Transform Meta Info

Display NameMirror: Email addresses found
Transform NameWebsiteToEmailAddress_Mirror
Short DescriptionThis transform uses Gary's Ruby website mirror to spider the site and extract email addresses
Data Source
Paterva/Host
OwnerPaterva
AuthorRoelof Temmingh (roelof@paterva.com), Gary Oleary-Steele (garyo@sec-1.com)
InputWebsite
OutputEmailAddress

Description

This transform will make a (partial) mirror of the web site and extract all email addresses found on the site. The slider plays a big role in this transform as it set the time-out for the mirroring process. The higher (to the right) the slider is set, the deeper the mirroring process will go, and hopefully, the more results you'll get. The process runs via a caching server (that is local on the box) which means that you wont be doing the data transfer to the site twice (if you run the transform again) - expect of course if the first round did not manage to get the entire site. Also keep in mind that not all sites are mirror friendly. Flash based sites will give problems as will sites with exotic Javascript menus and redirects. Email addresses that are obfuscated using non-standard techniques will also not be picked up.

Typical Use Case

URL --> Website ==> Email Addresses

Example

Starting with the URL for our contact page, we get a website entity from the URL. From the website entity we can create a mirror website to extract all the email address found on the site.