Transform Meta Info
|Display Name||To Files (Office) [using Search Engine]|
|Short Description||This transform will search for the given phrase and show documents (Office[tm]) containing the term.|
|Data Source||Bing API|
|Author||Roelof Temmingh (email@example.com)|
General to all Search engine transforms
There are couple of transforms that use search engines - all of them very similar. The basic recipe with these transforms are as follows:
- Expand the question. The question is the input from the GUI - be that a person's name, a domain or an phone number. When looking at a person's name for instance the name 'Kosie Kramer' will be expanded to searches like '"Kosie Kramer"', '"K Kramer"', 'Kramer Kosie' etc. In the case of a telephone number the search will be expanded to include most telephone notations used.
- Assign confidence levels. Because a search for '"Kosie Kramer"' is more likely to return good results - rather than a search for 'KramerK' the confidence level for the first search would be higher. The confidence levels are also used to assign preference to certain file types when doing searches on documents (these are configurable in the transform). In the same way a XLS file containing the word is likely more interesting than a PDF file.
- Perform each search. The searches are performed and the snippets are obtained. It is important to note that only snippets are parsed. For parsing the entire page you need to dump to URL and process the URLs separately. Various search engines have various snippet lengths.
- Parse for output entities. Depending on what output is required the snippets are parsed for entities - in some cases the web site's name is all that's required.
- Calculate weight. The weigh is calculated from various factors - the confidence of the search, the frequency of the result, the importance of the web site where the result came from, and in some cases a correlation to the input.
- Normalise. The weights are now normalised using a fairly interesting algorithm that involves the mean and standard deviation of the spread of weights. It is important to understand that a search result with a equal spread of weights are mostly useless.
General notes when using search engine transforms
Maltego will sometimes give you results that seem plain wrong. You need to keep in mind that the application will get pretty desperate when it does not get results. So - when you are searching for a person called "Vaxynutus Grabounill" and that person simply left no marks on the Internet Maltego will eventually go after a search term "VG" - with a super low confidence - but you will still get some results. These results could seem completely off the mark, but should have very low weights. Always look at the weights.
Problems with parsing results
Some entities are hard to parse. Telephone numbers are notoriously hard to parse. There is always a trade-off between missing numbers and parsing non-telephone numbers as phone numbers. With the current transform we hope to have reach the optimal balance.