Development and Custom Transforms

Machines (Transform Macros)

Modified on: Fri, 9 Jul, 2021 at 1:40 PM

Overview

The Maltego scripting language allows you to automate the running of Transforms in Maltego. Machines run on Entities just like Transforms. The input Entity of the Machine will be the same as the input Entity of the first Transform of the Machine.

Every Machine starts with a header that describes the Machine:

machine("maltego.test", 
    displayName:"Testing it", 
    author:“Paul Richards",
    description: "This is simply a test machine") {
start { <<machine here="">> }

Concepts of a Machine:

Pipeline – a set of Transforms and filters that are executed in sequence. Think a macro.
Trigger – a graph condition and a Transform. Think when this happens on the graph, run this Transform.
Feeder – a mechanism to feed Entities into Maltego.
Machine – a combination of pipelines, triggers and feeders.

Heads up! At the time of writing, feeders and triggers are not implemented. As such these are not covered at the moment, but may be in the future.

Pipelines

The basics

A pipeline is a sequence of filters and Transforms. Let's look at the following example:

//simple machine
machine("maltego.test", 
    displayName:"Testing it", 
    author:"Roelof Temmingh",
    description: "This is simply a test machine") {

    start{
        run("paterva.v2.DomainToMXrecord_DNS")
        run("paterva.v2.DNSNameToIPAddress_DNS")
        run("paterva.v2.IPAddressToNetblock_Cuts")
    }
}

From the example above we can quickly see that ‘//’ is used to comment code. You will also see that each Machine has a unique name, display name, and description. These are used to describe the Machine and translate it to show as follows:

In the example, there are three Transforms that are executed – top to bottom. The first one takes a domain name as input and generates an MX record. At that stage in the pipeline, we have an MX record, so the next Transform takes the MX record to an IP address which in turn is Transformed into a netblock. The resultant graph looks like this (we used ‘paterva.com’ as input):

The pipeline looks like this (always left to right):

It’s important to know that the pipeline does not accumulate Entity types. It only contains what the previous step has generated. This is also true for filters which we’ll get to later.

Parallel paths

To run Transforms in parallel we’ll use the ‘paths’. Again, let’s look at an example:

start{
  paths{
            run("paterva.v2.DomainToMXrecord_DNS")
            run("paterva.v2.DomainToNSrecord_DNS")
            run("paterva.v2.DomainToDNSName_DNSBrute")
  }
  //now resolve these to IP addresses
       run("paterva.v2.DNSNameToIPAddress_DNS")
    }

The pipeline for this looks as follows:

The resultant graph (running on ‘facebook.com’) looks like this:

Note that there are MX, NS, and DNS Name records on the graph.

It’s important to note this because the MX record Entity and the NS record Entity inherit from the DNSName Entity - all of these can be resolved to an IP with one Transform.

Whenever you want to split a pipe into two or more sections the syntax is as follows:

paths{
  Parallel-1
  Parallel-2
  ...
  Parallel-N
}

Commands inside the curly brackets will be executed in parallel.

Serial Paths

Let's take a look at how to handle this when trying to achieve something like this:

The problem here is that there is a sequence of Transforms that are within their own pipeline but which have to be executed in parallel. In Maltego scripting language this can be achieved as follows:

path {
  run("paterva.v2.DomainToMXrecord_DNS")
  run("paterva.v2.MXrecordToDomain_SharedMX")
}
path {
  run("paterva.v2.DomainToNSrecord_DNS")
  run("paterva.v2.NSrecordToDomain_SharedNS")
}

From this example, you can clearly see that each ‘sub pipeline’ is in a ‘path’ clause, but that these run in parallel.

The resultant graph when running on ‘logica.com’ looks like this:

From this graph, it becomes apparent that some form of user filtering is needed before we proceed with the next steps. In this example, the NS records ns1.logica.com and ns2.logica.com yield good info, but Logica also uses Outlook as an MX, and looking at shared domains is not particularly useful. We cover filters in subsequent sections – keep reading!

Transform settings, slider values

Transforms can optionally be passed extra information. Slider value (how many results to return), as well as Transform settings, can be specified:

run("transformname",slider:N)

//will run transform and <= N number of results will be returned

Let’s look at an example:

run("paterva.v2.MXrecordToDomain_SharedMX",slider:255)

Transform settings can be specified as follows:

run("transform", "setting_name":"value")
//will run transform with certain transform settings

As an example:

run("paterva.v2.DomainToDocument_SE","engine":"google",slider:50)

The setting’s actual name (e.g. not the display name) can be found in the Transform Manager:

Filters

The basics

There are two types of filters – user filters and pipeline filters. A user filter will give the user the chance to interact with the information in the pipe, effectively stopping the flow until the user has decided what to keep in the pipe and what to delete. A pipeline filter (or just filter) will filter the information based on certain parameters and requires no user interaction.
To stop the flow and send the output to the user for manual filtering, simple insert the command ‘userFilter()’. Let’s use it in an example. We’ll use the same pipeline as before, but this time give the user the option of editing the MX and NS records before passing them along to the sharedMX/NS Transforms. The script thus looks like this:

start{
  paths {
    path {
      run("paterva.v2.DomainToMXrecord_DNS")
      userFilter()
      run("paterva.v2.MXrecordToDomain_SharedMX")
    }
    path {
      run("paterva.v2.DomainToNSrecord_DNS")
      userFilter()
      run("paterva.v2.NSrecordToDomain_SharedNS")
    }
  }
}

When running this inside of Maltego it looks as follows:

In this case, we’re telling Maltego that we should not pass the logica-com.mail.protection.outlook.com Entity onto the next Transform. Once this user filter has been completed (the user clicked on Next), the pipeline flows again.

userFilter(title:"Remove hosted NSes",heading:"NS records",description:"Please remove the NS records that are hosted. We will see what's shared on the selected ones.",proceedButtonText:"Next>")

The following options are available:

title (string) – the title of the dialog
heading (string) – the heading of the first column
proceedButtonText (string) – the text on the proceed button
icon (string) – the name of an icon to display
removePromptText (string) – the text on the "remove unselected entities" checkbox
removePromptChecked (boolean) – the default value of said checkbox
showIncomingLinks (boolean) – display the incoming links column
showOutgoingLinks (boolean) – display the outgoing links column
selectEntities (boolean) – default selection state of entities

Consider the following screenshot:

The user filter for above is as follows:

userFilter(title:"TITLE",heading:"HEADING",description:"DESCRIBE",proceedButtonText:"PROCEED", removePromptText:"RPT", removePromptChecked:true, showIncomingLinks:true)

Remember that strings should be enclosed in quotes (") while Booleans are either true or false and are not enclosed in quotes.

Pipeline Filter

Pipeline filters stack next to each other. Each line in a filter makes the filter more specific. Consider the following script:

start{
  run("paterva.v2.URLToPerson_NLP")

  //filter just person entities
  type("maltego.Person")
  run("paterva.v2.PersonToEmailAddress_SamePGP")
}

The pipeline for this looks as follows:

Note that the named Entity recognition(NER) Transform will return three types of entities – Person, Location, and Phrase (which is the organization/company). To get the proper name for the Entity you can look in the detail view:

When we feed this Transform the BlackHat Europe 2010 speaker lineup (http://www.blackhat.com/html/bh-eu-10/bh-eu-10-speakerbios.html) the result is as follows:

If we needed to do something with the other types in the pipe (e.g. the phrases) the pipeline should look like this:

The code for this looks as follows:

start{
  run("paterva.v2.URLToPerson_NLP")

  paths{
    path{
      //filter just person names
      type("maltego.Person")
      run("paterva.v2.PersonToEmailAddress_SamePGP")
    }
    path {
      //filter on phrase
      type("maltego.Phrase")
      run("paterva.v2.PhraseToWebsite_SE")
    }
  }
}

The following can be used to filter in pipelines:

Type. Use type()
Number of incoming links. Use incoming()
Number of outgoing links. Use outgoing()
Total number of links. Use degree()
Value of the Entity. Use value()
Age of the Entity – when used with perpetual Machines. Use age()
Property of an Entity. Use property(propertyname,value)
Filter based on bookmark color. Use** bookmarked()***

When working with strings – for instance in value or property value filter the following applies:

value("frikkie") 
//only matches "frikkie"

value(like: "frikkie", ignoreCase:true) 
//matches "frikkiesbeer", "Its me Frikkie" and "befrikkied"

property("ipaddress.internal", equalTo:true)
//matches all internal IPs

property("ipaddress.internal", like: "true", ignoreCase:true) 
//matches all internal IPs but uses a like query

When working with numbers – for incoming, outgoing, age etc. the following applies:

outgoing(2) 
//matches exactly 2 outgoing links

incoming(moreThan:0) 
//matches if the entity has any incoming links

outgoing(lessThan:5, moreThan2) 
//matches if the entity 3 or 4 links (in or out)

age(moreThan:400) 
//matches nodes older than 400 seconds

Making filters global

In many cases, you want to collect nodes from all over the graph and not just from the current location in the pipeline. To do this you can add the argument ‘scope:"global"’ to filter any Entity on the graph. Consider the following script:

//prune leaf nodes
    start{
  incoming(1,scope:"global")
  outgoing(0)
  delete()
    }

When you run this Machine on any node this script will simply prune leaf nodes on a graph. Leaf nodes have no outgoing links and just one incoming link – which is precisely what the filter combination does. Another way of writing the filter would be:

outgoing(0,scope:"global")
incoming(1)
delete()

Here we are simply reversing the order of the filters but it ends up being the same thing.

Keep in mind that adding the global scope to your filter will access the ENTIRE graph everything. Therefore the second call will basically remove the first filter - this would be wrong…

outgoing(0,scope:"global")
incoming(1,scope:"global")
delete()

…as the second filter line also contains the global scope and negates the first filter.

Actions

The following sections explain the actions that Machines can perform.

Deleting Nodes

You can instruct Maltego to delete nodes. This is very useful when using perpetual Machines (otherwise the graph grows out of control over time). In certain scenarios, you would want to be able to delete not just the node but perhaps the parents (and/or) children of the node too. The delete command will always delete the node it’s running on. The command to delete nodes is … delete().

delete() 
//deletes the current nodes in the pipeline

delete(parents:2) 
//delete myself plus all my parents and grandparents –e.g. two levels up

delete(children:2)
//deletes myself plus all children and grandchildren – e.g. two levels down

deletebranch()
// delete the entire branch

More Actions

bookmark()
//Bookmarks entities currently in the machine pipeline

bookmark(2,overwrite:true)
//Bookmarks entities green and overwrites existing bookmarks

Different bookmark colors correspond to the following number:

-1: NONE
0: BLUE
1: GREEN
2: YELLOW
3: PURPLE
4: RED

clearBookmark()
//Clears bookmarks from entities that are currently in the machine pipeline

setLayout(type)

//Sets the layout of the graph.

Options for the graph layout types are:

Block
Organic
Circular
Hierarchical
Interactive Organic

e.g.

setLayout(“circular”)

Debug, logging and status messages

To help debugging Machines you can use the ‘log’ command. Whenever you call this command the Maltego GUI will tell you what’s in the pipe at that stage. Consider the following script:

start {
      log("this is my input",showEntities:true)  
      run("paterva.v2.DomainToMXrecord_DNS")
      log("after toMX",showEntities:true)
      run("paterva.v2.DNSNameToIPAddress_DNS")
      log("after toIP",showEntities:true)
    }

Inside of Maltego, you’ll see the following in the output screen:

The debug command automatically shows which entities are in the pipeline at the stage when the debug command was called.

When called with showEntities:false the command acts as a scrolling status update in the text window.

To set the status at the top you may use the status command. This sets the top label on the Machine window. Keep in mind that when setting the status label in a parallel path only one of the label will be visible – as the commands are executed in parallel and whatever last label is set will stick. Here is an example of the status command and the corresponding output:

start {
      status("Starting test machine")
      log("Getting MX records",showEntities:false)  
      run("paterva.v2.DomainToMXrecord_DNS")
      log("Resolving to IP address",showEntities:false)
      run("paterva.v2.DNSNameToIPAddress_DNS")
    }

Saving graphs/images

You can easily save a graph as the native Maltego format (MTGL at the time of writing) or export it as an image (PNG). The following actions are available:

exportImage(filename) #saves as PNG

saveAs(filename) #saves as Maltego Graph File.

Extra parameters:

suffixDate #Boolean indicating whether date and time should be appended. Default is true.

dateFormat #Specifies the format in which the date and time should be appended. Default is yyyyMMdd-HHmmssSSS.

Perpetual Machines

It is possible to have Machines that run until terminated. Think of this more as a monitoring Machine. Use onTimer(X) {} instead of start{} in your script. This will start the Machine every X seconds. Consider the following script:

machine("maltego.twitter.monitor", 
    displayName:"Twitter Monitor", 
    author:"Roelof Temmingh",
    description: "This machine monitors Twitter for hashtags, and named entities mentioned around a certain phrase") {

    //run every minute and a half
    onTimer(90) {
        run("paterva.v2.PhraseToTwit_Search",slider:30)
        paths{
            run("paterva.v2.pullHashTags")
            run("paterva.v2.toEntitiesNERTwitter")
            run("paterva.v2.pullURLs")
        }

        //delete Tweets with no info, after one round
        age(moreThan:100, scope:"global")
        type("maltego.Twit")
        outgoing(0)
        delete()

        //delete Tweets as they get older than 5 minutes
        age(moreThan:300, scope:"global")
        type("maltego.Twit")
        delete()

        //after a while, when nothing links to it, remove the orphans
        age(moreThan:500, scope:"global")
        incoming(0)
        outgoing(0)
        delete()
    }
}

As you can see from the Machine – running the Transforms are fairly straightforward. The more interesting code is deciding when and which nodes to delete to ensure that the graph is always up to date. When a Machine is still running it will not start another instance of the same Machine.

Automating Machines

To automate the process of running Machines, you can start Maltego with a new Machine running with the following command:

maltego --machine “<name> <entity type>=<entity value>”

The options listed below are also available:

-f Full screen
-q Exit Maltego after Machine completion
-i# Run perpetual Machine # for iterations

Overview

Pipelines

The basics

Parallel paths

Serial Paths

Transform settings, slider values

Filters

The basics

User filter options

Pipeline Filter

Making filters global

Actions

Deleting Nodes

More Actions

Debug, logging and status messages

Saving graphs/images

Perpetual Machines

Automating Machines

Search all Maltego Guides: