Warning: These libraries are no longer supported in favor of Python Maltego-TRX
Developing iTDS transforms requires that you set up another server to host your transform code. This server will need to have access to the data that you wish to integrate with Maltego and will also need to be accessible by the iTDS server that you are using. This guide provides details on how to set up your own transform host server.
- PHP - An extensive PHP library is provided for the iTDS. The library and documentation can be found on our Libraries page.
- Python - The rest of this document focuses around the Python development environment.
In the past Paterva only provided a CGI based Python library. This is easy to work with but has obvious limitations – for every transform that runs an instance of Python was started. This can easily drain resources on the host and is not very efficient. The more efficient method is using WSGI.
What is TRX?
TRX is a framework (although this perhaps a bit presumptuous) that facilitates coding Python transforms for Maltego using the iTDS. TRX can split into two parts:
- A Python transform library that assists you with building remote transforms.
- A transform dispatcher that runs transforms called over HTTPS by an iTDS server. The dispatcher uses a web facing Ubuntu server with Apache2, with mod_WSGI, Bottle and Python.
The remainder of this guide will explain how to set up a transform host server to run TRX transforms. A library guide for the TRX transform library can be found on the TRX transform library page.
Provisioning your Transform Host Server
We recommend using Ubuntu Server with Apache2, Python 2.7, mod_wsgi and bottle. Assuming a stock standard Ubuntu server with Python installed the following is a recipe for quickly building your environment:
$ sudo apt-get update $ sudo apt-get install apache2
$ sudo apt-get install libapache2-mod-wsgi
$ sudo apt-get install python-setuptools $ sudo easy_install bottle
Copy & extract TRX files
$ cd /tmp $ wget -O "TRX_Ubuntu.tgz" 'https://docs.maltego.com/helpdesk/attachments/2015015325059' $ tar -xvzf TRX_Ubuntu.tgz
Edit Apache configuration
Edit the file /etc/apache2/ports.conf and add the line:
Add it just below the line that reads Listen 80 so the file looks like so:
NameVirtualHost *:80 Listen 80 Listen 9001
Copy the TRX Apache configuration file:
$ sudo cp /tmp/etc.apache2.sites-available/TRX /etc/apache2/sites-available/
Now enable the site:
$ sudo a2ensite TRX.conf
Install TRX files
$ sudo mkdir /var/www/TRX $ cd /var/www/TRX $ sudo cp /tmp/var.www.TRX.tgz . $ sudo tar -xvzf var.www.TRX.tgz
An example transform DNS2IP is provided (it’s a function defined in DNSTRANSFORMS.py, but more about that later). In the next section we will see how to configure the iTDS to use this transform.
$ sudo /etc/init.d/apache2 restart
Feel free to inspect the configuration of the ‘TRX.config’ site defined in /etc/apache2/sites-available and customize this to your liking. The configuration will route all traffic on port 9001 to the WSGI script TRX.wsgi located in /var/www/TRX.
Changing to the Bottle Server
One of the painful realities of working with WSGI is that you need to restart the Apache server every time you make a change to the code. There are some scripts on the Internet available to make this easier but this method of coding is altogether unintuitive for most coders. Luckily there’s a way around it.
Bottle has its own web server built in and this server can be set to reload whenever a change in code happens – this goes for changes to the stub as well as any modules/libraries that it depends on. Let’s look a little closer.
Go to /var/www/TRX. In here you’ll see a file called debugTRX_Server.py. This file is a near exact copy of TRX.wsgi – but it can run as a standalone server. You’ll see that we’ve uncommented the bottle HTTP server and commented the WSGI server (at the end of the script). In this case it’s listening on the IP 10.77.0.106 and port 9001. Before you can run this script you need to disable the Apache server (because it is grabbing port 9001):
$ /etc/init.d/apache2 stop
Now you can start the (sub optimal, but great for debugging) server:
$ sudo python debugTRX_Server.py
It starts up as follows:
Bottle v0.11.6 server starting up (using WSGIRefServer())... Listening on http://0.0.0.0:9001/ Hit Ctrl-C to quit.
What’s great about this server is that it automatically reloads every time you make a change to the code. Better still, it verbosely complains about any mistake you might have made in your code. Here’s a screenshot of that happening:
In the example above we made a silly mistake in a transform by using m.slider and not m.Slider (note the difference in case). The bottle server complained bitterly about it. We fixed the code and as soon as we saved the file the server reloaded with the changes. (If you wondering – we had our server listen on 10.77.0.106 and not on 0.0.0.0.)
Deploying Transforms with Apache
The main reason why we don’t want to run the bottle server in production is because it’s not optimized for heavy load and we don’t want to have to struggle with start-up scripts.
Once you are happy with your transforms you should stop the bottle server, simply uncomment the first couple of lines, change the server to run as a WSGI and start Apache again:
$ /etc/init.d/apache2 start
Another way to do this is to run the bottle server on a different port. You might want to add another seed (‘DEV’?) and register the transforms on this port. Once you want to move it into production you can simple change the port and insert it into the production seed.
Components of TRX
There are basically three files that are of interest. All of them are located in /var/www/TRX/.
This is the dispatcher (TRX.wsgi and debugTRXServer.py is essentially the same script – the former used with Apache, the latter when debugging). Let’s look at the file in a little bit more detail:
Each transform has a small router stub in here that defines the URL of the transform and to which function it routes to. The ‘request.body.getvalue()’ reads the entire POST from the message. This XML is sent to the function ‘MaltegoMsg’ which is defined in the Maltego library. The function returns an object which can be used to easily extract all the information from the XML. This object is passed along to the actual transform. The transform does its work and returns XML - this is returned back to the TDS. We’ll look into the transform code in more detail shortly.
In the example we’ve included the library ‘DNSTRANSFORMS’. This is the transform code library – e.g. where the transform itself is defined. In the router for transform 1 (DNS2IP) case it’s called ‘trx_DNS2IP’. Of course, as you go along you may choose to add more libraries.
2. Transform library (e.g. DNSTRANSFORMS.py)
You can include as many libraries as you want. For our example we used DNSTRANSFORMS.py. Let’s take a look at what’s really happening:
This is the code that actually performs the work of the transform. As input it gets an object that contains all the details about the request – this is called ‘m’ in this case. Next, it creates a ‘vessel’ that will eventually contain the response – here it’s called ‘TRX’. It then does the DNS lookup and adds an entity of type ‘maltego.IPv4Address’ with the value of the lookup to the vessel. It also adds some UIMessages. Finally the transform returns the vessel’s XML to the router which in turn sends it back to the TDS.
3. Maltego Library (maltego.py)
This is the library that shuffles XML to an object and back. You are more than welcome to optimize and tinker with this library should you feel inclined to do so. But keep in mind that should we change the library you’ll need to redo any changes you’ve made to it.
The documentation for this library can be found on the TRX Python Library page.