Friday, November 26, 2010

Massive speed and memory improvements

Today marks the last day of working on speed / load / memory improvements for Maltego 3.0.3. The rest of the time is dedicated to bug fixes. We think we're at the edge of the 80/20 boundary on this. And it holds true - in roughly 4 weeks we've made MASSIVE improvements on the speed and load performance of Maltego. 3.0.3 is FASSST. The last performance release I tested was blindly fast and yet we haven't looked at things like caching results, link compression and license key caching. So there is room to optimize even more.

I am super excited about 3.0.3. Even though there are no new features, it's something we should have done a long time ago. You'll see the difference straight away....

Wednesday, November 24, 2010

Transform Tuesday++ : Facebook,SOA,SPF and Shodan integration!

Hey guys,

Transform Tuesday is here again! Sure it might be a Wednesday, but unfortunately the Fremont data center that two of our Linodes are hosted on went down yesterday (http://status.linode.com/2010/11/fremont-power-issues-rfo.html) !

None the less, there is a small possibility that it's Tuesday somewhere in the world, much like the fabled "it's 5 o'clock somewhere in the world".

This installment has a bunch of new transforms and I'm going to show off some that @achillean did integrating both Shodan and Exploit-db! Our transforms show integration with the Facebook graphAPI as well as some new DNS transforms (SPF and SOA).

Lets get to the good stuff!

Facebook GraphAPI:
I'd like to preface this by saying that we are not trying to break the terms of use of Facebook and we think that we completely abide by the principles listed on http://developers.facebook.com/policy/:

Create a great user experience

  • Build social and engaging applications
  • Give users choice and control
  • Help users share expressive and relevant content

Be trustworthy

  • Respect privacy
  • Don't mislead, confuse, defraud, or surprise users
  • Don't spam - encourage authentic communications
However if we are asked to take down the transforms we will of course.

What are the Facebook transforms?
  • toFacebookObject - search Facebook via the graphAPI and return the results. Think of this as your Facebook search engine transform!
  • toFacebookAffiliation- convert the above objects to to a Facebook affiliation with the profile picture and a link to the profile.
  • toPhrase - try and extract the phrase from the Facebook object (what the status message was).
  • toPersonFromProfile/toPerson - extracts the person from the Facebook object that made the post so you can use this with the normal people searching transforms.
  • toEntitiesNER - take the phrase from the Facebook object and try and extract terms/locations from this.
  • toFacebookObjectPerson - simply search Facebook for this person's name.
  • toEntitiesNERTwitter - technically not dealing with Facebook, but also allows the same functionality as the above toEntitiesNER transform - but on tweets!
A quick example:
So let's take a look at the phrase 'TSA' on Facebook. Simply drag a phrase onto your graph, set your slider to 255 (3rd notch) and run the 'toFacebookObject' transform. You should see results like this:

Next you can take all of these entities to other entities with the use of Named Entity Recognition (NER) via the 'toEntitiesNER' transform. This will try and extract things like Companies, Locations, and other entity types from the messages. NER is not perfect, as you will notice things like 'Pat Downs' as a person. Keep in mind that NER is very difficult to do! However you can immediately get some results from what is being said in the public Facebook space, such as:


From the screenshot above you can see that the term has connections to the US, deals with two agencies and Mr 'Pat Downs' - someone I think many people can now relate to at this point in time!

Other things you can do is take each of the facebookObjects to Person entities so that you can then perform other searches on these people or identify people commenting a lot on the phrase you originally searched for. To do this you can simply select all the facebookObjects as before and run the 'toPersonFromFacebook' transform:


So why are these Facebook transforms useful:
  • Tracking spam: you can use a phrase that you know is used in spam, take this to facebookObjects, then take each of these to a phrase ('toPhrase' transform), and then again search Facebook for these phrases, rinse and repeat until you have identified all the spammers.
  • Tracking what is said about a specific term (and who says it the most) as well as how often they are talking about and who they are. You can also identify locations/companies/other useful information by taking these objects and performing Named Entity Recognition on them.
  • If it was possible to identify friends of an individual (think the typeahead bug) you could identify the spheres of influence around people on Facebook that you have found via your graphAPI queries.
How to get these transforms:

Entity: http://ctas.paterva.com/TDSTransforms/GraphAPI/facebookObject.mtz
Seed: https://cetas.paterva.com/TDS/runner/showseed/SocialMedia


SPF/SOA Transforms:
Recently the topic of spam came up in the office and why SPF(txt) records were never implemented - they seem to be a viable means to stopping spam. We looked at the implementation a bit and noticed some very cool things, such as:
  • Admins are lazy and want the ability to move their mail servers around so they give their entire IP range in the SPF records
  • SPF records often include other SPF records which show other domains relating to the one you are interested
Secondly a transform RT has always wanted has been one that looks at the SOA records for domains to get the zone's administrative email address and the primary name server (where the zone was created - this is not necessarily one of the current nameservers). These transforms often provides information that's not found in the normal enumeration process.

We have developed two transforms specifically aimed at these:
  • DomainToSOAInformation
  • DomainToSPFInformation
Hereby some examples of using these transforms:

SPF Transform:
Compare the NS of pentagon.mil (left) to the NS found in the SOA record (right):


SOA Transform:
Quickly and easily identify Google's netblocks from their SPF records:




These transforms have been added to the standard infrastructure seed which can be found at: https://cetas.paterva.com/TDS/runner/showseed/Infrastructure

Shodan:
This week there has been a lot of coverage of the Shodan transforms, developed on the TDS. The transforms essentially allow the integration with the fantastic shodanhq.com as well as exploitdb.com.

The transforms are as follows:
  • searchExploitDB - Search the Exploit DB archive's exploit descriptions.
  • getHostProfile - Returns the list of banners for the given IPv4 as well as general host information (hostname, location, etc.).
  • searchShodanDomain - Search the Shodan database for information on the given domain name.
  • searchShodanNetblock -Searches Shodan for hosts contained in the given netblock.
  • searchShodan - Use the Shodan search engine to locate computers.
Some examples:
Identify hosts belonging to google.com:
  • Drag the domain google.com onto the graph and run the searchShodanDomain transform or run the searchShodanNetblock on one of the netblocks found with the SPF transforms (see earlier):



  • Verify these results by running the getHostProfile on one of the returned IP addresses:

  • Search for host responses with the word 'scada' in them by dragging the phrase 'scada' onto the graph and running the 'searchShodan' transform:

  • Identify Vulnerabilities that have 'scada' in the name or description by using the same phrase and running the 'searchExploitDB' transform:

Overall these transforms are awesome, and it is great to see people building (and releasing) transforms via the TDS! Hopefully we can see improvements on these such as:
  • Ability to search the returned banners against exploitdb
  • Ability to search the builtwith.com results against exploitdb
  • Exploits with a link to where one can find the specific exploit
Where can I get the shodan transforms?
The Shodan transforms can be found at http://maltego.shodanhq.com/

Finally, apologies for the Goliath of a blogpost. When I started it this morning it didn't seem like that much, but it's grown quite a bit. Special thanks to the Shodan guys for developing some awesome transforms.

Damn the man. Save the Empire.
-AM

Saturday, November 20, 2010

Maltego 3.0.3 is looming

3.0.3 What's in there and when it's released
Maltego 3.0.3 will be released before the end of this year (or perhaps in the first week of 2011). There is good news and bad news. The bad news is that 3.0.3 will not really have any new features. The good news is that it is a 'performance and stability' release. Which means - it will be fast. Very fast. And stable, very stable.

We've been working for the last 3 weeks with one goal in mind - making Maltego work better with large graphs. The target is to comfortably work with 10K node graphs on a 1GB JVM and Dual Core processor. And we're getting there. Of course, if you happen to own an I7 with 8GB of RAM, your experience will be much better and you'll be able to handle many more nodes. The side effect is that smaller graphs will be super fast to navigate, select and run transforms on.

We've also made a list of 'well known bugs' that have never made it to the priority list. Things that irritate us but that we've learn to live with. Many of these will be squashed. It's just that time now - time to grow up and fix it. It's hard to not add features - there are so many that we all want in Maltego.

New transforms coming + Shodan transforms
We are also releasing some new transforms this coming week. We've been sitting on a couple for a while and now it's time to make them public. Lastly it was GREAT to see that the folks of Sodan have adopted the TDS and made some transforms. This is exactly what we had in mind with the TDS. You can catch all the action at [http://maltego.shodanhq.com]

Crisp out,
RT

Tuesday, November 9, 2010

New infra enum transforms - with sweet example

We're happy to release a couple of simple transforms via the TDS that assist with the foot printing / enumeration of infrastructure. These are:

  • NetblockToNetblocks
Essentially this transform breaks large networks into smaller chunks of networks. This is useful when you have transforms (such as reverse DNS, portscans etc) that only works on class C networks...and you are stuck with a class B.

  • NetblockToIPs
Shows every IP within the netblock as a separate IP address entity. Useful when you need to run a transform on an IP address itself, and want to repeat the process over all the IP addresses in the network. An example of this will follow.

  • WebsitetoDNSName
  • NStoDNSName
  • MXtoDNSName
These transform simply converts the NS,MX or website to a DNS name so that the enumerate numerically transform can work on it. In other words - see the next transform..

  • enumerateHostNamesNumerically
This transform will test for the existence of DNS names that end with the same name, but another number. As example - if ran on mx1.domain.com it will check for mx1, mx2, mx3.domain.com. The range and padding can be set with transform settings.


Examples

How is this interesting at all (because frankly, on the surface it looks pretty boring) ? Let's look at examples. Let's assume we are are foot printing a domain called eop.gov (if you missed that class - EOP is the Executive Office of the President - which, network wise, is a lot more interesting than whitehouse.gov). We run the 'Find common DNS name' transform on this, and end up with a graph like this:


Clearly ns1 is a good candidate to be enumerated numerically. And so we shall:

The transform will ask us for some transform settings:


And ends up producing a graph looking like so:


With a couple of more transforms, a little re-arrangements and manual linking we get:


The resultant DNS entries (at the bottom of the screen shot, and produced by looking at reverse DNS within those netblocks) also looks yummy for numerical enum, so we'll run them too (but perhaps from 0 to 99 with one digit padding). You end up with graph looking like this:

In the end we'll take all of the DNS names, copy them to a new graph and resolve them to IP addresses. This gives us:

For the next step we'll use one of the other new transforms. We'll take the two blocks, and enum them to individual IP address entities. Why? You'll soon see. But first, this is what it should look like:


The blue dots are the IP addresses. The 'hands' sticking out at the sides are IP addresses that were discovered from two transforms, resolving the DNS names, and the enum. Sonowwhat? Now, we'll put every IP address into a search engine and see if there is any results. EH? Well, when anyone browses the 'net the site that they browse probably records the IP address in a log...and sometimes, just sometimes...those logs get index by a search engine. So - we end up with a graph that gives us a list of websites that were visited by that IP address. You might think it does that happen a lot - but you'll be surprised. Hereby the resultant graph:

The blue dots are IP addresses, the pink ones are websites where that IP address was found. This is the edge weighted view, so the larger the sphere, the more IP addresses pointed there. Of course, IP addresses don't just end up in logs that gets indexed. This closeup shows you why:

In fact, the more interesting sites are the ones that are only visited once or twice. We can also weed out the false positives (sorry Rob, in this case that's you) by searching our graph for words like 'usage stats' and the likes. The results then start looking a lot better - here is a small portion of the graph:


In the detail view we can see when and what were visited:


If you missed the point of this whole mission - it was to see if we can figure out to which web sites the people in the Whitehouse browsed to..

Anyhow - this was just a *brief* idea of where you can go with these transforms. On their own they are boring and bland, but when used with others they sparkle.

OK, initially I thought "brief" and then I ended up spending 45 minutes on it (most of the time copy and pasting the graphs, cropping them and struggling with this web interface blog editor).
Also, before I forget, and your reward for reading all of this - the seed for these transforms can be found here:
  • https://cetas.paterva.com/TDS/runner/showseed/Infrastructure
You may use instructions on [this blog post] to see how to get these into Maltego. They don't need any special entities. So it's load, discover and play.

Crisp out,
RT

Free BuiltWith.com Transforms!

Builtwith.com is a fantastic site for enumerating technologies used on a website, things such as JQuery, Google analytics and additional server information such as the type (Apache/IIS).

For example, if you had to perform a lookup for www.paterva.com/web5/ you will receive a results page (as seen on the left). This page includes that our website uses/is run on:
  • Apache
  • Ubuntu
  • Mod_SSL
  • JQuery
  • Google Analytics


But why Andrew, WHY?
So out the bag this may not seem that exciting, anyone could simply go and have a look at the source of a website and look for keywords relating to the technology, or even look for key directories (Wordpress' /wp-admin/ directory for example).

However imagine you were looking at a large number of websites, in this example gov.za space ( I placed a domain 'gov.za' onto my graph and then ran the "To Website DNS [using Search Engine]" Transform with my slider set to 255):



How would you correlate which technologies were being used with which websites? Well simple, with Maltego of course (and a bit of code to integrate with the BuiltWith.com API)!

The next step would be simply to select all of the websites found and run the "ToServerTechnologiesWebsite" transform. This will then return the technologies used for each site as seen in this example with just 1 website:


You would then run this across all of your websites to view what kind of technologies were being returned!

Initially we noticed that there was a lot of excess data coming through, technologies such as Javascript, CSS and SEO_H1 which we have since discarded and are only looking at results in the following categories:
"cms","framework","server","Apache Module","Database","Hosting","Interent Communication Server","J2EE Server","Security","Server","Web Accelerator","Web Host","Web Master","Web Server","Web Server Plugin","Web Technology","analytics","javascript"

If there are others you would like to see from the API ( http://api.builtwith.com/ ) feel free to let us know.

So where to from here?
Well now that you have a list of all the technologies used you can see interesting correlation between the data, such as:
  • Are all the websites running the same technology or are there odd cases, and if so why? (Does some dev have a machine running thats accessible from the internet running a vulnerable server)
  • Is the infrastructure mostly Windows or *nix based? (Apache vs IIS)
  • Which CMS' are used, and where? Are they all the same or are there variants?
  • Which websites need their technologies upgraded!
  • What is the most common technology used between all the websites?
Show me the money!

In edge weighted view we can quickly identify the most common technologies used:


And by using the Entity List view I can easily identify the top used technologies by searching for 'BuiltWith' (its the type) and then sorting by incoming links:


Give me the Entity and the Seed!

Entity: http://ctas.paterva.com/TDSTransforms/BuiltWith/BuiltWithTechnology.mtz
SeedURL: https://cetas.paterva.com/TDS/runner/showseed/builtWith

You can follow our previous post on importing these here: http://maltego.blogspot.com/2010/11/transform-tuesdays-free-maltego.html

Last but not least
I'd just like to send out a thank you to Gary Brewer from BuiltWith.com for some changes we requested to do HTTPS as well as HTTP and for helping us out generally with our BuiltWith.com queries!

I'd also like to point out that these transforms run on not only a website but also an IP address as well as URL. Please also note that BuiltWith.com does not currently follow redirects so it will simply try connect to the website, IP address or URL and return the information based on that single page (no spidering).

We look forward to seeing the community respond with more transforms like this!

Transform Tuesdays! Free Maltego Transforms!

Yes, it's not nearly as popular (yet) as Patch Tuesday, but at least it's an alliteration.

We have been working on a bunch of new TDS based transforms that we would like to share with the community in the hope that the community responds with more transforms of their own.

Today we will be releasing two sets of transforms:

BuiltWith.com integration via their fantastic API
Enumerate server side technologies of Websites and URLs. These include things like CMS (Joomla, Wordpress), Server information (Apache, IIS) and other technologies used (Jquery, Youtube, Silverlight, etc)

Various useful infrastructure transforms
Couple of transforms to help with infrastructure enumeration including Netblock to IP addresses and Netblock to Netblocks.

How do I use these Transforms within Maltego?
You will need two things to use any of the upcoming TDS transforms (and any we post in the future).

  1. Maltego Entity Objects file (mtz) with any custom entities that are used for these transforms. NOTE: This is only needed if there are custom entities.

  2. Seed URL: This will point your Maltego interface to where it can find new transforms.
Enough! Show me with pictures!

Importing custom entites (Where needed)

Click on Import Entities

Select the supplied Maltego Entity Objects file (mtz)

Select the entities you wish to add

Add it to a group

Enjoy your crisp new entities


Discovering Transforms:

Click discover transforms


Add a name for your seed and the supplied URL


Neeeeeext

Neeext

Neeeext

Finished! You now have some crisp new transforms!


In the upcoming posts we will release the mtz as well as the Seed URLs for the new transforms.