Wednesday, February 11, 2015

Building your own LovelyHorse monitoring system with Maltego (even the free version) - it's easy!

Someone linked me to the [LovelyHorse] thingy. If you missed it - it's basically a GCHQ NSA document that was leaked containing a list of a few security related Twitter accounts that the GCHQ NSA was supposedly monitoring. Seeing that, since the last release, we have some interesting Twitter functionality in Maltego, I figured it be interesting to see how we can replicate their work.

First - manually

Before even starting with Maltego I first spent some time thinking about what I really wanted from this and did it all by hand (still in Maltego, but before we start to automate the process). As a start I'd need to get the people's Twitter handles. Well that's easy - the document lists them all. In Maltego I can start with an alias and run the transform 'AliasToTwitterUser' to get the actual Twitter handle:


I want to get the Tweets that the people wrote. There's a transform for that too - 'To Tweets [that this person wrote]'.


OK great - now I have the last 12 Tweets (my slider was set to 12). What information can I extract from the Tweet itself - keeping in mind that I want to end up doing this across 36 different handles? Well - possibly extract any hashtag, any URL mentioned in the Tweet and any other Twitter user's handle. There are transforms for all those.


Running those on a single Tweet we get something like this:


Note how the http://t.co links are nicely resolved. This Tweet didn't contain any other aliases - so you only see the hashtags and the URLs.

If we select all the Tweets and run the 3 transforms across them we see that there are some matches - in this case on hashtags 'infosec and malware':


Now, this is not really interesting at all but it's a starting point. When I do the same for the last 12 Tweets of all of the lovely horses (as I'll call this group of Twitter handles) I might see some pattern.

Now - all the horses

I copy the text from the PDF and paste it into a text editor - Notepad will do. Clean it up a bit and we have:

Select all and paste into Maltego. It will result in every line being mapped as a phrase. Select all the entities (control A) and change the type to 'Alias' and run the 'AliasToTwitterUser' transform on all the phrases - like we did at the beginning, except now we're doing it on all the aliases. It should look something like this:

At this stage I can get rid of the Aliases because I am not going to use it anymore. I click on 'Select by Type' on the ribbon (Investigate) and select 'Alias'. Delete - and they're gone. I do a re-layout, select them all and run the 'To Tweets [that this person wrote]' - this time on all of them. Essentially I am repeating the entire process we did - but this time on all the lovely horses.

When the transforms complete the graph now looks like this:


All that's left is to run the 3 transforms (Pull URL/hashtag/alias) on all the Tweets. To select all the Tweets quickly I use 'Select by type' - Twit again. This takes a while to complete...but when Maltego has pulled out all the hashtags, URLs and aliases from the last 12 Tweets of all the lovely horses it looks like this:

No doubt this looks like ass. It's because the block layout is not really suited for this type of graph. But click on Bubble View:


and you get:


Let's get real

I wont lie - I've been spoon feeding up to now. Let's stop now - else this blog post is going to morph into a book. I am going to assume that you have a bit of Maltego experience under the belt by now. 

The way we've been doing up to now is really not terribly interesting or accurate. We're getting the last 12 Tweets. What we really want is all the Tweets in the last X seconds. Imagine that one horse hasn't been on Twitter in 14 days - then matching his/her Tweets to what's happening right now does not make a lot sense (in a monitoring scenario). We need to introduce the idea of a sliding time window. The Twitter transforms did not support that.

Didn't. Does now. Well - the 'To Tweets [that this person wrote]' does now. I hacked it quickly. Anton will not approve...but it works as it says on the tin. I've added a transform setting called 'Window' - by default 0 but when changed implements this 'in the last X seconds'. 

Now it becomes interesting when combined with machines (scripted transforms) - especially perpetual machines.  Consider the following machine:

machine("axeaxe.LovelyHorse", 
    displayName:"LovelyHorse", 
    author:"RT",
    description: "Simulates the GCHQ's LH program") {

    onTimer(240) {
        type("maltego.affiliation.Twitter",scope:"global")
        run("paterva.v2.twitter.tweets.from",slider:15,"window":"1800")
        paths{
            run("paterva.v2.pullAliases")
            run("paterva.v2.pullHashTags")
            run("paterva.v2.pullURLs")
            run("paterva.v2.TweetToWords")
        }

//half hour + half hour = one hour
        //the entities will be deleted if older than half hour
        //but the transforms time frame adds another half hour
        age(moreThan:1800, scope:"global")
        type("maltego.Twit")
        delete()
        
        age(moreThan:1800, scope:"global")
        incoming(0)
        outgoing(0)
        delete()
    }
}

Let's take a look. We run our sequence every 6 minutes ( 4 x 60s = 240s). We grab all the Twitter handles and get the Tweets - but 15 in total and only if it was written in the last half an hour (30 x 60 = 1800s - we set the window parameter to 1800. After this it's plain sailing - we get the Aliases, hashtags and URLs. 

At some stage we need to get rid of old Tweets - else our graph will just grow and grow and grow. So there's a little logic to delete nodes when they're older than half an hour. This  means that at any stage we have a one hour view on the activity of the horses. One hour - because on the limit the initial transform can contain a Tweet that's 30 minutes old and it will stay on the graph for another 30 minutes. 

The resulting graph will show us when they are Tweeting the same keyword (courtesy  of the 'TweetToWords' transform), hashtag, mention the same website or mention the same Twitter handle in their Tweets. And if they are not active on Twitter - then the graph won't contain outdated info. 

Of course - the values can be changed depending on how closely you want to monitor the situation - if the resolution is a day then the values should be (24 x 60 x 60) /2 and you should 1) up the number of Tweets returned in the slider value (as TheGrugq Tweets waaay more than 15 Tweets in a day) and 2) you shouldn't have to poll every 4 minutes.

Advanced
Right - so what we REALLY want is something that can tell us when we more than X horse's Tweets are linking to the same thing (be that a website/hashtag/whatever). For that we can't just use the 'incoming()' filter because one person could be sending ten Tweets mentioning the same website and it would mean that the website has ten incoming links. No - it has to have unique starting nodes (the horses).

We have that filter. It's called 'rootAncestorCount()'. So now - with a combo of bookmarks and this filter hackery we build something like:

        //if an entity links to moreThan 2 horses & 
        //we haven't seen it before  - mail
        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmarked(1,invert:true)
        run("paterva.v2.sendEmailFromEntity",EmailAddress:"roelof@paterva.com",EmailMessage:"Multiple horses mentioned: !value!",EmailSubject:"Horse Alert")
       
        //this is to ensure we don't email over & over
        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmark(1)


Basically what happens here is that we check for all entities with more than one incoming link (this can only be hashtags/URLs/words/aliases) and find the ones that have more than 2 unique grandparents (e.g. horses). If we find them, and we haven't seen them before (this Boolean flag is implemented with a bookmark) we mail the value out. We do the mailing with a transform that we wrote (and for obvious reasons cannot make public else it will be used for spam). It's not rocket science tho.

Such a machine can run for days...resultant graph for today (it's almost midnight), when configured with a one day window looks like this:


Highlighted with entire path here is the hashtag 'security' - no surprise here. The other one was the alias DaveAitel (again not suprising). Below is the email received. Remember that we'll only receive email ONCE per alert, that it's only when 2 or more horses links to it and only if it happened within a day.



The complete machine looks like this (please change values as needed - speed / resolution /etc):

machine("axeaxe.LovelyHorse", 
    displayName:"LovelyHorse", 
    author:"RT",
    description: "Simulates the GCHQ's LH program") {

    onTimer(600) {
        //find Twitter handles on graph
        type("maltego.affiliation.Twitter",scope:"global")
        
        //run to Tweets transform
        run("paterva.v2.twitter.tweets.from",slider:30,"window":"43200")
        
        //extract Alias/Hashtags/URLs and uncommon words
        paths{
            run("paterva.v2.pullAliases")
            run("paterva.v2.pullHashTags")
            run("paterva.v2.pullURLs")
            run("paterva.v2.TweetToWords")
        }

        //if an entity links to more than 2 unique horses & 
        //we haven't seen it before  - mail it out
        //comment this entire section if you don't have a mailing transform

        incoming(moreThan:1, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmarked(1,invert:true)
        
run("paterva.v2.sendEmailFromEntity",EmailAddress:"roelof@paterva.com",EmailMessage:"More than 2 horses mentioned: !value!",EmailSubject:"Horse Alert")       
        
        //this is to ensure we don't email over & over
        incoming(moreThan:2, scope:"global")
        rootAncestorCount(moreThan:2)
        bookmark(1)
        
        
        //delete nodes when they grow old
//half hour + half hour = one hour
        //the entities will be deleted if older than half hour
        //but the transforms time frame adds another half hour
        age(moreThan:43200, scope:"global")
        type("maltego.Twit")
        delete()
        
        age(moreThan:43200, scope:"global")
        incoming(0)
        outgoing(0)
        delete()
    }
}

I hope you've enjoyed this (waaaaay too long) blog post on how our thinking goes. Of course - you get a lot more understanding of these things if you do it yourself. All of the above functionality exists in the (free) community edition of Maltego too - although there you probably want to monitor shorter intervals (say 15 minutes) as you can only display 12 Tweets per person. All in all - that's probably better.. ;)

'laters,
RT

PS: for more information on machines check out our newly built dev portal at [http://dev.paterva.com/developer]. The machine syntax etc. is located in 'Advanced'. 

And also - we made a video some time ago that shows the same kind of principle - it's [ here ]

Tuesday, February 3, 2015

Calling all transform writers! and Maltego Chlorine details! and MORE!

All,

In a few weeks we'll be releasing a new version of Maltego. We're calling it Maltego CHLORINE! (we're sure the malware analyst will love the name as ...you know...chlorine..germs...bugs....that.)



There are SO many new things. Where to start...??..But - let's start at 1).

1. New context menu
We've totally redesigned the context menu. The main reason for this is that it was getting a bit cumbersome / fat / lazy / had too many Doritos. If you had a lot of transforms you had to really know your way around the GUI to find them all. We took some time, looked at what users mainly use and designed this:

After some weeks of tweaking the design it ended up looking like this in the GUI:

We think it rocks and you will too. YOU WILL LIKE IT! and if you don't YOU WILL LEARN TO LIKE IT! Like cauliflower and Brussels sprouts. No actually, if we're serious, it's a vast improvement from the previous context menu.

2. Java 8 support.
Yeah - eventually. The reason this took a while is because we had to do an end-to-end test of Maltego on the new platform before we're confident that we can release it. Like a new girlfriend every Java version has it's own unique quirks. Things that worked perfectly well in Java 7 needs a lot more TLC in Java 8. 

3. Better OSX support
Everybody knows how much we love Macs (cough). We've decided to make the Mac install and startup a lot more robust - and easier for end users. Keep in mind that with 3 versions of Java floating around (6,7 and 8) and Windows / Linux / Mac support it's not always so easy to make sure Maltego installs and run perfectly in all 9 environments. Not to talk about the small differences between Mavericks and Yosemite, Windows 7 vs Windows 8 and easing the install on Flubber Linux 8.15.221. 

...but the most exciting feature...

4. Transform Hub
When we started Maltego almost 8 years ago our vision was always that other people can build their own transforms. As Maltego became more mature this dream was becoming a reality and today many interesting project are using Maltego as it's front end. The problem was that sharing these transforms with the rest of the world was a bit... tedious. This is why we decided to build functionality that allows you to see which other cool transforms other people made available. We call it the Transform Hub.

Note that we don't call this the Transform Store - but I guess we could have. It's basically the same thing with the exception that we hope most of the transforms will be free. It's up to all the 3rd party transform writers to decide if / how they want to price transforms. 

It basically means that when you start Maltego you'll get a list of 3rd party transforms and you get to choose which ones you want to use. Here's what it's going to look like (note that items in the Transform Hub hasn't been finalized!):

At startup you'll see the Transform Hub.


 If you want to quickly see what the transforms are all about you can just mouse-over on them.


If you want to see more details - click on 'details' (so original). Here you can see things like the transform writer's web page, if it's commercial or not, where to register (if you need to) and where you can contact the transform writers. 

Once you're ready to install the transforms simply click on 'Install'.  


And that's pretty much it. No more seed URLs to enter. With the new TDS we push entity definitions to the client during install as well! One click install. <in a very soft/low voice/and spoken very quickly: "This applies for TDS transforms only">. Yeah of course.

With this fantastic development in place we call all transforms writers! Let us know what you've been brewing up and we'll add you to ..<cave reverb, lots of echo> THE TRANSFORM HUB..Hub...hub...ub...b.

One more thing - you can always add your private server to the list too. And - if the transforms you are using are hosted on a 3rd party's own TDS server your traffic will only go this 3rd party - we don't see it! 

If you are interested in getting your own TDS - please let us know. We're keen to sell you one. And you can sell transforms. And you can sell your transforms on a TDS server to other people. Together we can make lots of money! (sorry, marketing INSISTED on this paragraph, they're not the most creative bunch. In the long run we actually see a lot of free stuff on there. We hope. Actually - it's in your hands really.)

PS - so... the public TDS is not 100% there in terms of all the bells of whistles of the shiny new commercial TDS - but Andrew promised us that he will be making tea for everyone every 3rd day until he's done porting the commercial TDS to the public TDS.

5. Development portal & forum
Because of 4) we decided it would be a great idea if people actually knew how to build their own transforms! In the past we've been...well...not so great at that - we confess. But this has all changed! 

We spent days and weeks building a really snazzy looking development website. It's at [http://dev.paterva.com/developer/] and it's packed with all sorts of nice goodies. We're still working on it so not *all* the sections are 100% completed but it should be a great resource for people wanting to write transforms. 

And also - the forum is back. Well - the development forum. Until the spammers list their shit on there again. Then we'll put our famous Maltego Community Edition CAPTCHAS on there!

The Chlorine release should be ready to go by the middle of February 2015 - we're really looking forward to seeing the feedback from our users. We'll start off (as always) with the commercial release and the community release (and the Kali Linux release) will follow soon afterwards.

Thanks for reading all of this - wow - it's a lot - we've been damn busy!

Baby seals,
RT