The monkeywrench Javascript deobfuscator

These days, there are all kinds of cool Javascript exploits floating around in the wild. Typically, your browser is directed to a page from an email or from an embedded iframe on a hijacked page -- but there are all kinds of ways to inject Javascript without your noticing it. And sure, Javascript has a pretty good security model. But injected Javascript can still redirect your browser, or download objects which target known vulnerabilities on your system (a corrupt Flash movie, for instance.)

Of course, if all it took was to scan your incoming connection for something like "document.write('malicious link here')", there would be no problem. But these guys are smarter than that -- they obfuscate the Javascript. A simple obfuscation might look like

document.write('htt' + 'p:/' + '/badguysrus.co' + 'm');

but you can still scan for "document.write", so they obfuscate a little more:

eval('docu' + 'ment.w' + 'rite(\'htt\' +' + '\'p:/\' + \'/badguysrus.co\' + \'m\')')

And of course it can get arbitrarily complicated. The reason this makes sense is that current virus scanner technology only checks files for "virus fingerprints" -- which are fixed byte sequences. The above obfuscation already makes it impossible just to check for a link to badguysrus.com, and I can assign values to variables and shuffle them around at will to foil more sophisticated attempts to scan for fixed sequences. At least one obfuscation I've seen obfuscates using a variable key, so that different runs of the obfuscator will result in entirely different sets of byte sequences.

But no matter how obfuscated the code, ultimately it must still contain every bit of the information we need to de-obfuscated it. Why? Because your browser has to run it, for it to fulfill its nefarious purposes.

It would be nice if you could point a tool at a URL, and immediately get some magical analysis of whether the URL contains malicious Javascript and, if so, what it is attempting to do. Ultimately, this is an impossible task -- it demands automated analysis of human beings, which may even be mathematically impossible, but is certain to be damned difficult. Still, this project is an attempt to start down the path towards such an analysis tool, packaged as an online suite. If you're interested, by all means get in touch; the lifeblood of this kind of thing is community interest. Besides, mixing languages in a programming project is fun. No matter how far I get down this road, I'm going to stumble across some really cool specimens.

This tool suite is called "the monkeywrench" because it relies on the open-source Javascript interpreter "Spidermonkey", which was developed by the Mozilla folks for inclusion in their open-source browser. The thing about a command-line Javascript interpreter is that (1) there's no browser to corrupt, so security-wise, you're quite safe, but (2) there's no browser at all -- meaning that there's no "document" object to write to, no "location" object to get an href from, and lots of other things obfuscators use to make sure people like you and me aren't trying to deobfuscate their evil handiwork. So besides the Perl that will glue everything together, there must also be some Javascript to provide convenient substitutes for the various objects ordinarily found in a browser. This technique originally caught my eye when it was featured on the ISC Sans security blog.

Ultimately, of course, the task as a whole is a dauntingly large one: intelligent analysis of the obfuscational tricks of some pretty smart people. I have no idea how possible it will be to automate -- there will have to be a great deal of interactivity involved, and I frankly have no real concept of how to organize that, especially in an online tool which should ideally be available to the public for anonymous use.

And -- just to make the whole thing harder -- I don't really want to limit the scope of the project to Javascript. There are lots of similar exploits of other code-based resources online, like SWF exploits and even MP3 exploits, of all things. I'd honestly like to have a single tool that unites a whole raft of decoders for these resources and checks them for exploits. Hey -- other people collect beetles. I like spam and exploits, what can I say? The criminal coding mind is an endless source of fascination for me.