Monday, September 17, 2012

Fuzzing Like A Boss with Pythonect

In my previous post Automated Static Malware Analysis with Pythonect, I wrote about how to use Pythonect to automate static malware analysis. In this post I'll describe how to use Pythonect and all of its perks to fuzz file formats, network protocols, and command line arguments. The examples provided are only a sampling of what can be done. There are, obviously many more possibilities and you are encouraged to experiment. Before you read this tutorial you should have at least a basic knowledge of Fuzz testing, Python and Pythonect (I recommend reading the Pythonect Tutorial: Learn By Example).

Let's see some code!
['A', 'a', '0', '!', '$', '%', '*', '+', ',', '-', '.', '/', ':', '?', '@', '^', '_'] \
    -> [_ * n for n in [256, 512, 1024, 2048, 4096]] \
        -> os.system('/bin/ping ' + _)
The code above tries to fuzz the command-line arguments of a *nix command-line tool (e.g. /bin/ping). Let's go line by line and explain what's going on with these 3 lines of code.

The first line defines a list of inputs to try (i.e. ['A', 'a', '0', ...]]), the second line defines a list of length parameters (i.e. [256, 512, 1024, ...]), and the last line executes the command-line tool with the generated argument as argv[1] (e.g. /bin/ping "AAAAAA ... 250 times"). In addition, this fuzzer is multi-threaded and uses asynchronous communication. What does it mean? It means that it's not waiting for a thread to finish before continuing with the loop, and as a result, it's not guaranteed to fuzz in sorted order (.e. A * 255, A * 512, A * 1024, ...)

You can easily extend the code above to include testing for format string vulnerabilities:
['%s', '%n', 'A', 'a', '0', '!', '$', '%', '*', '+', ',', '-', '.', '/', ':', '?', '@', '^', '_'] \
    -> [_ * n for n in [256, 512, 1024, 2048, 4096]] \
        -> os.system('/bin/ping ' + _)
If you want the format string testing inputs to run first (i.e. fuzz in sorted order), change the forward pipe operator from asynchronous to synchronous:
['%s', '%n', 'A', 'a', '0', '!', '$', '%', '*', '+', ',', '-', '.', '/', ':', '?', '@', '^', '_'] \
    | [_ * n for n in [256, 512, 1024, 2048, 4096]] \
        -> os.system('/bin/ping ' + _)
If you also want the length parameters to run in sorted order (i.e. '%s' * 256, '%s' * 512, '%s' * 1024, ...), change the 2nd forward pipe operator to synchronous as well:
['%s', '%n', 'A', 'a', '0', '!', '$', '%', '*', '+', ',', '-', '.', '/', ':', '?', '@', '^', '_'] \
    | [_ * n for n in [256, 512, 1024, 2048, 4096]] \
        | os.system('/bin/ping ' + _)
Keep in mind, that the latter is no longer multi-threaded (due to the fact that it's waiting for both, the inputs and length threads to finish).

Moving on, here is an example of a generic file format fuzzer:
open('dana.jpg', 'r').read() \
    -> itertools.permutations \
        -> open('output_' + hex(_.__hash__()) + '.jpg', 'w').write(''.join(_))
The code above reads the content of dana.jpg and passes it to itertools.permutations, and that in turn returns dana.jpg-length tuples, all possible orderings, no repeated elements.
Each dana.jpg-length tuple is saved into a unique output_ prefixed file. Afterwards, testing the JPEG libraries is as easy as: eog *.jpg or zgv *.jpg

This is another example of a generic file format fuzzer:
open('dana.jpg', 'r').read() \
    -> [list(_) + [os.urandom(1) for n in xrange(0, len(_))]] \
        -> [tuple(random.sample(_, len(_)/2)) for i in xrange(0, len(_)*2)] \
            -> open('output_' + hex(_.__hash__()) + '.jpg', 'w').write(''.join(_))
The code above reads the content of dana.jpg, generates a dana.jpg-length random bytes buffer, joins them, and then randomly samples dana.jpg-length*2 dana.jpg-length chunks.
Each dana.jpg-length chunk is saved into a unique output_ prefixed file. Again, testing the JPEG libraries is as easy as: eog *.jpg or zgv *.jpg

Last but not least, here's a network protocol (FTP) fuzzer:
ftplib.FTP('localhost') \
    -> _.login().startswith('230') \
    -> [_.mkd(s) for s in reduce(lambda x,y: x+y, map(lambda c: [chr(c) * 2**l for l in range(8,13)], xrange(1, 255)))]
The code above uses ftplib module to connect to a FTP site, logins as an anonymous, generates strings from byte value 1-255 * 256, 512 and etc. and passes each string as pathname for MKD.

Lastly, if you have suggestions on how we can make Pythonect better, head over to Pythonect's github page and create a new ticket or fork. Enjoy the examples and have fun with Pythonect!