Grey bar Blue bar
Share this:

Sat, 7 Aug 2010

BlackHat Write-up: go-derper and mining memcaches

[Update: Disclosure and other points discussed in a little more detail here.]

Why memcached?

At BlackHat USA last year we spoke about attacking cloud systems, while the thinking was broadly applicable, we focused on specific providers (overview). This year, we continued in the same vein except we focused on a particular piece of software used in numerous large-scale application including many cloud services. In the realm of "software that enables cloud services", there appears to be a handful of "go to" applications that are consistently re-used, and it's curious that a security practitioner's perspective has not as yet been applied to them (disclaimer: I'm not aware of parallel work).

We choose to look at memcached, a "Free & open source, high-performance, distributed memory object caching system" 1. It's not outwardly sexy from a security standpoint and it doesn't have a large and exposed codebase (total LOC is a smidge over 11k). However, what's of interest is the type of applications in which memcached is deployed. Memcached is most often used in web application to speed up page loads. Sites are almost2 always dynamic and either have many clients (i.e. require horizontal scaling) or process piles of data (look to reduce processing time), or oftentimes both. This implies that the sites that use memcached contain more interesting info than simple static sites, and are an indicator of a potentially interesting site. Prominent users of memcached include LiveJournal (memcached was originally written by Brad Fitzpatrick for LJ), Wikipedia, Flickr, YouTube and Twitter.

I won't go into how memcached works, suffice it to say that since data tends to be read more often than written in common use cases the idea is to pre-render and store the finalised content inside the in-memory cache. When future requests ask for the page or data, it doesn't need to be regenerated but can be simply regurgitated from the cache. Their Wiki contains more background.


We released go-derper, a tool for playing with memcached instances. It supports three basic modes of operations:
  1. Fingerprinting memcacheds to determine interesting servers
  2. Extracting a (user-limited) copy of the cache
  3. Writing data into the cache
The tool has minor requirements: a recent Ruby and the memcache-client gem. What follows are basic use cases.


Let's assume you've scanned a hosting provider and found 239 potential targets using a basic .nse that hunts down open memcached instances3. You need to separate the wheat from the chaff and figure out which servers are potentially interesting; one way to do that is by extracting a bunch of metrics from each cache. Start small against one cache: insurrection:demo marco$ ruby go-derper.rb -f x.x.x.x [i] Scanning x.x.x.x x.x.x.x:11211 ============================== memcached 1.4.5 (1064) up 54:10:01:27, sys time Wed Aug 04 10:34:36 +0200 2010, utime=369388.17, stime=520925.98 Mem: Max 1024.00 MB, max item size = 1024.00 KB Network: curr conn 18, bytes read 44.69 TB, bytes written 695.93 GB Cache: get 514, set 93.41b, bytes stored 825.73 MB, curr item count 1.54m, total items 1.54m, total slabs 3 Stats capabilities: (stat) slabs settings items (set) (get)

44 terabytes read from the cache in 54 days with 1.5 million items stored? This cache is used quite frequently. There's an anomaly here in that the cache reports only 514 reads with 93 billion writes; however it's still worth exploring if only for the size.

We can run the same fingerprint scan against multiple hosts using

ruby go-derper.rb -f host1,host2,host3,...,hostn

or, if the hosts are in a file (one per line):

ruby go-derper.rb -F file_with_target_hosts

Output is either human-readable multiline (the default), or CSV. The latter helps for quickly rearranging and sorting the output to determine potential targets, and is enabled with the "-c" switch:

ruby go-derper.rb -c csv -f host1,host2,host3,...,hostn

Lastly, the monitor mode (-m) will loop forever while retrieving certain statistics and keep track of differences between iterations, in order to determine whether the cache appears to be in active use.


Once you've identified a potentially interesting target, it's time to mine that cache. The basic leach switch is "-l":

insurrection:demo marco$ ruby go-derper.rb -l -s x.x.x.x [w] No output directory specified, defaulting to ./output [w] No prefix supplied, using "run1"

This will extract data from the cache in the form of a key and its value, and save the value in a file under the "./output" directory by default (if this directory doesn't exist then the tool will exit so make sure it's present.) This means a separate file is created for every retrieved value. Output directories and file prefixes are adjustable with "-o" and "-r" respectively, however it's usually safe to leave these alone.

By default, go-derper fetches 10 keys per slab (see the memcached docs for a discussion on slabs; basically similar-sized entries are grouped together.) This default is intentionally low; on an actual assessment this could run into six figures. Use the "-K" switch to adjust:

ruby go-derper.rb -l -K 100 -s x.x.x.x

As mentioned, retrieved data is stored in the "./ouput" directory (or elsewhere if "-o" is used). Within this directory, each new run of the tool produces a set of files prefixed with "runN" in order to keep multiple runs separate. The files produced are:

  • runN-index, an index file containing metadata about each entry retrieved
  • runN-<md5>, a file containing the bytestream from a retrieved value
The mapping between key and file in which the value is stored occurs in the index file, which is useful in that potentially malicious data (keynames) aren't used when interacting with your local filesystem APIs.

At this point, there will (hopefully) be a large number of files in your output directory, which may contain useful info. Start grepping.

What we found with a bit of field experience was that mining large caches can take some time, and repeating grep gets quite boring. The tool permits you to supply your own set of regular expressions which will be applied to each retrieved value; matches are printed to the screen and this provides a scroll-by view of bits of data that may pique your interest (things like URLs, email addresses, session IDs, strings starting with "user", "pass" or "auth", cookies, IP addresses etc). The "-R" switch enables this feature and takes a file containing regexes as its sole argument:

ruby go-derper.rb -l -K 100 -R regexs.txt -s x.x.x.x


In this blog entry I don't cover the kinds of data we discovered (it'll be subject to a separate entry), however it may come to pass that you discover an interesting cache entry that you'd like to overwrite. Recall entries were stored in "./output" by default, with a prefix of "runN". If the interesting entry was stored in "output/run1-e94aae85bd3469d929727bee5009dddd", edit the file in whatever manner you see fit and save it to your local disk. Then, tell go-derper to write the entry back into the cache with:

ruby go-derper.rb -w output/run1-e94aae85bd3469d929727bee5009dddd

This syntax is simple since go-derper will figure out the target server and key from the run's index file.

And so?

Go-derper permits basic manipulations of a memcached instance. We haven't covered finding open instances or the kinds of data one may come across; these will be the subject of followup posts. Below are the slides from the talk, click through to SlideShare for the downloadable PDF.

2 We're hedging here, but we've not come across a static memcached site.

3 If so, you may be as surprised as we were in finding this many open instances.

Memcached talk update

Wow. At some point our talk hit HackerNews and then SlashDot after swirling around the Twitters for a few days. The attention is quite astounding given the relative lack of technical sexiness to this; explanations for the interest are welcome!

We wanted to highlight a few points that didn't make the slides but were mentioned in the talk:

  • and GoWalla repaired the flaws extremely quickly, prior to the talk.
  • PBS didn't get back to us.
  • GlobWorld is in beta and isn't publicly available yet.
For those blaming admins or developers, I think the criticism is overly harsh (certainly I'm not much of a dev as the "go-derper" source will show). The issues we found were in cloud-based systems and an important differentiating factor between deploying apps on local systems as opposed to in the cloud is that developers become responsible for security issues that were never within their job descriptions; network-level security is oftentimes a foreign language to developers who are more familiar with app-level controls. With cloud deployments (such as those found in small startups without dedicated network-security people) the devs have to figure all this out.

The potential risk assigned to exposed memcacheds hasn't as yet been publicly demonstrated so it's unsurprising that you'll find memcacheds around. I imagine this issue will flare and be hunted into extinction, at least on the public interwebs.

Lastly, the major interest seems to be on mining data from exposed caches. An equally disturbing issue is overwriting entries in the cache and this shouldn't be underestimated.

Mon, 31 May 2010

SensePost at BlackHat USA 2010

A brief update from South Africa on some recent talks as well as the upcoming BH USA: our talk proposal has been accepted for BH USA 2010 which makes it the ninth year running that SensePost is talking in Las Vegas. One more and we qualify for free milkshakes at the Peppermill. This year we'll be discussing caching in large scale web apps and why exposing caches to the interwebs is a Very Bad Thing. We'll also be looking at caching services, an idea whose time should never come.

This is a follow-on to last year's talk on hacking cloud providers; which was subsequently the topic of invited talks at TROOPERS10, CSI Filter, a BH Webcast and IS Labs. The talk generated much interest and we got fair mileage from it. This year's talk is a natural extension; we're poking at some of the technologies used under the hood to build large apps in the cloud.

Finally, mandatory shameless training plug (or I get fired): we're also training in Vegas. for more info.

Fri, 29 Jan 2010

Is the writing on the wall for general purpose computing ?

The Apple iPad announcement set the interwebs alight, and there is no shortage of people blogging or tweeting about how it will or wont change their lives. I'm going to ignore those topics almost completely to make one of those predictions that serve mainly to let people laugh at me later for being so totally wrong..

Heres my vision.. Its not just the Hipsters and college kids who get iPads, its the execs and CEO's. They are happy for a short while using it just as an E-Reader, movie watcher and couch based web browser, but the app store keeps growing to support the new form factor. Apps like iWork for iPad (at only $10) means that sooner or later they are relatively comfortable spreadsheeting or document pushing on their iPad.. It doesn't take too long for them to realize that they don't have much heavier computing requirements anyway and besides.. the instant on experience is what they always wanted..

Now despite the fact that it didn't take people like taviso or charlie miller long to exploit the iPhone, the devices security model does present a security benefit over the traditional end user computing model. Sand-boxed Applications, signed code restrictions and a rudimentary app store check means that the device has not been hammered with malware or exploited en-masse. Now the CEO hears the CFO complaining about his latest desktop virus episode, or patch-day drama. "If only your desktop could work like my tablet..". Apple currently run OS X, and iPhoneOS for iPad and iPhoneOS for Touch/iPhones. Why not a version of iPhone OS that runs on its desktops ?

You get the App store and access to all the apps across all your devices.. and its pretty, and it just works..

At this point i have to mis-quote Martin Niemöller : First they came for the mp3 players, and i did not speak out - because i never really had one before anyway. Then they came for the cell phones, and i did not speak out - because it was really cool. Then they came for the tablets, and i did not speak out - because it was just a tablet. Then they came for our desktops - and it made perfect sense...

Security practitioners have long lamented the fact that we seem to be losing the war. Too much runs on our machines and the surface area is too large to defend and bad code is being written and deployed faster than we can test it.. Moving iPhoneOS to the desktop allows a contained, controlled computing platform that has the potential to be pushed through the organization from the top down. I think this is an important difference. Techies and Geeks can debate the pros and cons of wireless for ages, but it just takes one member of exco to need it and wireless deployments will happen. CEO's and execs with iPads will push cloud and tablet computing at a quick pace too. Despite the relatively tame initial response to the iPad, the stars seem well aligned for this to be an inflection point that leaves us with less computer and more consumer electronic devices.

Of course all this comes at a cost.. You trade some measure of control and surrender to the will of our Cupertino overlords..

-shrug- or maybe im just smoking my socks... :>


Wed, 16 Dec 2009

We are famous (almost!)

Last week had two "cloud-security" related articles hit the inter-webs.. After our Vegas09 talk on "clobbering the cloud" we had a brief chat to Rob Lemos, who called us up again, so we ended up adding the soundbyte to his piece in Technology review along with guys like Moxie Marlinspike and Danny MacPherson [here]

We also showed up on Read/Write Web, where we were called "security nerds" and "black hats"

Ahhh.. roll on 2010!