Grey bar Blue bar
Share this:

Mon, 9 Mar 2015

Break the Web at BlackHat Singapore

Web application security training in 2015?

It's a valid question we get asked sometimes. With the amount of books available on the subject, the tools that seemingly automate the process coupled with the fact that findings bugs in web apps should be harder now that frameworks and developers are more likely to produce secure code, is there a need to still train people up in the art of application exploitation?

Our response is yes. Our application assessment course constantly changes. We look at the thousands of assessments that we perform for our customers and take those vulnerabilities discovered, new architectures and designs and try and build practical exploitation scenarios for our students. We love breaking the web, the cloud, 'the box that's hosted somewhere you can't recall but just works', as there's always new approaches and methods one can take to own the application layer.

Last month I discovered a vulnerability in Redhat's OpenStack Platform. What was cool about this vulnerability is that it's not a new class of vulnerability but when deployed in an organisation, it allows an authenticated user the ability to read files on the filesystem with the permissions of the web server. Owning organisations is all about exploiting flaws and chaining them together to achieve the end goal.

We want to teach you the same process: from learning how to own the application layer whilst having fun doing it at BlackHat Asia - Singapore.

During the course we will have a view on:

  • Introduction to the hacker mindset (more important than it sounds!)

  • Reconnaissance techniques.

...and...breaking the web...

  • Understanding and exploiting data validation issues such as SQLi, XSS, XML and LDAP injections.

  • Session Management issues...oh cookies!

  • Attacking web services.

  • Glance of client-side technologies such as Silverlight, ActiveX.

This course will be hands on. It won't just be me standing up and speaking but you learning how to own web apps and exploit common vulnerabilities just like the best.

Come and join us! It will be fun :)




Mon, 22 Apr 2013

Google Docs XSS - no bounty today

A few days ago, during one of those nights with the baby crying at 2:00 am and the only thing you can do is to read emails, I realised that Gmail shows the content of compressed files when reading them in Google Docs. As often is the case at SensePost, the "think evil (tm)" came to me and I started to ponder the possibilities of injecting HTML inside the file listing. The idea is actually rather simple. Looking at the file format of a .zip file we see the following:

Every file in the compressed file must have two entries; ZipFileRecord and ZipDirEntry. Both of these entries contain the filename, but only the first one contains the length of filename (it must match the actual length). Our first test case is obvious; if we could modify this name once the file was compressed, would Google sanitise it? Thankfully, the answer is, yes! (go Google!)

As you can see, Google shows the file name inside the compressed file but the tag is displayed with HTML entities. If we then try to see the contents of the file, Google responds by telling us it's not possible to read the content of the file (it's empty) and shows you the file "without formatting" after a few seconds:

Finally, the filename is shown but not sanitised:

Why this is possible?

Remember that the zip format has the name of the compressed files twice. Google uses the first one (ZipFileRecord) for displaying the file names, but in the vulnerable page it uses the second one (ZipDirEntry).

Possible attack vectors

Going back to the 'thinking evil (tm)' mindset, it is now possible to leave a "comprehensive" name in the first entry and inject the malicious payload in the second one. When I first discovered the possibility of doing this, I contacted Google, however, the XSS is in the domain, which Google's security team described as a "sandbox" domain (i.e. we aren't injecting into the DOM of and therefore not worthy of a bounty. Which I accept, if I had to prove usefulness this could be used as part of a simple social engineering attack, for example:

Leading the victim to my phishing site:

Which then proceeds to steals their Google session, or allows the attacker to use BeEF:

Granted, there are simpler ways of achieving the same result. I just wanted to demonstrate how you can use file meta-information for such an attack.

Thu, 3 Nov 2011

Squinting at Security Drivers and Perspective-based Biases

While doing some thinking on threat modelling I started examining what the usual drivers of security spend and controls are in an organisation. I've spent some time on multiple fronts, security management (been audited, had CIOs push for priorities), security auditing (followed workpapers and audit plans), pentesting (broke in however we could) and security consulting (tried to help people fix stuff) and even dabbled with trying to sell some security hardware. This has given me some insight (or at least an opinion) into how people have tried to justify security budgets, changes, and findings or how I tried to. This is a write up of what I believe these to be (caveat: this is my opinion). This is certainly not universalisable, i.e. it's possible to find unbiased highly experienced people, but they will still have to fight the tendencies their position puts on them. What I'd want you to take away from this is that we need to move away from using these drivers in isolation, and towards more holistic risk management techniques, of which I feel threat modelling is one (although this entry isn't about threat modelling).


The tick box monkeys themselves, they provide a useful function, and are so universally legislated and embedded in best practise, that everyone has a few decades of experience being on the giving or receiving end of a financial audit. The priorities audit reports seem to drive are:

  • Vulnerabilities in financial systems. The whole audit hierarchy was created around financial controls, and so sticks close to financial systems when venturing into IT's space. Detailed and complex collusion possibilities will be discussed when approving payments, but the fact that you can reset anyone's password at the helpdesk is sometimes missed, and more advanced attacks like token hijacking are often ignored.
  • Audit house priorities. Audit houses get driven just like anyone else. While I wasn't around for Enron, the reverberations could still be felt years later when I worked at one. What's more, audit houses are increasingly finding revenue coming from consulting gigs and need to keep their smart people happy. This leads to external audit selling "add-ons" like identity management audits (sometimes, they're even incentivised to).
  • Auditor skills. The auditor you get could be an amazing business process auditor but useless when it comes to infosec, but next year it could be the other way around. It's equally possibly with internal audit. Thus, the strengths of the auditor will determine where you get nailed the hardest.
  • The Rotation plan. This year system X, next year system Y. It doesn't mean system X has gotten better, just that they moved on. If you spend your year responding to the audit on system Y and ignore X, you'll miss vital stuff.
  • Known systems. External and internal auditors don't know IT's business in detail. There could be all sorts of critical systems (or pivot points) that are ignored because they weren't in the "flow of financial information" spread sheet.
Vendors Security vendors are the love to hate people in the infosec world. Thinking of them invokes pictures of greasy salesmen phoning your CIO to ask if your security chumps have even thought about network admission control (true story). On the other hand if you've ever been a small team trying to secure a large org, you'll know you can't do it without automation and at some point you'll need to purchase some products. Their marketing and sales people get all over the place and end up driving controls; whether it's “management by in-flight magazine”, an idea punted at a sponsored conference, or the result of a sales meeting.

But security vendors prioritisation of controls are driven by:

  • New Problems. Security products that work eventually get deployed everywhere they're going to be deployed. They continue to bring in income, but the vendor needs a new bright shiny thing they can take to their existing market and sell. Thus, new problems become new scary things that they can use to push product. Think of the Gartner hype curve. Whatever they're selling, be it DLP, NAC, DAM, APT prevention or IPS if your firewall works more like a switch and your passwords are all "P@55w0rd" then you've got other problems to focus on first.
  • Overinflated problems. Some problems really aren't as big as they're made out to be by vendors, but making them look big is a key part of the sell. Even vendors who don't mean to overinflate end up doing it just because they spend all day thinking of ways to justify (even legitimate) purchases.
  • Products as solutions. Installing a product designed to help with a problem isn't the same as fixing the problem, and vendors aren't great at seeing that (some are). Take patch management solutions, there are some really awesome, mature products out there, but if you can't work out where your machines are, how many there are or get creds to them, then you've got a long way to go before that product starts solving the problem it's supposed to.

Every year around Black Hat Vegas/Pwn2Own/AddYourConfHere time a flurry of media reports hit the public and some people go into panic mode. I remember The DNS bug, where all that was needed was for people to apply a patch, but which, due to the publicity around it, garnered a significant amount of interest from people who it usually wouldn't, and probably shouldn't have cared so much. But many pentesters trade on this publicity; and some pentesting companies use this instead of a marketing budget. That's not their only, or primary, motivation, and in the end things get fixed, new techniques shared and the world a better place. The cynical view then is that some of the motivations for vulnerability researchers, and what they end up prioritising are:

  • New Attacks. This is somewhat similar to the vendors optimising for "new problems" but not quite the same. When Errata introduced Hamster at ToorCon ‘07, I heard tales of people swearing at them from the back. I wasn't there, but I imagine some of the calls were because Layer 2 attacks have been around and well known for over a decade now. Many of us ignored FireSheep for the same reason, even if it motivated the biggest moves to SSL yet. But vuln researchers and the scene aren't interested, it needs to be shiny, new and leet . This focus on the new, and the press it drives, has defenders running around trying to fix new problems, when they haven't fixed the old ones.
  • Complex Attacks. Related to the above, a new attack can't be really basic to do well, it needs to involve considerable skill. When Mark Dowd released his highly complex flash attack, he was rightly given much kudos. An XSS attack on the other hand, was initially ignored by many. However, one lead to a wide class of prevalent vulns, while the other requires you to be, well, Mark Dowd. This mean some of the issues that should be obvious, that underpin core infrastructure, but that aren't sexy, don't get looked at.
  • Shiny Attacks. Some attacks are just really well presented and sexy. Barnaby Jack had an ATM spitting out cash and flashing "Jackpot", that's cool, and it gets a room packed full of people to hear his talk. Hopefully it lead to an improvement in security of some of the ATMs he targeted, but the vulns he exploited were the kinds of things big banks had mostly resolved already, and how many people in the audience actually worked in ATM security? I'd be interested to see if the con budget from banks increased the year of his talk, even if they didn't, I suspect many a banker went to his talk instead of one that was maybe talking about a more prevalent or relevant class of vulnerabilities their organisation may experience. Something Thinkst says much better here.
Individual Experience

Unfortunately, as human beings, our decisions are coloured by a bunch of things, which cause us to make decisions either influenced or defined by factors other than the reality we are faced with. A couple of those lead us to prioritising different security motives if decision making rests solely with one person:

  • Past Experience. Human beings develop through learning and consequences. When you were a child and put your hand on a stove hot plate, you got burned and didn't do it again. It's much the same every time you get burned by a security incident, or worse, internal political incident. There's nothing wrong with this, and it's why we value experience; people who've been burned enough times not to let mistakes happen again. However, it does mean time may be spent preventing a past wrong, rather than focusing on the most likely current wrong. For example, one company I worked with insisted on an overly burdensome set of controls to be placed between servers belonging to their security team and the rest of the company network. The reason for this was due to a previous incident years earlier, where one of these servers had been the source of a Slammer outbreak. While that network was never again a source of a virus outbreak, their network still got hit by future outbreaks from normal users, via the VPN, from business partners etc. In this instance, past experience was favoured over a comprehensive approach to the actual problem, not just the symptom.
  • New Systems. Usually, the time when the most budget is available to work on a system is during its initial deployment. This is equally true of security, and the mantra is for security to be built in at the beginning. Justifying a chunk of security work on the mainframe that's been working fine for the last 10 years on the other hand is much harder, and usually needs to hook into an existing project. The result is that it's easier to get security built into new projects than to force an organisation to make significant “security only” changes to existing systems. The result in those that present the vulnerabilities pentesters know and love get less frequently fixed.
  • Individual Motives. We're complex beings with all sorts of drivers and motivations, maybe you want to get home early to spend some time with your kids, maybe you want to impress Bob from Payroll. All sorts of things can lead to a decision that isn't necessarily the right security one. More relevantly however, security tends to operate in a fairly segmented matter, while some aspects are “common wisdom”, others seem rarely discussed. For example, the way the CISO of Car Manufacturer A and the CISO of Car Manufacturer B set up their controls and choose their focus could be completely different, but beyond general industry chit-chat, there will be little detailed discussion of how they're securing integration to their dealership network. They rely on consultants, who've seen both sides for that. Even then, one consultant may think that monitoring is the most important control at the moment, while another could think mobile security is it.
So What?

The result of all of this is that different companies and people push vastly different agendas. To figure out a strategic approach to security in your organisation, you need some objective risk based measurement that will help you secure stuff in an order that mirrors the actual risk to your environment. While it's still a black art, I believe that Threat Modelling helps a lot here, a sufficiently comprehensive methodology that takes into account all of your infrastructure (or at least admits the existence of risk contributed by systems outside of a “most critical” list) and includes valid perspectives from above tries to provide an objective version of reality that isn't as vulnerable to the single biases described above.

Sun, 29 May 2011

Incorporating cost into appsec metrics for organisations

A longish post, but this wasn't going to fit into 140 characters. This is an argument pertaining to security metrics, with a statement that using pure vulnerability count-based metrics to talk about an organisation's application (in)security is insufficient, and suggests an alternative approach. Comments welcome.

Current metrics

Metrics and statistics are certainly interesting (none of those are infosec links). Within our industry, Verizon's Data Breach Investigations Report (DBIR) makes a splash each year, and Veracode are also receiving growing recognition for their State of Software Security (SOSS). Both are interesting to read and contain much insight. The DBIR specifically examines and records metrics for breaches, a post-hoc activity that only occurs once a series of vulnerabilities have been found and exploited by ruffians, while the SOSS provides insight into the opposing end of a system's life-cycle by automatically analysing applications before they are put into production (in a perfect world... no doubt they also examine apps that are already in production). Somewhat tangentially, Dr Geer wrote recently about a different metric for measuring the overall state of Cyber Security, we're currently at a 1021.6. Oh noes!

Apart from the two bookends (SOSS and DBIR), other metrics are also published.

From a testing perspective, WhiteHat releases perhaps the most well-known set of metrics for appsec bugs, and in years gone by, Corsaire released statistics covering their customers. Also in 2008, WASC undertook a project to provide metrics with data sourced from a number of companies, however this too has not seen recent activity (last edit on the site was over a year ago). WhiteHat's metrics measure the number of serious vulnerabilities in each site (High, Critical, Urgent) and then slice and dice this based on the vulnerability's classification, the organisation's size, and the vertical within which they lie. WhiteHat is also in the fairly unique position of being able to record remediation times with a higher granularity than appsec firms that engage with customers through projects rather than service contracts. Corsaire's approach was slightly different; they recorded metrics in terms of the classification of the vulnerability, its impact and the year within which the issue was found. Their report contained similar metrics to the WhiteHat report (e.g. % of apps with XSS), but the inclusion of data from multiple years permitted them to extract trends from their data. (No doubt WhiteHat have trending data, however in the last report it was absent). Lastly, WASC's approach is very similar to WhiteHat's, in that a point in time is selected and vulnerability counts according to impact and classification are provided for that point.

Essentially, each of these approaches uses a base metric of vulnerability tallies, which are then viewed from different angles (classification, time-series, impact). While the metrics are collected per-application, they are easily aggregated into organisations.

Drawback to current approaches

Problems with just counting bugs are well known. If I ask you to rate two organisations, the Ostrogoths and the Visigoths, on their effectiveness in developing secure applications, and I tell you that the Ostrogoths have 20 critical vulnerabilities across their applications, while the Visigoths only have 5, without further data it seems that the Visigoths have the lead. However, if we introduce the fact that the Visigoths have a single application in which all 5 issues appear, while the Ostrogoths spread their 20 bugs across 10 applications, then it's not so easy to crow for the Visigoths, who average 5 bugs per application as oppossed to the Ostrogoth's 2. Most reports take this into account, and report on a percentage of applications that exhibit a particular vulnerability (also seen as the probability that a randomly selected application will exhibit that issue). Unfortunately, even taking into account the number of applications is not sufficient; an organisation with 2 brochure-ware sites does not face the same risk as an organisation with 2 transaction-supporting financial applications, and this is where appsec metrics start to fray.

In the extreme edges of ideal metrics, the ability to factor in chains of vulnerabilities that individually present little risk, but combined is greater than the sum of the parts, would be fantastic. This aspect is ignored by most (including us), as a fruitful path isn't clear.

Why count in the first place?

Let's take a step back, and consider why we produce metrics; with the amount of data floating around, it's quite easy to extract information and publish, thereby earning a few PR points. However, are the metrics meaningful? The quick test is to ask whether they support decision making. For example, does it matter that external attackers were present in an overwhelming number incidents recorded in the DBIR? I suspect that this is an easy "yes", since this metric justifies shifting priorities to extend perimeter controls rather than rolling out NAC.

One could just as easily claim that absolute bug counts are irrelevant and that they need to be relative to some other scale; commonly the number of applications an organisation has. However in this case, if the metrics don't provide enough granularity to accurately position your organisation with respect to others that you actually care about, then they're worthless to you in decision making. What drives many of our customers is not where they stand in relation to every other organisation, but specifically their peers and competitors. It's slightly ironic that oftentimes the more metrics released, the less applicable they are to individual companies. As a bank, knowing you're in the top 10% of a sample of banking organisations means something; when you're in the highest 10% of a survey that includes WebGoat clones, the results are much less clear.

In Seven Myths About Information Security Metrics, Dr Hinson raises a number of interesting points about security metrics. They're mostly applicable to security awareness, however they also carry across into other security activities. At least two serve my selfish needs, so I'll quote them here:

Myth 1: Metrics must be “objective” and “tangible”

There is a subtle but important distinction between measuring subjective factors and measuring subjectively. It is relatively easy to measure “tangible” or objective things (the number of virus incidents, or the number of people trained). This normally gives a huge bias towards such metrics in most measurement systems, and a bias against measuring intangible things (such as level of security awareness). In fact, “intangible” or subjective things can be measured objectively, but we need to be reasonably smart about it (e.g., by using interviews,surveys and audits). Given the intangible nature of security awareness, it is definitely worth putting effort into the measurement of subjective factors, rather than relying entirely on easy-to-measure but largely irrelevant objective factors. [G Hinson]


Myth 3: We need absolute measurements

For some unfathomable reason, people often assume we need “absolute measures”—height in meters, weight in pounds, etc. This is nonsense!
If I line up the people in your department against a wall, I can easily tell who is tallest, with no rulers in sight. This yet again leads to an unnecessary bias in many measurement systems. In fact, relative values are often more useful than absolute scales, especially to drive improvement. Consider this for instance: “Tell me, on an (arbitrary) scale from one to ten, how security aware are the people in your department are? OK, I'll be back next month to ask you the same question!” We need not define the scale formally, as long as the person being asked (a) has his own mental model of the processes and (b) appreciates the need to improve them. We needn't even worry about minor variations in the scoring scale from month to month, as long as our objective of promoting improvement is met. Benchmarking and best practice transfer are good examples of this kind of thinking. “I don't expect us to be perfect, but I'd like us to be at least as good as standard X or company Y. [G Hinson]

While he writes from the view of an organisation trying to decide whether their security awareness program is yielding dividends, the core statements are applicable for organisations seeking to determine the efficacy of their software security program. I'm particularly drawn by two points: the first is that intangibles are as useful as concrete metrics, and the second is that absolute measurements aren't necessary, comparative ordering is sometimes enough.

Considering cost

It seems that one of the intangibles that currently published appsec metrics don't take into account, is cost to the attacker. No doubt behind each vulnerability's single impact rating are a multitude of factors that contribute, one of which may be something like "Complexity" or "Ease of Exploitation". However, measuring effort in this way is qualitative and only used as a component in the final rating. I'm suggesting that cost (interchangeable with effort) be incorporated into the base metric used when slicing datasets into views. This will allow you to understand the determination an attacker would require when facing one of your applications. Penetration testing companies are in a unique position to provide this estimate; a tester unleashed on an application project is time-bounded and throws their experience and knowledge at the app. At the end, one can start to estimate how much effort was required to produce the findings and, over time, gauge whether your testers are increasing their effort to find issues (stated differently, do they find fewer bugs in the same amount of time?). If these metrics don't move in the right direction, then one might conclude that your security practices are also not improving (providing material for decision making).

Measuring effort, or attacker cost, is not new to security but it's mostly done indirectly through the sale of exploits (e.g. iDefence, ZDI). Even here, effort is not directly related to the purchase price, which is also influenced by other factors such as the number of deployed targets etc. In any case, for custom applications that testers are mostly presented with, such public sources should be of little help (if your testers are submitting findings to ZDI, you have bigger problems). Every now and then, an exploit dev team will mention how long it took them to write an exploit for some weird Windows bug; these are always interesting data points, but are not specific enough for customers and the sample size is low.

Ideally, any measure of an attacker's cost can take into account both time and their exclusivity (or experience), however in practice this will be tough to gather from your testers. One could base it on their hourly rate, if your testing company differentiates between resources. In cases where they don't, or you're seeking to keep the metric simple, then another estimate for effort is the number of days spent on testing.

Returning to our sample companies, if the 5 vulnerabilities exposed in the Visigoth's each required, on average, a single day to find, while the Ostrogoth's 20 bugs average 5 days each, then the effort required by an attacker is minimised by choosing to target the Visigoths. In other words, one might argue that the Visigoths are more at risk than the Ostrogoths.

Metricload, take 1

In our first stab at incorporating effort, we selected an estimator of findings-per-day (or finding rate) to be the base metric against which the impact, classification, time-series and vertical attributes would be measured. From this, it's apparent that, subject to some minimum, the number of assessments performed is less important than the number of days worked. I don't yet have a way to answer what the minimum number of assessments should be, but it's clear that comparing two organisations where one has engaged with us 17 times and the other once, won't yield reliable results.

With this base metric, it's then possible to capture historical assessment data and provide both internal-looking metrics for an organisation as well as comparative metrics, if the testing company is also employed by your competitors. Internal metrics are the usual kinds (impact, classification, time-series), but the comparison option is very interesting. We're in the fortunate position of working with many top companies locally, and are able to compare competitors using this metric as a base. The actual ranking formulae is largely unimportant here. Naturally, data must be anonymised so as to protect names; one could provide the customer with their rank only. In this way, the customer has an independent notion of how their security activities rate against their peers without embarrassing the peers.

Inverting the findings-per-day metric provide the average number of days to find a particular class of vulnerability, or impact level. That is, if a client averages 0.7 High or Critical findings per testing day, then on average it takes us 1.4 days of testing to find an issue of great concern, which is an easy way of expressing the base metric.

What, me worry?

Without doubt, the findings-per-day estimator has drawbacks. For one, it doesn't take into consideration the tester's skill level (but this is also true of all appsec metrics published). This could be extended to include things like hourly rates, which indirectly measure skill. Also, the metric does not take into account functionality exposed by the site; if an organisation has only brochure-ware sites then it's unfair to compare them against transactional sites; this is mitigated at the time of analysis by comparing against peers rather than the entire sample group and also, to a degree, in the scoping of the project as a brochure-ware site would receive minimum testing time if scoped correctly.

As mentioned above, a minimum number of assessments would be needed before the metric is reliable; this is a hint at the deeper problems that randomly selected project days are not independent. An analyst stuck on a 4 week project is focused on a very small part of the broader organisation's application landscape. We counter this bias by including as many projects of the same type as possible.


If you can tease it out of them, finding rates could be an interesting method of comparing competing testing companies; ask "when testing companies of our size and vertical, what is your finding rate?", though there'd be little way to verify any claims. Can you foresee a day when testing companies advertise using their finding rate as the primary message? Perhaps...

This metric would also be very useful to include in each subsequent report for the customer, with every report containing an evaluation against their longterm vulnerability averages.

Field testing

Using the above findings-per-day metric as a base, we performed an historical analysis for a client on work performed over a number of years, with a focus on answering the following questions for them:
  1. On average, how long does it take to find issues at each Impact level (Critical down to Informational)?
  2. What are the trends for the various vulnerability classes? Does it take more or less time to find them year-on-year?
  3. What are the Top 10 issues they're currently facing?
  4. Where do they stand in relation to anonymised competitor data?
In preparation for the exercise, we had to capture a decent number of past reports, which was most time-consuming. What this highlighted for us was how paper-based reports and reporting is a serious hinderance to extracting useful data, and has provided impetus internally for us to look into alternatives. The derived statistics were presented to the client in a workshop, with representatives from a number of the customer's teams present. We had little insight into the background to many of the projects, and it was very interesting to hear the analysis and opinions that emerged as they digested the information. For example, one set of applications exhibited particularly poor metrics from a security standpoint. Someone highlighted the fact that these were outsourced applications, which raised a discussion within the client about the pros and cons on using third party developers. It also suggests that many further attributes can be attached to the data that is captured: internal or third party, development lifecycle model (is agile producing better code for you than other models?), team size, platforms, languages, frameworks etc.

As mentioned above, a key test for metrics is where they support decision making, and the feedback from the client was positive in this regard.

And now?

In summary, current security metrics as they relate to talking about an organisation's application security suffers from a resolution problem; they're not clear enough. Attacker effort is not modeled when discussing vulnerabilities, even though it's a significant factor when trying to get a handle on the ever slippery notion of risk. One approximation for attacker effort is to create a base-metric of the number of findings-per-day for a broad set of applications belonging to an organisation, and use those to evaluate which kinds of vulnerabilities are typically present while at the same time clarifying how much effort an attacker requires in order to exploit it.

This idea is still being fleshed out. If you're aware of previous work in this regard or have suggestions on how to improve it (even abandon it) please get in contact.

Oh, and if you've read this far and are looking for training, we're at BH in August.

Thu, 10 Jun 2010

SensePost Corporate Threat(Risk) Modeler

Since joining SensePost I've had a chance to get down and dirty with the threat modeling tool. The original principle behind the tool, first released in 2007 at CSI NetSec, was to throw out existing threat modeling techniques (it's really attack-focused risk) and start from scratch. It's a good idea and the SensePost approach fits nicely between the heavily formalised models like Octave and the quick-n-dirty's like attack trees. It allows fairly simple modeling of the organisation/system to quickly produce an exponentially larger list of possible risks and rank them.

We've had some time and a few bits of practical work to enhance the tool and our thinking about it. At first, I thought it would need an overhaul, mostly because I didn't like the terminology (hat tip to Mr Bejtlich). But, in testament to Charl's original thinking & the flexibility of the tool, no significant changes to the code were required. We're happy to announce version 2.1 is now available at our new tools page. In addition, much of our exploration of other threat modeling techniques was converted into a workshop of which the slides are available (approx 30MB).

The majority of the changes were in the equation. The discussion below will give you a good idea of how you can play with the equation to fundamentally change how the tool works.

There are 5 values you can play with in the equation:

  1. imp - the impact of a risk being realised
  2. lik - the likelihood of the risk occurring
  3. int - the value of an asset (represented by an interface to that asset)
  4. usr & loc - the measurable trust placed in a user & location respectively
The current default formula is:

In English that translates to: The risk is equal to; the average of the impact of the attack and it's likelihood, combined with the value of the asset (exposed through a particular interface), and reduced by the trust of the user performing the attack and the location they are performing it from.

We felt there were two problems with this equation:

  1. It doesn't acknowledge impact as linked to value. e.g. You can't have a huge impact on something of low value.
  2. It doesn't see trust as linked to likelihood. e.g. a trusted user in a trusted location is less likely to commit an attack.
  3. It double weights trust with location and user trust counting at full weight.
  4. It's maybe a little far from semi-consensual views on the subject
After much internal wrangling, and some actual work on modeling fairly complex stuff, we came up with a new equation. While we feel this works better, it does mean the way things are modeled changes, and hence backwards compatibility with existing models is broken (but you don't need to use this equation). The new equation (consider the risk= implied) is:

Once again in English: The risk of an attack is; the likelihood of the attack reduced by the average of both the trust in the user & location, combined with, the value of the asset reduced by the potential impact of the attack (value at risk). (The 0.2 & 2.5 are just to make it fit the scales. Specifically, the 0.2 is because the scale of the entities is 1-5 and we're looking to make a percentage, and the 2.5 is to fit the 0-25 scale on the final graph.)

The key change which breaks backward compatibility here is that impact now becomes a moderator on value. i.e. the impact of an attack determines how much of the asset's value is exposed.

The way things are now modeled, interfaces represent the value of a system. For the most part, all a system's interfaces should have the same value, because as we often see, even minor interfaces that expose limited functionality can often be abused for a full compromise. However, the actual attack (called threats in the tool) determined how much of that value is exposed. For example, a worst-case XSS is (depending on the system of course) probably going to expose less of the system's value than a malicious sysadmin publicly pwning it (once again, dependent on the system and controls in place).

Unfortunately, there's still no provable way to perform threat modeling, but we feel we can go quite far in providing a quick and useful way of enumerating and prioritising attacks (and hence defenses) across complex system.

In a future blog post, I hope to cover some of the really cool scenario planning the tool can let you do, and the pretty graphs it gave us an excuse to justify budgets with.

[ Credit to the Online LaTeX Equation Editor for the formulas, although if you'd like to copy paste the formula described above into the tool, here's an ascii version:

( ( ( lik * ( ( ( (6 - usr) + (6 - loc) ) / 2 ) * 0.2 ) ) + ( int * ( imp * 0.2 ) ) ) * 2.5 )