Risky Business

<updated with missing risk matrix image>

Risk is a topic that I like to talk about a lot, mainly because I managed to get it ‘wrong’ for a very long time, and when I finally did realise what I was missing, everything else I struggled with fell into place around it. For me, therefore, Risk is the tiny cog in the big machine that, if it is not understood, greased and maintained, will snarl up everything else.

In the early days of my career, risk was something to be avoided, whatever the cost. Or rather, it needed to be Managed, Avoided, Transferred or Accepted down to the lowest possible levels across the board. Of course, I wasn’t so naive as to think all risks could be reduced to nothing, but they had to be reduced, and “accepting” a risk was what you did once it had been reduced. Imagine my surprise that you could “accept” a risk before you had even treated it!

There are many areas of risk that everyone should know before they start their risk management programme in whatever capacity they are in, but here are my top three:

Accepting the risk

If you want to know how not to accept a risk, look no further than this short music video  (which I have no affiliation with, honestly). Just accepting something because it is easy and you get to blame your predecessor or team is no way to deal with risks. Crucially, there is no reason why high-level risks cannot be accepted, as long as whoever does it is qualified to do so, cognizant of the potential fallout, and senior enough to have the authority to do so. Certain activities and technologies are inherently high risk; think of AI, IoT or oil and politics in Russia, but that doesn’t mean you should not be doing those activities. 

A company that doesn’t take risks is a company that doesn’t grow, and security risks are not the only ones that are being managed daily by the company leadership. Financial, geographic, market, people, and legal risks are just some things that need to be reviewed.

Your role as the security risk expert in your organisation is to deliver the measurement of the risks clearly as possible. That includes ensuring everyone understands how the score is derived, the logic behind it and the implications of that score. This brings us neatly to the second “Top Tip”:

Measuring the risk

Much has been written about how risks should be measured, quantitatively or qualitatively, for instance, financially or reputationally. Should you use a red/amber/green approach to scoring it, a percentage, or figure out of five? What is the best way to present it? In Word, Powerpoint or Excel? (Other popular office software is available.)

The reality is that, surprisingly, it doesn’t matter. What matters is choosing an approach and giving it a go; see if it works for you and your organisation. If it doesn’t, then look at different ways and methods. Throughout it all, however, it is vital that everyone involved in creating, owning and using the approach knows precisely how it works, what the assumptions are, and the implications of decisions being made from the information presented.

Nothing exemplifies this more than the NASA approach to risk. Now NASA, having the tough job of putting people into space via some of the most complicated machines in the world, would have a very rigorous, detailed and even complex approach to risk; after all, people’s lives are at stake here. And yet, their risk matrix comprises a five-by-five grid with probability on one axis and consequence on the other. The grid is then scored Low-Medium or High:

Seriously. That’s it. It doesn’t get much simpler than that. However, a 30-page supporting document explains precisely how the scores are derived, how probability and consequence should be measured, how the results can be verified, and so on. The actual simple measurement is different from what is important. It is what is behind it that is.

Incidents and risk

Just because you understand risk now, you may still need to be able to predict everything that might happen to you. For example, “Black Swan” events (from Nicholas Nasim Taleb’s book of the same name) cannot be predicted until they are apparent they will happen.

By this very fact, creating a risk register to predict unpredictable, potentially catastrophic events seems pointless. However, that differs from how an excellent approach to risk works. Your register allows you to update the organisational viewpoint on risk continuously. This provides supporting evidence of your security function’s work in addressing said risks and will enable you to help define a consensual view of the business’s risk appetite.

When a Black Swan event subsequently occurs (and it will), the incident response function will step up and address it as it would any incident. Learning points and advisories would be produced as part of the documented procedures they follow (You have these, right?), including future areas to look out for. This output must be reviewed and included in the risk register as appropriate. The risk register is then reviewed annually (or more frequently as required), and controls are updated, added or removed to reflect the current risk environment and appetite. Finally, the incident response team will review the risk register, safe in the knowledge it contains fresh and relevant data, and ensure their procedures and documentation are updated to reflect the most current risk environment.

Only by having an interconnected and symbiotic relationship between the risk function and the incident response function will you benefit most from understanding and communicating risks to the business.

So there you have it, three things to remember about risk that will help you not only be more effective when dealing with the inevitable incident but also help you communicate business benefits and support the demands of any modern business.

Risk is not a dirty word.


Shameless Coronavirus Special Promotion – Risk Edition!

iu-18Many, many moons ago, my good friend and learned colleague Javvad Malik and I came up with a way to explain how a risk model works by using an analogy to a pub fight. I have used it in a presentation that has been given several times, and the analogy has really helped people understand risk, and especially risk appetite more clearly (or so they tell me). I wrote a brief overview of the presentation and the included risk model in this blog some years back.

And now the Coronavirus has hit humanity AND the information security industry. Everyone is losing their minds deciding if they should self isolate, quarantine or even just generally ignore advice from the World Health Organisation (like some governments have shown a propensity to do) and carry on as usual and listen to the Twitter experts. During a conversation of this nature, Javvad and I realised that the Langford/Malik model could be re-purposed to not only help those who struggle with risk generally (most humans) but those who really struggle to know what to do about it from our own industry (most humans, again).

Disclaimer: we adopted the ISO 27005:2018 approach to measuring risk as it is comprehensive enough to cover most scenarios, yet simple enough that even the most stubborn of Board members could understand it. If you happen to have a copy you can find it in section E.2.2, page 48, Table E.1.

Click the image to view in more detail and download.

The approach is that an arbitrary, yet predefined (and globally understood) value is given to the Likelihood of Occurrence – Threat, the Ease of Exploitation, and the Asset Value of the thing being “risk measured”. This generates a number from 0-8 going from little risk to high risk. The scores can then be banded together to define if they are High, Medium or low, and can be treated in accordance with your organisation’s risk appetite and risk assessment procedures.

In our model, all one would have to do is define the importance of their role from “Advocate” (low) to “Sysadmin” (high), personality type (how outgoing you are) and the Level of human Interaction your role is defined as requiring. Once ascertained, you can read off your score and see where you sit in the risk model.

In order to make things easier for you, dear reader, we then created predefined actions in the key below the model based upon that derived risk score, so you know exactly what to do. In these troubled times, you can now rest easy in the knowledge that not only do you understand risk more but also what to do in a pandemic more.

You’re welcome.

Note: Not actual medical advice. Do I really need to state this?


Keeping It Supremely Simple, the NASA way

Any regular reader (hello to both of you) will know that I also follow an ex NASA engineer/manager by the name of Wayne Hale. Having been in NASA for much of his adult life and being involved across the board he brings a fascinating view of the complexities of space travel, and just as interestingly, to risk.

His recent post is about damage to the Space Shuttle’s foam insulation on the external fuel tank (the big orange thing),and the steps NASA went through to return the shuttle to active service after it was found that loose foam was what had damaged the heat shield of Columbia resulting in its destruction. His insight into the machinations of NASA, the undue influence of Politics as well as politics, and that ultimately everything comes down to a risk based approach make his writing compelling and above all educational. This is writ large in the hugely complex world fo space travel, something I would hazard a guess virtually all of us are not involved in!

It was when I read the following paragraph that my jaw dropped a little as I realised  that even in NASA many decisions are based on a very simple presentation of risk, something I am a vehement supporter of:

NASA uses a matrix to plot the risks involved in any activity.  Five squares by five squares; rating risk probability from low to high and consequence from negligible to catastrophic.  The risk of foam coming off part of the External Tank and causing another catastrophe was in the top right-hand box:  5×5:  Probable and Catastrophic.  That square is colored red for a reason.

What? The hugely complex world of NASA is governed by a five by five matrix like this?

Isn’t this a hugely simplistic approach that just sweeps over the complexities and nuances of an immensely complex environment where lives are at stake and careers and reputations constantly on the line? Then the following sentence made absolute sense, and underscored the reason why risk is so often poorly understood and managed:

But the analysts did more than just present the results; they discussed the methodology used in the analysis.

It seems simple and obvious, but the infused industry very regularly talks about how simple models like a traffic light approach to risk just don’t reflect the environment we operate in, and we have to look at things in a far more complex way to ensure the nuance and complexity of our world is better understood. “Look at the actuarial sciences” they will say. I can say now i don’t subscribe to this.

The key difference with NASA though is that the decision makers understand how the scores are derived, and then discuss that methodology, then the interpretation of that traffic light colour is more greatly understood. In his blog Wayne talks of how the risk was actually talked down based upon the shared knowledge of the room and a careful consideration of the environment the risks were presented. In fact the risk as it was initially presented was actually de-escalated and a decision to go ahead was made.

Imagine if that process hadn’t happened; decisions may have been made based on poor assumptions and poor understanding of the facts, the outcome of which had the potential to be catastrophic.

The key point I am making is that a simple approach to complex problems can be taken, and that ironically it can be harder to make it happen. Everyone around the table will need to understand how the measures are derived, educated on the implications, and in a position to discuss the results in a collaborative way. Presenting an over complex, hard to read but “accurate” picture of risks will waste everyone’s time.

And if they don’t have time now, how will they be able to read Wayne’s blog?

 

 


The Power of Silence

Not so many years ago in the dim and distant past, the very first full length public talk I did was called “An Anatomy of a Risk Assessment”; it was a successful talk and one I was asked to present several times again in the following years. Below is a film of the second time I presented it, this time at BSides London:

My presentation style left a lot to be desired, and I seemed unable to stop using note cards until almost eighteen months later despite me not using them for other talks I gave! (Top speaking tip folks, never use printed notes when speaking, it conditions your mind to think it can only deliver when using them.) But that is not the focus of this message.

One of the pieces of “anatomy” that I spoke about in terms of risk assessments was the ears. The principle being that since you have two ears and one mouth, when auditing or assessing you should be listen twice as much as be speaking. This is important for two reasons, the second of which may not be as obvious as the first:

  1. If you are assessing someone or something, you should be drawing information from them. When you are speaking you are not gaining any information from them which is a wasted opportunity. As a consequence of this therefore,
  2. There will be periods of silence which you must not feel tempted to break. Just as nature fills a vacuum so a human wants to fill a silence. Silence therefore will encourage the target of the assessment to open up even more, just so as not to feel awkward!

Interestingly, after my very first presentation of this talk, a member of the audience asked me if i had ever been in the Police Force. “I haven’t” I replied.

Well, some of the techniques you just described are exactly like police interrogation techniques, especially the silence. I should know, I used them every day!

Flattered though I was, I did become a little concerned! Was i taking this risk assessment malarkey a little too seriously? Was i subjecting people to what amounted to an interrogation?

Obviously this was not the case, but it occurred to me that in the many books i have read on risk assessment and audit, never is the softer side of the process covered. We tend to focus on the technology, or the boxes that need to be ticked, when actually we can simply sit back and let others do the talking. I also employ humour very often to help people relax, and even do it when i am on the other side of the table too. It can make a gruelling and mindless activity far more engaging and allow you to connect with the person on the other side of the table more effectively.

It engenders trust.

You can apply many of the techniques described in the presentation in your daily work lives, especially when on a discovery programme or wanting to get to the bottom of an incident. In fact, I can’t think of anything easier than having a (one-sided) chat with someone and getting the assessment completed.

Or as Will Rogers, actor and vaudeville performer in the early 1900’s put it:

Never miss a good chance to shut up


On another note, look out for a new series of YouTube films coming from me in the next few weeks.

I give you, The Lost CISO


What 80’s pop can teach us about Rocket failure and incident management

image

Most accidents originate in actions committed by reasonable, rational individuals who were acting to achieve an assigned task in what they perceived to be a responsible and professional manner.

(Peter Harle, Director of Accident Prevention,Transportation Safety Board of Canada and former RCAF pilot, ‘Investigation of human factors: The link to accident prevention.’ In Johnston, N., McDonald, N., & Fuller, R. (Eds.), Aviation Psychology in Practice, 1994)

I don’t just read infosec blogs or cartoons that vaguely related to infosec, I also read other blogs from “normal” people. One such blog is from a chap called Wayne Hale who was a Fligh Director (amongst other things) at NASA until fairly recently. As a career NASA’ite he saw NASA from it’s glory days through the doldrums and back to the force it is today. There are a number of reasons I like his blog, but mostly I have loved the idea of space since I was a little kid – I still remember the first space shuttle touching down, watching it on telly, and whooping with joy much to my mother’s consternation and chagrin. The whole space race has captured my imaginaion, as a small child and an overweight adult. I encourage anyone to head to his blog for not only fascinating insider stories of NASA, but also of the engineering behind space flight.

What Wayne’s blog frequently shows is one thing; space is hard. It is an unforgiving environment that will take advantage of every weakness, known and unknown, to take advantage and destroy you. Even just getting into space is hard. Here is Wayne describing a particular incident the Russians had;

The Russians had a spectacular failure of a Proton rocket a while back – check out the video on YouTube of a huge rocket lifting off and immediately flipping upside down to rush straight into the ground. The ‘root cause’ was announced that some poor technician had installed the guidance gyro upside down. Reportedly the tech was fired. I wonder if they still send people to the gulag over things like that.

This seems like such a stupid mistake to make, and one that is easy to diagnose; the gyro was in stalled upside down by an idiot engineer. Fire the engineer, problem solved. But this barely touches the surface of root cuse analysis. Wayne coniTunes;

better ask why did the tech install the gyro upside down? Were the blueprints wrong? Did the gyro box come from the manufacturer with the ‘this side up’ decal in the wrong spot? Then ask – why were the prints wrong, or why was the decal in the wrong place. If you want to fix the problem you have to dig deeper. And a real root cause is always a human, procedural, cultural, issue. Never ever hardware.

What is really spooky here is that the latter part of the above quote could so easily apply to our industry, especially the last sentence – it’s never the hardware.

A security breach could be traced back to piece of poor coding in an application;

1. The developer coded it incorrectly. Fire the developer? or…

2. Ascertain that the Developer had never had secure coding training. and…

3. The project was delivered on tight timelines and with no margins, and…

4. As a result the developers were working 80-100 hrs a week for three months, which…

5. Resulted in errors being introduced into the code, and…

6. The errors were not found because timelines dictated no vulnerabiliy assessments were carried out, but…

7. A cursory port scan of the appliction by unqualified staff didn’t highlight any issues.

It’s a clumsy exampe I know, but there are clearly a number of points (funnily enough, seven) throughout the liufecycle of the environment that would have highlighted the possibility for vulnerabilities, all of which should have been acknowledged as risks, assessed and decisions made accordingly. Some of these may fall out of the direct bailiwick of the information security group, for instance working hours, but the impact is clearl felt with a security breach.

A true root cause analysis should always go beyond just the first response of “what happened”? If in doubt, just recall the eponymous words of Bronski Beat;

“Tell me why? Tell me why? Tell me why? Tell me why….?”