What 80’s pop can teach us about Rocket failure and incident management

image

Most accidents originate in actions committed by reasonable, rational individuals who were acting to achieve an assigned task in what they perceived to be a responsible and professional manner.

(Peter Harle, Director of Accident Prevention,Transportation Safety Board of Canada and former RCAF pilot, ‘Investigation of human factors: The link to accident prevention.’ In Johnston, N., McDonald, N., & Fuller, R. (Eds.), Aviation Psychology in Practice, 1994)

I don’t just read infosec blogs or cartoons that vaguely related to infosec, I also read other blogs from “normal” people. One such blog is from a chap called Wayne Hale who was a Fligh Director (amongst other things) at NASA until fairly recently. As a career NASA’ite he saw NASA from it’s glory days through the doldrums and back to the force it is today. There are a number of reasons I like his blog, but mostly I have loved the idea of space since I was a little kid – I still remember the first space shuttle touching down, watching it on telly, and whooping with joy much to my mother’s consternation and chagrin. The whole space race has captured my imaginaion, as a small child and an overweight adult. I encourage anyone to head to his blog for not only fascinating insider stories of NASA, but also of the engineering behind space flight.

What Wayne’s blog frequently shows is one thing; space is hard. It is an unforgiving environment that will take advantage of every weakness, known and unknown, to take advantage and destroy you. Even just getting into space is hard. Here is Wayne describing a particular incident the Russians had;

The Russians had a spectacular failure of a Proton rocket a while back – check out the video on YouTube of a huge rocket lifting off and immediately flipping upside down to rush straight into the ground. The ‘root cause’ was announced that some poor technician had installed the guidance gyro upside down. Reportedly the tech was fired. I wonder if they still send people to the gulag over things like that.

This seems like such a stupid mistake to make, and one that is easy to diagnose; the gyro was in stalled upside down by an idiot engineer. Fire the engineer, problem solved. But this barely touches the surface of root cuse analysis. Wayne coniTunes;

better ask why did the tech install the gyro upside down? Were the blueprints wrong? Did the gyro box come from the manufacturer with the ‘this side up’ decal in the wrong spot? Then ask – why were the prints wrong, or why was the decal in the wrong place. If you want to fix the problem you have to dig deeper. And a real root cause is always a human, procedural, cultural, issue. Never ever hardware.

What is really spooky here is that the latter part of the above quote could so easily apply to our industry, especially the last sentence – it’s never the hardware.

A security breach could be traced back to piece of poor coding in an application;

1. The developer coded it incorrectly. Fire the developer? or…

2. Ascertain that the Developer had never had secure coding training. and…

3. The project was delivered on tight timelines and with no margins, and…

4. As a result the developers were working 80-100 hrs a week for three months, which…

5. Resulted in errors being introduced into the code, and…

6. The errors were not found because timelines dictated no vulnerabiliy assessments were carried out, but…

7. A cursory port scan of the appliction by unqualified staff didn’t highlight any issues.

It’s a clumsy exampe I know, but there are clearly a number of points (funnily enough, seven) throughout the liufecycle of the environment that would have highlighted the possibility for vulnerabilities, all of which should have been acknowledged as risks, assessed and decisions made accordingly. Some of these may fall out of the direct bailiwick of the information security group, for instance working hours, but the impact is clearl felt with a security breach.

A true root cause analysis should always go beyond just the first response of “what happened”? If in doubt, just recall the eponymous words of Bronski Beat;

“Tell me why? Tell me why? Tell me why? Tell me why….?”


Making the world angrier, one process at a time

Angry Thom BlogI have recently set up Family Sharing on my iOS devices, so that I can monitor and control what apps go on my kids devices without having to be in the room with them. Previously they would ask for an app, and I would type in my AppleID password and that was  that. Unfortunately with my new role I am travelling so much now that the thought of waiting a week before they can get an apps was causing apoplectic grief with my kids. Family Sharing was the solution, and when I had finally worked it out, we were goood to go and it works well. I can now authorise a purchase from anywhere in the world. I get woken up at 3am with a request for a BFF makeover or car crash game (one girl, one boy) but my kids are happy.

One problem however was that for some reason my daughters date of birth was incorrect, therefore indicating that she was an adult, and thereby breaking the whole “app approval” process. Straightforward to fix? Not at all.

I won’t bore you with the details, but it was the most frustrating process I have encountered in a long time. I admit, I misinterpreted the instructions along the way (they were a bit asinine in my defence), but it came down to the fact that I had to have a credit card as my default payment method for my family account, not a debit card, simply to authorise the change of status of my daughter from an adult to a child. In other words, I had to jump through hoops to restrict her  account rather than give it more privilege. Not only that, but from an account that already had the privileges in the first place. There didn’t seem to be any element of trust along the way.

I am sure there is a good, formal response from Apple along the lines of “take your security seriously”, “strong financial controls” etc, but as an experience for me it sucked, and if I could have worked around it I would have. Thankfully not all of Apple’s ecosystem works like this!

This is a problem for many information security organisations when they introduce procedures to support organisational change or request mechanisms. For instance, how many times have you seen a change request process require CISO, CIO and potentially even higher approvals for even simple changes? Often this is due to a lack of enablement in the organisation, the ability to trust people at all levels, and often it is a simple lack of accountability. It seems we regularly don’t trust either our own business folks as well as our own employees to make the right decisions.

Procedures like this fail in a number of places:

  1. They place huge pressure on executives to approve requests they have little context on, and little time to review.
  2. The operational people in the process gain no experience in investigting and approving as they simply escalate upwards.
  3. The original requestors are frustrated by slow progress and no updates as the requests are stuck in senior management and above queues.
  4. The requestors often work aroun d the procedure, avoid it, or simply do the opposite of what finally comes out of the request as work pressures dictate a quicker response.
  5. The owners of the procedure respond with even tighter regulations and processes in order to reduce the ability nof the nrequestor to wotk around them.

And so the cycle continues.

The approach I have regularly used in situations like this comprises of two tenets:

  1. Consider the experience of the user first, then the desirable outcomes of the process second.
  2. Whatever process you then come up with, simplify it further. And at least once more.

Why should you consider the expoerience of the user first? Who is the process for the benfit of, you as in formation secuity, or them as the end user? If you answered the former, then go to the back of the class. We are not doing security for our benefit, it is not security for the sake of security, it is to allow the user, our customers, to do more. If we make their experience bad as they do their best to make more money, sell more beer, do more whatever, security becomes an irellevance at best and a barrier to successful business at worst.

Making the requstors exoerience as painless and as straightforward as possible (perhaps eeven throw in a bit of education in there?) they are encouraged to not only see the long term benefits of using the procedure as we defined, but also become fanatical advocates of it.

Secondly, why should we keep it simple? Well not only to support the above points, but also because guess who is going to have to support the process when it is running? Of course, you and your team. If the process itself is bulky and unmanageable then more time will be spent running the process than doing the work that the process needs to support. If that amount of time becomes too onerous over time, then the process itself breaks down, the reporting on the process becomes outdated, and ultimately the process itself becomes irrelevant and considered a waste of time by those it affects.

Putting your requestors at the centre of your simplified process universe will always make that process more robust, more understood, more beneficial and of course more relevant to the business, and who can argue with that?

InfoSecurity Europe

I spoke at this years InfoSecurity Europe in London a few months back on articulating risk to senior management. Peter Wood, the moderator, did an excellent job as moderator of the panel, and even revitalised my faith in them after too many very poor experiences earlier this year.