Incorporating human factors in bowtie risk assessment: 3 simple methods for handling human error
A common saying starts with “To Err is Human…”
We as humans are not infallible: Any activity involving human action is vulnerable to human failure. These failures can include:
- errors in unit conversion, which led to the loss of the Mars climate orbiter
- serious violations, such as the sabotage of Maroochy Shire sewer system in 2001
It is, therefore, important that factors that impact human performance or lead to human failure (“human factors”) are accounted for whenever a risk management exercise is undertaken.
This post looks at three key methods of analysing human factors in bowtie risk assessment, namely:
- Human failures treated as escalation factors (failure modes) of hard risk controls
- Human failures treated as causes or threats leading to the top event
- Human failures framed as a separate loss-of-control event (human failure bowtie)
What are human factors?
The concept of human factors aims to move away from stereotyping and blaming culture to identifying the key “performance shaping factors” that impact how reliably humans perform tasks.
The UK Health & Safety Executive (HSE) defines human factors as:
“A combination of environmental, organisational and job factors that influence human and individual characteristics, which influence behaviour at work in a way which can affect human health.”
Image Source: HSG48 – Reducing error and influencing behaviour
Some examples of these performance shaping factors are:
|Job factors||Individual Factors||Organisational Factors|
|Task complexity||Fatigue||Staffing level|
|Instruction clarity||Operator competence||Peer pressure|
|Time pressure||Physical fitness or capability||Level and quality of supervision|
|System interface||Work load||Role clarity|
|Inappropriate tools||Motivation||Safety culture|
There are alternative breakdowns of these performance shaping factors but largely the groupings align. A good example of a more detailed breakdown is from Queensland Mining and Energy using a Human Factors Analysis and Classification Framework tailored for the Mining Industry.
Ultimately, the purpose of discussing human factors in risk management is to determine what performance shaping factors could foreseeably result in a human failure – whether it is an error (unintended) or a violation (deliberate). Once these have been identified, the strategies to manage these factors (i.e. controls) can then be defined and implemented.
What’s the worst that can happen?
Human failures erode the effectiveness of existing barriers protecting under critical risks. Major accident statistics worldwide consistently indicate that human factors have played a significant role in contributing to the root cause of many reportable OHS incidents. Some of the tragic examples of accidents involving human factor root causes include:
Zeebrugge Ferry disaster (1987) – 193 passengers and crew dead
The ‘Herald of Free Enterprise’ was a roll-on/roll-off ferry that capsized within 30 minutes after leaving the port of Zeebrugge in Belgium on a clear calm spring day.
On the fateful day, the ferry departed port with its bow doors open. As the ferry gained more speed, the bow dipped lower eventually water entered the bow causing the ship to capsize in a couple of minutes. The water was just 3 degrees Celsius resulting in 193 passengers and crew drowning following cold immersion.
Investigations by the Department of Transport pointed to several human failures leading up to the incident, the key ones were:
- Failure to close bow doors
- The Assistant bosun responsible for closing the bow doors was asleep (Fatigue) following maintenance and cleaning duties prior to departure.
- The Bosun being the last individual working at the bow doors failed to close the bow doors after completing loading. He is recorded as stating: “It has never been part of my duties to close the doors or make sure anybody is there to close the doors.”
- Failure to check bow doors are closed prior to departing
- There was no indicator at the bridge to indicator if bow door position – doors are not visible from the bridge.
- Officer in charge of loading failed to ensure bow doors were closed. There was miscommunication/misunderstanding on who was the officer in charge of loading at the time. Consequently, no officer followed up on the missing assistant bosun.
Performance shaping factors identified were:
- Fatigue – crew was completing final trip of their 24-hour shift and hence were tired.
- ‘Not my job’ culture
- Poor standard procedures for operations
- Manning was reduced by one officer because the Dover-Zeebrugge journey was 3 times longer than Dover – Calais journey giving officers more time to rest. However, officer duties were not adjusted to account for this.
- Crew changed frequently and thus understanding of duties varied.
- Lack of enforcement of standard procedures / instructions from Captains – these were poorly written and loosely interpreted by officers and crew.
- Time pressure – port ferry operations were organised like those done for Calais port despite the difference in Zeebrugge port facilities resulting in time pressures at point of departure to maintain schedule.
Zeebrugge port only had one deck with a ramp requiring loading of one deck at a time and need to fill ballasts to drop the ferry 3 feet lower in the water to enable the ramp to reach upper levels.
- Pressure to be present at harbour stations following departure
Kegworth Air disaster (1989) – 47 dead
British Midland Flight 92 crashed whilst attempting an emergency landing following a fire on one of its engines.
Following take-off from Heathrow to Belfast, the aircraft No.1 engine (on the left) experienced a mechanical failure during ascent leading to a series compression stalls causing the aircraft to shudder and smoke and fumes to ingress into the flight deck. The decision was made to throttle down the affected engine and make an emergency landing at East Midlands Airport. Unfortunately, the piloting crew believed the issue was with engine No.2 (on the right), throttling it down and subsequently shutting it down. Upon throttling down No.2 engine the aircraft stopped shuddering persuading the flight crew the right decision was made. The aircraft was then redirected to East Midlands Airport for emergency landing.
Unfortunately, landing procedures necessitated increasing power to No.1 engine resulting in very high vibration readings which were not identified by the flight crew. When the aircraft was only 2.4 nautical miles from the landing point, No.1 engine failed catastrophically causing rapid loss of power. Unable to restart No.2 engine, the aircraft crashed resulting in the death of 47 passengers.
According to the Air Accidents Investigation Bureau, during the following were the key human failure that contributed to the tragedy following initial engine failure was shutting down the wrong engine. Some of the performance shaping factors that led to this failure were:
- Poor Engine Instrument System dashboard design
- Engine Instrumentation System did not draw attention to which engine had the issue in the high stress scenario.
- New cockpit dashboard layout in 737-400 did not prominently display engine parameters like in the previous 737-300 model. This meant pilots failed to identify the high vibrations of No.1 engine persisting leading to catastrophic failure.
- Inadequate training and knowledge of crew
- Training did not cover scenarios where high engine vibration are accompanied by smoke in the flight deck.
- Training did not cover decision-making techniques in the event of failures not covered by standard procedures.
- Flight crew mistakenly believed stabilising of the engine was due to shutting down No.2 engine and not a result of disengagement of auto-throttle.
- Failure to follow aircraft ‘non-normal’ checklist – severe vibration without abnormal engine parameters as was experienced is not deemed an event necessitating engine shutdown.
- Cabin crew did not communicate their observations to the flight crew when blue sparks were observed from No.1 engine.
- High stress
Captain was manually piloting following disengagement of autopilot following initial failure. This meant there was increased cockpit workload whilst the flight crew were investigating the engine failure.
What can be done to manage human failure risks?
Incorporating human factors in bowtie risk assessments can be done in a number of ways. A common approach used requires:
- Identifying safety critical actions in performing a task – these are actions whose failure will result in a major accident or erosion of safety critical controls.
- This is typically done through systematic task analysis – there are many techniques, a common approach is hierarchical task analysis (HTA).
- Determining possible human failures that impact the safety critical action
- Estimating likelihood of the human failure
- Determine performance shaping factors influencing the human failure
- Identifying controls that help mitigate performance shaping factors
Applying the approach across all safety critical tasks will lead to development of a list of different safety critical human failures for completing a task, including the likelihood and consequence of the failure and controls in place to mitigate the human failure likelihood. Such information lends itself to be easily incorporated into existing frameworks for job safety analysis or eventually bowtie risk assessment.
In practice, there are over 200 different Human Factors risk assessment techniques that cater for variety of failure types (error vs violations), performance shaping factors and desired output (quantitative or qualitative). Which method to use is dependent on the scenario being analysed, the end goal of the exercise and business requirements.
3 bowtie risk assessment techniques
To demonstrate the three approaches, , consider the following scenario:
The operation of a vertical charcoal retort column can be viewed simply as a batch flow where hydraulically powered flaps control feed and discharge flow rates. Wood feedstock is fed at the top of the vessel and gradually descends down the column under gravity as charcoal is removed. While the wood descends, it is exposed to controlled flow of hot gas (mainly carbon monoxide – heavier than air) to pyrolyze it into charcoal.
In this process, the hydraulic flaps play multiple safety critical roles:
- Seal the vessel limiting fugitive gas emissions – the gas is primarily carbon monoxide or carbon dioxide that can asphyxiate
- Control flow rate of charcoal through process – holding charcoal too long will result in smoldering and potentially fires
Human factors enter the equation when we consider the human-led maintenance activities required.
Task: Maintenance overhaul of vertical charcoal retort discharge flaps
On conducting Human Factors risk assessment of the task, one safety critical subtask assessed is as follows (for this example a simplified qualitative SHERPA approach was used):
|Sub-task||Possible consequence of failure||Estimated failure likelihood||Possible human failure||Performance shaping factors||Existing Management measures|
|Perform hydraulic hose coupling checks post maintenance||Flap operation reversed – uncontrolled release of heavier than air hazardous gas||Possible (has occurred in past company history)||Verification not done or incomplete||Time pressure||Dedicated time allotted to testing and commissioning following planned shutdown|
Independent check by operations team
|Fatigue||Fatigue management plan – maximum hours|
|Poor visual cues||Colour coding and labelling on equipment|
|Poor procedures||Periodic Document Review Process – includes task instructions in CMMS|
|Training Management System|
The above information could be incorporated into a risk assessment as:
|Irrespirable atmosphere||Uncontrolled release of asphyxiating gas from charcoal retort||Failure to test discharge flap prior to start-up||Possible||Fatality||Dedicated testing time post shutdowns.|
Independent check by operation team.
Fatigue Management Plan.
Colour coding and labelling of hose connections.
Periodic review of task instructions.
Training Management System.
Incorporating this information in bowtie risk assessments can then be approached in a number of ways. The best approach is the one that aligns best with ultimate goal for the bowtie within your company. However, if you do include human failures / human factors then refrain from being too generic as this fails to add value. Three commonly used approaches are:
1. Human failures are treated as escalation factors
This is the recommended approach from Centre for Chemical Process Safety (CCPS) / Energy Institute (EI). This approach aligns with the philosophy that human failures do not in themselves create unwanted event but erode the function of barriers put in place. A potential issue with this approach is that you can easily have too many escalation factors aligned with human failures in a bow tie making it hard to read. It is recommended judicious use of escalation factors is employed to only highlight failures that add value to the scenario portrayed in the bowtie model.
2. Human failures are treated as causes / threats
Treating human failures as threats / causes should be thought of very carefully as it may separate the human factor aspects from the safety critical element they influence. Typically, such an approach is used when the activity inherently can result in a critical risk e.g. sampling or flange removal which can result in loss of containment events.
This approach is used to address high level generic failures to ensure that there is explicit coverage of such factors in the risk management process. These more closely resemble the outcomes of human factors assessments on safety critical tasks.
The key to tackling human factors effectively as part of bowtie risk assessments is to recognise that the “right” method depends entirely on the organisation and the logic underpinning the method used. For example, an oil and gas business needing to demonstrate bowties as part of a safety assessment may prefer to model human failures onto the bowtie risk assessments for each major accident event. This makes all the logic behind the risk assessment (and critical control selection) evident in a single model for each major accident hazard. An organisation in aviation, by contrast, may prefer to model human factors more comprehensively as a standalone bowtie which feeds into other hazard-specific bowties.
Any of the three methods described above can be effective for human factors. The key is for the business (and the facilitator of the risk assessment) to understand the logic behind the method used and apply it consistently.