Thousands of alarms are built into the control systems of many automated plants and platforms. An unintended consequence is the large number of unnecessary or nuisance alarms, which can cause frustration and anxiety for the operator. Incident investigations have revealed that consequential or repeat alarm warnings from automatic systems can distract staff dealing with a problem and increase stress while concealing important new information.

Poor human-machine interface design and alarm prioritization can hinder an effective response. Even small operators can lose up to US $100 million each year to plant upsets and shutdowns.

Managing alarms
Restoring the effectiveness of alarm systems must be a priority so assets can be operated safely and cost-efficiently.

Asset managers need to understand how significant improvements can be achieved within reasonable time and cost constraints.

Highly automated systems are efficient, reliable, and generally safe. Most of the time, the operator simply monitors the overall situation and responds to the occasional minor alarm. When something more significant does happen, often signaled by a barrage of noise and flashing lights, boredom can quickly turn to panic.

alarm system, plant

Once the plant is operational, alarm system performance should be continuously reviewed and improved.(Images courtesy of Amor Group)

Excessive alarms have two consequences. One is that more alarms or a high repetition rate can reduce the probability of the operator noticing it and responding effectively.

The second is more subtle. As the first few alarms sound, the operator moves quickly from vigilance to analysis to action.

Evidence suggests that when in action, people rarely reevaluate their initial analysis even if subsequent information indicates the initial analysis is wrong.
Once in action, there is a powerful human tendency to confirm rather than hypothesize. This bias may be more prevalent under stress.

Alarm systems commonly have three major problems:
1. Standing alarms (which remain active even though the asset is operating normally) can obscure more important information. They usually are caused by instrument faults, inappropriate alarm limit settings, and out-of-service equipment. If conditions change, they cannot sound again.
These alarms are relatively easy to deal with – instrument faults can be fixed, limits can be adjusted, and processes can be devised to suppress alarms on out-of-service equipment. This should be part of a process of continuous improvement.
2. Nuisance and repeating alarms, which can be endlessly activated after being silenced, are distracting. Engineering Equipment and Materials Users Association (EEMUA) guidelines and HSE research indicates 50% of alarms have a small number of triggers. Nuisance alarms are typically caused by faulty instruments, alarm limits being set too close to normal operating conditions, and ineffective use of mechanisms designed to minimize repeating alarms.
Reducing repeating alarms is not technically difficult but requires sustained effort.
3. Alarm floods are by far the most serious issue. A sustained rate of two alarms/minute is as much as process operators can handle, with four or five/minute perhaps being acceptable, but only for a short period. It is common for several hundred alarms to occur in the first 10 minutes following an issue on an asset. Peak rates of one alarm/second are not unusual.

The result is that the operator effectively abandons the system, acknowledging alarms without looking at them, which means information can be missed or misinterpreted.

Most electronic systems have too many alarms. The HSE has set a target – no more than 10 to be activated in the first 10 minutes following a major issue.

The design process provides opportunities to deal with alarm problems at the source through prevention rather than cure. If an alarm is not configured, it cannot become a standing or nuisance alarm later.

Each alarm should require formal justification and a defined operator response.

Adjusting operations
The HAZOP technique should be revised to treat alarms in the context of the overall operation. Alarm reduction techniques should be considered only where there is both a need and a realistic chance of success. Once the plant is operational, alarm system performance should be continuously reviewed and improved.

An alarm system improvement exercise on an existing asset should be anchored to a formal set of principles and policies. It sometimes is best to conduct a review of the design, configuration, and performance of the existing system. This requires gathering the numbers and priorities of the alarms, the number of standing alarms, evidence of nuisance and repeating alarms, and activation rates.

Most modern systems provide an auto-documentation facility, which usually is the best way to determine how many alarms are configured and their priorities. Systems also provide a “current alarm” report, which can be run at regular intervals under normal operations to identify standing alarms.

An alarm/event historian facility can be used to investigate nuisance and repeating alarms and to obtain some measure of average and peak alarm rates. If a problem exists, an alarm rationalization exercise may be necessary.

Basic principles include eliminating duplicate alarms, ensuring normal or expected events are not triggers, and making sure there is no more than one pre-alarm for the cause of each trip.

The number of high priority-rated alarms should be limited. EEMUA guidelines suggest no more than 10% be at the highest priority and 20% at medium, with the remainder low priority.

Risk-based methodology similar to that used in the IEC-61508 standard for assessing instrumented protective systems can be effective in determining alarm priorities if the relevant questions are asked:
• Can the operator do anything?
• Is there time for action?
• What happens if action doesn’t take place in time?

Risks must be in relation to personnel, the environment, equipment, and production. With the possible exception of fire and gas systems, alarms should not be relied upon to prevent human injury or serious environmental damage.

An effective alarm review should take other aspects of plant operability into account. The ergonomics of displays and the robustness of automatic controls impact the operator’s ability to maintain “situational awareness” and return the plant rapidly to normal.

A number of techniques can assist with achieving good alarm system performance in practice. It is common for installations to have thousands of individual alarms, so an effective solution is to categorize them as far as possible, such as “failure to trip on demand,” so all alarms in a category could be assigned the same priority and characteristics. This can be effective but requires the reviewer to have a good understanding of the process and its operation.

On one offshore platform, Amor Group identified 4,397 alarms, of which 48% were high priority, 5% were medium priority, and 47% low priority. A changeover to standby pumps generated a flurry of alarms, few of which were of any value.

Following rationalization, the number of alarms was reduced to 2,554, with the new configuration having only 8% high-priority alarms, 10% medium, and 82% low priority.

A number of other measures also were taken to reduce transient and consequential alarms as well as those from out-of-service equipment.

If the entire export system is shut down, all alarms are suppressed apart from those indicating high pressure, high level, or failure to trip (hazardous conditions under any circumstances). Only these conditions have potentially serious consequences, so their alarms are the only ones set to high priority.

Given the number of alarms that need to be examined, software tools are essential. Spreadsheets can be used to analyze data; however, a specialized database package such as TiPS LogMate provides additional facilities for long-term maintenance and improvement work.

Following the initial review and rationalization exercise, achieving and sustaining good alarm system performance requires long-term commitment to a process of continuous review and improvement.