Criticality Analysis: What It Is and Why It’s Important

Jonathan Trout, Noria Corporation
Tags: maintenance and reliability

Criticality analysis is defined as the process of assigning assets a criticality rating based on their potential risk of failure.

What Is Criticality Analysis?

Criticality analysis is defined as the process of assigning assets a criticality rating based on their potential risk. Risk is defined as "the effect of uncertainty on objectives," according to ISO 31000:2009 – Risk Management – Principles and Guidelines. Since it can't truly be quantified, risk, in this case, is thought of as all the possible ways assets can fail and the effects that failure can have on the system and operation as a whole. Given this, criticality analysis is closely related to a failure modes and effects analysis (FMEA) and a failure modes, effects and criticality analysis (FMECA), which will be discussed later. Once a criticality analysis has been performed, an FMEA typically is performed on the top 20 percent of the most critical assets.

What is the Purpose of Criticality Analysis?

So, why is criticality analysis important? You're constantly hearing about criticality — doing a criticality analysis to prioritize assets for a total productive maintenance (TPM) plan, a condition-based monitoring program or a root cause analysis on high-priority equipment. Criticality plays a role in nearly all types of maintenance. It comes down to risk and what makes each piece of equipment critical. Criticality analysis lets you understand the asset's potential risks that could impact your operation. It ensures reliability is looked at from a risk-based magnifying glass rather than each person's opinion.

According to the Life Cycle Institute, a criticality analysis model should cover multiple areas of your organization including:

Customer impact
Impact on safety and environment
Ability to isolate single-point failures
Preventive maintenance (PM) history
Corrective maintenance history
Mean time between failures (MTBF)
Spare parts lead time
Probability of failure

Because the criticality model deals with multiple areas of an organization, a criticality analysis should be a company-wide effort. Including departments that deal with operations, engineering, maintenance, procurement, and health and safety ensures the analysis considers all functions of the operation as a whole. You must understand that risk can be defined differently across various teams. Having a diverse team providing input helps with the subjectivity of assigning risk.

Criticality analysis is also important because it can be used across a variety of scenarios within an organization. Some of these scenarios might look like this:

A criticality score can be employed as an input to help determine the final priority ranking for maintenance tasks, which in turn can be used together with work order priority.
It can help identify high-level risk mitigation strategies for specific equipment. For example, this could involve applying a condition monitoring technique to high-criticality assets.
It can assist with figuring out the optimum number of spare parts for each piece of equipment.
It can provide valuable input for budgeting discussions, so high-criticality equipment is given higher priority for upgrades or replacement.
Criticality analysis helps reliability engineers focus their efforts and energy on the most critical assets.

How to Perform a Criticality Analysis

It's important to note there isn't one definitive approach for performing a criticality analysis. Following are two widely used methods, one simplistic approach to get started and another in-depth method.

So, where should you start? Many organizations just want to know which assets should be included in a criticality assessment. Instead of assuming all your assets are critical, make a list of the key assets your team thinks are critical and calculate the cost of downtime and repairs. You might be surprised by the results. For example, you may have hundreds of motors in constant motion, which are fairly critical, but the most critical asset is the boiler making steam to keep those motors in motion.

Since the point of this approach is to find a good starting point, let's take a look at some action steps you can take to get started on a criticality plan.

Compile a list of assets to cut that won't exceed 20 percent of all assets. Best practice for this is a 5-to-1 or greater ratio.
Put together a team of personnel from the operations, maintenance, engineering and procurement side of the organization to conduct a survey of the plant equipment. Equipment operators should be included in this team as well.
Next, rank the criticality of the assets using an established formula. Lifetime Reliability Solutions uses the following formula to determine the financial impact of an asset: Equipment Criticality = Failure Frequency (per year) x Cost Consequence ($) = Risk ($ per year). The cost consequence in this formula is the cost of lost production plus the repair costs. For example, if you have a lot of identical machines, machine downtime might be $400 per hour, per machine.

Now that you have a basic idea on how to get started, let's look at a more in-depth, streamlined approach to criticality analysis. This method includes three steps: agree on the risk matrix to be used, assemble your equipment hierarchy and assess the failure risks for each asset.

Agree on the risk matrix. This mainly refers to existing corporate risk matrices and how most of these matrices may need to be adjusted to include an equipment criticality assessment. Two key areas where modifications might be needed are agreeing on risk levels from a corporate and equipment level, and combining the overlapping risk categories. On a corporate level, a failure that leads to a loss of $1 million in revenue might be considered minor or moderate, but on an equipment or plant operational level, it might be seen as major.
Secondly, risk matrices that include separate categories for things like health, safety, the environment and community can be combined, because if one of these categories is impacted by a failure, the others will be as well. Combining categories will speed up the criticality analysis.
Assemble your asset hierarchy. It is recommended that your equipment or asset hierarchy be laid out along functional lines, meaning your plant floor has a certain number of process units, those process units are made up of their own equipment systems, and each of those systems is made up of individual pieces of equipment. This lets you perform a criticality analysis much quicker than if your asset hierarchy is organized by equipment class lines. Even if your assets are already organized along functional lines, they should still be reviewed to make sure nothing is out of line. Having a properly assembled hierarchy at the beginning speeds up the criticality analysis later.
Assess each asset's failure risks. When assessing the failure risks to help determine equipment criticality, consider the following points:
1. understand risk relates to events, not equipment;
2. choose only one event – the maximum reasonable outcome (MRO) event;
3. look at only the dimension with the highest risk level; and
4. start at the top of the hierarchy and work your way down.
Secondly, each piece of equipment can have a myriad of possible failure events, and the risks associated with each of those events are different. It would be extremely time-consuming to try and identify all these possible events. Multiple reliability consultants and experts recommend choosing only one event – the one that best portrays the maximum reasonable outcome (MRO) in terms of risk for that particular piece of equipment. This means you should look for an event that is most likely and one in which the overall risk is determined to be the highest.

This determination should take place in a workshop-type environment, as individuals from different departments will have varying opinions on the MRO event. Included in this discussion should be people who know the equipment best and those who understand the consequences of a failure from a business perspective.

Thirdly, consider assessing only one risk dimension – the one with the highest risk level — to avoid wasting time. As mentioned earlier, looking at each event individually usually ends up being a waste of time, as many directly affect the others. Often, it's fairly obvious which risk dimension comes with the highest level of risk. For example, if you're evaluating the criticality of a pressure relief valve at a natural gas plant, the risks associated with safety are what you'll be looking at (including the environmental and community impact). If you're assessing a component that provides electricity to operate plant equipment, you'll most likely consider the economic impact of that failing.

Finally, to ensure your criticality analysis approach is streamlined and efficient, start at the top of the equipment hierarchy and work your way down. The best thing about this approach is, by logic, any asset or piece of equipment on the lower level of the hierarchy cannot have a higher criticality ranking than the asset above it. In other words, as soon as you've identified a piece of equipment in one of the lower categories in your hierarchy where criticality ratings are low, any item below this piece of equipment must also belong in the same category, eliminating the need to analyze its criticality. As you can imagine, this stresses the importance of building your hierarchy correctly from the start.

Visualizing Criticality Analysis

When it comes to laying out criticality rankings visually, you'll find there are many theories on the best way to do this. One of the most common approaches is to use a 6x6 grid, which plots the probability of a failure against the severity of the failure, resulting in a risk priority number (RPN).

Perhaps a more common approach is evaluating all major categories (operational, health, safety and environment, reliability, etc.) individually to figure out the worst-case failure. This type of analysis will have team members assign each consequence a risk number, which is then either added or multiplied against each one, giving a final RPN. Most organizations use a criticality score derived from a defined 0-6 to 0-10 ranking for each category, with a 0 having no impact and a 6 (or 10) having the most impact. For example, if you're scoring the safety, health and environmental impact risk of an asset, you might define the impact a failure would have based on the following:

This way of performing and visualizing a criticality analysis should be done in two phases. The first phase is the initial analysis from a cross-functional team with input from operations; maintenance; engineering procurement; and environment, health and safety (EH&S). The second phase is keeping the analysis process evergreen or maintaining the criticality analysis process throughout the asset's life cycle. This helps you figure out when risk has been mitigated or if there are any significant changes with each asset.

Creating a visual for your process of performing a criticality analysis and determining final criticality ratings can be done in 10 steps:

Step 1: Choose the characteristics by which you want to evaluate each asset. These characteristics should cover multiple aspects of the business, such as the impact on customers, the EH&S impact, the ability to isolate and recover from single-point failures, preventive maintenance history, corrective maintenance history, etc.
Step 2: Weigh each characteristic using a scale of 0 to 10 to portray the significance to the business. You can also use a larger scale (the larger the scale, the easier it will be to identify critical assets), but the scale shouldn't exceed 100.
Step 3: Define each characteristic's description on the scale for accuracy.
Step 4: List (or import) your asset hierarchy.
Step 5: Define the main function of each asset to identify a single-point failure.
Step 6: Analyze the effect a single-point failure would have for each asset across all characteristics.
Step 7: Calculate the criticality rating for each asset by dividing the raw score (sum of all characteristics) by the total weighted points possible, multiplied by 100.
Step 8: Identify the top 10-20 percent of the critical assets.
Step 9: Review your analysis and find the characteristics that make each asset critical.
Step 10: Finally, identify the assets that are most significant to important areas of the business, such as reliability, cost, replacement value, maintenance plan development, etc.

Criticality Analysis: The FMECA Approach

Failure modes, effects and criticality analysis (FMECA) was developed in the late 1940s by the United States military to transition from an "identify failure and fix it" approach to an "anticipate failure and prevent it" approach. This methodology was later standardized and published as a military standard: MIL-STD_ 1629A. FMECA involves quantitative failure analysis, meaning it uses quantities and numbers to asses risk and failure potential.

FMECA and FMEA are closely related tools used to perform a criticality analysis; one is a qualitative tool (FMEA) that looks at "what-if" scenarios, while the other (FMECA) is the quantitative tool that considers RPNs. Utilizing FMEA with FMECA, you can perform a criticality analysis to ensure certain areas of the business like design, operations and costs are optimized.

The FMEA portion of this criticality approach involves defining the system, constructing system boundary and parameter diagrams, identifying failure modes, analyzing failure effects, determining root causes of the failure modes, and providing the results to the design team. The FMECA portion includes transferring everything learned from the FMEA to the FMECA, classifying failure effects by severity, performing criticality calculations, ranking failure mode criticality and determining the highest risk items, taking actions to mitigate failure and documenting the remaining risk, and following up on correction action effectiveness.

Performing a criticality analysis using the FMECA methodology provides value in the design and development department, operations and cost benefits, including:

Design and development benefits include increased asset reliability, better equipment quality, higher safety margins and a decrease in development time and redesign.
Operations benefits include a more effective way to reduce cost, optimized preventive and predictive maintenance (PdM) programs, reliability growth analysis during product development, and a decrease in waste and non-value-added operations or increasing lean manufacturing principles.
Cost benefit include being able to mitigate or recognize failures before they happen when they are less costly to fix, minimized warranty costs and increased sales due to customer satisfaction.

Because it is fairly time-consuming to put into practice, the FMECA approach isn't generally the "go-to" method of conducting a criticality analysis; however, some reliability consulting groups have resources to help you should your organization choose this method.

Criticality Analysis: The Bottom Line

Criticality analysis is a great tool for identifying the priority of maintenance tasks. A good way to look at it is that maintenance task priority should be established by the risk level that comes with not performing that task. Coincidently, this level of risk associated with not doing a particular maintenance task is determined by the consequences of the potential failure that could happen if the task isn't completed and the likelihood of that failure occurring if the task isn't done at a predetermined time.

Once you have your criticality ratings, a criticality analysis can help you choose a proper risk mitigation strategy that you can apply to each asset. For example:

Criticality analysis mitigation strategy