Criticality analysis is defined as the process of assigning assets a criticality rating based on their potential risk of failure.
Criticality analysis is defined as the process of assigning assets a criticality rating based on their potential risk. Risk is defined as "the effect of uncertainty on objectives," according to ISO 31000:2009 – Risk Management – Principles and Guidelines. Since it can't truly be quantified, risk, in this case, is thought of as all the possible ways assets can fail and the effects that failure can have on the system and operation as a whole. Given this, criticality analysis is closely related to a failure modes and effects analysis (FMEA) and a failure modes, effects and criticality analysis (FMECA), which will be discussed later. Once a criticality analysis has been performed, an FMEA typically is performed on the top 20 percent of the most critical assets.
So, why is criticality analysis important? You're constantly hearing about criticality — doing a criticality analysis to prioritize assets for a total productive maintenance (TPM) plan, a condition-based monitoring program or a root cause analysis on high-priority equipment. Criticality plays a role in nearly all types of maintenance. It comes down to risk and what makes each piece of equipment critical. Criticality analysis lets you understand the asset's potential risks that could impact your operation. It ensures reliability is looked at from a risk-based magnifying glass rather than each person's opinion.
According to the Life Cycle Institute, a criticality analysis model should cover multiple areas of your organization including:
Because the criticality model deals with multiple areas of an organization, a criticality analysis should be a company-wide effort. Including departments that deal with operations, engineering, maintenance, procurement, and health and safety ensures the analysis considers all functions of the operation as a whole. You must understand that risk can be defined differently across various teams. Having a diverse team providing input helps with the subjectivity of assigning risk.
Criticality analysis is also important because it can be used across a variety of scenarios within an organization. Some of these scenarios might look like this:
It's important to note there isn't one definitive approach for performing a criticality analysis. Following are two widely used methods, one simplistic approach to get started and another in-depth method.
So, where should you start? Many organizations just want to know which assets should be included in a criticality assessment. Instead of assuming all your assets are critical, make a list of the key assets your team thinks are critical and calculate the cost of downtime and repairs. You might be surprised by the results. For example, you may have hundreds of motors in constant motion, which are fairly critical, but the most critical asset is the boiler making steam to keep those motors in motion.
Since the point of this approach is to find a good starting point, let's take a look at some action steps you can take to get started on a criticality plan.
Now that you have a basic idea on how to get started, let's look at a more in-depth, streamlined approach to criticality analysis. This method includes three steps: agree on the risk matrix to be used, assemble your equipment hierarchy and assess the failure risks for each asset.
Secondly, risk matrices that include separate categories for things like health, safety, the environment and community can be combined, because if one of these categories is impacted by a failure, the others will be as well. Combining categories will speed up the criticality analysis.
Secondly, each piece of equipment can have a myriad of possible failure events, and the risks associated with each of those events are different. It would be extremely time-consuming to try and identify all these possible events. Multiple reliability consultants and experts recommend choosing only one event – the one that best portrays the maximum reasonable outcome (MRO) in terms of risk for that particular piece of equipment. This means you should look for an event that is most likely and one in which the overall risk is determined to be the highest.
This determination should take place in a workshop-type environment, as individuals from different departments will have varying opinions on the MRO event. Included in this discussion should be people who know the equipment best and those who understand the consequences of a failure from a business perspective.
Thirdly, consider assessing only one risk dimension – the one with the highest risk level — to avoid wasting time. As mentioned earlier, looking at each event individually usually ends up being a waste of time, as many directly affect the others. Often, it's fairly obvious which risk dimension comes with the highest level of risk. For example, if you're evaluating the criticality of a pressure relief valve at a natural gas plant, the risks associated with safety are what you'll be looking at (including the environmental and community impact). If you're assessing a component that provides electricity to operate plant equipment, you'll most likely consider the economic impact of that failing.
Finally, to ensure your criticality analysis approach is streamlined and efficient, start at the top of the equipment hierarchy and work your way down. The best thing about this approach is, by logic, any asset or piece of equipment on the lower level of the hierarchy cannot have a higher criticality ranking than the asset above it. In other words, as soon as you've identified a piece of equipment in one of the lower categories in your hierarchy where criticality ratings are low, any item below this piece of equipment must also belong in the same category, eliminating the need to analyze its criticality. As you can imagine, this stresses the importance of building your hierarchy correctly from the start.
When it comes to laying out criticality rankings visually, you'll find there are many theories on the best way to do this. One of the most common approaches is to use a 6x6 grid, which plots the probability of a failure against the severity of the failure, resulting in a risk priority number (RPN).
Perhaps a more common approach is evaluating all major categories (operational, health, safety and environment, reliability, etc.) individually to figure out the worst-case failure. This type of analysis will have team members assign each consequence a risk number, which is then either added or multiplied against each one, giving a final RPN. Most organizations use a criticality score derived from a defined 0-6 to 0-10 ranking for each category, with a 0 having no impact and a 6 (or 10) having the most impact. For example, if you're scoring the safety, health and environmental impact risk of an asset, you might define the impact a failure would have based on the following:
This way of performing and visualizing a criticality analysis should be done in two phases. The first phase is the initial analysis from a cross-functional team with input from operations; maintenance; engineering procurement; and environment, health and safety (EH&S). The second phase is keeping the analysis process evergreen or maintaining the criticality analysis process throughout the asset's life cycle. This helps you figure out when risk has been mitigated or if there are any significant changes with each asset.
Creating a visual for your process of performing a criticality analysis and determining final criticality ratings can be done in 10 steps:
Failure modes, effects and criticality analysis (FMECA) was developed in the late 1940s by the United States military to transition from an "identify failure and fix it" approach to an "anticipate failure and prevent it" approach. This methodology was later standardized and published as a military standard: MIL-STD_ 1629A. FMECA involves quantitative failure analysis, meaning it uses quantities and numbers to asses risk and failure potential.
FMECA and FMEA are closely related tools used to perform a criticality analysis; one is a qualitative tool (FMEA) that looks at "what-if" scenarios, while the other (FMECA) is the quantitative tool that considers RPNs. Utilizing FMEA with FMECA, you can perform a criticality analysis to ensure certain areas of the business like design, operations and costs are optimized.
The FMEA portion of this criticality approach involves defining the system, constructing system boundary and parameter diagrams, identifying failure modes, analyzing failure effects, determining root causes of the failure modes, and providing the results to the design team. The FMECA portion includes transferring everything learned from the FMEA to the FMECA, classifying failure effects by severity, performing criticality calculations, ranking failure mode criticality and determining the highest risk items, taking actions to mitigate failure and documenting the remaining risk, and following up on correction action effectiveness.
Performing a criticality analysis using the FMECA methodology provides value in the design and development department, operations and cost benefits, including:
Because it is fairly time-consuming to put into practice, the FMECA approach isn't generally the "go-to" method of conducting a criticality analysis; however, some reliability consulting groups have resources to help you should your organization choose this method.
Criticality analysis is a great tool for identifying the priority of maintenance tasks. A good way to look at it is that maintenance task priority should be established by the risk level that comes with not performing that task. Coincidently, this level of risk associated with not doing a particular maintenance task is determined by the consequences of the potential failure that could happen if the task isn't completed and the likelihood of that failure occurring if the task isn't done at a predetermined time.
Once you have your criticality ratings, a criticality analysis can help you choose a proper risk mitigation strategy that you can apply to each asset. For example: