Long a mainstay tool used to enhance the reliability of product designs, failure modes and effects analysis (FMEA) can also serve a valuable purpose for the manufacturing and process reliability engineer … if we make a few modifications. The standard FMEA process, which is detailed in IEC Standard 60812 and elsewhere, works pretty well as it is, but it can be improved. Here are some key points for updating your FMEA process to get it working for you in the plant.

Figure 1. A more modern form of the FMEA to serve industrial plants and operations.

  1. Begin the process by creating functional block diagrams (FBD) for the production processes under review. An FBD is just like a reliability block diagram (RBD) but without the numbers. We typically lack quantifiable reliability performance data at this stage of the game. Once we have it, we'll fill in the blocks to create RBDs. It's usually best to break the plant down into blocks. The functional block diagrams form the functions that will be reviewed in the FMEA process, so be detailed - these are the functions upon which your plant relies to complete the mission.
  2. Take the guesswork out of the process. Begin by collecting data using a systematic failure reporting and corrective action system (FRACAS) process. Usually, within three to six months, you'll see 80 to 90 percent of what goes wrong in your plant. After that time, you can scale back or simplify the data collection process if you wish; but prior to completing your FMEAs, you need plenty of detail about what's really happening. The human memory is fragile and fallible; data is the difference between deciding and guessing. Before undertaking the data collection process, standardize your taxonomies of failure modes and failure causes. This will make the data much more manageable when it comes time to carry out the FMEA process.
  3. Forget about the dimensionless risk priority number (RPN). Quantify your losses in terms of Cost per Event x Number of Events per Year. This yields a monetized effect on the organization in annualized terms. Costs might include repair costs, downtime costs, risk-based costs, energy costs, etc. You may have to spend money to mitigate the targeted loss. To do so, you need a cost benefit analysis, so the risk must be monetized at some point; you may as well do it in the FMEA process. When it comes to getting improvement initiatives approved, dollars talk ... everything else walks. Get ahead of the game and monetize your risks.
  4. Identify the possible contributing causes using a standardized taxonomy of failure causes, and check all that apply. It is rare that complex failures have a single cause. The event is usually brought about by the combined effect of several contributing causes. Carry that information over from FRACAS to FMEA.
  5. Capture and categorize your action decision. In all instances, you must decide to act to resolve the problem, choose not to act or schedule a root cause analysis (RCA) event to explore the problem further. The decision to act is often based upon the cost benefit analysis. If you choose not to act, record your reason why and periodically review to see if the landscape has changed. Sometimes, new technology comes around that will enable you to manage a risk more cost-effectively, or the impact of a risk becomes more poignant to the organization due to market changes or changes in the operating context.
  6. f you choose to act, clearly define what mitigating actions are required. Associated with those actions are implementation costs, both up-front and ongoing, and a new estimate of the annualized failure cost. Changes in design, procedures, etc., will serve to either: a) reduce the cost per event, and/or b) reduce the number of events per year. In all cases, the action should reduce the annualized failure cost. Capture that projected new failure cost information on the FMEA sheet.
  7. In order to go forward with any improvement project, you must have a positive return on investment (ROI). Since you've captured your current annualized failure costs, your projected annualized failure costs and your estimated costs to mitigate the risks with corrective actions, you have everything you need to create a five- or seven-year cash flow projection. By applying your company's cost of capital, you can tabulate the internal rate of return (IRR), net present value (NPV) and discounted payback period (DPP). If the numbers look good, go with it. If the proposed change fails to provide an acceptable return, shelve the project, but schedule a re-review to determine if changes in the market or the organization's operating context alters the ROI calculation, and to evaluate if a new, lower-cost solution to the problem is available.
  8. Assign the corrective action to an individual or team and define a due-by date. A due-by date converts a wish into a goal.
  9. Trend actual-to-projected costs to implement and returns to the organization. If the returns are greater than projected, turn up the steam to deploy the change into other plants or lines where it is applicable. If the returns are less than you projected, retool the correction or slow/stop further deployment to other plants or lines.
  10. Make the FMEA process a living document, a monetized diary of your risk management activities. When you conduct an RCA, capture the results in your FMEA database. The traditional approach specified by the FMEA standard is still valid. We just need to modify it a little for use in industry.

To summarize our key points:

  • Start with data collection. Data is the difference between deciding and guessing.

  • Dollarize your findings. You're spending money to correct problems; your benefits must outweigh the expense.

  • While FMEA is used extensively in Reliability-Centered Maintenance, don't limit its application to maintenance in the plant. Your problems are cross-functional. Your solutions should also be cross-functional!


Drew D. Troyer is a champion of effective reliability management and passionate about helping companies find hidden profits inside their plants. As a highly sought consultant to Fortune 500 manufacturing firms, award-winning columnist and teacher, he understands both management expectations and plant-floor realities. Troyer is a Certified Reliability Engineer (CRE), a Certified Maintenance and Reliability Professional (CMRP), and chairs the standards committee of the Society for Maintenance and Reliability Professionals (SMRP). Contact Drew at 800-597-5460.