Buy New or Rebuild? A Systematic Approach for Difficult Maintenance Decisions

Mike Waltrip, Advanced Technology Services
Tags: maintenance and reliability, inventory management

There is a misconception in the manufacturing world about reliability, and it has significant implications for factories, managers, technicians, reliability professionals and bottom-line results at manufacturing plants worldwide. The misconception is rooted in a risk-based approach to reliability that may fail to properly identify productivity improvements, leading to increased costs.

The mistake some reliability professionals are making is this: When they look for answers when a production machine or process fails on the factory floor, they take a risk-based approach to focus exclusively on the manufacturing process rather than analyzing the root cause of the failure.

Statistics reveal that 50 percent of maintenance and repair orders require a part. Control of reliability and maintenance capital expense and expense budgets make it challenging to improve the reliability approach since up to 20 percent of plant operating expenses are maintenance-related and 73 percent of maintenance MRO inventories are inactive.

Fortunately, there is a different approach. Process-level analysis is undoubtedly important, so reliability professionals aren’t wrong to assess the manufacturing process when trying to improve productivity and increase uptime. Where some reliability departments err is in not drilling down into root causes. What’s needed is an approach that takes the process a few steps further, analyzing data trends to develop an understanding of how and why system-level components fail.

This paper will examine the problems inherent in a risk-based approach that focuses exclusively on process-level analysis. It will outline the solution, which is expansion of that strategy to include components-level analysis, including an examination of factors, development and implementation of corrective actions that work to eliminate failures and continuous improvement. The paper will then review the results manufacturers can expect when they modernize their reliability approach.

A Risk-based Approach to Reliability

In manufacturing facilities, reliability engineers typically use a value-stream mapping strategy, identifying each phase of the process that converts raw materials into a finished product and analyzing every step of the workflow in detail. Production equipment is a key part of the value stream, and reliability engineers are tasked with making sure all components are as reliable as possible to ensure peak performance.

As part of their reliability strategy, reliability engineers identify each component — electrical systems, hydraulics, etc. — working through each system methodically and assessing how the failure of any component would impact production.

For example, the reliability engineer might identify a hydraulic line as a failure point, and using the risk-based strategy, create a mitigation plan that might include maintaining a supply of replacement parts for the line or building in redundancies to ensure operations continue or get back on track quickly in the event of a system failure.

These may be necessary steps, but when reliability professionals attack reliability from a process level instead of at the system-component level, they can make decisions that lead to costly and unnecessary expenditures.

Whether it is the significant costs associated with replacing complex process systems, or locking up capital in an inventory of MRO components that might never be deployed, the risk-based approach to reliability is an inefficient and expensive approach to preventing production line breakdowns. 

The problem with the risk-based approach is that it doesn’t go far enough since it doesn’t address the root cause of failure, which is likely to be components. There are several points of failure for any given component.

Aging components or obsolete products, design flaws and misidentified parts are just a few examples of the factors that can cause a piece of equipment to fail at the component level.

To truly modernize their reliability approach, reliability professionals need to gain an understanding of how and why system-level components fail. A component-level approach to reliability better defines the root cause of the failure and reduces investments in inventory and future CAPEX investments.

It’s important to remember that there is a story behind every component that fails; reliability professionals must take steps to ensure that the story doesn’t get lost in the scrap and trash bins. Reliability, maintenance and spare parts play a crucial role, as illustrated by Figure 1.

 


Figure 1. The importance of reliability, maintenance and spare parts

 

Consider the real-life example of an aging servo-motor drive, which also happens to be an obsolete product that causes multiple production line failures. Using the risk-based approach, the reliability engineer determines that random board failures indicate a drive beyond its useful life and may implement a CAPEX plan to replace the install base to minimize production delays.

However, if the reliability engineer pursues a components-level strategy, the examination of the failure point would include a root cause analysis that determines that random circuit board failures are due to age.

The reliability engineer could develop a rebuild procedure for the circuit board, replacing aging components with newer, premium components. In a real-life scenario, this approach resulted in a 54 percent reduction in failures.

By addressing the root cause of the problem in this way instead of focusing exclusively on process-level remedies, the reliability engineer was able to not only extend the useful life of the components but also to decrease their failure rates over time. The trend chart illustrated in Figure 2 below depicts the reduction in failures correlated to evolving rebuild procedure standards.

 


Figure 2. Failure rate reduction correlated to evolving minimum standards

 

Design problems can also contribute to production equipment failures. For example, a reliability engineer analyzes a recurring control board failure on a motor drive and determines that it fails because the control board overheats.

If the reliability engineer were considering process-level factors only, the corrective action might be to set an inventory minimum/maximum to ensure that a stock of motor drive replacement parts is readily available to minimize production delays.

However, if the reliability engineer pursues a components-level strategy, the examination of the failure point would include a root cause analysis that discovers a design flaw. In the real-life example, the motor drive was found to have a control board that was located too close to a heat sink, which caused capacitors to fail.

After identifying the root cause, the manufacturer implemented a corrective plan that included a proactive recall on other installations and an engineered solution that resulted in the relocation of the control board away from the heat sink, which resulted in a 96 percent reduction in failures, as illustrated in Figure 3 below.

 


Figure 3. Scrap rate reduction

 

A third real-life example of how a process-level approach can lead to unnecessary expenditures and increased incidents of failure involves a factory that had an issue with incorrect pressure transducer installations due to parts misidentification.

The manufacturer stocked multiple models of the transducers for a variety of applications under a single part number. Under the risk-based approach, the corrective action was to set an inventory minimum/maximum to ensure that stock was available when pressure transducers failed.

A reliability engineer conducted a components-level assessment and identified the problem with the incorrectly installed pressure transducers. The reliability engineer implemented a corrective plan to stock transducers per set pressure parameters and establish new SKUs for each parameter. As a result, failures decreased 37 percent as the plan was implemented, as illustrated by the chart in Figure 4 below.

 


Figure 4. Transducer failure decrease

 

As these three examples show, understanding how and why these system level components fail is the key to establishing true reliability performance. By drilling down past the process level, reliability engineers were able to analyze root causes and develop solutions that improved uptime, reduced operating expenses and increased efficiency. New technologies can also play a key role in achieving efficiency.

For example, additive manufacturing (3-D printing) is an emerging technology that can also be used to improve component-level reliability. While additive manufacturing technology is changing quickly, it is an effective method to engineer solutions to prevent premature wear and failures.

Prototypes can be manufactured in a very cost-effective manner to ensure the design meets fit, form and function requirements of the original design application. Various materials, such as titanium, can be used in the additive manufacturing process to make components much more durable in their operating environments.

There are many examples that illustrate the efficacy of a components-level approach. By capturing information on component failure, reliability engineers can identify simple “quick-win” improvements and create standard work procedures and technician training programs that result in major savings.

A Component-based Approach to Reliability

Reliability departments that go beyond a process-level analysis to focus on components-level issues within subsystems can significantly improve outcomes, but doing so requires a different way of thinking about failure points and mitigation strategies.

Reliability professionals who are engaged in transitioning to a components-level strategy can benefit from outsider perspectives, such as the methods used by original equipment manufacturers (OEMs) and the expertise of third parties.

Foundational improvements can be additional opportunities for reliability-based improvements and may include implementation of storeroom processes, including 5-S, part identification and storage best practices, testing certification procedures for components used for troubleshooting and core tracking to ensure that the story doesn’t get dumped into the recycle bin.

Maintenance standards development can also be a key component, such as engaging technical resources in the RCA process and single-point lessons that document component-level tribal knowledge.

Fully embracing the new reliability paradigm requires looking beyond abstract theories and leveraging practical experience to develop new solutions. Accessing team expertise more broadly can lead to better outcomes and a strong commitment to delivering value.

True Reliability Performance

The risk-based approach to reliability may unfortunately lead to increased costs for manufacturers. However, it is still a valuable analysis and should remain a core function of the reliability professional.

But a modern understanding of reliability includes the acknowledgement that the real root cause of failure is likely to be components, rather than the production process itself. Further, it involves embracing new technologies, cutting-edge analytics and best practices to achieve a more reliable, more profitable output.

One example of true reliability performance in action can be found in the experience of a recognized leader in replacement tires for passenger and commercial vehicles that implemented a new approach to managing its repairable parts inventory.

A third-party reliability professional assessed the company’s unique situation and developed a plan to manage its parts program through an onsite repairable parts management program. With a more proactive parts management program, the manufacturer saw part failures drop by 53 percent. The repairable parts management program also was able to reduce part inventory by 15 percent, which resulted in significant savings.

With increased uptime and machine availability, the manufacturer was able to more efficiently meet the growing demand for its products. A senior manager in the company’s corporate maintenance division observed that by gaining access to third-party expertise and focusing on improving the reliability of the company’s repairable assets, the manufacturer was able to reduce its overall costs while improving manufacturing effectiveness.

In another example that illustrates the value of true reliability performance, a tier-one automotive supplier worked with a reliability professional to improve components-level issues and identified a problem with the service life of ball screws. Reliability was an issue because the ball screws were deployed on multiple machines and had to be custom ordered from Germany. That meant the automotive supplier had to contend with long and/or sporadic lead times.

A Smarter Approach to Reliability

Manufacturing companies worldwide are missing out on a golden opportunity by focusing exclusively on process-level improvements, which are an important part of an overall reliability strategy but do not go deep enough to capture all available productivity gains and cost savings. 

By neglecting to analyze the root cause of failures at the component and subsystem level, reliability departments that adhere to a risk-based approach are costing their companies time and money.

As the examples covered in this paper illustrate, there is a better way. Failure points often occur at the components level. Reliability experts who expand their approach beyond a process-level review to include a root-cause analysis can identify and mitigate issues caused by factors such as design flaws, component age and incorrect installation.

By gaining a better understanding of how and why system-level components fail and shifting their thinking to include new perspectives, reliability professionals can improve outcomes at every level. Aided by new technologies such as IoT-enabled parts and using data analytics to more effectively estimate behavior and implement strategies like predictive maintenance programs, modern reliability professionals can make the machine work for them to deliver reliable parts, reliable processes and reliable savings.