Reactive maintenance, termed unscheduled and/or unplanned maintenance, is often determined to be "too high." This assessment may be based on data or simply an intuitive call. Either way, it is always an unwanted situation that needs to be addressed. And the obvious first question is WHY? Why is the plant, the mill, the equipment, etc., the undesired maintenance downtime so high or extensive? This can be for many reasons, which we will explore in this article. I am going to approach this much like a murder investigation, as we are looking for evidence. So put your Sherlock Holmes hat on, and let's get started.
The first step in the investigation is to seek information, facts and evidence to help answer the question, "Why is the reactive maintenance so high?" And the first place to look is the backlog of work in your database. But what is a backlog? Backlog is defined differently depending on who you ask. But the definition for this discussion is that a backlog is simply work that needs to be done in the future. And, of course, the backlog may contain overdue work, and the planned start date for completion has passed, which is often a red flag. This overdue work may be for various reasons, such as waiting on materials, lack of resources, waiting for the equipment to be released by production, etc. This requires a deeper investigation to determine the actual reasons. The backlog may also contain work that is partially complete, and it may be partially complete for many of the same reasons, such as some work being overdue. However, the most essential part of the investigation is looking for what corrective maintenance tasks have been identified proactively as needing to be completed in the future. If corrective maintenance tasks are absent in the backlog, that should raise a huge red flag.
So, How Do You Quantify This Absence of Tasks?
First, if there is simply a lack of
corrective work orders in the backlog, there is an absence of work that has been identified proactively and documented in the system via the work order system. Without the early identification of corrective work needing to be executed on equipment, you have a classic reactive maintenance program despite everything else you might have, such as a
CMMS system, a PM program, reliability engineers, etc. In other words, if your maintenance system is not detecting problems early enough to plan and schedule corrective maintenance tasks, you are operating a high-cost, inefficient, ineffective and unsafe reactive maintenance program.
Note: I will not address the other areas where backlog may exist, such as on whiteboards, in log books, on shift reports, in people's heads, etc., because none of these are acceptable methods for managing a proactive and efficient work management system. You have overall work management issues if your backlog does reside in any of these areas. Your overall work management business processes are out of line and must be addressed.
But, if there are some corrective maintenance work orders in the
backlog, then how do you determine if there is a sufficient quantity of the corrective maintenance work orders? First, you must understand that backlog is measured by the estimated labor hours this proactive work represents. This would require that all work orders in the backlog would have an estimated number of labor hours necessary to complete the described work for each work order. We all understand that work order estimates have an associated margin of error. For those work orders that have not gone through any kind of formal planning system, the margin of error will be much greater than those that have been planned by a skilled planner and have been made ready to schedule. So, quantifying the backlog is very simple. First, add the estimated labor hours on all open and authorized work orders in the backlog. That is, with one exception. Do not include work orders associated with a scheduled shutdown or outage. Second, add a crew's total available labor hours in a typical work week. This is referred to as a crew week.
Note: You may have multiple or a single crew, but either way, you want to match each crew with the group of work orders for which each respective crew is responsible.
Next, divide the total available labor hours into the estimated labor hours on the work orders. This will give you an account for the number of crew-weeks of backlog. The number you are looking for is in a range, but the minimum for the range is two crew weeks for a small plant and four crew weeks for a big plant. These are not absolutes but only estimates to help you understand where you stand concerning the size of your backlog.
There is another essential point to consider when you do this exercise: to see who identified the work that made its way into the backlog. You might ask why that is important, and here is why: If it is production that created the backlog work order, I am speaking specifically about the equipment operators that are NOT highly engaged in any type of essential maintenance or care of the equipment; this work is usually going to be reactive maintenance type work. There is little time left to plan and schedule that work before the equipment fails. So, it is another red flag if you see much of this backlog and not many backlogs created by proactive maintenance activities, such as you would expect through a high-functioning preventative maintenance program.
So, let's assume that you find your backlog to be deficient in the amount of corrective maintenance work that would be considered normal. Your next question is, why? This leads to a series of questions:
-
Do you have a formal preventive maintenance (PM) program?
-
If yes, how well-documented is your formal PM program?
-
How well do you execute the PM tasks?
-
If a problem is detected, how does it get addressed?
Let's address each of these questions.
Do You Have a Formal PM Program?
It may come as a surprise, but some locations do not have formal PM programs established. This may be true for only some parts of a plant, some equipment, or the entire plant.
How Well-Documented is Your Formal PM Program?
It may be a subjective assessment of how well-documented your PM program is, but there are a couple of straightforward ways to determine this. First, simply ask the technicians who are assigned to do the PMs how good the current PM program is. They are usually honest and forthcoming with their opinion(s) on how good the PM program is, and you might be surprised by a lot of things they may tell you about your PM program. Second, as reactive maintenance situations happen, see what is written into the PM program and see if the PM task(s) are written in a way that would have allowed a technician executing the task to have detected the problem early on. Then again, you may find that nothing addresses the particular issue in the PM program.
How Well Do You Execute the PM Tasks?
The execution of the existing PM tasks is often an opportunity for improvement, no matter how good the PM program actually is. But the execution can be deficient for several reasons:
-
The least qualified technicians are assigned to execute the tasks. Supervisors often assign the newest, most inexperienced technicians to do the PMs. The supervisors who do this have a distorted view of the importance of the PM program. These inexperienced techs simply don't know the equipment, they don't know how the equipment and components fail, and they most likely do not understand how to execute the actual PM task.
-
If the maintenance team views the PM program as not very good, they also likely assume that it is not very important to the company and are not likely to take it seriously, even to the point of "pencil-whipping" the documents.
If a Problem is Detected, How Does It Get Addressed?
If a problem is detected, what is the business process for how the technicians or operators deal with the problem? Does a business process even exist for this? If it does, does anyone know about it or adhere to it? Often, when a problem is found, the PM stops, and the technician goes into repair mode. That's great, but what about the rest of the PM? There may be a higher priority problem with higher consequences that exist, but it goes undetected because the technician assigned to do the PM never got to that equipment with the bigger problem. Discipline to follow a formal business process is needed to properly deal with these problems as they are detected.
Conclusion
Without a high-functioning PM program that detects problems early, reactive maintenance will be high. A high-functioning PM program that includes early detection allows for enough time to properly process the problem. Properly processing the problem means that each problem is fully planned to address it effectively. Then, the remedy is effectively scheduled, and the repair is efficiently executed to proactively address the problem.
Companies often think that they should not have a high degree of reactive maintenance because they have a maintenance program that includes a CMMS system,
planners, schedulers, engineers, predictive maintenance practices, and a PM program. Even with all of these pieces in place, if there is an ineffective PM program, it is almost guaranteed that the organization's reactive maintenance will be excessive. A high-functioning PM program is the key to reducing reactive maintenance.