A Better Way to Allocate and Analyze Downtime

Don Armstrong, Veleda Services
Tags: maintenance and reliability

As part of a reliability improvement program, many process industries assign the losses resulting from each downtime or lost production event to the "responsible" department, such as operations, mechanical or electrical. Frequently this allocation is based on someone's perception of who is to blame, which can have significant negative effects.

I recall the following discussion at a morning meeting in a pulp mill. About 50 tons of production had been lost because of a leak in an elbow of a secondary knotter reject pipe. The secondary knotter is the last stage of coarse debris removal after the pulp leaves the digesters where wood chips are cooked to raw pulp.

Operations superintendent: "The reject pipe is a piece of mechanical equipment, so this is mechanical downtime."

Maintenance superintendent: "Sorry, but the pipe wore out because your operators have been running the chip piles too low and putting a lot of gravel through the system. It's not designed for that. It's operations' fault."

Operations: "But the reason for that is the chip conveyor to the chip pile keeps breaking down, so the pile has gotten so low that they're scraping the bottom, and the chip conveyor problem is definitely mechanical."

Maintenance: "The conveyor breaks down because the operators aren't keeping the conveyor gallery clean, and chips get caught under the drums and run the belt off. That's an operations problem."

Operations: "But we can't clean it because the air line that supplies the lances for blowing under the conveyor is rusted out, and we have no air there. That's mechanical."

Maintenance: "OK, so what's the work order to repair the air line? We'll get on to it tomorrow."

Operations: "We haven't submitted a work order yet. I'll do that right now."

The point is that downtime is downtime, and the focus should be on preventing the problems from recurring, not who is to blame. Arguing about blame drives a wedge through the operations/maintenance partnership. Such arguments are an inevitable result of the blaming process and will be much more frequent (and louder) if a "departmental downtime" measurement is a component of an incentive program, which it often is.

In this example, a prudent manager would have recognized that the root causes of the problem were a lack of communication between operations and maintenance, and a failure to follow the work order/backlog/priority setting/scheduling process. If there is a mechanical inspection program in place, it should also be reviewed, especially if it is missing such fundamental problems as corroded service piping.

Fortunately, there is a better way. Instead of assigning blame, a much more positive and productive approach is to always treat downtime as a joint responsibility of the operations/maintenance partnership. Record all losses against the equipment or event that resulted in the downtime (e.g., Eq. No. 23-4567, No. 3 hot oil pump or raw material delivery delayed by rail strike). Then, assign the responsibility for action to the department in the best position to initiate and follow through so the problem will not recur. This department may not even be directly involved in the day-to-day operation.

For example, in a pulp mill where a large fiberglass pipe failed during startup, the root cause was determined to be a lack of operator training on the correct startup procedure. Responsibility was assigned to the engineering department to train operators on the fundamentals of pump and piping systems and to develop a standard operating procedure for starting each pump.

After a major downtime event, frequently the first action required is to conduct a root cause analysis. This may involve a few knowledgeable people or require extensive investigation. Of course, managers need to ensure that the people assigned responsibility for preventing the problems take the necessary action. If part of this action is to initiate a work order for some preventive or corrective maintenance or redesign, then it is also necessary to make certain that the work gets done.

If your operation is qualified under ISO 9000, including maintenance provides a tool to track corrective action requests (CARs) and preventive action requests (PARs). Otherwise, the use of a "how initiated" field in the work-order database, where one of the values in the drop-down list is "investigation," will enable managers to focus on all such work orders. The "reason" field can be used to further separate "investigation" work orders into safety, operations, environment, etc.

Remember, nothing good comes from assigning blame. There is value in addressing individual problems, but the best value can be achieved by improving systems, such as business processes, so that the philosophy of problem avoidance becomes ingrained in the organization's culture.