Maintenance and reliability professionals can make a difference. At most manufacturing plants, that means focused work that allows operations to get more finished product (whether it’s soft drinks, motorcycles, light bulbs, cars, cupcakes, water heaters, etc.) out the door.
At Eli Lilly’s biosynthetic human insulin (BHI) plant in Indianapolis, making a difference takes on added significance.
The Eli Lilly BHI plant’s reliability engineering group includes (from left) senior reliability engineer Vadim Redchanskiy, reliability engineering technician Mary Ann Dust, maintenance and reliability team leader Ken Swank, and senior reliability engineer Mark Lafever. |
Nearly 21 million people in America – and 200 million people worldwide – have diabetes. In the U.S. alone, nearly 1 million new cases are diagnosed each year. Many people with diabetes require insulin to control their blood sugar (glucose), and the BHI plant helps fill that need. Opened for production in 1992, the site produces a significant percentage of the world’s medicinal insulin. Maintenance and reliability work that enhances productivity allows the company to get high-quality, life-sustaining medicine in the hands of those who need it.
Eli Lilly has been producing medicinal insulin for more than 80 years. |
Whether you work for a company that makes computer chips or potato chips, when you add value to the stakeholder or to the economy, that can’t be belittled in any way,” says Ken Swank, the plant’s maintenance and reliability team leader. “It doesn’t matter what you do. It’s important. There is one distinct difference, however. I used to work for an industrial coatings company. Our coatings went into everything from golf balls to the Space Shuttle to surgical instruments. When I’d see the end product, I knew I was part of that. But when you work for a pharmaceutical company . . . it means a lot on a personal level.
I met a couple over the weekend who has a young son. We got to talking and I found out the boy has diabetes. I asked if he had Type 1 or Type 2. The father asked, how do you know so much about diabetes? The boy told me he was Type 1. I looked at him and said, ‘Guess what I do for a living? I make insulin. I work at Eli Lilly and make Humulin.’ He said, ‘Thanks. I really like my medicine. It makes me feel a lot better.’
“My department is in charge of making sure this building is making the medicine it is supposed to make every time. There are millions of people who depend on it and reach for it every day.”
The company’s Indianapolis campus contains
manufacturing sites and corporate offices.
Increasing Demands
The BHI plant is large and technically complex. It houses more than 17,000 pieces of equipment, 13,000 input/output points and 600 operating units. The processing method to generate the BHI molecule involves several centrifugation steps, a handful of reactions, many purification steps and various solvent exchange steps. As a result, approximately one-third of the operating units are classified as either a high-risk or safety-critical operation.
A few years back, maintenance and reliability leaders decided it was necessary to enact substantial changes in order to maximize the department’s time, skills, resources and potential impact on the plant. BHI was running at more than twice its original design capacity, and business demand continued to increase. Technicians were overcommitted, remediation efforts were frequently reprioritized to address the needs of the moment, and crucial systems were not getting a fair percentage of attention.
Eli Lilly produces 25 percent of the world’s insulin supply. |
We’ve never been concerned about our equipment operating in a qualified state,” says Swank. “But like the rest of the pharmaceutical industry at that time, we didn’t give extra emphasis to our equipment, other than we always wanted the most uptime we could get in order to make the most medicine. But when you look at getting as many kilos as possible out the door, we got to the point where you question, ‘Do we keep adding facilities or do we do it smart from a business perspective and focus on reliability?’”
It was something a host of Eli Lilly facilities were pondering in the late 1990s. For instance, BHI engineering manager Ron Reimer led efforts to increase proactive work and uptime and decrease maintenance costs while at the company’s Clinton (Ind.) Laboratories site. As part of that project, which was then systematized and dubbed Proactive Asset Management, he hired the company’s first reliability engineer.
Direct involvement from all key stakeholders in plant reliability
(production, HSE, quality control, finance, engineering and
management) helps ensure the success of the reliability
prioritization initiative.
BHI’s enhancement efforts began with the addition of a reliability engineer in 1999, and the introduction of Reliability-Centered Maintenance (RCM) and root cause failure analysis (RCFA) projects. Those efforts increased when regulatory agencies such as the Food and Drug Administration and the Environmental Protection Agency began to closely examine maintenance in the pharmaceutical industry. The agencies’ message was simple: maintenance equals plant reliability; plant reliability equals product reliability; and overall reliability equals compliance. “Unreliable” companies can face penalties, including the shutdown of operations.
Communication’s importance According to maintenance and reliability team leader Ken Swank, communication plays a major role in the success of the BHI plant’s reliability prioritization initiative.
“Communication is one of the most important parts of my job,” he says. “I meet with the production leaders on a frequent basis and explain to them what’s coming down the pipe. I get their commitment and help them understand the value it’s adding. They have to pony up some resources, too. A proper analysis isn’t just our department, obviously. It involves engineering, maintenance, reliability, operations, technical services, automation at times. We spend a great deal of time to ensure that they do understand the value. |
"That’s when, I think, reliability started to take the focus it has today,” says Swank. “On our journey to become a truly reliable plant, the vision is that when production uses a piece of equipment, it’s in a qualified state, it’s available when they need it and it’s going to run at the predetermined performance level. We obviously play a big part in that. The maintenance strategies that we put together deal with keeping that piece of equipment in a qualified state. Also, the depth of the maintenance strategies addresses the utilization or uptime required. There are operations in the facility that run more than others, have less redundancy than others or are more essential than others. Those require more attention and more detailed investigation.”
A prioritization initiative that began in early 2004 has been central in this strategy to deliver uptime and reliability to the operating units that warrant the most attention.
How Eli Lilly defines reliability roles According to Eli Lilly BHI maintenance and reliability leaders, the job functions and responsibilities of their reliability engineers include:
The job functions and responsibilities of reliability technicians include:
|
Make It a Priority
Swank recounts the marching orders that would eventually lead to a reliability remedy.
“My boss at the time said, ‘Figure out how we are going to make the BHI facility more reliable. We need to get a handle on this,’” says Swank. “What he really meant to say was, ‘You and your team need to understand the business needs of the facility, determine a method to set a path forward to remediate the correctly prioritized reliability gaps, sell it to the business, execute it and make it sustainable.”
Sounds easy enough, or so he thought.
“We started in February (2004) and assumed we’d be done in March or April,” he says. “We quickly realized this was more involved and complex than we anticipated. Plus, we wanted to do it right.”
The game plan would be to develop an analysis that uses existing data to prioritize system remediation as a continuous improvement effort outside of the department’s daily support efforts. The analysis requirements were that:
- it would take the systems identified and rank them according to business impact based on data;
- all stakeholders would be represented;
- the analysis could be executed in less than one man-week (40 hours).
This challenge was placed on the shoulders of the reliability engineering portion of the department. The group included senior reliability engineers Mark Lafever, Vadim Redchanskiy and Rod Matasovsky (now retired), and reliability engineering technicians David Doyle, Mary Ann Dust and Matt O’Dell. They began to strategize the content of the analysis.
“They are the smart ones. I was the translator from management to the guys on the floor,” says Swank. “They understood the data systems, and what made sense and what didn’t.”
The group acknowledged that in order to get support for this initiative, the analysis would have to be based on facts and need to directly involve and be meaningful to all key stakeholders in plant reliability – production; health, safety and environmental (HSE); quality control (QC); finance; engineering; and management. This was going to be an incredible balancing act.
“Anyone can go out and pull a bunch of data,” says Lafever. “We had to decide where to pull the data from, how we were going to pull it, and figure out whether the data was going to tell us the information we needed in order to make the right decisions.”
Table 1. Summary of weights for the five scenarios.
Table 2. Example from the first scenario sensitivity analysis.
After several iterations – and “a lot of head-rubbing,” says Lafever – the team finalized on an analysis that took the stakeholders into account by using existing data from the past 12 months. This data includes:
1) Hours of Emergency Work, equated to equipment downtime, to satisfy production. This was collected from the plant’s computerized maintenance management system, which tracks all hours charged against each operating unit. Emergency work was defined as “work that can’t wait.” While not a traditional measurement of system downtime, this does directly correlate with the amount of disruption production felt when the system wasn’t functioning properly.
2) Risk Classification, per Lilly’s Globally Integrated Process Safety Management (GIPSM), to satisfy HSE. The classification system has four possibilities: safety-critical operation, the top risk factor; high risk, which involves considerable environmental, health and fire risk; mechanical integrity, which is defined by the Occupational Safety and Health Administration; and no risk or “none of the above.”
3) Number of Deviations to the Process caused by equipment failures to satisfy QC. This targeted deviations that were the result of equipment reliability issues, not operator error or other non-equipment issues. The number of deviations were taken into account along with a level (1, 2 or 3) that pinpointed the deviation’s impact to product quality.
4) Cost of Reactive Work to satisfy finance. This was again taken from the CMMS, which tracks all budgetary charges against the operating units. This cost included all parts and labor associated with reactive work done on the system.
5) Process Engineer Input to satisfy engineering. The process engineer accountable for each system was surveyed on topics such as system age, hours of potential downtime generated by system failure and regulatory impact.
6) State of System’s Maintenance Plan, also to satisfy engineering. This was crafted to include four levels: Level 1, no routine maintenance conducted, which was deemed the most severe; Level 2, preventive maintenance exists on the system; Level 3, Periodic Qualification Evaluations (PQEs), designed to ensure the system is in a constant state of qualification and fit for use, are executed; and, Level 4, an RCM-based analysis was used on the system to generate a maintenance plan.
This data created a “crucial equipment” evaluation that looked at the 420 operating units and identified those with the potential to either stop production or cause an OSHA or EPA reportable incident.
“The way our structure is set up in our CMMS, and the way we control our incident database, operating unit was the best way to go,” says Lafever. “Sometimes, an operating unit is a piece of equipment. Most of the time, it’s a major piece of equipment, plus a whole lot more.”
For example, Redchanskiy says that operating unit EV1411 (evaporation process) includes “50 to 60 pieces of equipment and instrumentation, such as valves, heat exchangers and pumps.”
The evaluation cut the initial list 70 percent, from 420 operating units to 135.
The facts don’t lie
When it comes to determining the importance of a particular operating unit, it’s hard to argue with the facts. “This also has us looking outside the definition of production equipment. Before the analysis, people forgot to include waste tanks, air handlers, etc. They didn’t think about Tank 1099 in Control Room 2, where all of the floor drains go. The flush goes through the drains and into the tank. If that tank isn’t operational, we have to shut down our purification steps.” |
Weights and Measures
To ensure proper prioritization of the remaining 135 units, the group decided to apply a weighting to each data source and performed a handful of sensitivity analyses.
“We didn’t feel that the six criteria were equally weighted,” says Swank. “We felt that safety and quality had a higher impact than, say, the amount of money we were spending on emergency-type work.”
A scoring system (zero to 3, with zero being the least impact and severity and 3 being the most impact and severity) was developed for each data set and applied to each operating unit. The breakdown was as follows:
Hours of Emergency Work (HEW): less than 15 hours (score of zero), 15 to less than 25 hours (one), 25 to less than 40 hours (two) and 40 or more hours (three).
Risk classification (RC): no HSE risk (zero), mechanical integrity system (one), high-risk process (two) and safety-critical operation (three).
Deviations (D): Four groupings were made by taking into account the levels and numbers of deviations. It was determined that a Level 2 deviation was equal to three times a Level 1 deviation and a Level 3 deviation was equal to two times a Level 2 deviation. This made a Level 1 deviation worth one point, a Level 2 deviation worth three points and a Level 3 deviation worth six points. This was applied to every deviation. As a result, the values were: two or less (zero), greater than two through through five (one), greater than five through eight (two) and greater than eight (three).
Cost of reactive work (CRW): less than $5,000 (zero), $5,000 to less than $7,499 (one), $7,500 to $14,999 (two) and $15,000 or greater (three).
Process Engineer Input (PEI): minimal impact system (zero) and escalating up to maximum impact system (three).
State of System’s Maintenance Plan (SSMP): RCM analysis performed on the system (zero), PQE executed on a routine basis (one), PMs performed (two) and no routine maintenance performed (three).
Scores were applied to the 135 operating units. The information was then loaded into a spreadsheet and various weights were applied to emphasize the importance of various data sets. The sensitivity analysis project included five different weighting scenarios to ensure a single data point wasn’t driving prioritization of a system.
The scenarios varied from a fairly even weight distribution (HEW, HSE, D and SSMP, 20 percent each; CRW and PEI, 10 percent each) to the elimination of two categories (HEW, HSE, D and CRW, 20 percent each; PEI and SSMP, zero percent). In the latter scenario, the remaining data sets were “true data” that changed according to the level of reliability demonstrated by the system. The scenarios are shown in Table 1.
Each scenario in the sensitivity analysis took the risk factor and multiplied it by the weighting of that particular scenario. The product of each category was summed for each operating unit. Table 2 shows an example from the first scenario.
When all five scenarios were completed, the final scores for the operating units were graphed and examined by the reliability team. Before determining final rankings and remediation plans, additional factors were considered. Was the unit in question recently replaced, or is replacement in the capital plan? Can remediation plans for this unit be applied to other units? What functional groups are needed for this remediation, and are they available? What remediation activities have been done in the past?
“For example, one of the centrifuges came out near the top of the list, but we knew another site was doing an RCM on a very similar system,” says Swank. “We didn’t need to duplicate the efforts.”
The final list of proposed remediation activities varied, depending on operating unit, from an in-depth RCM analysis to not remediating the system at all.
At that point, the team knew they had a comprehensive plan.
“It was a grind,” says Lafever. “We came up with a plan three times – it felt like 30 – and we kept saying, ‘This isn’t good enough.’ ‘How would quality control feel about this?’ ‘How would process engineering feel about this?’ We were very perplexed. We had to make sure we addressed all of the facets and all of the potential questions.”
Adds Swank, “The first stab at it was like, ‘Wow, this isn’t even close.’ It became obvious why it hadn’t been done before. It’s a lot of work. There’s also day-to-day stuff that keeps tearing you away from something like this. But we told ourselves that we were going to stick with it and get it done.”
In Writing
On Sept. 21, 2004, Swank formally presented the prioritization model to the plant’s lead team, which includes the site head and all functional managers. The exhausting seven months of work paid off.
“There was no bartering or discussion back and forth,” he says. “They were like, ‘This is great. Proceed.’”
Of course, a head nod and a hand wave only go so far. So, Swank had Lafever create a report that summed up the evaluation process and detailed how the crucial equipment analysis would be performed year over year. The report would serve as a template for future evaluations.
“I told the lead team members to sign it,” says Swank. “They did. I have it in writing. There was no hesitation. That goes to show you that the analysis we did was real strong.”
On Feb. 28, 2005, the crucial equipment analysis was formalized and approved by the lead team, and the remediation activities for the plant’s most vital operating units were included in the plant’s 2005 and 2006 business plans.
The template has made subsequent evaluations almost seamless.
“Last year, it was a breeze to do the analysis,” says Lafever.
The 2006 plan was completed in May.
Remedy's Results
Maintenance and reliability department leaders at this Eli Lilly plant say they are currently unable to quantify the bottom-line, dollars-and-cents impact of the reliability prioritization initiative.
“The unfortunate part is we don’t always see the results of our work year after year,” says Lafever. “There might be a year delay because it takes time to work through the system.”
But that doesn’t mean there haven’t been benefits and results.
Redchanskiy and Doyle say there are inevitable cost savings just by re- evaluating how maintenance is performed on a given asset.
“In the analysis, we found we were spending a considerable amount of money on a couple of systems that didn’t have any reactive work,” says Redchanskiy. “We spent a ton on preventive maintenance. We overdid it on PMs. We changed the way we did maintenance on those systems.”
“The biggest shift from this is that for some systems, people can now say that it’s actually OK to run to failure,” says Doyle. “If that is our diagnosis and plan for that particular system, it’s OK. That’s a whole different philosophy for us.”
Swank says positives can be drawn from the plant’s productivity levels.
“The fact that we’ve met our inventory levels, and the fact that our business model is moving toward heightened productivity, shows we have already reached our first major milestone,” he says.
Lafever believes involvement with remediation projects has led to increased technical knowledge and increased uptime.
“When most maintenance and operations people come out of an RCM analysis, they could be classified as experts on that system,” he says. “Everybody has a better understanding of the groups’ individual functions and how they work together to perform their part of fixing a piece of equipment or identifying when there’s a problem with a piece of equipment. That interaction itself, I think, reduces the amount of emergency work that develops.”
The best indicator of success?
“It makes our upper management happy,” says Doyle.
“And that makes me happy,” says Swank.
All of these improvements could explain why BHI received the 2005 Making Medicine Award, which goes to the Eli Lilly plant that “best meets the needs of the business and embodies what manufacturing is supposed to be like at the company.”
Other Lilly plants are taking note of the prioritization initiative and are examining the feasibility of adoption. This has led to increased visibility corporate-wide for maintenance and reliability.
“Part of the good has been corporate understanding and awareness of the value that maintenance adds,” says Reimer. “It’s something that we definitely want to take advantage of.”
This team demonstrates daily that maintenance and reliability professionals can and do make a difference.