Friday, November 30, 2012

Ignoring Alarms and Feeling the Fire

I find myself in the Middle East this week and while having lunch today the fire alarm went off in the hotel. The interesting thing is that no one moved or showed much concern for that matter. One guy in the distance did said 'If I do not see fire, smell fire, or feel fire then I am not moving." I have seen similar behavior all over the world. This all got me to thinking, how often do our assets give us alarms that we choose not to hear? How many of us ignore the early points on the P-F curve only to wait for the sight of fire or the smell of smoke. Once we smell the smoke or see the fire it is likely too late to execute a properly planned job and because we may not be able to get the repair done quick enough the chances are very high that a catastrophic failure is likely. When equipment suffers a catastrophic failure you are subject to higher cost from every angle. More parts cost due to additional damaged components, more shipping cost due to expedited spare parts, more labor cost due to overtime required to complete the unplanned repair,  more contractor cost for specialty repairs and support, and finally more operations losses that can cause the total bill for the repair to skyrocket. The most recent statistic that I have found states that the emergent repair will cost you nearly 5 times as much as the planned and properly executed job. So if we can look for the early alarms and heed their warning we can lower maintenance cost and increase production substantially. We can use the operational data like amp draw and differential pressure or we can use the condition based tools like vibration, ultrasound, and IR to give us these early indications. The key is to correctly set up the alarms so that we can trust the technology and not just sit and listen to it sound.
What early signals might you or your operators be ignoring that if caught could reduce repair cost and ancillary damage and increase production reliability?

Listen for the alarms do not wait for the fire
because at that point the situation is dire.

Saturday, November 24, 2012

Thanksgiving Leftovers and Lower Cost Reliability

What if after the Thanksgiving feast you just stuck the leftover food in your china cabinet until the next time you wanted them. This sounds crazy yet many times after plant outages when kitted work is completed the left overs are crammed into a tool box or under a desk in an office. A tool box is no more a suitable storage place for left over parts than a china cabinet is for left over turkey and potato salad. When you are done with the turkey it goes into the refrigerator and when you have leftover parts they should be returned to the maintenance store room where their quality can be verified and they can be preserved in a controlled environment. 
The questions are:
Do you have a "parts return to stores" process?
Does it make returns easy to prevent "squirreling" of parts on the shop floor?
Does it insure quality parts are kept in stores and damaged parts are disposed of?
In the end we want to prevent spoilage and insure that the part and the turkey provides maximum health and reliability and not infant mortality. This way of thinking will get us a lower cost level of reliability now.

Monday, November 19, 2012

5 Reasons to be Thankful for Reliability

Many of you are advocating reliability improvement within your facility on a daily basis. If you want to increase your buy-in from the general population than one of the best things you can do is tell them what's in it for them.
Here are 5 reasons to be thankful for reliability if you have it and why you might want it if you don't:
5. Better relationships with co-workers
If you are not fighting about lack of access to the equipment, quality problems, over budget maintenance cost or not meeting customer deadlines than your coworker act less like evil trolls and more like people you might want to play golf with on the weekend.
4. Less stress about job stability
When a facility runs reliably then it can be a star within the division as opposed to the whipping boy in last place on the cost curve waiting for the other shoe to drop. Reliable plants are not immune to closures and cutbacks but statistically they outlive their peers.
3. Afternoons with your kids
When you enjoy employment in a reliable facility there is very little forced overtime. This is thanks to great job plans, low reactive behavior and schedules that people adhere to. This lack of forced overtime means you can make more time to spend with your kids playing ball, reading and preparing for the future.
2. Holidays with family
When you have reliable equipment you will have fewer calls to deal with plant issues during your family time because the plant is performing, the outages are executed efficiently, and the staff is trained to deal with the minor issues that arise from time to time. This all means you can eat turkey with your parents, family and friends with out pesky plant problems calling you away.
1. Safety at work
A reliable plant is a safe plant. Both Ron Moore in the book "Making Common Sense Common Practice" and a study that is currently being done at the University of Tennessee has demonstrated the statistics to prove reliability improves safety.  But if we just think about it logically for a minute, if you can plan and schedule the repair with the right tools and the right parts and the right people with the right expertise then surely it must be safer than an emergency repair done on forced overtime after a long exhausting day with tools that don't work quite right and parts that don't fit quite right all while an operations manager continuously rushes you to get it running one way or another.

Tell us why you are thankful for your reliability in the comments below.

Monday, November 12, 2012

Solving the Crime of Unreliability: Elements of a Process for RCA

I was recently watching a popular crime drama on TV and I noticed that they follow a very similar process when solving a crime that I do when solving a reliability problem in a facility.
The first thing the detectives do is identify questions they have that they would like answers to and then collect all the evidence they can to begin to answer those questions. Then they build a timeline to understand where things fit around the crime. Then they combine the evidence and the timeline together and identify the motives and the finally the suspects. I have over simplified all they do but the core process steps are still there.
Solving the Crime of Unreliability in a facility starts with identification of the questions and the evidence to be collected. Then just like the investigator the next step is the collection of said evidence. I suggest folks use collection kits to help categorize and capture the data in its entirety. There is a blog here about the kits I use and what they contain.
Early on I skipped the element of time and did not complete the timeline or sequence of events prior to the use of other tools. Over time I learned this was a mistake in many cases and cause me to miss details. In two recent RCA investigations that were completed by others and reviewed and refined later by me, we discovered whole new causal chains and missed causes related to rebuild and maintenance execution that was not identified in the initial investigation. This was due to the fact that the original RCA team focused on their preconceived notions and did not look at what happen just previous to the failure in the sequence of events. Completing the sequence of events opened their eyes. It will do two things for you: first it identifies other potential causes and second it clarifies the causes that you have already identified. Just as the crime scene investigators then take the time lines and evidence and begin to look at the relationships I do the same. I choose a tool like fault tree or logic tree, among others, to attack evidence in the sequence and to draw the connections and the causal chains.  
It you find the crime of unreliability has been committed in your facility then you may want to make sure you have included each of these steps in your RCA process.

Monday, November 5, 2012

Force versus Finesse

Forcing Misalignment
This weekend I was watching my daughter assemble a toy robot. She is young and does not yet understand the concept of finesse. She is more of the brute force school of assembly. In her mind, if it doesn't go together then what you need to do is push harder. If you need to escalate beyond that then bang it on something. If that doesn't work then it is obviously broken and therefor can't be assembled. You can see her in the photo misaligned and still pushing harder.
Alignment Finesse
As I watched this I realized she is not the only one who does this on a regular basis. Many times we get a process or method in our mind and we skip the finesse and go strait for brute force. In our world it may look a bit different but it could sounds like this: "Boss says I have got to get this new process rolled out... I told our folks to just follow the new process... They didn't follow the process so I am going yell and scream about following the process... If that doesn't work then I will let them know that they are just not going to work out and I will look for someone who will follow the process." OK maybe a bit simplified and exaggerated but my money is on it not being that far off.
The real problem may be that the method being installed or implemented does not fit the application or the process is too different culturally for folks to grasp. Digging deeper we find that we did not take the time to look at the situation and understand all of the moving parts and how they work together. It could be that we missed an emotional, political, or rational issue with the change and until we look at each of these areas and address any underlying cause then adoption may escape us.
If you are in that situation take a look at the three areas shown to the above left. Think about what could go wrong in each of the dimensions and how you can proactively address that area using finesse in the place of sheer force.