If you’re in IT, fixing the same issues again and again may not mean you’re insane, but it’s definitely driving your customers crazy! Stop the Insanity with Problem Management!
Most organizations start looking at formal Problem Management when there’s a highly visible problem that they just can’t seem to fix. Customers are frustrated and complaining. IT’s credibility is on the line.
Take advantage of these hot issues by implementing Basic ITIL Problem Management!
The Basic ITIL Problem Management Process
Problem Management has one single goal: identify and resolve the underlying issues that cause Incidents. Unlike Incident Management, Problem Management is a systematic, methodical process where time to resolution is much less important than identifying and resolving the root cause.
Problem Management process steps:
- Identify a potential Problem
- Raise a Problem Management case
- Categorize and prioritize
- Systematic investigation (Root Cause Analysis)
- Identify change(s) needed to resolve and work through Change Management
- Verify problem has been resolved
- Close out problem
Let’s take a look at each of these steps in more detail.
Identify a Potential Problem
In the early stages of implementing Problem Management, there’s usually no shortage of problems to be resolved. It’s usually the one that’s in front of you. But longer term, you’ll want to engage your customers and IT staff.
Some organizations follow up all Major Incidents with a Problem Management case. Incident reporting is also a great source of possible cases.
Create a Problem Management Case
This is where you gather as much information about the problem as possible. You are essentially creating a proposal to allocate resources to the problem.
Include details like:
- User contact information
- Specifics about exactly what’s failing (applications, servers, networks, Pcs)
- Number of users affected and business impact
- Description of the Incident(s) – failure details, dates, times. Include links to all known related Incident records.
- Details of troubleshooting steps taken and the results. Include any changes implemented.
Categorize and Prioritize
One of the key differences between Root Cause Analysis as a capability, and Problem Management as a process is the process of prioritizing which cases will be worked. Involve the business in the selection process. Don’t make assumptions.
Some criteria for selecting:
- Business Impact of problem (pain point)
- Business value if resolved.
- Problem complexity
- Availability of required skill sets
- Estimated time and cost to resolve
Systematic Investigation (Root Cause Analysis)
If you have staff with Root Cause training and expertise, get them engaged. Leverage existing practices where possible.
Use whatever works, but make sure your approach is systematic and thorough.
Some common methods include:
- Kepner Tregoe
- Fault Isolation
- Ishikawa diagrams
All have strengths and weaknesses. Which you use depends on the specifics of the problem, the environment, complexity, organizational culture, skills and knowledge. It’s common to use elements from several methods.
Keep the goal firmly in mind: Identify and resolve the underlying root cause(s). It’s very tempting to dig into a problem, but stop short when someone says “aha! I found it, I know what’s wrong!”
Problem Management uses carefully selected changes to gain critical new information, or eliminate possible causes. It’s a very data-driven process. (See What is Basic ITIL Problem Management)
How does that work?
Identify and Manage Changes
Remember How the Scientific Method Works from college science?
The Scientific Method steps:
- Make an observation
- Ask a question
- Form a hypothesis
- Conduct an experiment
- Accept/reject hypothesis
That’s pretty much what we want to do in Problem Management – identify exactly what we’re changing and why, what data to collect, and how that will help us verify our hypothesis.
These steps are managed through formal Change Management (See What is Basic Change Management).
Verify Problem Has Been Resolved and Close
Don’t be in a hurry to close a Problem case. Underlying problems are usually very complex. If they weren’t, they would have been fixed long ago!
A single step often changes the equation, but doesn’t eliminate the underlying cause.
A server crashes daily. It appears to be memory utilization related. Memory is added and the daily crashes go away. Problem solved, right?
Until a week later when it goes down again. What the Heck?
Added memory changed the crash cycle time, but didn’t eliminate the underlying problem (application memory leak.)
Each time a change is implemented, the results are monitored. Carefully observe how the change effected the problem and use that information to form the next hypothesis.
Once you’ve verified that a change eliminated the problem, formally close the case. Communicate the results of your Problem Management effort. Celebrate success. Give the team credit for following a systematic process until the root cause was found and eliminated.
Problem Management tends to be one of those things that we all know we should be doing, but we never seem to have the time. (Of course, we always seem to have the time to fix the same things again and again.)
This approach to Problem Management is doable in any organization. It has huge value to the business, and can be started with very little effort.
Stop the Insanity and start eliminating Problems with Basic Problem Management!
Your turn: Let’s hear your experience with Problem Management