How to Implement Basic ITIL® Problem Management

If you’re in IT, fixing the same issues again and again may not mean you’re insane, but it’s definitely driving your customers crazy! Stop the Insanity with Problem Management!

Basic ITIL Problem ManagementMost organizations start looking at formal Problem Management when there’s a highly visible problem that they just can’t seem to fix. Customers are frustrated and complaining. IT’s credibility is on the line.

Take advantage of these hot issues by implementing Basic ITIL Problem Management!

The Basic ITIL Problem Management Process

Problem Management has one single goal: identify and resolve the underlying issues that cause Incidents. Unlike Incident Management, Problem Management is a systematic, methodical process where time to resolution is much less important than identifying and resolving the root cause.

Problem Management process steps:

  1. Identify a potential Problem
  2. Raise a Problem Management case
  3. Categorize and prioritize
  4. Systematic investigation (Root Cause Analysis)
  5. Identify change(s) needed to resolve and work through Change Management
  6. Verify problem has been resolved
  7. Close out problem

Let’s take a look at each of these steps in more detail.

Identify a Potential Problem

In the early stages of implementing Problem Management, there’s usually no shortage of problems to be resolved. It’s usually the one that’s in front of you. But longer term, you’ll want to engage your customers and IT staff.

Some organizations follow up all Major Incidents with a Problem Management case. Incident reporting is also a great source of possible cases.

Create a Problem Management Case

This is where you gather as much information about the problem as possible. You are essentially creating a proposal to allocate resources to the problem.

Include details like:

  • User contact information
  • Specifics about exactly what’s failing (applications, servers, networks, Pcs)
  • Number of users affected and business impact
  • Description of the Incident(s) – failure details, dates, times. Include links to all known related Incident records.
  • Details of troubleshooting steps taken and the results. Include any changes implemented.

Categorize and Prioritize

One of the key differences between Root Cause Analysis as a capability, and Problem Management as a process is the process of prioritizing which cases will be worked. Involve the business in the selection process. Don’t make assumptions.

Some criteria for selecting:

  • Business Impact of problem (pain point)
  • Business value if resolved.
  • Problem complexity
  • Availability of required skill sets
  • Estimated time and cost to resolve

You may choose a problem that’s easier to resolve to show immediate value and build Problem Management and Root Cause Analysiscredibility for Problem Management as an ongoing process.

Systematic Investigation (Root Cause Analysis)

If you have staff with Root Cause training and expertise, get them engaged. Leverage existing practices where possible.

Use whatever works, but make sure your approach is systematic and thorough.

Some common methods include:

  • Chronological
  • Kepner Tregoe
  • Brainstorming
  • 5-Whys
  • Fault Isolation
  • Ishikawa diagrams

All have strengths and weaknesses. Which you use depends on the specifics of the problem, the environment, complexity, organizational culture, skills and knowledge. It’s common to use elements from several methods.

Keep the goal firmly in mind: Identify and resolve the underlying root cause(s). It’s very tempting to dig into a problem, but stop short when someone says “aha! I found it, I know what’s wrong!”

Problem Management uses carefully selected changes to gain critical new information, or eliminate possible causes. It’s a very data-driven process. (See What is Basic ITIL Problem Management)

How does that work?

Identify and Manage Changes

Remember How the Scientific Method Works  from college science?

The Scientific Method steps:

  • Make an observation
  • Ask a question
  • Form a hypothesis
  • Conduct an experiment
  • Accept/reject hypothesis

That’s pretty much what we want to do in Problem Management – identify exactly what we’re changing and why, what data to collect, and how that will help us verify our hypothesis.

These steps are managed through formal Change Management (See What is Basic Change Management).

Verify Problem Has Been Resolved and Close

Don’t be in a hurry to close a Problem case. Underlying problems are usually very complex. If they weren’t, they would have been fixed long ago!

A single step often changes the equation, but doesn’t eliminate the underlying cause.

Simple example.

A server crashes daily. It appears to be memory utilization related. Memory is added and the daily crashes go away. Problem solved, right?

Until a week later when it goes down again. What the Heck?

Added memory changed the crash cycle time, but didn’t eliminate the underlying problem (application memory leak.)

Each time a change is implemented, the results are monitored. Carefully observe how the change effected the problem and use that information to form the next hypothesis.

Once you’ve verified that a change eliminated the problem, formally close the case. Communicate the results of your Problem Management effort. Celebrate success. Give the team credit for following a systematic process until the root cause was found and eliminated.

Conclusion

Problem Management tends to be one of those things that we all know we should be doing, but we never seem to have the time. (Of course, we always seem to have the time to fix the same things again and again.)

This approach to Problem Management is doable in any organization. It has huge value to the business, and can be started with very little effort.

Stop the Insanity and start eliminating Problems with Basic Problem Management!

Your turn: Let’s hear your experience with Problem Management