PROBLEM MANAGEMENT PROCESS
Information Services
April 1998
Version 1.02
TABLE OF CONTENTS
i. PREFACE
1.0 INTRODUCTION
1.1 AUDIENCE
1.3 OBJECTIVES
1.4 BENEFITS
1.5 DEFINITIONS
2.0 PROCESS
2.1 PROCESS SUMMARY
2.2 PROBLEM MANAGEMENT PROCESS FLOWCHART
2.3.1 PROCEDURAL FIX
2.3.2 NON-PROCEDURAL FIX
3.0 ROLES AND RESPONSIBILITIES
3.1 INTRODUCTION
3.2 CUSTOMER
3.5.1 PROBLEM FACILITATOR
3.6 PROBLEM MANAGER
4.0 PROBLEM MANAGEMENT STEWARDSHIP
4.1 INTRODUCTION
4.3.1 FUNCTIONAL TEAM STEWARDSHIP (FTS) WEEKLY REVIEW
4.3.2 SERVICE LEVEL AGREEMENT (SLA) MONTHLY REVIEW
6.0 PRIORITIES
7.0 ESCALATION
APPENDIX A
APPENDIX B
APPENDIX C
KEPNER-TREGOE PROBLEM SOLVING PROCESSES
Figure #1 Process Flow
Figure #2 Problem Resolution Roles
Figure #3 Stewardship Roles
Figure #4 Stewardship Meetings
Information Sevices (IS) is responsible for delivering reliable, high quality service to Syncrude. A critical step in meeting this objective is the use of an effective process to resolve and eliminate problems identified by customers of IS. A problem is defined as "anything that prevents a customer from meeting his or her commitments."
Problem Management is the process that ensures problems will be resolved in a consistent and timely fashion.
Problem Management can only be successful with the 100% commitment of all parties. Therefore, all IS staff, contractors and customer organizations are expected to be active and constructive participants.
This document presents the problem management process flow. While the individual steps and process flow are not optional, the details of each step are guiding principles and are not necessarily an all inclusive (or exclusive) list of required steps.
The principles of Problem Management involve:
· restoring business functionality
· defining problems and their resolutions
· providing data for review purposes in stewardship meetings
· providing input for design changes to the Problem Management process
This document is intended as a guide for anyone at Syncrude who needs to resolve problems pertaining to computer applications and/or computer technology, where the problems prevent the customers of the application or technology from meeting their business commitments.
This document outlines a process that identifies the accountability and escalation process used to identify and resolve problems. Problem-solving tools that can be used to analyze these concerns are also identified.
The Problem Management process provides a mechanism to help steward to the expectations of all parties involved while not obstructing the continuance of business. It also provides a consistent methodology for problem resolution.
The process will be reviewed on a regular basis to continuously improve it. Other Syncrude systems will be reviewed to determine if Problem Management can be incorporated and/or consolidated with them.
1.3 OBJECTIVES
The overall objective of this process is to ensure that problems are being dealt with by the appropriate individuals and/or teams.
Specific objectives include:
·
To accurately record all problems in a timely, responsive manner.·
To quantify the number and nature of problems that exist.·
To shorten problem determination and resolution time by creating a knowledge base to draw on.·
To categorize problems into levels of criticality to help determine the number of resources required to resolve them.·
To strive toward eliminating all problems.1.4 BENEFITS
The benefits of Syncrude using a consistent methodology to resolve technology related problems are:
·
To reduce and minimize the impact of problems on customers and Syncrude.·
To provide a consistent methodology for problem resolution.·
To ensure all parties affected or involved are kept informed of the progress and have input at each stage of problem evaluation and resolution, as required.·
To provide a technical or business resolution whenever possible to support continuance of business.·
To provide a mechanism to:1. steward to problem-related expectations
2. measure the effectiveness of the problem resolution process
3. provide data to identify areas for improvement
·
To provide a descriptive database that can be analyzed to determine trends.·
To identify recurring problems and assign ownership to the appropriate individual and/or team for resolution.Throughout this document there are certain terms and acronyms used. The following is a list of these terms and their definition as they pertain to the Problem Management process.
Expectations:
Mutual agreement between customer and IS support that identified problems will be resolved in a timely manner. A Service Level Agreement (SLA) between the customer and IS could define customer expectations. Past experience with problem resolution may also influence both the customer expectations and the SLA.
KT Process:
Refer to the Kepner/Tregoe Analytical Trouble Shooting and Problem Analysis & Decision Making methodologies.
Problem:
Anything related to computer applications and/or technology that prevents the customer(s) from meeting their commitments.
Service Level Agreement (SLA):
A Service Level Agreement is one of the mechanisms for defining and stewarding to customer expectations.
Stewardship:
The process of comparing what has been done against what was to be done. It also indicates deviations from the process and allows for monitoring and analyzing trends.
Return to TopReview the process flowchart (See Figure 1) to see a graphical depiction of the process.
1. The person who experiences the problem reports it to Level One support (Help Desk, LAN Administrators) or any IS person who is called directly.
2. The problem is analyzed by Level One support to determine if it can easily be resolved using procedural resolutions. If it can be, the problem is fixed. Otherwise, it is escalated to Level Two for resolution.
3. The problem is analyzed by Level Two support to determine if it can be resolved within their area of expertise (either individually or with an informal team). If it can be, the problem is fixed. Otherwise, a decision is made whether to pull together a Level 3 team.
Communication to the customer who identified the problem, the Help Desk and the LAN Administrator is very important throughout the process.
The problem can at anytime be escalated to the highest level of support if the seriousness, urgency and growth of the problem indicate this is necessary to resolve the problem.
2.2 PROBLEM MANAGEMENT PROCESS FLOWCHART

Customer
1. Identify a computer application or technology infrastructure problem.
2. Contact Level One Support to report problem. Provides:
a) description of problem
b) description of what customer was doing when problem occurred
c) personal customer information
¨ cost center
¨ employee number
¨ phone number
¨ mail drop
¨ customer's physical location
¨ customer's PC system number
¨ activity code (if applicable)
d) alternate customer contact for contractor follow- up to Level One Support
Level One Support
1. Open Service Ticket by logging problem to Help Desk Application.
2. Determine nature of problem:
a) If it is procedural (or is easily fixed) and can be fixed:
i) fix problem
ii) communicate resolution to customer.
iii) close out Service Ticket
b) If it is not procedural (or is easily fixed) and can be fixed:
i) do same as in a) above
ii) identify if a procedure is required or enter resolution in the knowledge database
c) If it is not procedural (or is easily fixed) and cannot be fixed:
i) communicate the status of the problem to the customer
ii) implement business solution (if available)
iii) assign the problem to the appropriate person (Level 2 Support)
iv) contact assigned support person
v) update the problem with the relevant information (Knowledge Base)
Level Two Support
1. Acknowledge ownership of problem from Level One Support.
2. Evaluate problem:
a) If problem can be fixed quickly by one individual:
i) fix problem
ii) update Knowledge Base
iii) communicate resolution to Level One Support (as requested)
iv) communicate resolution to customer
v) close out Service Ticket
b) If problem cannot be fixed quickly:
i) communicate status to Level One Support (as requested)
ii) communicate status to customer
iii) implement interim business solution
iv) resolve problem
v) update Knowledge Base
vi) communicate resolution to Level One Support (as requested)
vii) communicate resolution to customer
viii) close Service Ticket
c) If problem cannot be fixed quickly and requires an informal team to resolve:
i) communicate status to Level One Support (as requested)
ii) communicate status to customer
iii) assemble informal team
Informal Team:
iv) implement interim business solution
v) resolve problem
vi) update Knowledge Base
vii) communicate resolution to Level One Support (as requested)
viii) communicate resolution to customer
ix) close Service Ticket
d) If Level Two Support cannot resolve the problem:
Level Two Support:
i) communicate status to Level One Support (as requested)
ii) communicate status to customer
iii) escalate problem to Level Three Support
iv) update Knowledge Base
Level Three Support:
v) assemble formal team, assign Leader
Formal Team:
vi) implement interim business solution
vii) resolve problem (using formal analytical processes with KT facilitator)
viii) update Knowledge Base
ix) communicate resolution to Level One Support (as requested)
x) communicate resolution to customer
xi) close Service Ticket
Level 2/3 Support
3. Initiate Change Management if applicable. (refer to Change Management Process Document)
Level Three Support
4. Prepare formal report on problem resolution if applicable.
Problem Coordinator
5. files in appropriate location as identified by Team.
3.0 ROLES AND RESPONSIBILITIES
This section addresses accountability by identifying the various roles involved in the Problem Management Process and the responsibilities of those roles. An individual may fulfill one or more of these roles in each problem resolution process. (See Figure #2)

Figure 2 - Roles
A customer is any individual who identifies a problem and who may also be impacted by the problem, e.g., Syncrude Employees and Contractors. A customer starts the Problem Management process.
Customers are responsible for identifying problems to their LAN Administrator or to the Help Desk. Information that needs to be communicated includes the following:
· Description of the problem including pertinent information such as error messages to assist in the troubleshooting process
· Personal customer information, eg., employee ID, cost center
· Physical Location of the customer reporting the problem
· Customer's PC System Number when appropriate
· Alternate Contact for follow-up
Level One Support is the individual to whom the problem is initially reported, eg. Help Desk, LAN Administrator, System Administrator, IS personnel.
A Level One support person is responsible for ensuring all appropriate data is captured and logged. This individual determines if the problem can be resolved within their expertise by using documented procedures, general knowledge or if it requires routing to a higher support level. If escalation to Level Two and/or Three is required, this person ensures it is done promptly.
The Level One support person is also responsible for:
· Identifying expectations of the customer.
· Resolving problems using documented procedures.
¨ Communicating problem resolution to the customer.
¨ Closing of the Service Ticket upon completion of the problem resolution.
Level Two Support are individuals assigned to support and maintain computer applications and technology infrastructure (eg. Communications, Desktop Services, MultiTech, Wiltel, etc.) At this level, support may be an individual or an informal team.
Level Two Support is responsible for:
· Performing a more in-depth problem analysis and attempting to fix the problem within the level of their technical expertise.
· Communicating to the Help Desk and/or customer
· Documenting ongoing update notes on open calls
· Logging resolutions/work done (close the Service Ticket) which will maintain the viability of a knowledge base.
If escalation is required to Level Three, Level Two Support ensures it is done promptly and that the necessary communication takes place. (eg. Notification to Supervisor, Help Desk, etc.)
Level Three Support is provided by a formal, multi-skilled/cross technology team dedicated to resolving the problem. This team may also contain resources from the Customer Business area and/or vendors.
This team will ensure:
· A formal KT process is followed to identify and resolve the problem.
· That status updates, based on criticality, are directed to the appropriate management personnel on an ongoing basis.
· That "whatever it takes" to get the problem resolved is done.
· That an appointed member of the team will close the Service Ticket when the problem is resolved and/or a fix implemented.
· That appropriate documentation outlining the resolution is recorded (either on the service ticket or by formal report).
The Problem Facilitator is a resource, external to the problem, asked by the Problem Resolution Team Leader to manage the KT process. This individual is responsible to keep the Problem Resolution Team on track. This individual should not be involved in discussions of the technical details. (See Appendix A for list)
3.5.2 Problem Resolution Team Leader
The Problem Resolution Team Leader is the individual assigned to lead the problem resolution project from a technical perspective.
This individual is responsible for:
·
Escalating any problem or set of problems that the team has determined to be outside their range of expertise.·
Communicating to the appropriate individuals as determined by the seriousness, urgency, and growth of the situation.·
Coordinating the efforts of personnel providing a work around while the Functional Teams carry on with their process.·
Providing status updates.·
Projecting resolution targets.·
Providing a detailed report to the Problem Manager on the entire process.·
Identifying and ensuring that outstanding issues are assigned to the appropriate individuals and/or teams prior to the team disbanding .The Problem Manager is held accountable for the effectiveness and efficiency of the problem management process. The Problem Manager has the authority to correct process deficiencies when they are recognized or identified in the Problem Management Process semi-annual review. Changes to the process will be communicated by the Problem Manager to the users of this process.
The Problem Manager duties include:
· organizing the semi-annual review meetings
· ensuring minutes are taken and distributed
· updating and follow up of action log items
A Problem Coordinator is an individual nominated by, or selected from each Functional Team. (See Appendix B for list)
This individual will:
·
Provide stewardship at a Departmental level on the Functional Team's problems and resolution of those problems.·
Problems that are unresolved or outstanding will be presented for review:
Functional - Weekly Review
IS (Departmental) - Monthly Review
·
Generate reports for stewardship meetings (Functional/Departmental).·
Report Near Misses and ensure that a Loss Control Report is issued when appropriate.·
Ensure communication on results of resolutions and resolution attempts occurred to Help Desk and Customer.· Ensure problem resolution reports from Level Two and Three are reviewed and filed appropriately.
·
Participate when requested on Level Three problem reviews.·
Participate in the Problem Management Process Review (Semi-Annually). Log concerns and / or problems in the Process Log for review at these meetings. Return to Top4.0 PROBLEM MANAGEMENT STEWARDSHIP
For Problem Management to be successful, continuous stewardship at all levels must take place. Individual teams are responsible to set review meetings and identify areas of concern for further investigation.
The ultimate goal is to manage problems so that they can be eliminated. Teams and/or individuals will manage this process, with escalation (communication) to management when necessary.
4.2 STEWARDSHIP ROLES & RESPONSIBILITIES
4.2.1 Functional Team Stewardship
A Functional Team is a group of individuals who share like technical skills, eg. LAN Administrators, Radio Team, etc.
Each Functional Team will incorporate into their whole job task a weekly Problem Management Stewardship Review where all problems associated with the team will be reviewed. This review will encompass all outstanding problems as well as those problems that were resolved. Each Functional Team will document outstanding issues and actions and assign a Problem Coordinator who will be a member of the Departmental Stewardship Team.
One individual from each Functional Team needs to receive KT training.
4.2.2 Departmental Stewardship Team
The Departmental Stewardship Team is composed of Problem Management Coordinators from the Functional Stewardship Teams (See Figure #3).
This team is responsible for reviewing and stewarding to problems and their resolution on a set basis for the purpose of sharing information.

Figure 3 - Stewardship Roles
Note that stewardship meetings are not the only method of communication between groups. Stewardship is the formal method but communication should be ongoing on a daily, informal basis (eg, morning meetings, one to one). (See Figure #4)

Figure 4 - Stewardship Meetings
4.3.1 Functional Team Stewardship (FTS) Weekly Review
The minimum recommended activities for each Functional Stewardship Team to discuss at the weekly meeting are:
· Review all open problems
· Are the current status and priority level of each correct?
· Are proper resources assigned?
· Are customer communications/expectations being met?
· Review all closed problems
· Are there any re-occurring problems?
· Was each problem properly documented?
· Is follow-up work required?
· Were customer expectations met?
· Review recurring problems (closed or open)
· Maintain action log for outstanding tasks.
4.3.2 Service Level Agreement (SLA) Monthly Review
The members of this team are from both the customer business area and IS. It will be required that all Business System Groups follow the Problem Management process as outlined in the Service Level Agreement. It is expected that each Business Service Group will assign their own Problem Co-Ordinator.
These review discussions should include applications supported in the area, LAN Administration issues and other areas of concern. The minimum recommended activities for each SLA Review are as follows:
·
Review all open problems· Are the current status and priority level of each correct?
· Are proper resources assigned?
· Are customer communications/expectations being met?
·
Review all closed problems· Are there any recurring problems?
· Was each problem properly documented?
· Is follow-up work required?
· Were customer expectations met?
·
Review recurring problems (closed or open)·
Maintain action log for outstanding tasks.·
Review all long term, major and re-occurring problems that have resulted during the past month and what is being done to rectify them from both IG and Customer perspectives.·
Evaluate overall satisfaction of the relationship between IS and the Customer in terms of:· Responsiveness
· Quality of fixes
· Quality and quantity of communication
·
Review of outstanding issues and concerns.4.3.3 Information Services (IS) Monthly Review
The members of this team are representatives from the various teams within the IS Infrastructure Support Group.
Stewardship reviews of problems are conducted monthly and stewardship activities include:
·
Review status of all long term, re-occurring and cross team problems to ensure the following issues are being dealt with properly:· Severity level
· Resource allocation
· Expectations clearly defined, realistic and being stewarded to by all members working on a problem.
·
Review of IS Functional Stewardship Teams for responsiveness.·
Review of all closed priority 'A' (1 hour) and 'B' (4 hour) problems for completeness of documentation and to insure that follow up actions are being dealt with appropriately.4.3.4 Problem Management Process Review
The process will be reviewed on a semi-annual basis. The members of this committee will consist of the Problem Manager, Problem Coordinators and representative(s) from the business community.
The review will focus on:
·
Is the process meeting its objectives effectively? Is it satisfying the requirements of the process users, IT Management and IT's customers?·
How are we measuring compliance, and what are the results?·
Are the users of the process having difficulties with it? i.e., are there things about it that users do not understand?·
Have previously identified process deficiencies been corrected?·
Does the process need revision? Return to TopTools for use during problem-solving exercises are:
1. Help Desk Application
2. Kepner-Tregoe Problem Solving processes
3. Manuals
4. Existing Knowledge Base
5. Peer Knowledge
6. Individual's personal expertise
7. Vendor Tools
(See Appendix C for a more detailed description of the above tools.)
This is a during-the-fact control applied during the Problem Management process.
Setting priorities applies established criteria for escalation while the problem is still active. When the allotted time for problem resolution has exceeded the standards established by individual teams, the tracking mechanism alerts higher level support and/or management to the fact so alternative actions can be considered.
Escalation allows controlled actions to take place in time to change the effect of the problem.
Priorities:
A: (1-4 hrs)
System/application outages that have an immediate impact on critical company operations or affect many Customers.
B: (4 hrs - 1 day)
System/application outages that impact non-critical company operations or affect productivity of individual Customers.
C: (1 day and over)
System annoyances that may impact productivity of individual Customers but where alternatives are available to the customer. Can also be requests for information.
Escalation is meant to enable the problem solving process not deter from it.
When a problem is identified, the clock starts ticking; it does not stop until the problem is resolved. At certain predetermined times' action must take place to insure efficient and effective problem resolution. Note that time is a secondary reason for escalation. The primary reason is problem severity/priority. As well, certain communications must be made to management and customers so they can take whatever action they feel necessary to carry out their business. This reduces the number of inquiries directed to the Problem Solvers and allows them get on with solving the problem.
Escalation consists of two components:
· Technical Escalation: The problem solver involves more resources, at predetermined times, until resolution is obtained. These resources can be peers, technical people from other disciplines, problem solving facilitators, vendors and consultants.
· Business escalation: Communication to supervisors, management, the Help Desk and customers in order to give an indication of:
· the length of time it will take to solve the problem
· at what point in the technical escalation the problem is so the business issue surrounding the problem may be managed to minimize interruptions to the technical people trying to solve the problem.
Involvement in the escalation process is handled as follows;
|
Level |
Business |
Technical |
|
1 |
Help Desk
Customer LAN Administrator |
Help Desk
LAN Administrator |
|
2 |
Help Desk Coordinator
LAN Administrator Problem Coordinator System Administrator |
Help Desk Coordinator
LAN Administrator Problem Coordinator Functional Team |
|
3 |
Help Desk Coordinator
Business Rep. LAN Administrator Management Functional Teams |
Help Desk Coordinator
LAN Administrator Functional Teams |
APPENDIX A
APPENDIX B
APPENDIX C
The Help Desk database was developed to record all calls that come into the Help Desk. The ARS application was developed to give anyone access to input their problems and to share problem resolution information.
It is our intention to now expand this to include problem management for tracking and escalation purposes.
KEPNER-TREGOE PROBLEM SOLVING PROCESSES
Known as "KT" for short, this approach to problem solving incorporates common-sense ideas everyone uses to resolve concerns. Because this is a systematic, thorough approach to dealing with concerns, KT helps gather and organize information effectively. It guides problem solvers through unfamiliar situations by helping them draw on their own experience and judgment.
Each different KT process fulfills a clearly identified purpose. The choice of which process to use will depend on the questions that need to be answered.
To facilitate this process one individual from each Functional Team needs to receive KT training.
Situation Appraisal: This is a starting place - a method for taking a complex or ill-defined situation and sorting it out. Situation appraisal does not find specific solutions. Its purpose is to clarify a concern that needs to be resolved by the other processes.
Problem Analysis: Effective action cannot be taken unless the cause of the problem is known. This is an efficient way to find the true cause of a problem before committing a solution.
Decision Analysis: Helps shift the focus from alternatives to objectives, encouraging the individual or team to carefully define the decision to made before jumping to conclusions. A more carefully reasoned decision, one based on information and rational analysis is an outcome of Decision Analysis.
Potential Problem Analysis: Helps to anticipate the difficulties that might arise and determine what can be done about them. Using PPA, actions are identified to prevent problems from interfering with the successful implementation of the plan. The end result is an improved plan that can be monitored and implemented successfully.
Root Cause Analysis: Root Cause Analysis is the process of asking the question of what caused the cause until the question can no longer be answered. The process is much like pealing an onion with the outer layers being technical causes and the inner layers being process causes. The final question that must be asked is what can be done to prevent reoccurrence of this problem.
KT Analytical Troubleshooting (ATS)
Analytical Troubleshooting (ATS):- This is a methodology for problem solving and problem determination. This methodology goes much farther than the problem analysis module in the more familiar Kepner Tregoe Genco methodology that most people are familiar with in Syncrude.
This detailed problem solving/problem determination is usually used for determine the cause of problems that have a major impact and a high risk of reoccurring if not corrected.
Help Desk staff and/or LAN Administrators will have a library of software and printer manuals for troubleshooting and education purposes.
Level Two support will have a library of technical manuals to support the applications and network.
EXISTING KNOWLEDGE BASE (Under Review)
Currently the knowledge base within the Help Desk application is inadequate. All levels of support will be responsible, however, to ensure proper comments are placed in the description when the service ticket is closed. This will make it easier to recognize reoccurring problems and facilitate follow-ups.
Where documenting the resolution is not viable using the Help Desk application, a method by each Team will be established for storage of this information. The Problem Coordinators and/or Problem Resolution Team Leader will store the detailed resolution reports in this location.
* This is an interim solution until such time as the current Help Desk application is replaced.
Interdepartmental networking will provide individuals with the opportunity to share knowledge among their peers.
By having regular stewardship meetings peer knowledge will be more readily available. Also as the meetings spread from individuals, teams, departments, information and knowledge will be more encompassing.
INDIVIDUAL'S PERSONAL EXPERTISE
It is the responsibility of each individual to maintain an appropriate level of expertise. This can be achieved through formal training (Vendor Certification) and as a result of mentoring.
The following provides a sampling of the vendor tools we have available to us:
·
Computer Select·
Microsoft Tech net·
Gartner CD-ROM·
Microsoft Developers Net·
On-line Help Guides·
CBT (Computer Based Training)·
Premier Support for technicians·
Vendor Contact database (to be communicated)