ITIL: Service Support
Problem Management
The Problem Management discipline involves the detection, resolution, and prevention of problems. A problem is the unknown cause of one or more incidents. Once the root cause of a problem has been determined it is a known error. Unlike in the Incident Management discipline, the focus of Problem Management is not on the speed of the resolution, but on the permanent resolution of the underlying problem. The goal of Problem Management is to minimize the negative effects of incidents and problems caused by errors in the IT infrastructure and prevent recurrence of incidents related to those errors.
Root Cause Analysis
PacketShaper's traffic monitoring tools let you drill into problems from the top down for rapid root cause analysis. Application performance and availability issues across the network can be difficult to track to a root cause because problems often occur seemingly randomly and then go away before you can fully trace them. Packeteer provides a full set of tools for finding the root cause, including:
- Historical application behavior records so that incidents such as "the application isn't slow now but it was yesterday and last Tuesday" can still be investigated. The PacketShaper stores measurement data on appliances for up to two months and for months or years when using ReportCenter. When creating graphs and reports, you can select a period (such as hour, day, week, or month), the number of periods, and an end date and time. So if you want to investigate an application's behavior over the past week or for a specific day last week, it's easy to do so. There are dozens of preconfigured graphs available, including per-application bandwidth utilization, efficiency, and response times.
- Network-wide time-matched data so that seemingly unrelated events can be correlated and then corroborated as the source.
- Highly granular real-time application performance alerts so that if an incident recurs, service staff can be notified immediately and perhaps see the problem as it's happening. Using PacketShaper's adaptive response feature, you can set up an agent that tracks a particular application's performance and if performance dips below a specified level, you can be notified that an incident has occurred. You can then view a report that was automatically generated for the period in which the incident happened. For an example, see Create an Agent that Monitors a Class Variable.
Because PacketShaper sits at the demarcation point where traffic crosses between the WAN/LAN, Service Provider/Enterprise, or Server/Network, it is able to pinpoint the source of problems. For example, suppose an end user reports that application sessions are dropping periodically. In order to route the call, the help desk needs to know whether the problem is with the server or the network, and if it's the server, which server? PacketShaper's response-time measurement (RTM) feature is able to answer these questions. See Analyze Application Response Times for more information on determining the breakdown of server and network delay for a particular application.
Problem Resolution
PacketShaper not only has tools for investigating the root cause of a problem, it has a bundle of tools for resolving the problem. Once you know what's causing the problem, you can use any of a number of techniques for implementing a permanent solution. Note that these techniques require the Shaping Module.
If your analysis shows that the root cause of an application's performance problems is due to the use of unsanctioned traffic consuming too much of the link's bandwidth, you need to put controls in place to Protect Critical Application Performance. You may also choose to Block Unwanted Traffic or Contain a Greedy Business Application.
To resolve incidents related to specific applications, see Per Application Strategies. Here you will find strategies for managing the performance of Citrix, ERP, Instant Messaging, P2P, streaming media, etc.
If you discovered a new aggressive application, you can control the application's impact on the performance of other applications by assigning an appropriate policy or partition to the new class. See Limit an Application's Total Bandwidth or Policy and Partition Guidelines for help.
Problem Prevention
If end users are complaining about the speed of FTP downloads and your analysis shows that one user's large download is slowing down all other downloads, you may want to Insulate Users of the Same Application to avoid similar problems in the future.
When many network administrators were caught by surprise by the Napster phenomenon several years ago, they decided they wanted to detect and contain any new applications before any chance to wreak havoc. The downside is, of course, that new desirable applications also get thrown in a contained, best-effort strategy until you change it.
If root cause analysis indicates that the reason behind performance problems is a single user consuming an excessive amount of bandwidth, you may want to put a system in place that prevents "bandwidth hogs" from taking more than their fair share of bandwidth. Two different techniques are described in Quarantine Bandwidth Abusers.
To prevent future problems with unsanctioned recreational traffic, such as Intenet radio, instant messaging, peer-to-peer, and VoIP, you can apply appropriate policies and partitions to contain this traffic.
Adaptive response is another useful tool for preventing network problems. You can use adaptive response agents to monitor network health and application performance so that support staff members can be alerted just as soon as a critical threshold is crossed. You can even create an action file that automatically takes corrective action after an incident occurs.
View the other disciplines in the ITIL Service Support area:
Please see bluecoat.com/support/packeteer
for more detailed information.
Resource Library

