troubleshooting1

The Troubleshooting module of DC Scope allows to easily identify the VMs in degradation in order to anticipate anomalies and to provide the relevant explanation to recurrent issues, as well as providing the guidelines to correct them.

The analysis period of the Troubleshooting module, corresponds to the last 30 days of data collected by DC Scope.

The Troubleshooting module is based on a point given to each VM every time it exceeds the threshold of best practices for defined counters. These points could be red for critical issues or orange for less critical issues. The total number of points is then analyzed by DC Scope and represented by arrows.

* * *
1 Time beyond a threshold of best practices = 1 Point according to the threshold exceeded.
Color of the arrow Direction of the arrow
troubleshooting2 troubleshooting2
The colour of the the circle around the arrow defines whether a VM has exceeded the threshold parameters of best practices.Red and Orange for the VMs that have exceeded the threshold of best practices.Green for the VMs that have not exceeded the threshold of best practices. The direction of the arrow defines the evolution in the number of times that a VM has exceed the threshold of best practices.If the arrow is pointing downwards, the number of times the VM has exceeded the threshold of best practices is decreasing. Therefore, the evolution of the VM is positive as is status is improving. If the arrow is pointing upwards, the number of times the VM has exceeded the threshold of best practices is increasing. Therefore, the evolution of the VM is negative as is status is in degradation.

Further information on the color and direction of the arrow is provided in the Troubleshooting submenu.

Filtering and sorting

A filter in the top of the frame of the troubleshooting module, helps to prioritize de issues to resolve in the infrastructure.

troubleshooting2

The Critical button filter is used to display all the VMs in orange or red whose inclination angle in one of the resources (CPU, Disk, Net or RAM) is positive, and therefore its status is in degradation.

The CPU, Disk, Net and RAM button filter displays the VMs in orange or red and which arrow is positive (in degradation) according to the resource selected. The sorting can be done also by VM name or by the inclination degree of the arrow in each resource.

List view

It is possible to view the Troubleshooting module as a list. This view provides the total number of points during the whole period of analysis for all the resources.

troubleshooting3

Thresholds of Best Practices

For each resource there are different metrics to identify issues in the VMs. These metrics have two thresholds levels that should not be exceeded and that define the color of the point: Red when the most critical threshold has been exceeded or Orange: when the less critical threshold has been exceeded.

The list of metrics and threshold is as follows:

CPU resource

Item Description Orange Red
Too much CPU activity on host CPU overload at the hypervisor level (too much ready on VMs, waiting to give access) 5 % 10 %
Too much VCPU on VM CPU overload at VM level (high COSTOP counter, too much VCPU allocated) 1 % 3 %
Virtual machine overload Overload to «inside the VM» 90 % 95 %

DISK resource

Item Description Orange Red
Controls failed Number of SCSI disk drives lost 1 5
Total latency Average time to read and write on the disk 20ms 30ms

RAM resource

Item Description Orange Red
Virtual machine overload Virtual machine overloaded 70% 90%

NET resource

Item Description Orange Red
Lost packets Number of lost network packets 1 5

Direction of the arrow

The analysis period is divided in two equal parts (period A and period B) for each resource (CPU, Disk, RAM and Network). DC Scope then compares the total number of points in the period A and B.

Arrow Points Description
arrow1 arrow1 If the number of points in both periods is the same, the arrow will be horizontal (pointing right)
arrow1 arrow1 If the number of points in both periods is the same, the arrow will be horizontal (pointing right)
arrow1 arrow1 If the period A contains more points than period B then the arrow will be pointing downwards, indicating an improvement in the VM.
arrow1 arrow1 If the period B contains at least 5% more points than the period A, the arrow will be pointing upwards, indicating a degradation in the VM.

Color of the arrow

To define the color of the arrow, the analysis period is divided in three parts: Period A, B and C. The Period C represents the last 5% of time of the analysis. Period A and B are two equal parts of the rest of the time of the analysis:

troubleshooting4

For each resource (CPU, Disk, RAM and Network), DC Scope will include the points that exceed the red and orange thresholds and will locate them in the appropriate period of time. Then, if the points in Period C represent more than X% of the total number of points of the complete analysis period, it will assign the color of those points. For the following examples we have defined a threshold of 5% of the number of points in Period C:

Arrow Points Description
arrow1 arrow1 20 points in total. 2 red dots in period C are the 10% of the total points. The colour is red.
arrow1 arrow1 10 points in total. 2 points in the period C are 20% of the total points. The colour is orange.
arrow1 arrow1 30 points in total. 1 point in period C is 3.3% of total points. The colour is green.
arrow1 arrow1 30 points in total. Any points in period C. The colour is green.
arrow1 arrow1 50 points in total. 3 red points (6%) and 5 orange points (10%) in period C. The colour is red as it has priority over the orange.

By clicking on the key symbol at the top of le the frame, you can customize the percentage of points in the Period C. By default, this rate is 5%

Troubleshooting use case

Clicking on the button Critical will filter the VMs in red and orange and which status is in degradation.

troubleshooting5

Clicking on one VM will open a new frame which shows a colored dot for each resource.

troubleshooting6

Clicking on each dot shows the criteria and the total number of orange and red points for the resource. By clicking the number of points, DC Scope will provide a graphic with the date and time when the VM exceeded the threshold.

troubleshooting7

Clicking on the points graphic will open a popup frame with a chronological behavior and details of the average, maximum and minimum points for each metric.

troubleshooting8