Troubleshooting - Guide DC Scope English

troubleshooting1

The troubleshooting has two objectives:

To quickly identify VMs that are deteriorating, to anticipate anomalies.
To provide explanations and correct recurring problems.

DC Scope:

analyses all VMs or only the VMs of a selected filter
examines these VMs according to the thresholds exceeded
displays according to 2 codes visual codes: a more or less oriented arrow and a colour (red, This is the only way to get the most out of the system.)

*	*	*
1 Time beyond a threshold of best practices	=	1 Point according to the threshold exceeded.

Color of the arrow	Direction of the arrow

Direction of the arrow Indicates the evolution of the " health" of the VM :

downwards: improvement
upwards: deterioration.

The operating principle is as follows: The analysis period The analysis period is divided into two (period A and B). For each resource (cpu, disk and network) the number of points that exceed the thresholds set on the red criterion in period A and then in period B. If there is the same number of points between the 2 periods then the arrow is "flat". If zone B contains at least 5% more points more than the A area then the arrow is ascending. If area A contains more points than area B then the arrow is "descending". down".

Filtering and sorting

The filter allows you to display the GENERAL STATUS of the VMs that respond to a resource (cpu for example). It displays the colour and the arrow of the general state (most degraded resource and most critical state). The critical filter displays all VMs whose the tilt angle on one of the three resources (cpu, disk or net) is positive. The filter cpu filter displays all VMs whose colour orange or red on the cpu resource and whose arrow tilt is positive arrow is positive in cpu. The "disk" filter displays all the VMs whose the colour is orange or red on the disk resource and the the slope of the arrow is positive in disk. The net filter displays all VMs whose colour is orange or red on the network resource and whose arrow slope is positive in network. The sorting can be done by VM name or by the degree of slope of the arrow of each resource.

List view

It is possible to view in list form, the sorting is different. This view gives all the red points over the whole period and on all metrics.

troubleshooting3

Thresholds of Best Practices

For each resource there are two thresholds that should not be exceeded (orange and red) which correspond to two different levels of values.

For each resource, one or more metrics are analysed:

CPU resource

Item	Description	Orange	Red
Too much CPU activity on host	CPU overload at the hypervisor level (too much ready on VMs, waiting to give access)	5 %	10 %
Too much VCPU on VM	CPU overload at VM level (high COSTOP counter, too much VCPU allocated)	1 %	3 %
Virtual machine overload	Overload to "inside the VM	90 %	95 %

DISK resource

Item	Description	Orange	Red
Controls failed	Number of SCSI disk drives lost	1	5
Total latency	Average time to read and write on the disk	20ms	30ms

RAM resource

Item	Description	Orange	Red
Virtual machine overloaded	Virtual machine overloaded	70%	90%

NET resource

Item	Description	Orange	Red
Lost packets	Number of lost network packets	1	5

Direction of the arrow

Arrow	Points	Description
		As many points exceed the red threshold, in period A as in period B, this gives a horizontal arrow.
		No points above the red threshold gives a horizontal arrow
		Fewer points exceed the red threshold in period B than in period A, this gives a downward arrow.
		At least 5% more points exceeding the red threshold in period B gives an upward arrow. The greater the difference in the number of points between the 2 periods, the greater the angle of inclination of the arrow.

Color of the arrow

The colour indicates whether the VM has more or less exceeded good practice thresholds (see Good practice thresholds). (see Good Practice Thresholds) at the end of the analysis period. analysis period.

red & orange: thresholds exceeded
green: below the thresholds

For each resource (cpu, disk and network) the number of points that exceed the For each resource (cpu, disk and network), the number of points that exceed the thresholds set for the red and orange criteria is counted for the last 5 percent of the analysis period (period C). If X% of the total points are red over the period C then the colour is red otherwise if If X% of the total points are orange over the period C then the colour is orange otherwise the colour is green. The diagrams below illustrate the colour:

Arrow	Points	Description
		Area C corresponds to the last 5% of the total analysis period (A + B) .20 points in total over the whole analysis period. 2 red points in period C, i.e. 5% of the total points. The colour is therefore red.
		10 points in total and exceeding the amber threshold but not the red threshold over the whole analysis period. 2 points in period C i.e. 20% of total points. The colour is therefore orange and all the points are in the B zone which gives a maximum tilt.
		30 points in total and exceeding the orange threshold over the whole analysis period. Only 1 point in period C or 3.3% of total points. The colour is green.
		30 points in total and exceeding the amber threshold over the whole analysis period. No points in period C. The colour is green.
		50 points in total. 3 and 5 points respectively exceeding the red and orange thresholds, i.e. 6% and 10% in period C. The colour is therefore red (priority over orange).

Note that the threshold for zone C is configurable (by default at 5%), it to adjust the display of very critical VMs at the end of the period

Troubleshooting use case

1 - A click on degradation highlights the VMs concerned by this state, they are red and in degradation.

2 - It is possible to know more about it, by clicking on the desired VM and detect where the problem comes from.

3 - In this case, the problem comes from the CPU. To identify it clearly, click on it.

troubleshooting5

Click on the curve to get more details and zoom in on the points, get the MIN, MAX, MOV and points, get the MIN, MAX, MOY....

troubleshooting6