Congestion monitoring on egress port scheduler

A typical example of congestion monitoring on an Egress Port Scheduler (EPS) is when the EPS is configured within a Vport. A Vport is a construct in an H-QoS hierarchy that can be used to control the bandwidth associated with an access network element (such as, GPON port, OLT, DSLAM) or a retailer that has subscribers on an access node (among other retailers).

The example in Figure: GPON bandwidth control through Vport shows Vports representing GPON ports on an OLT. For capacity planning purposes, it is necessary to know if the GPON ports (Vports) are congested. Frequent and prolonged congestion on the Vport prompts the operator to increase the offered bandwidth to its subscribers by allocating additional GPON ports and subsequently moving the subscribers to the newly allocated GPON ports.

Figure: GPON bandwidth control through Vport

There are no forward/drop counters directly associated with the EPS. Instead, the counters are maintained on a per queue level. Consequently, any indication of the congestion level on the EPS is derived from the queue counters that are associated with the specific EPS.

The EPS congestion monitoring capabilities rely on a counter that records the number of times that the offered EPS load (measured at the queue level) crossed the predefined bandwidth threshold levels within an operator-defined timeframe. This counter is called the exceed counter. The rate comparison calculation (offered rate vs threshold) are executed several times per second and the calculation interval cannot be influenced externally by the operator.

The monitoring threshold can be configured via CLI per aggregate EPS rate, EPS level or EPS group. The threshold is applicable to PIR rates.

To enable congestion monitoring on EPS, monitoring must be explicitly enabled under the Vport object itself or under the physical port when the EPS is attached directly to the physical port. In addition, the monitoring threshold within the EPS must be configured.

Two examples of congestion monitoring on an EPS that is configured under the Vport are shown in Figure: Exceed counts and Figure: Exceed counts (severe congestion). Figure: Exceed counts (severe congestion) shows more severe congestion than Figure: Exceed counts. The EPS exceed counter (the number of dots above the threshold line) can be obtained via a CLI show command or read directly via MIBs.

Figure: Exceed counts
Figure: Exceed counts (severe congestion)

When the exceed counter value is obtained, the counter should be cleared, which resets the exceed counter and number of samples to zero. This is because the longer the interval between a clear and a show or read, the more diluted the congestion information becomes. For example, 100 threshold exceeds within a 5-minute interval depicts a more accurate congestion picture compared to 100 threshold exceeds within a 5-hour interval.

The reduced ability to determine the time of congestion if the reading interval is too long is shown in Figure: Determining the time of congestion (example 1), Figure: Determining the time of congestion (example 2), and Figure: Determining the time of congestion (example 3). It can be seen that the same readings (in the 3 examples) can represent different congestion patterns that occur at different times between the two consecutive reads. The congestion pattern, or the exact time of congestion cannot be determined from the reading itself. The reading only indicates that the congestion occurred x number of times between the two consecutive readings. In the example shown in Figure: Determining the time of congestion (example 1), Figure: Determining the time of congestion (example 2), and Figure: Determining the time of congestion (example 3), an operator can decipher that the link was congested 20% of the time during a one-day period without being able to pinpoint the exact time of congestion within the one-day period. To determine the time of the congestion more accurately, the operator must collect the information more frequently. For example, if the information is collected every 30 minutes, then the operator can determine the part of the day during which congestion occurred within 30 minutes of accuracy.

Figure: Determining the time of congestion (example 1)
Figure: Determining the time of congestion (example 2)
Figure: Determining the time of congestion (example 3)