This section provides information to configure QoS scheduler and port scheduler policies using the command line interface.
Topics in this section include:
Virtual schedulers are created within the context of a scheduler policy that is used to define the hierarchy and parameters for each scheduler. A scheduler is defined in the context of a tier which is used to place the scheduler within the hierarchy. Three tiers of virtual schedulers are supported. Root schedulers are defined without a parent scheduler meaning it is not subject to obtaining bandwidth from a higher tier scheduler. A scheduler has the option of enforcing a maximum rate of operation for all child queues and schedulers associated with it.
Because a scheduler is designed to arbitrate bandwidth between many inputs, a metric must be assigned to each child queue or scheduler vying for transmit bandwidth. This metric indicates whether the child is to be scheduled in a strict or weighted fashion and the level or weight the child has to other children.
In previous releases, HQoS root (top tier) schedulers always assumed that the configured rate was available, regardless of egress port level oversubscription and congestion. This resulted in the possibility that the aggregate bandwidth assigned to queues was not actually available at the port level. When the HQoS algorithm configures queues with more bandwidth than available on an egress port, actual bandwidth distribution to queues on the port will be solely based on the action of the hardware scheduler. This can result in a forwarding rate at each queue that is very different than the desired rate.
The port-based scheduler feature was introduced to allow HQoS bandwidth allocation based on available bandwidth at the egress port level. The port-based scheduler works at the egress line rate of the port to which it is attached. Port-based scheduling bandwidth allocation automatically includes the Inter-Frame Gap (IFG) and preamble for packets forwarded on queues servicing egress Ethernet ports. However, on PoS and SDH based ports, the HDLC encapsulation overhead and other framing overhead per packet is not known by the system. Instead of automatically determining the encapsulation overhead for SDH or SONET queues, the system provides a configurable frame encapsulation efficiency parameter that allows the user to select the average encapsulation efficiency for all packets forwarded out the egress queue.
A special port scheduler policy can be configured to define the virtual scheduling behavior for an egress port. The port scheduler is a software-based state machine managing a bandwidth allocation algorithm that represents the scheduling hierarchy shown in Port Level Virtual Scheduler Bandwidth Allocation Based on Priority and CIR.
The first tier of the scheduling hierarchy manages the total frame based bandwidth that the port scheduler will allocate to the eight priority levels.
The second tier receives bandwidth from the first tier in two priorities, a “within-cir” loop and an “above-cir” loop. The second tier “within-cir” loop provides bandwidth to the third tier “within-cir” loops, one for each of the eight priority levels. The second tier “above-cir” loop provides bandwidth to the third tier “above-cir” loops for each of the eight priority levels.
The “within-cir” loop for each priority level on the third tier supports an optional rate limiter used to restrict the maximum amount of “within-cir” bandwidth the priority level can receive. A maximum priority level rate limit is also supported that restricts the total amount of bandwidth the level can receive for both “within-cir” and “above-cir”. The amount of bandwidth consumed by each priority level for “within-cir” and “above-cir” is predicated on the rate limits described and the ability for each child queue or scheduler attached to the priority level to use the bandwidth.
The priority 1 “above-cir” scheduling loop has a special two tier strict distribution function. The high priority level 1 “above-cir” distribution is weighted between all queues and schedulers attached to level 1 for “above-cir” bandwidth. The low priority distribution for level 1 “above-cir” is reserved for all orphaned queues and schedulers on the egress port. Orphans are queues and schedulers that are not explicitly or indirectly attached to the port scheduler through normal parenting conventions. By default, all orphans receive bandwidth after all parented queues and schedulers and are allowed to consume whatever bandwidth is remaining. This default behavior for orphans can be overridden on each port scheduler policy by defining explicit orphan port parent association parameters.
Ultimately, any bandwidth allocated by the port scheduler is given to a child queue. The bandwidth allocated to the queue is converted to a value for the queue’s PIR (maximum rate) setting. This way, the hardware schedulers operating at the egress port level will only schedule bandwidth for all queues on the port up to the limits prescribed by the virtual scheduling algorithm.
The following lists the bandwidth allocation sequence for the port virtual scheduler:
When a queue is inactive or has a limited offered load that is below its fair share (fair share is based on the bandwidth allocation a queue would receive if it was registering adequate activity), its operational PIR must be set to some value to handle what would happen if the queues offered load increased prior to the next iteration of the port virtual scheduling algorithm. If an inactive queues PIR was set to zero (or near zero), the queue would throttle its traffic until the next algorithm iteration. If the operational PIR was set to its configured rate, the result could overrun the expected aggregate rate of the port scheduler.
To accommodate inactive queues, the system calculates a Minimum Information Rate (MIR) for each queue. To calculate each queue’s MIR, the system determines what that queue’s Fair Information Rate (FIR) would be if that queue had actually been active during the latest iteration of the virtual scheduling algorithm. For example, if three queues are active (1, 2, and 3) and two queues are inactive (4 and 5), the system first calculates the FIR for each active queue. Then it recalculates the FIR for queue 4 assuming queue 4 was active with queues 1, 2, and 3 and uses the result as the queue’s MIR. The same is done for queue 5 using queues 1, 2, 3, and 5. The MIR for each inactive queue is used as the operational PIR for each queue.
The port-based egress scheduler can be used to allocate bandwidth to each service or subscriber or multi-service site associated with the port. While egress queues on the service can have a child association with a scheduler policy on the SAP or multi-service site, all queues must vie for bandwidth from an egress port. Two methods are supported to allocate bandwidth to each service or subscriber or multi-service site queue:
The service or subscriber or multi-service site scheduler to port scheduler association model allows for multiple services or subscriber or multi-service site to have independent scheduler policy definitions while the independent schedulers receive bandwidth from the scheduler at the port level. By using two scheduler policies, available egress port bandwidth can be allocated fairly or unfairly depending on the desired behavior. Figure 21 graphically demonstrates this model.
Once a two scheduler policy model is defined, the bandwidth distribution hierarchy allocates the available port bandwidth to the port schedulers based on priority, weights, and rate limits. The service or subscriber or multi-service site level schedulers and the queues they service become an extension of this hierarchy.
Due to the nature of the two scheduler policy, bandwidth is allocated on a per-service or per subscriber or multi-service site basis as opposed to a per-class basis. A common use of the two policy model is for a carrier-of-carriers mode of business. In essence, the goal of a carrier is to provide segments of bandwidth to providers who purchase that bandwidth as services. While the carrier does not concern itself with the interior services of the provider, it does however care how congestion affects the bandwidth allocation to each provider’s service. As an added benefit, the two policy approach provides the carrier with the ability to preferentially allocate bandwidth within a service or subscriber or multi-service site context through the service or subscriber or multi-service site level policy without affecting the overall bandwidth allocation to each service or subscriber or multi-service site. Figure 22 shows a per-service bandwidth allocation using the two scheduler policy model. While the figure shows services grouped by scheduling priority, it is expected that many service models will place the services in a common port priority and use weights to provide a weighted distribution between the service instances. Higher weights provide for relatively higher amounts of bandwidth.
The second model of bandwidth allocation on an egress access port is to directly associate a service or subscriber or multi-service site queue to a port-level scheduler. This model allows the port scheduler hierarchy to allocate bandwidth on a per class or priority basis to each service or subscriber or multi-service site queue. This allows the provider to manage the available egress port bandwidth on a service tier basis ensuring that during egress port congestion, a deterministic behavior is possible from an aggregate perspective. While this provides an aggregate bandwidth allocation model, it does not inhibit per service or per subscriber or multi-service site queuing. Figure 23 demonstrates the single, port scheduler policy model.
Figure 23 also demonstrates the optional aggregate rate limiter at the SAP, multi-service site or subscriber or multi-service site level. The aggregate rate limiter is used to define a maximum aggregate bandwidth at which the child queues can operate. While the port-level scheduler is allocating bandwidth to each child queue, the current sum of the bandwidth for the service or subscriber or multi-service site is monitored. Once the aggregate rate limit is reached, no more bandwidth is allocated to the children associated with the SAP, multi-service site, or subscriber or multi-service site. Aggregate rate limiting is restricted to the single scheduler policy model and is mutually exclusive to defining SAP, multi-service site, or subscriber or multi-service site scheduling policies.
The benefit of the single scheduler policy model is that the bandwidth is allocated per priority for all queues associated with the egress port. This allows a provider to preferentially allocate bandwidth to higher priority classes of service independent of service or subscriber or multi-service site instance. In many cases on the 7750 SR, a subscriber can purchase multiple services from a single site (VoIP, HSI, Video) and each service can have a higher premium value relative to other service types. If a subscriber has purchased a premium service class, that service class should get bandwidth before another subscriber’s best effort service class. When combined with the aggregate rate limit feature, the single port-level scheduler policy model provides a per-service instance or per-subscriber instance aggregate SLA and a class based port bandwidth allocation function.
A port-based bandwidth allocation mechanism must consider the effect that line encapsulation overhead plays relative to the bandwidth allocated per service or subscriber or multi-service site. The service or subscriber or multi-service site level bandwidth definition (at the queue level) operates on a packet accounting basis. For Ethernet, this includes the DLC header, the payload and the trailing CRC. This does not include the IFG or the preamble. This means that an Ethernet packet will consume 20 bytes more bandwidth on the wire than what the queue accounted for. When considering HDLC encoded PoS or SDH ports on the 7750 SR, the overhead is variable based on ‘7e’ insertions (and other TDM framing issues). The HDLC and SONET/SDH frame overhead is not included for queues forwarding on PoS and SDH links.
The port-based scheduler hierarchy must translate the frame based accounting (on-the-wire bandwidth allocation) it performs to the packet based accounting in the queues. When the port scheduler considers the maximum amount of bandwidth a queue should get, it must first determine how much bandwidth the queue can use. This is based on the offered load the queue is currently experiencing (how many octets are being offered the queue). The offered load is compared to the queues configured CIR and PIR. The CIR value determines how much of the offered load should be considered in the “within-cir” bandwidth allocation pass. The PIR value determines how much of the remaining offered load (after “within-cir”) should be considered for the “above-cir” bandwidth allocation pass.
For Ethernet queues (queues associated with an egress Ethernet port), the packet to frame conversion is relatively easy. The system multiplies the number of offered packets by 20 bytes and adds the result to the offered octets (offeredPackets x 20 + offeredOctets = frameOfferedLoad). This frame-offered-load value represents the amount of line rate bandwidth the queue is requesting. The system computes the ratio of increase between the offered-load and frame-offered-load and calculates the current frame based CIR and PIR. The frame-CIR and frame-PIR values are used as the limiting values in the “within-cir” and “above-cir” port bandwidth distribution passes.
For PoS or SDH queues on the 7750 SR, the packet to frame conversion is more difficult to dynamically calculate due to the variable nature of HDLC encoding. Wherever a ‘7e’ bit or byte pattern appears in the data stream, the framer performing the HDLC encoding must place another ‘7e’ within the payload. Since this added HDLC encoding is unknown to the forwarding plane, the system allows for an encapsulation overhead parameter that can be provisioned on a per queue basis. This is provided on a per queue basis to allow for differences in the encapsulation behavior between service flows in different queues. The system multiplies the offered load of the queue by the encapsulation-overhead parameter and adds the result to the offered load of the queue (offeredOctets * configuredEncapsulationOverhead + offeredOctets = frameOfferedLoad). The frame-offered-load value is used by the egress PoS/SDH port scheduler in the same manner as the egress Ethernet port scheduler above.
From a provisioning perspective, queues and service level (and subscriber level) scheduler policies are always provisioned with packet-based parameters. The system will convert these values to frame-based on-the-wire values for the purpose of port bandwidth allocation. However, port-based scheduler policy scheduler maximum rates and CIR values are always interpreted as on-the-wire values and must be provisioned accordingly. Figure 24 and Figure 25 provide a logical view of bandwidth distribution from the port to the queue level and shows the packet or frame-based provisioning at each step.
A port-parent command in the sap-egress and network-queue QoS policy queue context defines the direct child/parent association between an egress queue and a port scheduler priority level. The port-parent command is mutually exclusive to the already-existing parent command, which associates a queue with a scheduler at the SAP, multi-service site or subscriber or multi-service site profile level. It is possible to mix local parented (parent to service or subscriber or multi-service site level scheduler) and port parented queues with schedulers on the same egress port.
The port-parent command only accepts a child/parent association to the eight priority levels on a port scheduler hierarchy. Similar to the local parent command, two associations are supported, one for “within-cir” bandwidth (cir-level) and a second one for “above-cir” bandwidth (level). The “within-cir” association is optional and can be disabled by using the default “within-cir” weight value of 0. In the event that a queue with a defined parent port is on a port without a port scheduler policy applied, that queue will be considered an orphaned queue. If a queue with a parent command is defined on a port and the named scheduler is not found due a missing scheduler policy or a missing scheduler of that name, the queue will be considered orphaned as well.
A queue can be moved from a local (on the SAP, multi-service site, or subscriber or multi-service site profile) parent to a port parent priority level simply by executing the port-parent command. Once the port-parent command is executed, any local parent information for the queue is lost. The queue can also be moved back to a local parent at anytime by executing the local parent command. Lastly, the local parent or port parent association can be removed at any time by using the no version of the appropriate parent command.
The port-parent command in the scheduler-policy scheduler context (at all tier levels) allows a scheduler to be associated with a port scheduler priority level. The port-parent command is mutually exclusive to the parent command for schedulers at tiers 2 and 3 within the scheduler policy. The port-parent command is the only parent command allowed for schedulers in tier 1.
The port-parent command only accepts a child/parent association to the eight priority levels on a port scheduler hierarchy. Similar to the normal local parent command, two associations are supported, one for “within-cir” bandwidth (cir-level) and a second one for “above-cir” bandwidth (level). The “within-cir” association is optional and can be disabled by using the default “within-cir” weight value of 0. In the event that a scheduler with a port parent defined is on a port without a port scheduler policy applied, that scheduler will be considered an orphaned scheduler.
A scheduler in tiers 2 and 3 can be moved from a local (within the policy) parent to a port parent priority level simply by executing the port-parent command. Once the port-parent command is executed, any local parent information for the scheduler is lost. The schedulers at tiers 2 and 3 can also be moved back to a local parent at anytime by executing the local parent command. Lastly, the local parent or port parent association can be removed at anytime by using the no version of the appropriate parent command. A scheduler in tier 1 can only be associated with a port parent and that port parent definition can be added or removed at anytime.
Network queues support port scheduler parent priority-level associations. Using a port scheduler policy definition and mapping network queues to a port parent priority level, HQoS functionality is supported providing eight levels of strict priority and weights within the same priority. A network queue’s bandwidth is allocated using the “within-cir” and “above-cir” scheme normal for port schedulers.
Queue CIR and PIR percentages when port-based schedulers are in effect will be based on frame-offered-load calculations. Figure 26 demonstrates port-based virtual scheduling bandwidth distribution.
A network queue with a port parent association exists on a port without a scheduler policy defined will be considered to be orphaned.
All queues and schedulers on a port that has a port-based scheduler policy configured will be subject to bandwidth allocation through the port-based schedulers. All queues and schedulers that are not configured with a scheduler parent are considered to be orphaned when port-based scheduling is in effect. This includes access and network queue schedulers at the SAP, multi-service site, subscriber and port level.
By default, orphaned queues and schedulers are allocated bandwidth after all queues and schedulers in the parented hierarchy have had bandwidth allocated “within-cir” and “above-cir”. In essence, an orphaned scheduler or queue can be considered as being foster parented by the port scheduler. Orphaned queues and schedulers have an inherent port scheduler association as shown below:
The above-CIR weight = 0 value is only used for orphaned queues and schedulers on port scheduler enabled egress ports. The system interprets weight=0 as priority level 0 and will only distribute bandwidth to level 0 once all other properly parented queues and schedulers have received bandwidth. Orphaned queues and schedulers all have equal priority to the remaining port bandwidth.
The default orphan behavior can be overridden for each port scheduler policy by using the orphan override command. The orphan override command accepts the same parameters as the port parent command. When the orphan override command is executed, all orphan queues and schedulers are treated in a similar fashion as other properly parented queues and schedulers based on the override parenting parameters.
It is expected that an orphan condition is not the desired state for a queue or scheduler and is the result of a temporary configuration change or configuration error.
A typical example of congestion monitoring on an Egress Port Scheduler (EPS) is when the EPS is configured within a Vport. A Vport is a construct in an HQoS hierarchy that can be used to control the bandwidth associated with an access network element (such as, GPON port, OLT, DSLAM) or a retailer that has subscribers on an access node (among other retailers).
The example in Figure 27 shows Vports representing GPON ports on an OLT. For capacity planning purposes, it’s necessary to know if the GPON ports (Vports) are congested. Frequent and prolonged congestion on the Vport will prompt the operator to increase the offered bandwidth to its subscribers by allocating additional GPON ports and subsequently moving the subscribers to the newly allocated GPON ports.
There are no forward/drop counters directly associated with the EPS. Instead, the counters are maintained on a per queue level. Consequently, any indication of the congestion level on the EPS is derived from the queue counters that are associated with the given EPS.
The EPS congestion monitoring capabilities rely on a counter that records the number of times that the offered EPS load (measured at the queue level) crossed the predefined bandwidth threshold levels within a given, operator defined timeframe. This counter is called the exceed counter. The rate comparison calculation (offered rate vs threshold) are executed several times per second and the calculation interval cannot be influenced externally by the operator.
The monitoring threshold can be configured via CLI per aggregate EPS rate, EPS level or EPS group. The threshold is applicable to PIR rates.
To enable congestion monitoring on EPS, monitoring must be explicitly enabled under the Vport object itself or under the physical port when the EPS is attached directly to the physical port. In addition, the monitoring threshold within the EPS must be configured.
Two examples of congestion monitoring on an EPS that is configured under the Vport are shown in Figure 28 and Figure 29. Figure 29 shows more severe congestion than Figure 28. The EPS exceed counter (the number of dots above the threshold line) can be obtained via a CLI show command or read directly via MIBs.
Once the exceed counter value is obtained, the counter should be cleared, which resets the exceed counter and number of samples to zero. This is because the longer the interval between a clear and a show or read, the more diluted the congestion information becomes. For example, 100 threshold exceeds within a 5 minute interval depicts a more accurate congestion picture compared to 100 threshold exceeds within a 5 hour interval.
The reduced ability to determine the time of congestion if the reading interval is too long is shown in Figure 30, Figure 31, and Figure 32. It can be seen that the same readings (in the 3 examples) can represent different congestion patterns that occur at different times between the two consecutive reads. The congestion pattern, or the exact time of congestion cannot be determined from the reading itself. The reading only indicates that the congestion occurred x number of times between the two consecutive readings. In the example shown in Figure 30, Figure 31, and Figure 32, an operator can decipher that the link was congested 20% of the time during a one day period without being able to pinpoint the exact time of congestion within the one day period. To determine the time of the congestion more accurately, the operator must collect the information more frequently. For example, if the information is collected every 30 minutes, then the operator can determine the part of the day during which congestion occurred within 30 minutes of accuracy.
The scalability and performance is driven by the number of entities for which congestion monitoring is enabled on each line card.
Each statistics gathering operation requires a show or read followed by a clear. The shorter the time between the two, the more accurate the information about the congestion state of the EPS will be.
If the clear operation is not executed after the show or read operation, the external statistics gathering entity (external server) would need to perform additional operations (such as, subtract statistics between the two consecutive reads) in order the obtain the delta between the two reads.
The recommended minimum polling interval at a higher scale (high number of monitoring entities) is 15 minutes per monitoring entity.
If statistics are obtained via SNMP, the relevant MIB entries corresponding to the show command are:
Clearing of the statistics can also be performed through a common MIB entry, corresponding to a clear command: tmnxClearEntry.
The scalability and performance is driven by the number of entities for which congestion monitoring is enabled on each line card.
The standard accounting mechanism uses ‘packet based’ rules that account for the DLC header, any existing tags, Ethernet payload and the 4 byte CRC. The Ethernet framing overhead which includes the Inter-Frame Gap (IFG) and preamble (20 bytes total) are not included in packet based accounting. When frame based accounting is enabled, the 20 byte framing overhead is included in the queue CIR, PIR and scheduling operations allowing the operations to take into consideration on-wire bandwidth consumed by each Ethernet packet.
Since the native queue accounting functions (stats, CIR and PIR) are based on packet sizes and do not include Ethernet frame encapsulation overhead, the system must manage the conversion between packet based and frame based accounting. To accomplish this, the system requires that a queue operates in frame based accounting mode, and must be managed by a virtual scheduler policy or by a port virtual scheduler policy. Egress queues can use either port or service schedulers to accomplish frame based accounting, but ingress queues are limited to service based scheduling policies.
Turning on frame based accounting for a queue is accomplished through a frame based accounting command defined on the scheduling policy level associated with the queue or through a queue frame based accounting parameter on the aggregate rate limit command associated with the queues SAP, multi-service site or subscriber or multi-service site context.
To add frame overhead to the existing QoS Ethernet packet handling functions, the system uses the already existing virtual scheduling capability of the system. The system currently monitors each queue included in a virtual scheduler to determine its offered load. This offered load value is interpreted based on the queues defined CIR and PIR threshold rates to determine bandwidth offerings from the queues virtual scheduler. When egress port based virtual scheduling was added, frame based usage on the wire was added to allow for the port bandwidth to be accurately allocated to each child queue on the port.
The port based virtual scheduling mechanism takes the native packet based accounting results from the queue and adds 20 bytes to each packet to derive the queue’s frame based offered load. The ratio between the frame based offered load and the packet based offered load is then used to determine the effective frame based CIR and frame based PIR thresholds for the queue. Once the port virtual scheduler computes the amount of bandwidth allowed to the queue (in a frame based fashion), the bandwidth is converted back to a packet based value and used as the queue’s operational PIR. The queue’s native packet based mechanisms continue to function, but the maximum operational rate is governed by frame based decisions.
The frame based accounting feature extends this capability to allow the queue CIR and PIR thresholds to be defined as frame based values as opposed to packet based values. The queue continues to internally use its packet based mechanisms, but the provisioned frame based CIR and PIR values are continuously revalued based on the ratio between the calculated frame based offered load and actual packet based offered load. As a result, the queue’s operational packet based CIR and PIR are accurately modified during each iteration of the virtual scheduler to represent the provisioned frame based CIR and PIR.
Normally, a scheduler policy contains rates that indicate packet based accounting values. When the children queues associated with the policy are operating in frame based accounting mode, the parent schedulers must also be governed by frame based rates. Since either port based or service based virtual scheduling is required for queue frame based operation, enabling frame based operation is configured at either the scheduling policy or aggregate rate limit command level. All queues associated with the policy or the aggregate rate limit command will inherit the frame based accounting setting from the scheduling context.
When frame based accounting is enabled, the queues CIR and PIR settings are automatically interpreted as frame based values. If a SAP ingress QoS policy is applied with a queue PIR set to 100Mbps on two different SAPs, one associated with a policy with frame based accounting enabled and the other without frame based accounting enabled, the 100Mbps rate will be interpreted differently for each queue. The frame based accounting queue will add 20 bytes to each packet received by the queue and limit the rate based on the extra overhead. The packet based accounting queue will not add the 20 bytes per packet and thus allow more packets through per second.
Similarly, the rates defined in the scheduling policy with frame based accounting enabled will automatically be interpreted as frame based rates.
The port based scheduler aggregate rate limit command always interprets its configured rate limit value as a frame based rate. Setting the frame based accounting parameter on the aggregate rate limit command only affects the queues managed by the aggregate rate limit and converts them from packet based to frame based accounting mode.
The Hierarchical QoS (HQoS) mechanism is designed to enforce a user definable hierarchical shaping behavior on an arbitrary set of queues. The mechanism accomplishes this by monitoring the offered rate of each queue and using the result as an input to a virtual scheduler hierarchy defined by the user. The hierarchy consists of a number of virtual scheduler with configurable maximum rates per scheduler and attachment parameters between each. The parameters consist of weights and priority levels used to distribute the available bandwidth in a top down fashion through the hierarchy with the queues at the bottom. The resulting bandwidth provided to each member queue by the virtual schedulers is then configured as an operational PIR on the corresponding hardware queue, which prevents that queue from receiving more hardware scheduler bandwidth than dictated by the virtual scheduler.
The default behavior of HQoS is to only throttle active queues currently exceeding their allocated bandwidth by the virtual schedulers controlling the active queue. A queue that is currently operating below its share of bandwidth is allowed an operational PIR greater than its current rate, this includes inactive queues. The operational PIR for a queue is capped by its admin PIR and set to the queue’s fair-share of the available bandwidth based on its priority level in the HQoS hierarchy and its weight within that priority level. The result is that between HQoS iterations, a queue below its share of bandwidth may burst to a higher rate and momentarily overrun the prescribed aggregate rate.
This default behavior works well in situations where an aggregate rate is being applied as a customer capping function to limit excessive use of network resources. However, in certain circumstances where an aggregate rate must be maintained due to limited downstream QoS abilities or due to downstream priority unaware aggregate policing, a more conservative behavior is required. The following functions can be used to control the unused bandwidth distribution:
The limit-unused-bandwidth (LUB) command protects against exceeding the aggregated bandwidth by adding a LUB second-pass to the HQoS function, which ensures that the aggregate fair-share bandwidth does not exceed the aggregate rate.
The command can be applied on any tier 1 scheduler within an egress scheduler policy or within any agg-rate node (except when using the HS-MDA) and affects all queues controlled by the object.
When LUB is enabled, the LUB second pass is performed as part of the HQoS algorithm The order of operation between HQoS and LUB is as follows:
When LUB is enabled on a scheduler rate or aggregate rate, a LUB context is created containing the rate and the associated queues the rate controls. Because a queue may be controlled by multiple LUB enabled rates in a hierarchy, a queue may be associated with multiple LUB contexts.
LUB is applied to the contexts where it is enabled. LUB first considers how much of the aggregate rate is unused by the aggregate rates of each member queue after the first pass of the HQoS algorithm. This represents the current bandwidth that may be distributed between the member queues. LUB then distributes the available bandwidth to its member queues based on each queue’s LUB-weight. A queue’s LUB-weight is determined as follows:
The resulting operational PIRs are then set such that the scheduler or agg-rate rate is not exceeded. To achieve the best precision, queues must be configured to use adaptation-rule pir max cir max to prevent the actual queue rate used exceeds that determined by LUB.
Example
Taking a simple scenario with 5 egress SAP queues all without rates configured but with each queue parented to a different level in a parent scheduler which has a rate of 100Mb/s, see Figure 33.
The resulting bandwidth distribution is shown in Figure 34. Firstly, when no traffic is being sent with and without LUB applied, then when 20Mbps and 40Mbps are sent on queues 3 and 5, respectively, again with and without LUB applied. As can be seen, the distribution of bandwidth in the case where traffic is sent and LUB is enabled is based upon the LUB-weights described above.
Every port scheduler supports eight strict priority levels with a two pass bandwidth allocation mechanism for each priority level. Priority levels 8 through 1 (level 8 is the highest priority) are available for port-parent association for child queues and schedulers. Each priority level supports a maximum rate limit parameter that limits the amount of bandwidth that may be allocated to that level. A CIR parameter is also supported that limits the amount of bandwidth allocated to the priority level for the child queue’s offered load, within their defined CIR. An overall maximum rate parameter defines the total bandwidth that will be allocated to all priority levels.
When a port scheduler is present on an egress port or channel, the system ensures that all queues and schedulers receive bandwidth from that scheduler to prevent free-running queues which can cause the aggregate operational PIR of the port or channel to oversubscribe the bandwidth available. When the aggregate maximum rate for the queues on a port or channel operate above the available line rate, the forwarding ratio between the queues will be affected by the hardware schedulers on the port and may not reflect the scheduling defined on the port or intermediate schedulers. Queues and schedulers that are either explicitly attached to the port scheduler using the port-parent command or are attached to an intermediate scheduler hierarchy that is ultimately attached to the port scheduler are managed through the normal eight priority levels. Queues and schedulers that are not attached directly to the port scheduler and are not attached to an intermediate scheduler that itself is attached to the port scheduler are considered orphaned queues and, by default, are tied to priority 1 with a weight of 0. All weight 0 queues and schedulers at priority level 1 are allocated bandwidth after all other children and each weight 0 child is given an equal share of the remaining bandwidth. This default orphan behavior may be overridden at the port scheduler policy by using the orphan-override command. The orphan-override command accepts the same parameters as the port-parent command. When the orphan-override command is executed, the parameters will be used as the port parent parameters for all orphans associated with a port using the port scheduler policy.
Another difference between the service level scheduler-policy and the port level port-scheduler-policy is in bandwidth allocation behavior. The port scheduler is designed to offer on-the-wire bandwidth. For Ethernet ports, this includes the IFG and the preamble for each frame and represents 20 bytes total per frame. The queues and intermediate service level schedulers (a service level scheduler is a scheduler instance at the SAP, multi-service site or subscriber or multi-service site profile level) operate based on packet overhead which does not include the IFG or preamble on Ethernet packets. In order for the port based virtual scheduling algorithm to function, it must convert the queue and service scheduler packet based required bandwidth and bandwidth limiters (CIR and rate PIR) to frame based values. This is accomplished by adding 20 bytes to each Ethernet frame offered at the queue level to calculate a frame based offered load. Then the algorithm calculates the ratio increase between the packet based offered load and the frame based offered load and uses this ratio to adapt the CIR and rate PIR values for the queue to frame-CIR and frame-PIR values. When a service level scheduler hierarchy is between the queues and the port based schedulers, the ratio between the average frame-offered-load and the average packet-offered-load is used to adapt the scheduler’s packet based CIR and rate PIR to frame based values. The frame based values are then used to distribute the port based bandwidth down to the queue level.
Packet over SONET (PoS) and SDH queues on the 7450 ESS and 7750 SR also operate based on packet sizes and do not include on-the-wire frame overhead. Unfortunately, the port based virtual scheduler algorithm does not have access to all the frame encapsulation overhead occurring at the framer level. Instead of automatically calculating the difference between packet-offered-load and frame-offered-load, the system relies on a provisioned value at the queue level. This avg-frame-overhead parameter is used to calculate the difference between the packet-offered-load and the frame-offered-load. This difference is added to the packet-offered-load to derive the frame-offered-load. Proper setting of this percentage value is required for proper bandwidth allocation between queues and service schedulers. If this value is not attainable, another approach is to artificially lower the maximum rate of the port scheduler to represent the average port framing overhead. This, in conjunction with a zero or low value for avg-frame-overhead, will ensure that the allocated queue bandwidth will control forwarding behavior instead of the low level hardware schedulers.
When all queues for a SAP, multi-service site or subscriber or multi-service site instance are attached directly to the port scheduler (using the port-parent command), it is possible to configure an agg-rate-limit for the queues. This is beneficial since the port scheduler does not provide a mechanism to enforce an aggregate SLA for a service or subscriber or multi-service site and the agg-rate-limit provides this ability. Queues may be provisioned directly on the port scheduler when it is desirable to manage the congestion at the egress port based on class priority instead of on a per service object basis.
The agg-rate-limit is not supported when one or more queues on the object are attached to an intermediate service scheduler. In this event, it is expected that the intermediate scheduler hierarchy will be used to enforce the aggregate SLA. Attaching an agg-rate-limit is mutually exclusive to attaching an egress scheduler policy at the SAP or subscriber or multi-service site profile level. Once an aggregate rate limit is in effect, a scheduler policy cannot be assigned. Once a scheduler policy is assigned on the egress side of a SAP or subscriber or multi-service site profile, an agg-rate-limit cannot be assigned.
Since the sap-egress policy defines a queue’s parent association before the policy is associated with a service SAP or subscriber or multi-service site profile, it is possible for the policy to either not define a port-parent association or define an intermediate scheduler parenting that does not exist. As stated above, queues in this state are considered to be orphaned and automatically attached to port scheduler priority 1. Orphaned queues are included in the aggregate rate limiting behavior on the SAP or subscriber or multi-service site instance they are created within.
A sap-egress QoS policy queue may be associated with either a port parent or an intermediate scheduler parent. The validity parent definition cannot be checked at the time it is provisioned since the application of the QoS policy is not known until it is applied to an egress SAP or subscriber or multi-service site profile. It is allowed to have port or intermediate parenting decided on a queue by queue basis, some queues tied directly to the port scheduler priorities while other queues are attached to intermediate schedulers.
A network-queue policy only supports direct port parent priority association. Intermediate schedulers are not supported on network ports or channels.
Once a port scheduler has been associated with an egress port, it is possible to override the following parameters:
The orphan priority level (level 1) has no configuration parameters and cannot be overridden.
In order to represent a downstream network aggregation node in the local node scheduling hierarchy, a new scheduling node, referred to as virtual port, and vport in CLI have been introduced. The vport operates exactly like a port scheduler except multiple vport objects can be configured on the egress context of an Ethernet port.
This feature applies to the 7450 ESS and 7750 SR only.
Figure 35 illustrates the use of the vport on an Ethernet port of a Broadband Network Gateway (BNG). In this case, the vport represents a specific downstream DSLAM.
The user adds a vport to an Ethernet port using the following command:
The vport is always configured at the port level even when a port is a member of a LAG. The vport name is local to the port it is applied to but must be the same for all member ports of a LAG. It however does not need to be unique globally on a chassis.
The user applies a port scheduler policy to a vport using the following command:
A Vport cannot be parented to the port scheduler when it is using a port scheduler policy itself. It is thus important the user ensures that the sum of the max-rate parameter value in the port scheduler policies of all Vport instances on a given egress Ethernet port does not oversubscribe the port’s rate. If it does, the scheduling behavior degenerates to that of the H/W scheduler on that port. A Vport which uses an agg-rate, or a scheduler-policy can be parented to a port scheduler. This is explained in Section Applying Aggregate Rate Limit to a VPORT. The application of the agg-rate rate, port-scheduler-policy and scheduler-policy commands under a VPORT are mutually exclusive.
Each subscriber host queue is port parented to the Vport which corresponds to the destination DSLAM using the existing port-parent command:
This command can parent the queue to either a port or to a Vport. These operations are mutually exclusive in CLI as explained above. When parenting to a Vport, the parent Vport for a subscriber host queue is not explicitly indicated in the above command. It is determined indirectly. The determination of the parent Vport for a given subscriber host queue is described in the 7750 SR OS Triple Play Guide.
Subscriber host queues, SLA profile schedulers, subscriber profile scheduler and PW SAPs (in IES or VPRN services) can be parented to a VPORT.
The user can apply an aggregate rate limit to the Vport and apply a port scheduler policy to the port.
This model allows the user to over subscribe the Ethernet port. The application of the agg-rate option is mutually exclusive with the application of a port scheduler policy, or a scheduler policy to a VPORT.
When using this model, a subscriber host queue with the port-parent option enabled is scheduled within the context of the port’s port scheduler policy. More details are provided in the 7750 SR OS Triple Play Guide.
The user can apply a scheduler policy to the VPORT. This is allows scheduling control of subscriber tier 1 schedulers in a scheduler policy applied to the egress of a subscriber or SLA profile, or to a PW SAP in an IES or VPRN service. This feature applies only to the 7450 ESS and 7750 SR.
The advantage of using a scheduler policy under a VPORT, compared to the use of a port scheduler (with or without an agg-rate rate), is that it allows a port parent to be configured at the VPORT level as well as allowing the user to oversubscribe the Ethernet port.
Bandwidth distribution from an egress port scheduler to a VPORT configured with a scheduler policy can be performed based on the level/cir-level and weight/cir-weight configured under the scheduler’s port parent. The result is in allowing multiple VPORTs, for example representing different DSLAMs, to share the port bandwidth capacity in a flexible way that is under the control of the user.
The configuration of a scheduler policy under a VPORT is mutually exclusive with the configuration of a port scheduler policy or an aggregate rate limit.
A scheduler policy is configured under a VPORT as follows:
When using this model, a tier 1 scheduler in a scheduling policy applied to a subscriber profile or SLA profiles must be configured as follows:
If the VPORT exists, but the port does not have a port scheduler policy applied, then its schedulers will be orphaned and no port level QOS control can be enforced.
The following show/clear commands are available related to the VPORT scheduler:
HQoS adjustment and host tracking are not supported on schedulers that are configured in a scheduler policy on a VPORT, so the configuration of a scheduler policy under a VPORT is mutually exclusive with the configuration of the egress-rate-modify parameter.
ESM over MPLS pseudowires are not supported when a scheduler policy is configured on a VPORT.
The existing port scheduler policy defines a set of eight priority levels with no ability of grouping levels within a single priority. In order to allow for the application of a scheduling weight to groups of queues competing at the same priority level of the port scheduler policy applied to the vport, or to the Ethernet port, a new group object is defined under the port scheduler policy:
Up to eight groups can be defined within each port scheduler policy. One or more levels can map to the same group. A group has a rate and optionally a cir-rate and inherits the highest scheduling priority of its member levels. For example, the scheduler group for the 7450 ESS and 7750 SR shown in the vport in Figure 35 consists of level priority 3 and level priority 4. It thus inherits priority 4 when competing for bandwidth with the standalone priority levels 8, 7, and 5.
In essence, a group receives bandwidth from the port or from the vport and distributes it within the member levels of the group according to the weight of each level within the group. Each priority level will compete for bandwidth within the group based on its weight under congestion situation. If there is no congestion, a priority level can achieve up to its rate (cir-rate) worth of bandwidth.
The mapping of a level to a group is performed as follows:
CLI will enforce that mapping of levels to a group are contiguous. In other words, a user would not be able to add priority level to group unless the resulting set of priority levels is contiguous.
When a level is not explicitly mapped to any group, it maps directly to the root of the port scheduler at its own priority like in existing behavior.
A basic QoS scheduler policy must conform to the following:
A basic QoS port scheduler policy must conform to the following:
Configuring and applying QoS policies is optional. If no QoS policy is explicitly applied to a SAP or IP interface, a default QoS policy is applied.
To create a scheduler policy, define the following:
The following displays a scheduler policy configuration:
Apply scheduler policies to the following entities:
Use the following CLI syntax to associate a scheduler policy to a customer’s multiservice site:
Use the following CLI syntax to apply QoS policies to ingress and/or egress Epipe SAPs:
The following output displays an Epipe service configuration with SAP scheduler policy SLA2 applied to the SAP ingress and egress.
Use the following CLI syntax to apply scheduler policies to ingress and/or egress IES SAPs:
The following output displays an IES service configuration with scheduler policy SLA2 applied to the SAP ingress and egress.
Use the following CLI syntax to apply scheduler policies to ingress and/or egress VPLS SAPs:
The following output displays an VPLS service configuration with scheduler policy SLA2 applied to the SAP ingress and egress.
Use the following CLI syntax to apply scheduler policies to ingress and/or egress VPRN SAPs on the 7750 SR and 7950 XRS:
The following output displays a VPRN service configuration with the scheduler policy SLA2 applied to the SAP ingress and egress.
Configuring and applying QoS port scheduler policies is optional. If no QoS port scheduler policy is explicitly applied to a SAP or IP interface, a default QoS policy is applied.
To create a port scheduler policy, define the following:
Use the following CLI syntax to create a QoS port scheduler policy.
The create keyword is included in the command syntax upon creation of a policy.
The following displays a scheduler policy configuration example:
The port-parent command defines a child/parent association between an egress queue and a port based scheduler or between an intermediate service scheduler and a port based scheduler. The command may be issued in three distinct contexts; sap-egress>queue queue-id, and network-queue> queue queue-id and scheduler-policy>scheduler scheduler-name the network-queue> queue queue-id context. The port-parent command allows for a set of within-cir and above-cir parameters that define the port priority levels and weights for the queue or scheduler. If the port-parent command is executed without any parameters, the default parameters are assumed.
The within-cir parameters define which port priority level the queue or scheduler should be associated with when receiving bandwidth for the queue or schedulers within-cir offered load. The within-cir offered load is the amount of bandwidth the queue or schedulers could use that is equal to or less than its defined or summed CIR value. The summed value is only valid on schedulers and is the sum of the within-cir offered loads of the children attached to the scheduler. The parameters that control within-cir bandwidth allocation are the port-parent commands cir-level and cir-weight keywords. The cir-level keyword defines the port priority level that the scheduler or queue uses to receive bandwidth for its within-cir offered load. The cir-weight is used when multiple queues or schedulers exist at the same port priority level for within-cir bandwidth. The weight value defines the relative ratio that is used to distribute bandwidth at the priority level when more within-cir offered load exists than the port priority level has bandwidth.
A cir-weight equal to zero (the default value) has special meaning and informs the system that the queue or scheduler does not receive bandwidth from the within-cir distribution. Instead all bandwidth for the queue or scheduler must be allocated in the port scheduler’s above-cir pass.
The above-cir parameters define which port priority level the queue or scheduler should be associated with when receiving bandwidth for the queue’s or scheduler’s above-cir offered load. The above-cir offered load is the amount of bandwidth the queue or scheduler could use that is equal to or less than its defined PIR value (based on the queue or schedulers rate command) less any bandwidth that was given to the queue or scheduler during the above-cir scheduler pass. The parameters that control above-cir bandwidth allocation are the port-parent commands level and weight keywords. The level keyword defines the port priority level that the scheduler or queue uses to receive bandwidth for its above-cir offered load. The weight is used when multiple queues or schedulers exist at the same port priority level for above-cir bandwidth. The weight value defines the relative ratio that is used to distribute bandwidth at the priority level when more above-cir offered load exists than the port priority level has bandwidth.
The following output displays a sample configuration and explanation with and without dist-lag-rate-shared.
Before enabling dist-lag-rate-shared, in the port-scheduler-policy psp, the max-rate achieved is twice 413202 kbps 816Mbps. This is because LAG has members from two different cards.
Two port-scheduler-instances are created, one on each card with the max-rate of 413202 kbps. This can be confirmed using the following show o/p.
Once dist-lag-rate-shared is enabled in port-scheduler-policy, this max-rate is enforced across all members of the LAG.
The following output shows dist-lag-rate-shared enabled.
If one of the member links of the LAG goes down, then the max-rate is divided among the remaining lag members.
Card 2 is assigned 137734 (1/3 of max-rate 413202)
Card 3 is assigned 275468 (2/3 of max-rate 413202)
The following output shows the max-rate percent value.
With max-rate percent, the max-rate is capped to the percent of the active LAG capacity.
When max-rate is configured as percentage and the dist-lag-rate-shared is ignored.
The group rate, level pir and cir rate can be entered as percent.
Port scheduler-Overrides
Both max-rate and level can be overridden if they are of the same type as in the policy being overridden.
The following are additions to the to the show command output:
Dist Lag Rate, Lvl and Group PIR and Cir Percent rates
This section discusses the following service management tasks:
There are no scheduler or port-scheduler policies associated with customer or service entities. Removing a scheduler or port-scheduler policy from a multi-service customer site causes the created schedulers to be removed which makes them unavailable for the ingress SAP queues associated with the customer site. Queues that lose their parent scheduler association are deemed to be orphaned and are no longer subject to a virtual scheduler. The SAPs that have ingress queues that rely on the schedulers enter into an orphaned state on one or more queues.
A QoS scheduler policy cannot be deleted until it is removed from all customer multi-service sites or service SAPs where it is applied.
The following syntax and example apply to the 7750 SR and 7950 XRS.
To delete a scheduler policy, enter the following commands:
To delete a port scheduler policy, enter the following commands:
You can copy an existing QoS policy, rename it with a new QoS policy value, or overwrite an existing policy. The overwrite option must be specified or an error occurs if the destination policy exists.
You can change existing policies and entries in the CLI. The changes are applied immediately to all customer multi-service sites and service SAPs where the policy is applied. To prevent configuration errors use the copy command to make a duplicate of the original policy to a work area, make the edits, and then overwrite the original policy.