This chapter provides information about the Operations, Administration and Management (OAM) and Service Assurance Agent (SAA) commands available in the CLI for troubleshooting services.
Delivery of services requires that a number of operations occur correctly and at different levels in the service delivery model. For example, operations such as the association of packets to a service, must be performed correctly in the forwarding plane for the service to function correctly. To verify that a service is operational, a set of in-band, packet-based Operation, Administration, and Maintenance (OAM) tools is required, with the ability to test each of the individual packet operations.
For in-band testing, the OAM packets closely resemble customer packets to effectively test the customer forwarding path, but they are distinguishable from customer packets so they are kept within the service provider network and not forwarded to the customer.
The suite of OAM diagnostics supplement the basic IP ping and traceroute operations with diagnostics specialized for the different levels in the service delivery model. There are diagnostics for services.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
The following guidelines and restrictions apply:
The router LSP diagnostics are implementations of LSP ping and LSP trace based on RFC 4379, Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures. LSP ping provides a mechanism to detect data plane failures in MPLS LSPs. LSP ping and LSP trace are modeled after the ICMP echo request/reply used by ping and trace to detect and localize faults in IP networks.
For a specific LDP FEC, RSVP P2P LSP, or BGP IPv4 Label Router, LSP ping verifies whether the packet reaches the egress label edge router (LER), while in LSP trace mode, the packet is sent to the control plane of each transit label switched router (LSR) which performs various checks to see if it is actually a transit LSR for the path.
The downstream mapping TLV is used in lsp-ping and lsp-trace to provide a mechanism for the sender and responder nodes to exchange and validate interface and label stack information for each downstream of an LDP FEC or an RSVP LSP and at each hop in the path of the LDP FEC or RSVP LSP.
Two downstream mapping TLVs are supported. The original Downstream Mapping (DSMAP) TLV defined in RFC 4379 and the new Downstream Detailed Mapping (DDMAP) TLV defined in RFC 6424.
When the responder node has multiple equal cost next-hops for an LDP FEC prefix, the downstream mapping TLV can further be used to exercise a specific path of the ECMP set using the path-destination option. The behavior in this case is described in the following ECMP subsection.
This feature adds support of the target FEC stack TLV of type BGP Labeled IPv4 /32 Prefix as defined in RFC 4379.
The new TLV structure is shown in the following figure
The user issues a LSP ping using the existing CLI command and specifying a new type of prefix:
oam lsp-ping bgp-label prefix ip-prefix/mask [src-ip-address ip-address] [fc fc-name [profile {in|out}]] [size octets] [ttl label-ttl] [send-count send-count] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [detail]
The path-destination option is used for exercising specific ECMP paths in the network when the LSR performs hashing on the MPLS packet.
Similarly, the user issues a LSP trace using the following command:
oam lsp-trace bgp-label prefix ip-prefix/mask [src-ip-address ip-address] [fc fc-name [profile {in|out}]] [max-fail no-response-count] [probe-count probes-per-hop] [size octets] [min-ttl min-label-ttl] [max-ttl max-label-ttl] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [detail]
The following are the procedures for sending and responding to an LSP ping or LSP trace packet. These procedures are valid when the downstream mapping is set to the DSMAP TLV. The detailed procedures with the DDMAP TLV are presented in Using DDMAP TLV in LSP stitching and LSP hierarchy:
Note: Only BGP label IPv4 /32 prefixes are supported because these are usable as tunnels on nodes. BGP label IPv6 /128 prefixes are not currently usable as tunnels on a node and are not supported in LSP ping or trace. |
Note: The following restrictions apply for this section.
|
When the responder node has multiple equal cost next-hops for an LDP FEC or a BGP label IPv4 prefix, it replies in the DSMAP TLV with the downstream information of the outgoing interface that is part of the ECMP next-hop set for the prefix.
However, when a BGP label route is resolved to an LDP FEC (of the BGP next-hop of the BGP label route), ECMP can exist at both the BGP and LDP levels. The following next-hop selection is performed in this case:
In the following description of LSP ping and LSP trace behavior, generic references are made to specific terms as follows:
LSP ping operates over a network using unnumbered links without any changes. LSP trace are modified such that the unnumbered interface is correctly encoded in the downstream mapping (DSMAP/DDMAP) TLV.
In an RSVP P2P LSP, the upstream LSR encodes the downstream router ID in the Downstream IP Address field and the local unnumbered interface index value in the Downstream Interface Address field of the DSMAP/DDMAP TLV as per RFC 4379. Both values are taken from the TE database.
In an LDP unicast FEC, the interface index assigned by the peer LSR is not readily available to the LDP control plane. In this case, the alternative method described in RFC 4379 is used. The upstream LSR sets the Address Type to IPv4 Unnumbered, the Downstream IP Address to a value of 127.0.0.1, and the interface index is set to 0. If an LSR receives an echo-request packet with this encoding in the DSMAP/DDMAP TLV, it will bypass interface verification but continue with label validation.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
The DDMAP TLV provides with exactly the same features as the existing DSMAP TLV, plus the enhancements to trace the details of LSP stitching and LSP hierarchy. The latter is achieved using a new sub-TLV of the DDMAP TLV called the FEC stack change sub-TLV. Figure 8 and Figure 9 are the structures of these two objects as defined in RFC 6424.
The DDMAP TLV format is derived from the DSMAP TLV format. The key change is that variable length and optional fields have been converted into sub-TLVs. The fields have the same use and meaning as in RFC 4379.
The operation type specifies the action associated with the FEC stack change. The following operation types are defined.
More details on the processing of the fields of the FEC stack change sub-TLV are provided later in this section.
The following shows the command usage to configure which downstream mapping TLV to use globally on a system.
configure test-oam mpls-echo-request-downstream-map {dsmap | ddmap}
This command specifies which format of the downstream mapping TLV to use in all LSP trace packets and LDP tree trace packets originated on this node. The Downstream Mapping (DSMAP) TLV is the original format in RFC 4379 and is the default value. The Downstream Detailed Mapping (DDMAP) TLV is the new enhanced format specified in RFC 6424.
This command applies to LSP trace of an RSVP P2P LSP, a BGP IPv4 Label Route, or LDP unicast FEC, and to LDP tree trace of a unicast LDP FEC. It does not apply to LSP trace of an RSVP P2MP LSP which always uses the DDMAP TLV.
The global DSMAP/DDMAP setting impacts the behavior of both OAM LSP trace packets and SAA test packets of type lsp-trace and is used by the sender node when one of the following events occurs:
A consequence of the preceding rules is that a change to the value of mpls-echo-request-downstream-map option does not affect the value inserted in the downstream mapping TLV of existing tests.
The following are the details of the processing of the new DDMAP TLV:
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
In addition to performing the same features as the DSMAP TLV, the new DDMAP TLV addresses the following scenarios:
To correctly check a target FEC which is stitched to another FEC (stitching FEC) of the same or a different type, or which is tunneled over another FEC (tunneling FEC), it is necessary for the responding nodes to provide details about the FEC manipulation back to the sender node. This is achieved via the use of the new FEC stack change sub-TLV in the Downstream Detailed Mapping TLV (DDMAP) defined in RFC 6424.
When the user configures the use of the DDMAP TLV on a trace for an LSP that does not undergo stitching or tunneling operation in the network, the procedures at the sender and responder nodes are the same as in the case of the existing DSMAP TLV.
This feature however introduces changes to the target FEC stack validation procedures at the sender and responder nodes in the case of LSP stitching and LSP hierarchy. These changes pertain to the processing of the new FEC stack change sub-TLV in the new DDMAP TLV and the new return code 15 Label switched with FEC change. The following is a description of the main changes which are a superset of the rules described in Section 4 of RFC 6424 to allow greater scope of interoperability with other vendor implementations.
Note the following limitation when a BGP IPv4 label route is resolved to an LDP FEC which is resolved to an RSVP LSP all on the same node. This 2-level LSP hierarchy is not supported as a feature on the SROS but user is not prevented from configuring it. In that case, user and OAM packets are forwarded by the sender node using two labels (T-LDP and BGP). The LSP trace will fail on the downstream node with return code 1 Malformed echo request received since there is no label entry for the RSVP label.
Note: This feature is supported only on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
MPLS OAM supports segment routing extensions to lsp-ping and lsp-trace as specified in draft-ietf-mpls-spring-lsp-ping.
When the data plane uses MPLS encapsulation, MPLS OAM tools such as lsp-ping and lsp-trace can be used to check connectivity and trace the path to any midpoint or endpoint of an SR-ISIS or SR-OSPF shortest path tunnel.
The CLI options for lsp-ping and lsp-trace are under OAM and SAA for SR-ISIS and SR-OSPF node SID tunnels.
This section describes how MPLS OAM models the SR tunnel types.
An SR shortest path tunnel, SR-ISIS or SR-OSPF tunnel, uses a single FEC element in the target FEC stack TLV. The FEC corresponds to the prefix of the node SID in a specific IGP instance.
The following figure shows the format of the IPv4 IGP-prefix segment ID.
In this format, the fields are as follows:
Both lsp-ping and lsp-trace apply to the following contexts:
The following operating guidelines apply to lsp-ping and lsp-trace.
The following figure shows a sample topology for an lsp-ping and lsp-trace for SR-ISIS node SID tunnels.
Given this topology, the following output is an example of LSP-PING on DUT-A for target node SID on DUT-F.
The following output is an example of LSP-TRACE on DUT-A for target node SID on DUT-F (DSMAP TLV):
The following output is an example of LSP-TRACE on DUT-A for target node SID onDUT-F (DDMAP TLV).
The following operating guidelines apply to lsp-ping and lsp-trace:
The following is an output example of the lsp-trace command of the DDMAP TLV for LDP-to-SR direction (symmetric topology LDP-SR-LDP).
The following output is an example of the lsp-trace command of the DDMAP TLV for SR-to-LDP direction (symmetric topology LDP-SR-LDP).
The 7210 SAS enhances lsp-ping and lsp-trace of a BGP IPv4 LSP resolved over an SR-ISIS IPv4 tunnel or an SR-OSPF IPv4 tunnel. The 7210 SAS enhancement reports the full set of ECMP next-hops for the transport tunnel at both ingress PE and at the ABR or ASBR. The list of downstream next-hops is reported in the DSMAP or DDMAP TLV.
If an lsp-trace of the BGP IPv4 LSP is initiated with the path-destination option specified, the CPM hash code at the responder node selects the outgoing interface to return in the DSMAP or DDMAP TLV. The decision is based on the modulo operation of the hash value on the label stack or the IP headers (where the DST IP is replaced by the specific 127/8 prefix address in the multipath type 8 field of the DSMAP or DDMAP) of the echo request message and the number of outgoing interfaces in the ECMP set.
The following figure shows a sample topology used in the subsequent BGP over SR-OSPF and BGP over SR-ISIS examples.
The following outputs are examples of the lsp-trace command for a hierarchical tunnel consisting of a BGP IPv4 LSP resolved over an SR-ISIS IPv4 tunnel or an SR-OSPF IPv4 tunnel.
The following output is an example of BGP over SR-OSPF.
The following output is an example of BGP over SR-ISIS.
Assuming the topology in the following figure includes an eBGP peering between nodes B and C, the BGP IPv4 LSP spans the AS boundary and resolves to an SR-ISIS tunnel within each AS.
The following output is an example of BGP over SR-ISIS using inter-AS option C.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
The 7210 SAS SDP diagnostics are SDP ping and SDP MTU path discovery.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
SDP ping performs in-band uni-directional or round-trip connectivity tests on SDPs. The SDP ping OAM packets are sent in-band, in the tunnel encapsulation, so it will follow the same path as traffic within the service. The SDP ping response can be received out-of-band in the control plane, or in-band using the data plane for a round-trip test.
For a uni-directional test, SDP ping tests:
For a round-trip test, SDP ping uses a local egress SDP ID and an expected remote SDP ID. Since SDPs are uni-directional tunnels, the remote SDP ID must be specified and must exist as a configured SDP ID on the far-end 7210 SAS. SDP round trip testing is an extension of SDP connectivity testing with the additional ability to test:
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
In a large network, network devices can support a variety of packet sizes that are transmitted across its interfaces. This capability is referred to as the Maximum Transmission Unit (MTU) of network interfaces. It is important to understand the MTU of the entire path end-to-end when provisioning services, especially for virtual leased line (VLL) services where the service must support the ability to transmit the largest customer packet.
The Path MTU discovery tool provides a powerful tool that enables service provider to get the exact MTU supported by the network's physical links between the service ingress and service termination points (accurate to one byte).
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
-The Nokia Service ping feature provides end-to-end connectivity testing for an individual service. Service ping operates at a higher level than the SDP diagnostics in that it verifies an individual service and not the collection of services carried within an SDP.
Service ping is initiated from a 7210 SAS router to verify round-trip connectivity and delay to the far-end of the service. -The Nokia implementation functions for MPLS tunnels and tests the following from edge-to-edge:
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
While the LSP ping, SDP ping and service ping tools enable transport tunnel testing and verify whether the correct transport tunnel is used, they do not provide the means to test the learning and forwarding functions on a per-VPLS-service basis.
It is conceivable, that while tunnels are operational and correctly bound to a service, an incorrect Forwarding Information Base (FIB) table for a service could cause connectivity issues in the service and not be detected by the ping tools. Nokia has developed VPLS OAM functionality to specifically test all the critical functions on a per-service basis. These tools are based primarily on the IETF document draft-stokes-vkompella-ppvpn-hvpls-oam-xx.txt, Testing Hierarchical Virtual Private LAN Services.
The VPLS OAM tools are:
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
For a MAC ping test, the destination MAC address (unicast or multicast) to be tested must be specified. A MAC ping packet can be sent through the control plane or the data plane. When sent by the control plane, the ping packet goes directly to the destination IP in a UDP/IP OAM packet. If it is sent by the data plane, the ping packet goes out with the data plane format.
In the control plane, a MAC ping is forwarded along the flooding domain if no MAC address bindings exist. If MAC address bindings exist, then the packet is forwarded along those paths (if they are active). Finally, a response is generated only when there is an egress SAP binding to that MAC address. A control plane request is responded to via a control reply only.
In the data plane, a MAC ping is sent with a VC label TTL of 255. This packet traverses each hop using forwarding plane information for next hop, VC label, etc. The VC label is swapped at each service-aware hop, and the VC TTL is decremented. If the VC TTL is decremented to 0, the packet is passed up to the management plane for processing. If the packet reaches an egress node, and would be forwarded out a customer facing port, it is identified by the OAM label after the VC label and passed to the management plane.
MAC pings are flooded when they are unknown at an intermediate node. They are responded to only by the egress nodes that have mappings for that MAC address.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
A MAC trace functions like an LSP trace with some variations. Operations in a MAC trace are triggered when the VC TTL is decremented to 0.
Like a MAC ping, a MAC trace can be sent either by the control plane or the data plane.
For MAC trace requests sent by the control plane, the destination IP address is determined from the control plane mapping for the destination MAC. If the destination MAC is known to be at a specific remote site, then the far-end IP address of that SDP is used. If the destination MAC is not known, then the packet is sent unicast, to all SDPs in the service with the appropriate squelching.
A control plane MAC traceroute request is sent via UDP/IP. The destination UDP port is the LSP ping port. The source UDP port is whatever the system gives (note that this source UDP port is really the demultiplexor that identifies the particular instance that sent the request, when correlating the reply). The source IP address is the system IP of the sender.
When a traceroute request is sent via the data plane, the data plane format is used. The reply can be via the data plane or the control plane.
A data plane MAC traceroute request includes the tunnel encapsulation, the VC label, and the OAM, followed by an Ethernet DLC, a UDP and IP header. If the mapping for the MAC address is known at the sender, then the data plane request is sent down the known SDP with the appropriate tunnel encapsulation and VC label. If it is not known, then it is sent down every SDP (with the appropriate tunnel encapsulation per SDP and appropriate egress VC label per SDP binding).
The tunnel encapsulation TTL is set to 255. The VC label TTL is initially set to the min-ttl (default is 1). The OAM label TTL is set to 2. The destination IP address is the all-routers multicast address. The source IP address is the system IP of the sender.
The destination UDP port is the LSP ping port. The source UDP port is whatever the system gives (note that this source UDP port is really the demultiplexor that identifies the particular instance that sent the request, when correlating the reply).
The Reply Mode is either 3 (i.e., reply via the control plane) or 4 (i.e., reply through the data plane), depending on the reply-control option. By default, the data plane request is sent with Reply Mode 3 (control plane reply) Reply Mode 4 (data plane reply).
The Ethernet DLC header source MAC address is set to either the system MAC address (if no source MAC is specified) or to the specified source MAC. The destination MAC address is set to the specified destination MAC. The EtherType is set to IP.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
The MAC ping OAM tool makes it possible to detect whether a particular MAC address has been learned in a VPLS.
The cpe-ping command extends this capability to detecting end-station IP addresses inside a VPLS. A CPE ping for a specific destination IP address within a VPLS will be translated to a MAC-ping toward a broadcast MAC address. Upon receiving such a MAC ping, each peer PE within the VPLS context will trigger an ARP request for the specific IP address. The PE receiving a response to this ARP request will report back to the requesting 7210 SAS. It is encouraged to use the source IP address of 0.0.0.0 to prevent the provider IP address of being learned by the CE.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
MAC populate is used to send a message through the flooding domain to learn a MAC address as if a customer packet with that source MAC address had flooded the domain from that ingress point in the service. This allows the provider to craft a learning history and engineer packets in a particular way to test forwarding plane correctness.
The MAC populate request is sent with a VC TTL of 1, which means that it is received at the forwarding plane at the first hop and passed directly up to the management plane. The packet is then responded to by populating the MAC address in the forwarding plane, like a conventional learn although the MAC will be an OAM-type MAC in the FIB to distinguish it from customer MAC addresses.
This packet is then taken by the control plane and flooded out the flooding domain (squelching appropriately, the sender and other paths that would be squelched in a typical flood).
This controlled population of the FIB is very important to manage the expected results of an OAM test. The same functions are available by sending the OAM packet as a UDP/IP OAM packet. It is then forwarded to each hop and the management plane has to do the flooding.
Options for MAC populate are to force the MAC in the table to type OAM (in case it already existed as dynamic or static or an OAM induced learning with some other binding), to prevent new dynamic learning to over-write the existing OAM MAC entry, to allow customer packets with this MAC to either ingress or egress the network, while still using the OAM MAC entry.
Finally, an option to flood the MAC populate request causes each upstream node to learn the MAC, for example, populate the local FIB with an OAM MAC entry, and to flood the request along the data plane using the flooding domain.
An age can be provided to age a particular OAM MAC after a different interval than other MACs in a FIB.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
MAC purge is used to clear the FIBs of any learned information for a particular MAC address. This allows one to do a controlled OAM test without learning induced by customer packets. In addition to clearing the FIB of a particular MAC address, the purge can also indicate to the control plane not to allow further learning from customer packets. This allows the FIB to be clean, and be populated only via a MAC Populate.
MAC purge follows the same flooding mechanism as the MAC populate.
A UDP/IP version of this command is also available that does not follow the forwarding notion of the flooding domain, but the control plane notion of it.
This section describes VLL diagnostics.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
VCCV ping is used to check connectivity of a VLL in-band. It checks that the destination (target) PE is the egress for the Layer 2 FEC. It provides a cross-check between the data plane and the control plane. It is in-band, meaning that the VCCV ping message is sent using the same encapsulation and along the same path as user packets in that VLL. This is equivalent to the LSP ping for a VLL service. VCCV ping reuses an LSP ping message format and can be used to test a VLL configured over an MPLS SDP.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
VCCV effectively creates an IP control channel within the pseudowire between PE1 and PE2. PE2 should be able to distinguish on the receive side VCCV control messages from user packets on that VLL. There are three possible methods of encapsulating a VCCV message in a VLL which translates into three types of control channels:
When sending the label mapping message for the VLL, PE1 and PE2 must indicate which of the preceding OAM packet encapsulation methods (for example, which control channel type) they support. This is accomplished by including an optional VCCV TLV in the pseudowire FEC Interface Parameter field. The format of the VCCV TLV is shown in the following figure.
Note that the absence of the optional VCCV TLV in the Interface parameters field of the pseudowire FEC indicates the PE has no VCCV capability.
The Control Channel (CC) Type field is a bitmask used to indicate if the PE supports none, one, or many control channel types, as follows:
If both PE nodes support more than one of the CC types, then a 7210 SAS PE will make use of the one with the lowest type value. For instance, OAM control word will be used in preference to the MPLS router alert label.
The Connectivity Verification (CV) bit mask field is used to indicate the specific type of VCCV packets to be sent over the VCCV control channel. The valid values are:
0x00 None of the following VCCV packet types are supported.
0x01 ICMP ping. Not applicable to a VLL over a MPLS SDP and as such is not supported by the 7210 SAS.
0x02 LSP ping. This is used in VCCV-Ping application and applies to a VLL over an MPLS SDP. This is supported by the 7210 SAS.
A VCCV ping is an LSP echo request message as defined in RFC 4379. It contains an L2 FEC stack TLV which must include within the sub-TLV type 10 “FEC 128 Pseudo-wire”. It also contains a field which indicates to the destination PE which reply mode to use. There are four reply modes defined in RFC 4379:
Reply mode, meaning:
The reply is an LSP echo reply message as defined in RFC 4379. The message is sent as per the reply mode requested by PE1. The return codes supported are the same as those supported in the 7210 SAS LSP ping capability.
The VCCV ping feature (Figure 16) is in addition to the service ping OAM feature which can be used to test a service between 7210 SAS nodes. The VCCV ping feature can test connectivity of a VLL with any third party node which is compliant to RFC 5085.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
Pseudo-wire switching is a method for scaling a large network of VLL or VPLS services by removing the need for a full mesh of T-LDP sessions between the PE nodes as the number of these nodes grow over time. Pseudo-wire switching is also used whenever there is a need to deploy a VLL service across two separate routing domains.
In the network, a Termination PE (T-PE) is where the pseudo-wire originates and terminates.
VCCV ping is extended to be able to perform the following OAM functions:
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
Although tracing of the MS-pseudo-wire path is possible using the methods described in previous sections, these require multiple manual iterations and that the FEC of the last pseudo-wire segment to the target T-PE/S-PE be known a priori at the node originating the echo request message for each iteration. This mode of operation is referred to as a “ping” mode.
The automated VCCV-trace can trace the entire path of a pseudo-wire with a single command issued at the T-PE or at an S-PE. This is equivalent to LSP-trace and is an iterative process by which the ingress T-PE or T-PE sends successive VCCV-ping messages with incrementing the TTL value, starting from TTL=1.
The method is described in draft-hart-pwe3-segmented-pw-vccv, VCCV Extensions for Segmented Pseudo-Wire, and is pending acceptance by the PWE3 working group. In each iteration, the source T-PE or S-PE builds the MPLS echo request message in a way similar to VCCV ping. The first message with TTL=1 will have the next-hop S-PE T-LDP session source address in the Remote PE Address field in the pseudo-wire FEC TLV. Each S-PE which terminates and processes the message will include in the MPLS echo reply message the FEC 128 TLV corresponding the pseudo-wire segment to its downstream node. The inclusion of the FEC TLV in the echo reply message is allowed in RFC 4379, Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures. The source T-PE or S-PE can then build the next echo reply message with TTL=2 to test the next-next hop for the MS-pseudo-wire. It will copy the FEC TLV it received in the echo reply message into the new echo request message. The process is terminated when the reply is from the egress T-PE or when a timeout occurs. If specified, the max-ttl parameter in the vccv-trace command will stop on SPE before reaching T-PE.
The results VCCV-trace can be displayed for a fewer number of pseudo-wire segments of the end-to-end MS-pseudo-wire path. In this case, the min-ttl and max-ttl parameters are configured accordingly. However, the T-PE/S-PE node will still probe all hops up to min-ttl to correctly build the FEC of the desired subset of segments.
Note that this method does not require the use of the downstream mapping TLV in the echo request and echo reply messages.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
MS pseudo-wire is supported with a mix of static and signaled pseudo-wire segments. However, VCCV ping and VCCV-trace is allowed until at least one segment of the MS pseudo-wire is static. Users cannot test a static segment but also, cannot test contiguous signaled segments of the MS-pseudo-wire. VCCV ping and VCCV trace is not supported in static-to-dynamic configurations.
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
A trace can be performed on the MS-pseudo-wire originating from T-PE1 by a single operational command. The following process occurs:
Note: This feature is only supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C. |
When in the ping mode of operation, the sender of the echo request message requires the FEC of the last segment to the target S-PE/T-PE node. This information can either be configured manually or be obtained by inspecting the corresponding sub-TLV's of the pseudo-wire switching point TLV. However, the pseudo-wire switching point TLV is optional and there is no guarantee that all S-PE nodes will populate it with their system address and the pseudo-wire-id of the last pseudo-wire segment traversed by the label mapping message. Therefore the 7210 SAS implementation will always make use of the user configuration for these parameters.
Upon receiving a VCCV echo request the control plane on S-PEs (or the target node of each segment of the MS pseudo-wire) validates the request and responds to the request with an echo reply consisting of the FEC 128 of the next downstream segment and a return code of 8 (label switched at stack-depth) indicating that it is an S-PE and not the egress router for the MS-pseudo-wire.
If the node is the T-PE or the egress node of the MS-pseudo-wire, it responds to the echo request with an echo reply with a return code of 3 (egress router) and no FEC 128 is included.
The operation to be taken by the node that receives the echo reply in response to its echo request depends on its current mode of operation such as ping or trace.
In ping mode, the node may choose to ignore the target FEC 128 in the echo reply and report only the return code to the operator.
The 7210 SAS OS supports Two-Way Active Measurement Protocol (TWAMP) and Two-Way Active Measurement Protocol Light (TWAMP Light).
Two-Way Active Measurement Protocol (TWAMP) provides a standards-based method for measuring the round-trip IP performance (packet loss, delay and jitter) between two devices. TWAMP uses the methodology and architecture of One-Way Active Measurement Protocol (OWAMP) to define a way to measure two-way or round-trip metrics.
There are four logical entities in TWAMP:
The control-client and session-sender are typically implemented in one physical device (the “client”) and the server and session-reflector in a second physical device (the “server”) with which the two-way measurements are being performed. The 7210 SAS acts as the server. The control-client and server establishes a TCP connection and exchange TWAMP-Control messages over this connection. When the control-client requires to start testing, the client communicates the test parameters to the server. If the server corresponds to conduct the described tests, the test begins as soon as the client sends a Start-Sessions message. As part of a test, the session sender sends a stream of UDP-based test packets to the session-reflector, and the session reflector responds to each received packet with a response UDP-based test packet. When the session-sender receives the response packets from the session-reflector, the information is used to calculate two-way delay, packet loss, and packet delay variation between the two devices.
The following are the configuration notes:
Note: This feature is supported on all 7210 SAS platforms as described in this document, except the 7210 SAS-D. |
TWAMP Light is an optional model included in the TWAMP standard RFC5357 that uses standard TWAMP test packets but provides a lightweight approach to gathering ongoing IP delay performance data for base router and per-VPRN statistics. Full details are described in Appendix I of RFC 5357 (Active Two Way Measurement Protocol). The 7210 SAS implementation supports the TWAMP Light model for gathering delay and loss statistics.
For TWAMP Light, the TWAMP Client/Server model is replaced with the Session Controller/Responder model. In general, the Session Controller is the launch point for the test packets and the Responder performs the reflection function.
TWAMP Light maintains the TWAMP test packet exchange but eliminates the TWAMP TCP control connection with local configurations; however, not all negotiated control parameters are replaced with a local configuration. For example, CoS parameters communicated over the TWAMP control channel are replaced with a reply-in-kind approach. The reply-in-kind model reflects back the received CoS parameters, which are influenced by the reflector QoS policies.
The responder function is configured under the config>router>twamp-light command hierarchy for base router reflection, and under the config>service>vprn>twamp-light command hierarchy for per VPRN reflection. The TWAMP Light reflector function is configured per context and must be activated before reflection can occur; the function is not enabled by default for any context. The reflector requires the operator to define the TWAMP Light UDP listening port that identifies the TWAMP Light protocol and the prefixes that the reflector will accept as valid sources for a TWAMP Light request. If the configured TWAMP Light listening UDP port is in use by another application on the system, a Minor OAM message will be presented indicating that the port is unavailable and that the activation of the reflector is not allowed. If the source IP address in the TWAMP Light packet arriving on the responder does not match a configured IP address prefix, the packet is dropped. Multiple prefix entries may be configured per context on the responder. An inactivity timeout under the config>test-oam>twamp>twamp-light hierarchy defines the amount of time the reflector will keep the individual reflector sessions active in the absence of test packets. A responder requires CPM3 or better hardware.
TWAMP Light test packet launching is controlled by the OAM Performance Monitoring (OAM-PM) architecture and adheres to those rules; this includes the assignment of a "Test-Id". TWAMP Light does not carry the 4-byte test ID in the packet to remain locally significant and uniform with other protocols under the control of the OAM-PM architecture. The OAM-PM construct allow the various test parameters to be defined. These test parameters include the IP session-specific information which allocates the test to the specific routing instance, the source and destination IP address, the destination UDP port (which must match the listening UDP port on the reflector) and a number of other options that allow the operator to influence the packet handling. The probe interval and padding size can be configured under the specific session. The size of the all “0” padding can be included to ensure that the TWAMP packet is the same size in both directions. The TWAMP PDU definition does not accomplish symmetry by default. A pad size of 27 bytes will accomplish symmetrical TWAMP frame sizing in each direction.
The OAM-PM architecture does not perform any validation of the session information. The test will be allowed to be activated regardless of the validity of this information. For example, if the configured source IP address is not local within the router instance to which the test is allocated, the test will start sending TWAMP Light packets but will not receive any responses.
The OAM-PM section of this guide provides more information describing the integration of TWAMP Light and the OAM-PM architecture, including hardware dependencies.
The following TWAMP Light functions are supported on the 7210 SAS-K 2F1C2T:
The following TWAMP Light functions are supported on the 7210 SAS-K 2F6C4T and 7210 SAS-K 3SFP+ 8C:
The following sample reflector configuration output shows the use of TWAMP Light to monitor two IP endpoints in a VPRN service on the 7210 SAS, including the default TWAMP Light values that were not overridden with configuration entries.
The following is a sample session controller configuration output.
The IEEE and the ITU-T have cooperated to define the protocols, procedures, and managed objects to support service-based fault management. Both IEEE 802.1ag standard (Ethernet Connectivity Fault Management (ETH-CFM)) and the ITU-T Y.1731 recommendation support a common set of tools that allow operators to deploy the necessary administrative constructs, management entities and functionality. The ITU-T has also implemented a set of advanced ETH-CFM and performance management functions and features that build on the proactive and on demand troubleshooting tools.
CFM uses Ethernet frames and is distinguishable by ether-type 0x8902. In certain cases, the different functions use a reserved multicast address that can also be used to identify specific functions at the MAC layer. However, the multicast MAC addressing is not used for every function or in every case. The Operational Code (OpCode) in the common CFM header is used to identify the type of function carried in the CFM packet. CFM frames are only processed by IEEE MAC bridges. With CFM, interoperability can be achieved between different vendor equipment in the service provider network up to and including customer premise bridges.
IEEE 802.1ag and ITU-T Y.1731 functions that are implemented are available on the 7210 SAS platforms.
The following table lists the CFM-related acronyms used in this section.
Acronym | Expansion |
1DM | One way Delay Measurement (Y.1731) |
AIS | Alarm Indication Signal |
BNM | Bandwidth Notification Message (Y.1731 sub OpCode of GNM) |
CCM | Continuity Check Message |
CFM | Connectivity Fault Management |
DMM | Delay Measurement Message (Y.1731) |
DMR | Delay Measurement Reply (Y.1731) |
GNM | Generic Notification Message |
LBM | Loopback Message |
LBR | Loopback Reply |
LTM | Linktrace Message |
LTR | Linktrace Reply |
ME | Maintenance Entity |
MA | Maintenance Association |
MA-ID | Maintenance Association Identifier |
MD | Maintenance Domain |
MEP | Maintenance Association Endpoint |
MEP-ID | Maintenance Association Endpoint Identifier |
MHF | MIP Half Function |
MIP | Maintenance Domain Intermediate Point |
OpCode | Operational Code |
RDI | Remote Defect Indication |
TST | Ethernet Test (Y.1731) |
The IEEE and the ITU-T use their own nomenclature when describing administrative contexts and functions. This introduces a level of complexity to configuration, discussion and different vendors naming conventions. The 7210 SAS OS CLI has chosen to standardize on the IEEE 802.1ag naming where overlap exists. ITU-T naming is used when no equivalent is available in the IEEE standard. In the following definitions, both the IEEE name and ITU-T names are provided for completeness, using the format IEEE Name/ITU-T Name.
Maintenance Domain (MD)/Maintenance Entity (ME) is the administrative container that defines the scope, reach and boundary for faults. It is typically the area of ownership and management responsibility. The IEEE allows for various formats to name the domain, allowing up to 45 characters, depending on the format selected. ITU-T supports only a format of “none” and does not accept the IEEE naming conventions, as follows:
Maintenance Association (MA)/Maintenance Entity Group (MEG) is the construct where the different management entities will be contained. Each MA is uniquely identified by its MA-ID. The MA-ID is comprised of the by the MD level and MA name and associated format. This is another administrative context where the linkage is made between the domain and the service using the bridging-identifier configuration option. The IEEE and the ITU-T use their own specific formats. The MA short name formats (0-255) have been divided between the IEEE (0-31, 64-255) and the ITU-T (32-63), with five currently defined (1-4, 32). Even though the different standards bodies do not have specific support for the others formats a Y.1731 context can be configured using the IEEE format options, as follows:
Note: When a VID is used as the short MA name, 802.1ag will not support VLAN translation because the MA-ID must match all the MEPs. The default format for a short MA name is an integer. Integer value 0 means the MA is not attached to a VID. This is useful for VPLS services on 7210 SAS platforms because the VID is locally significant. |
Maintenance Domain Level (MD Level)/Maintenance Entity Group Level (MEG Level) is the numerical value (0-7) representing the width of the domain. The wider the domain, higher the numerical value, the farther the ETH-CFM packets can travel. It is important to understand that the level establishes the processing boundary for the packets. Strict rules control the flow of ETH-CFM packets and are used to ensure correct handling, forwarding, processing and dropping of these packets. To keep it simple ETH-CFM packets with higher numerical level values will flow through MEPs on MIPs on SAPs configured with lower level values. This allows the operator to implement different areas of responsibility and nest domains within each other. Maintenance association (MA) includes a set of MEPs, each configured with the same MA-ID and MD level used verify the integrity of a single service instance.
Maintenance Endpoint (MEP)/MEG Endpoint (MEP) are the workhorses of ETH-CFM. A MEP is the unique identification within the association (0-8191). Each MEP is uniquely identified by the MA-ID, MEPID tuple. This management entity is responsible for initiating, processing and terminating ETH-CFM functions, following the nesting rules. MEPs form the boundaries which prevent the ETH-CFM packets from flowing beyond the specific scope of responsibility. A MEP has direction, up or down. Each indicates the directions packets will be generated; UP toward the switch fabric, down toward the SAP away from the fabric. Each MEP has an active and passive side. Packets that enter the active point of the MEP will be compared to the existing level and processed accordingly. Packets that enter the passive side of the MEP are passed transparently through the MEP. Each MEP contained within the same maintenance association and with the same level (MA-ID) represents points within a single service. MEP creation on a SAP is allowed only for Ethernet ports with NULL, Q-tags, q-in-q encapsulations. MEPs may also be created on SDP bindings.
Maintenance Intermediate Point (MIP)/MEG Intermediate Point (MIP) are management entities between the terminating MEPs along the service path. These provide insight into the service path connecting the MEPs. MIPs only respond to Loopback Messages (LBM) and Linktrace Messages (LTM). All other CFM functions are transparent to these entities. Only one MIP is allowed per SAP or SDP. The creation of the MIPs can be done when the lower level domain is created (explicit). This is controlled by the use of the mhf-creation mode within the association under the bridge-identifier. MIP creation is supported on a SAP and SDP, not including Mesh SDP bindings. By default, no MIPs are created.
There are two locations in the configuration where ETH-CFM is defined. The domains, associations (including linkage to the service id), MIP creation method, common ETH-CFM functions and remote MEPs are defined under the top level eth-cfm command. It is important to note, when Y.1731 functions are required the context under which the MEPs are configured must follow the Y.1731 specific formats (domain format of none, MA format icc-format). When these parameters have been entered, the MEP and possibly the MIP can be defined within the service under the SAP or SDP.
Table 10, Table 11, Table 12, Table 13, and Table 14 are general tables that indicates the ETH-CFM support for the different services and endpoints. They are not meant to indicate the services that are supported or the requirements for those services on the individual platforms.
Service | Ethernet connection type | MEP | MIP | Primary VLAN | ||
Down MEP | Up MEP | Ingress MIP | Egress MIP | |||
Epipe | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | |
VPLS | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ 1 | |
R-VPLS | SAP | |||||
IES | IES IPv4 interface | |||||
SAP |
Note:
Service | Ethernet Connection Type | MEP | MIP | Primary VLAN | ||
Down MEP | Up MEP | Ingress MIP | Egress MIP | |||
Epipe | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | |
VPLS | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ 1 | |
R-VPLS | SAP | |||||
IES | IES IPv4 interface | |||||
SAP |
Note:
Service | Ethernet Connection Type | MEP | MIP | Primary VLAN | ||
Down MEP | Up MEP | Ingress MIP | Egress MIP | |||
Epipe | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | ✓ 1 |
VPLS | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | |
R-VPLS | SAP | |||||
IES | IES IPv4 interface | |||||
SAP |
Note:
Service | Ethernet Connection Type | MEP | MIP | Primary VLAN | ||
Down MEP | Up MEP | Ingress MIP | Egress MIP | |||
Epipe | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | |
Spoke-SDP | ✓ | ✓ | ✓ | ✓ | ||
VPLS | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | |
Spoke-SDP | ✓ | ✓ | ✓ | ✓ | ||
Mesh-SDP | ✓ | ✓ | ✓ | ✓ | ||
R-VPLS | SAP | |||||
IP interface (IES or VPRN) | ||||||
IES | IES IPv4 interface | |||||
SAP |
Service | Ethernet Connection Type | MEP | MIP | Primary VLAN | ||
Down MEP | Up MEP | Ingress MIP | Egress MIP | |||
Epipe | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | ✓ 1 |
Spoke-SDP | ✓ | ✓ | ✓ | ✓ | ||
VPLS | SAP (access and access-uplink SAP) | ✓ | ✓ | ✓ | ✓ | |
Spoke-SDP | ✓ | ✓ | ✓ | ✓ | ||
Mesh-SDP | ✓ | ✓ | ✓ | ✓ | ||
R-VPLS | SAP | |||||
IP interface (IES or VPRN) | ||||||
IES | IES IPv4 interface | |||||
SAP |
Note:
The following figures show the detailed IEEE representation of MEPs, MIPs, levels and associations, using the standards defined icons.
A loopback message is generated by a MEP to its peer MEP (Figure 19). The functions are similar to an IP ping to verify Ethernet connectivity between the nodes.
The following loopback-related functions are supported:
A linktrace message is originated by a MEP and targeted to a peer MEP in the same MA and within the same MD level (see Figure 20). Its function is similar to IP traceroute. Linktrace traces a specific MAC address through the service. The peer MEP responds with a linktrace reply message after successful inspection of the linktrace message. The MIPs along the path also process the linktrace message and respond with linktrace replies to the originating MEP if the received linktrace message has a TTL greater than 1; the MIPs also forward the linktrace message if a lookup of the target MAC address in the Layer 2 FIB is successful. The originating MEP will receive multiple linktrace replies and from processing the linktrace replies, it can determine the route to the target bridge.
A traced MAC address (the targeted MAC address) is carried in the payload of the linktrace message. Each MIP and MEP receiving the linktrace message checks whether it has learned the target MAC address. To use linktrace, the target MAC address must have been learned by the nodes in the network. If the address has been learned, a linktrace message is sent back to the originating MEP. A MIP forwards the linktrace message out of the port where the target MAC address was learned.
The linktrace message has a multicast destination address. On a broadcast LAN, it can be received by multiple nodes connected to that LAN; however, only one node will send a reply.
The following linktrace-related functions are supported:
The following display output has been updated to include the Sender ID TLV contents if they are included in the LBR.
A Continuity Check Message (CCM) is a multicast frame that is generated by a MEP and multicast to all other MEPs in the same MA. The CCM does not require a reply message. To identify faults, the receiving MEP maintains an internal list of remote MEPs it should be receiving CCM messages from.
This list is based on the remote MEP ID configuration within the association the MEP is created in. When the local MEP does not receive a CCM from one of the configured remote MEPs within a preconfigured period, the local MEP raises an alarm.
The following figure shows a CFM continuity check.
The following figure shows a CFM CC failure scenario.
The following functions are supported:
Alarm Indication Signal (AIS) provides an Y.1731 capable MEP the ability to signal a fault condition in the reverse direction of the MEP, out the passive side. When a fault condition is detected the MEP will generate AIS packets at the configured client levels and at the specified AIS interval until the condition is cleared. Currently a MEP configured to generate AIS must do so at a level higher than its own. The MEP configured on the service receiving the AIS packets is required to have the active side facing the receipt of the AIS packet and must be at the same level the AIS, The absence of an AIS packet for 3.5 times the AIS interval set by the sending node will clear the condition on the receiving MEP.
It is important to note that AIS generation is not supported to an explicitly configured endpoint. An explicitly configured endpoint is an object that contains multiple individual endpoints, as in PW redundancy.
Ethernet test affords operators an Y.1731 capable MEP the ability to send an in service on demand function to test connectivity between two MEPs. The test is generated on the local MEP and the results are verified on the destination MEP. Any ETH-TST packet generated that exceeds the MTU will be silently dropped by the lower level processing of the node.
Timestamps for different Y.1731 messages are obtained as follows:
Note: Accurate results for one-way and two-way delay measurement tests using Y.1731 messages are obtained if the nodes are capable of time stamping packets in hardware. |
One-way delay measurement allows the operator the ability to check unidirectional delay between MEPs. An ETH-1DM packet is time stamped by the generating MEP and sent to the remote node. The remote node time stamps the packet on receipt and generates the results. The results, available from the receiving MEP, will indicate the delay and jitter. Jitter, or delay variation, is the difference in delay between tests. This means the delay variation on the first test will not be valid. It is important to ensure that the clocks are synchronized on both nodes to ensure the results are accurate. NTP can be used to achieve a level of wall clock synchronization between the nodes.
Two-way delay measurement is similar to one way delay measurement except it measures the round trip delay from the generating MEP. In this case wall clock synchronization issues will not influence the test results because four timestamps are used. This allows the remote nodes time to be removed from the calculation and as a result clock variances are not included in the results. The same consideration for first test and hardware based time stamping stated for one way delay measurement are applicable to two-way delay measurement.
Delay can be measured using one-way and two-way on demand functions. The two-way test results are available single-ended, test initiated, calculation and results viewed on the same node. There is no specific configuration under the MEP on the SAP to enabled this function. An example of an on demand test and results follows. The latest test result is stored for viewing. Further tests will overwrite the previous results. Delay Variation is only valid if more than one test has been executed.
Note: This feature is supported only on the 7210 SAS-K 2F1C2T and 7210 SAS-K 2F6C4T. |
The Ethernet Bandwidth Notification (ETH-BN) function is used by a server MEP to signal link bandwidth changes to a client MEP.
This functionality is for point-to-point microwave radios. When a microwave radio uses adaptive modulation, the capacity of the radio can change based on the condition of the microwave link. For example, in adverse weather conditions that cause link degradation, the radio can change its modulation scheme to a more robust one (which will reduce the link bandwidth) to continue transmitting.
This change in bandwidth is communicated from the server MEP on the radio, using an Ethernet Bandwidth Notification Message (ETH-BNM), to the client MEP on the connected router. The server MEP transmits periodic frames with ETH-BN information, including the interval, the nominal and currently available bandwidth. A port MEP with the ETH-BN feature enabled will process the information contained in the CFM PDU and appropriately adjust the rate of traffic sent to the radio.
A port MEP that is not a LAG member port supports the client side reception and processing of the ETH-BN CFM PDU sent by the server MEP. By default, processing is disabled. The config>port>ethernet>eth-cfm>mep>eth-bn>receive CLI command sets the ETH-BN processing state on the port MEP. A port MEP supports untagged packet processing of ETH-CFM PDUs at domain levels 0 and 1 only. The port client MEP sends the ETH-BN rate information received to be applied to the port egress rate in a QoS update. A pacing mechanism limits the number of QoS updates sent. The config>port>ethernet>eth-cfm>mep>eth-bn>rx-update-pacing CLI command allows the updates to be paced using a configurable range of 1 to 600 seconds (the default is 5 seconds). The pacing timer begins to count down following the most recent QoS update sent to the system for processing. When the timer expires, the most recent update that arrived from the server MEP is compared to the most recent value sent for system processing. If the value of the current bandwidth is different from the previously processed value, the update is sent and the process begins again. Updates with a different current bandwidth that arrive when the pacing timer has already expired are not subject to a timer delay. Refer to the 7210 SAS-D, Dxp, K 2F1C2T, K 2F6C4T, K 3SFP+ 8C Interface Configuration Guide for more information about these CLI commands.
A complimentary QoS configuration is required to allow the system to process current bandwidth updates from the CFM engine. The config>port>ethernet>eth-bn-egress-rate-changes CLI command is required to enable the QoS function to update the port egress rates based on the current available bandwidth updates from the CFM engine. By default, the function is disabled.
Both the CFM and QoS functions must be enabled for the changes in current bandwidth to dynamically update the egress rate.
When the MEP enters a state that prevents it from receiving the ETH-BNM, the current bandwidth last sent for processing is cleared and the egress rate reverts to the configured rate. Under these conditions, the last update cannot be guaranteed as current. Explicit notification is required to dynamically update the port egress rate. The following types of conditions lead to ambiguity:
If the eth-bn-egress-rate-changes command is disabled using the no option, CFM continues to send updates, but the updates are held without affecting the port egress rate.
The ports supporting ETH-BN MEPs can be configured for the network, access, hybrid, and access-uplink modes. When ETH-BN is enabled on a port MEP and the config>port>ethernet>eth-cfm>mep>eth-bn>receive and the QoS config>port>ethernet>eth-bn-egress-rate-changes contexts are configured, the egress rate is dynamically changed based on the current available bandwidth indicated by the ETH-BN server.
Note: For SAPs configured on an access port or hybrid port, changes in port bandwidth on reception of ETH-BNM messages will result in changes to the port egress rate, but the SAP egress aggregate shaper rate and queue egress shaper rate provisioned by the user are unchanged, which may result in an oversubscription of the committed bandwidth. Consequently, Nokia recommends that the user should change the SAP egress aggregate shaper rate and queue egress shaper rate for all SAPs configured on the port from an external management station after egress rate changes are detected on the port. |
The port egress rate is capped by the minimum of the configured egress-rate, and the maximum port rate. The minimum egress rate using ETH-BN is 1024 kb/s. If a current bandwidth of zero is received, it does not affect the egress port rate and the previously processed current bandwidth will continue to be used.
The client MEP requires explicit notification of changes to update the port egress rate. The system does not timeout any previously processed current bandwidth rates using a timeout condition. The specification does allow a timeout of the current bandwidth if a frame has not been received in 3.5 times the ETH-BNM interval. However, the implicit approach can lead to misrepresented conditions and has not been implemented.
When you start or restart the system, the configured egress rate is used until an ETH-BNM arrives on the port with a new bandwidth request from the ETH-BN server MEP.
An event log is generated each time the egress rate is changed based on reception of a BNM. If a BNM is received that does not result in a bandwidth change, no event log is generated.
The destination MAC address can be a Class 1 multicast MAC address (that is, 01-80-C2-00-0x) or the MAC address of the port MEP configured. Standard CFM validation and identification must be successful to process CFM PDUs.
For information on the eth-bn-egress-rate-changes command, refer to the 7210 SAS-D, Dxp, K 2F1C2T, K 2F6C4T, K 3SFP+ 8C Interface Configuration Guide.
The Bandwidth Notification Message (BNM) PDU used for ETH-BN information is a sub-OpCode within the Ethernet Generic Notification Message (ETH-GNM).
The following table shows the BNM PDU format fields.
Label | Description |
MEG Level | Carries the MEG level of the client MEP (0 to 7). This field must be set to either 0 or 1 to be recognized as a port MEP. |
Version | The current version is 0 |
OpCode | The value for this PDU type is GNM (32) |
Flags | Contains one information element: Period (3 bits), which indicates how often ETH-BN messages are transmitted by the server MEP. The following are the valid values:
|
TLV Offset | This value is set to 13 |
Sub-OpCode | The value for this PDU type is BNM (1) |
Nominal Bandwidth | The nominal full bandwidth of the link, in Mb/s. This information is reported in the display but not used to influence QoS egress rates. |
Current Bandwidth | The current bandwidth of the link in Mb/s. The value is used to influence the egress rate. |
Port ID | A non-zero unique identifier for the port associated with the ETH-BN information, or zero if not used. This information is reported in the display, but is not used to influence QoS egress rates. |
End TLV | An all zeros octet value On the 7210 SAS, port-level MEPs with level 0 or 1 should be implemented to support this application. A port-level MEP must support CCM, LBM, LTM, RDI, and ETH-BN, but can be used for ETH-BN only. |
The show eth-cfm mep eth-bandwidth-notification display output includes the ETH-BN values received and extracted from the PDU, including a last reported value and the pacing timer. If the n/a value appears in the field, it indicates that field has not been processed.
The base show eth-cfm mep output is expanded to include the disposition of the ETH-BN receive function and the configured pacing timer.
The show port port-id detail is expanded to include an Ethernet Bandwidth Notification Message Information section. This section includes the ETH-BN Egress Rate disposition and the current Egress BN rate being used.
The 7210 SAS supports port-based MEPs for use with CFM ETH-BN. The port MEP must be configured at level 0 and can be used for ETH-BN message reception and processing as described in ITU-T Y.1731 Ethernet Bandwidth Notification. Port-based MEPs only support CFM CC, LT, LS, and RDI message processing. No other CFM and Y.1731 messages are supported on these port-based MEPs.
Note: Port-based MEPs are designed to be used with the ETH-BN application. Nokia recommends not to use port-based MEPs for other applications. |
Note: The ETH-CFM statistics feature is supported on all platforms as described in this document, except the 7210 SAS-D. |
A number of statistics are available to view the current processing requirements for CFM. Any packet that is counted against the CFM resource is included in the statistics counters. The counters do not include sub-second CCM and ETH-CFM PDUs generated by non-ETH-CFM functions (which include OAM-PM & SAA) or filtered by a security configuration.
SAA and OAM-PM use standard CFM PDUs. The reception of these packets is included in the receive statistics. However, SAA and OAM-PM launch their own test packets and do not consume ETH-CFM transmission resources.
Per-system and per-MEP statistics are included with a per-OpCode breakdown. These statistics help operators determine the busiest active MEPs on the system and provide a breakdown of per-OpCode processing at the system and MEP level.
Use the show eth-cfm statistics command to view the statistics at the system level. Use the show eth-cfm mep mep-id domain md-index association ma-index statistics command to view the per-MEP statistics. Use the clear eth-cfm mep mep-id domain md-index association ma-index statistics command to clear statistics. The clear command clears the statistics for only the specified function. For example, clearing the system statistics does not clear the individual MEP statistics because each MEP maintains its own unique counters.
All known OpCodes are listed in the transmit and receive columns. Different versions for the same OpCode are not displayed. This does not imply that the network element supports all functions listed in the table. Unknown OpCodes are dropped.
Use the tools dump eth-cfm top-active-meps command to display the top ten active MEPs in the system. This command provides a nearly real-time view of the busiest active MEPS by displaying the active (not shutdown) MEPs and inactive (shutdown) MEPs in the system. ETH-CFM MEPs that are shutdown continue to consume CPM resources because the main task is syncing the PDUs. The counts begin from the last time that the command was issued using the clear option.
Nokia applied pre-standard OpCodes 53 (Synthetic Loss Reply) and 54 (Synthetic Loss Message) for the purpose of measuring loss using synthetic packets.
Notes: These will be changes to the assigned standard values in a future release. This means that the Release 4.0R6 is pre-standard and will not inter-operate with future releases of SLM or SLR that supports the standard OpCode values.
This synthetic loss measurement approach is a single-ended feature that allows the operator to run on-demand and proactive tests to determine “in”, “out” loss and “unacknowledged” packets. This approach can be used between peer MEPs in both point to point and multi-point services. Only remote MEP peers within the association and matching the unicast destination will respond to the SLM packet.
The specification uses various sequence numbers to determine in which direction the loss occurred. Nokia has implemented the required counters to determine loss in each direction. To correctly use the information that is gathered the following terms are defined:
The per probe specific loss indicators are available when looking at the on-demand test runs, or the individual probe information stored in the MIB. When tests are scheduled by Service Assurance Application (SAA) the per probe data is summarized and per probe information is not maintained. Any “unacknowledged” packets will be recorded as “in-loss” when summarized.
The on-demand function can be executed from CLI or SNMP. The on demand tests are meant to provide the carrier a way to perform on the spot testing. However, this approach is not meant as a method for storing archived data for later processing. The probe count for on demand SLM has a range of one to 100 with configurable probe spacing between one second and ten seconds. This means it is possible that a single test run can be up to 1000 seconds. Although possible, it is more likely the majority of on demand case can increase to 100 probes or less at a one second interval. A node may only initiate and maintain a single active on demand SLM test at any specific time. A maximum of one storage entry per remote MEP is maintained in the results table. Subsequent runs to the same peer can overwrite the results for that peer. This means, when using on demand testing the test should be run and the results checked before starting another test.
The proactive measurement functions are linked to SAA. This backend provides the scheduling, storage and summarization capabilities. Scheduling may be either continuous or periodic. It also allows for the interpretation and representation of data that may enhance the specification. As an example, an optional TVL has been included to allow for the measurement of both loss and delay or jitter with a single test. The implementation does not cause any interoperability because the optional TVL is ignored by equipment that does not support this. In mixed vendor environments loss measurement continues to be tracked but delay and jitter can only report round trip times. It is important to point out that the round trip times in this mixed vendor environments include the remote nodes processing time because only two time stamps will be included in the packet. In an environment where both nodes support the optional TLV to include time stamps unidirectional and round trip times is reported. Since all four time stamps are included in the packet the round trip time in this case does not include remote node processing time. Of course, those operators that wish to run delay measurement and loss measurement at different frequencies are free to run both ETH-SL and ETH-DM functions. ETH-SL is not replacing ETH-DM. Service Assurance is only briefly described here to provide some background on the basic functionality. To know more about SAA functions see Service Assurance Agent overview.
The ETH-SL packet format contains a test-id that is internally generated and not configurable. The test-id is visible for the on demand test in the display summary. It is possible for a remote node processing the SLM frames receives overlapping test-ids as a result of multiple MEPs measuring loss between the same remote MEP. For this reason, the uniqueness of the test is based on remote MEP-ID, test-id and Source MAC of the packet.
ETH-SL is applicable to up and down MEPs and as per the recommendation transparent to MIPs. There is no coordination between various fault conditions that could impact loss measurement. This is also true for conditions where MEPs are placed in shutdown state as a result of linkage to a redundancy scheme like MC-LAG. Loss measurement is based on the ETH-SL and not coordinated across different functional aspects on the network element. ETH-SL is supported on service based MEPs.
It is possible that two MEPs may be configured with the same MAC on different remote nodes. This causes various issues in the FDB for multipoint services and is considered a misconfiguration for most services. It is possible to have a valid configuration where multiple MEPs on the same remote node have the same MAC. In fact, this is likely to happen. In this release, only the first responder is used to measure packet loss. The second responder is dropped. Since the same MAC for multiple MEPs is only truly valid on the same remote node this should is an acceptable approach
There is no way for the responding node to understand when a test is completed. For this reason a configurable “inactivity-timer” determines the length of time a test is valid. The timer will maintain an active test as long as it is receiving packets for that specific test, defined by the test-id, remote MEP Id and source MAC. When there is a gap between the packets that exceeds the inactivity-timer the responding node responds with a sequence number of one regardless of what the sequence number was the instantiating node sent. This means the remote MEP accepts that the previous test has expired and these probes are part of a new test. The default for the inactivity timer is 100 second and has a range of 10 to 100 seconds.
The responding node is limited to a fixed number of SLM tests per platform. Any test that attempts to involve a node that is already actively processing more than the system limit of the SLM tests shows up as “out loss” or “unacknowledged” packets on the node that instantiated the test because the packets are silently discarded at the responder. It is important for the operator to understand this is silent and no log entries or alarms is raised. It is also important to keep in mind that these packets are ETH-CFM based and the different platforms stated receive rate for ETH-CFM must not be exceeded. ETH-SL provides a mechanism for operators to pro-actively trend packet loss for service based MEPs.
The following figure shows the configuration required for proactive SLM test using SAA.
The following is a sample MIB output of an on-demand test. Node1 is tested for this example. The SAA configuration does not include the accounting policy required to collect the statistics before they are overwritten. NODE2 does not have an SAA configuration. NODE2 includes the configuration to build the MEP in the VPLS service context.
The following sample output is meant to demonstrate the different loss conditions that an operator may see. The total number of attempts is “100” is because the final probe in the test was not acknowledged.
The following is an example of an on demand tests that and the associated output. Only single test runs are stored and can be viewed after the fact.
UP MEPs and Down MEPs have been aligned as of this release to better emulate service data. When an UP MEP or DOWN MEP is the source of the ETH-CFM PDU the priority value configured, as part of the configuration of the MEP or specific test, will be treated as the Forwarding Class (FC) by the egress QoS policy. If there is no egress QoS policy the priority value will be mapped to the CoS values in the frame. However, egress QoS Policy may overwrite this original value. The Service Assurance Agent (SAA) uses [fc {fc-name} to accomplish similar functionality.
UP MEPs and DOWN MEPs terminating an ETH-CFM PDU will use the received FC as the return priority for the appropriate response, again feeding into the egress QoS policy as the FC.
ETH-CFM PDUs received on the MPLS-SDP bindings will now correctly pass the EXP bit values to the ETH-CFM application to be used in the response.
These are default behavioral changes without CLI options.
The following are ETH-CFM configuration guidelines:
Platform | G.8032 down MEP | Service down MEP | Service up MEP |
7210 SAS-D | 100 ms | 100 ms | 1 s |
7210 SAS-Dxp | 10 ms | 1 s | 1 s |
7210 SAS-K 2F1C2T | 10 ms | 1 s | 1 s |
7210 SAS-K 2F6C4T | 10 ms | 1 s | 1 s |
7210 SAS-K 3SFP+ 8C | 10 ms | 1 s | 1 s |
OAM mapping is a mechanism that enables a way of deploying OAM end-to-end in a network where different OAM tools are used in different segments. For instance, an Epipe service could span across the network using Ethernet access (CFM used for OAM).
In the 7210 SAS implementation, the Service Manager (SMGR) is used as the central point of OAM mapping. It receives and processes the events from different OAM components, then decides the actions to take, including triggering OAM events to remote peers.
Fault propagation for CFM is by default disabled at the MEP level to maintain backward compatibility. When required, it can be explicitly enabled by configuration.
Fault propagation for a MEP can only be enabled when the MA is comprised of no more than two MEPs (point-to-point).
CFM MEP declares a connectivity fault when its defect flag is equal to or higher than its configured lowest defect priority. The defect can be any of the following depending on configuration:
The following additional fault condition applies to Y.1731 MEPs:
Setting the lowest defect priority to allDef may cause problems when fault propagation is enabled in the MEP. In this scenario, when MEP A sends CCM to MEP B with interface status down, MEP B will respond with a CCM with RDI set. If MEP A is configured to accept RDI as a fault, then it gets into a dead lock state, where both MEPs will declare fault and never be able to recover. The default lowest defect priority is DefMACstatus, which will not be a problem when interface status TLV is used. It is also very important that different Ethernet OAM strategies should not overlap the span of each other. In some cases, independent functions attempting to perform their normal fault handling can negatively impact the other. This interaction can lead to fault propagation in the direction toward the original fault, a false positive, or worse, a deadlock condition that may require the operator to modify the configuration to escape the condition. For example, overlapping Link Loss Forwarding (LLF) and ETH-CFM fault propagation could cause these issues.
For the DefRemoteCCM fault, it is raised when any remote MEP is down. So, whenever a remote MEP fails and fault propagation is enabled, a fault is propagated to SMGR.
When CFM is the OAM module at the other end, it is required to use any of the following methods (depending on local configuration) to notify the remote peer:
Note: 7210 platforms expect that the fault notified using interface status TLV, is cleared explicitly by the remote MEP when the fault is no longer present on the remote node. On 7210 SAS-D and 7210 SAS-Dxp, use of CCM with interface status TLV Down is not recommended to be configured with a Down MEP, unless it is known that the remote MEP clears the fault explicitly. |
User can configure UP MEPs to use Interface Status TLV with fault propagation. Special considerations apply only to Down MEPs.
When a fault is propagated by the service manager, if AIS is enabled on the SAP/SDP-binding, then AIS messages are generated for all the MEPs configured on the SAP/SDP-binding using the configured levels.
Note that the existing AIS procedure still applies even when fault propagation is disabled for the service or the MEP. For example, when a MEP loses connectivity to a configured remote MEP, it generates AIS if it is enabled. The new procedure that is defined in this document introduces a new fault condition for AIS generation, fault propagated from SMGR, that is used when fault propagation is enabled for the service and the MEP.
The transmission of CCM with interface status TLV must be done instantly without waiting for the next CCM transmit interval. This rule applies to CFM fault notification for all services.
Notifications from SMGR to the CFM MEPs for fault propagation should include a direction for the propagation (up or down: up means in the direction of coming into the SAP/SDP-binding; down means in the direction of going out of the SAP/SDP-binding), so that the MEP knows what method to use. For instance, an up fault propagation notification to a down MEP will trigger an AIS, while a down fault propagation to the same MEP can trigger a CCM with interface TLV with status down.
For a specific SAP/SDP-binding, CFM and SMGR can only propagate one single fault to each other for each direction (up or down).
When there are multiple MEPs (at different levels) on a single SAP/SDP-binding, the fault reported from CFM to SMGR will be the logical OR of results from all MEPs. Basically, the first fault from any MEP will be reported, and the fault will not be cleared as long as there is a fault in any local MEP on the SAP/SDP-binding.
Down and up MEPs are supported for Epipe services as well as fault propagation. When there are both up and down MEPs configured in the same SAP/SDP-binding and both MEPs have fault propagation enabled, a fault detected by one of them will be propagated to the other, which in turn will propagate fault in its own direction.
When a MEP detects a fault and fault propagation is enabled for the MEP, CFM needs to communicate the fault to SMGR, so SMGR will mark the SAP/SDP-binding faulty but still oper-up. CFM traffic can still be transmitted to or received from the SAP/SDP-binding to ensure when the fault is cleared, the SAP will go back to normal operational state. Since the operational status of the SAP/SDP-binding is not affected by the fault, no fault handling is performed. For example, applications relying on the operational status are not affected.
If the MEP is an up MEP, the fault is propagated to the OAM components on the same SAP/SDP-binding; if the MEP is a down MEP, the fault is propagated to the OAM components on the mate SAP/SDP-binding at the other side of the service.
This section describes procedures for the scenario where an Epipe service is down when service is administratively shutdown. When service is administratively shutdown, the fault is propagated to the SAP/SDP-bindings in the service.
LLF and CFM fault propagation are mutually exclusive. CLI protection is in place to prevent enabling both LLF and CFM fault propagation in the same service, on the same node and at the same time. However, there are still instances where irresolvable fault loops can occur when the two schemes are deployed within the same service on different nodes. This is not preventable by the CLI. At no time should these two fault propagation schemes be enabled within the same service.
802.3ah EFM OAM declares a link fault when any of the following occurs:
When 802.3ah EFM OAM declares a fault, the port goes into operation state down. The SMGR communicates the fault to CFM MEPs in the service. OAM fault propagation in the opposite direction (SMGR to EFM OAM) is not supported.
A fault on the access-uplink port brings down all access ports with services independent of the encapsulation type of the access port (null, dot1q, or QinQ), that is, support Link Loss Forwarding (LLF). A fault propagated from the access-uplink port to access ports is based on configuration. A fault is propagated only in a single direction from the access-uplink port to access port.
A fault on the access-uplink port is detected using Loss of Signal (LoS) and EFM-OAM.
The following figure shows local fault propagation.
The operational group functionality, also referred to as oper-group, is used to detect faults on access-uplink ports and propagate them to all interested access ports regardless of their encapsulation. On the 7210 SAS operating in access-uplink mode, ports can be associated with oper-groups. Perform the following procedure to configure the use of the oper-group functionality for fault detection on a port and monitor-oper-group to track the oper-group status and propagate the fault based on the operational state of the oper-group:
The following is a sample oper-group system configuration output.
Note: For more details about the CLI, refer to the 7210 SAS-D, Dxp, K 2F1C2T, K 2F6C4T, K 3SFP+ 8C Basic System Configuration Guide. |
Note: This feature is not supported on the 7210 SAS-D. |
Fault propagation allows the user to track an Epipe service link connectivity failure and propagate the failure toward customer devices connected to access ports.
The following figure shows an example Epipe service architecture.
In the preceding figure, the 7210 SAS node is deployed as the network interface device (NID) and the customer premises equipment (CPE) is connected to the 7210 SAS over a dot1q SAP (may be a single VLAN ID dot1q SAP, range dot1q SAP, or default dot1q SAP). The 7210 SAS adds an S-tag to the packet received from the customer, and the packet is transported over the backhaul network to the service edge, which is typically a 7750 SR node acting as an external network-to-network interface (ENNI) where the service provider (SP) is connected. At the ENNI, the 7750 SR hands off the service to the SP over a SAP.
Service availability must be tracked end-to-end between the uplink on the 7210 SAS and the customer hand-off point. If there is a failure or a fault in the service, the access port toward the SP is brought down. For this reason, a Down MEP is configured on the 7210 SAS access-uplink SAP facing the network, or alternatively, an Up MEP can be configured on the access port connected to the CPE. The Down MEP has a CFM session with a remote Up MEP configured on the SAP facing the SP node that is the service hand-off point toward the SP. The customer-facing access port and the access-uplink or access SAP facing the network with the Down MEP are configured in an Epipe service on the 7210 SAS.
Connectivity fault management (CFM) continuity check messages (CCMs) are enabled on the Down MEP and used to track end-to-end availability. On detection of a fault that is higher than the lowest-priority defect of the configured MEP, the access port facing the customer is brought down (with the port tx-off action), which immediately indicates the failure to the CPE device connected to the 7210 SAS and allows the node to switch to another uplink, if available.
To enable fault propagation from the access-uplink SAP to the access port, the user can configure an operational group using the defect-oper-group command. An operational group configured on a service object (the CFM MEP on the Epipe SAP in this use case) inherits the operational status of that service object and on change in the operational state notifies the service object that is monitoring its operational state. The user can configure monitoring using the monitor-oper-group command under the service object (the access port connected to the customer in this use case).
For this use case, the user can configure an operational group under the CFM MEP to track the CMF MEP fault status so that when the CFM MEP reports a fault, the corresponding access ports that are monitoring the CFM MEP state is brought down (by switching off the laser on the port, similar to link loss forwarding (LLF)). When the CFM MEP fault is cleared, the operational group must notify the service object monitoring the MEP (that is, the access port) so that the access port can be brought up (by switching on the laser on the port, similar to LLF). The user can use the low-priority-defect command to configure the CFM session events that cause the MEP to enter the fault state.
If the MEP reports a fault greater than the configured low-priority defect, the software brings down the operational group so that the port configured to monitor that operational group becomes operationally down. After the MEP fault is cleared (that is, the MEP reports a fault lower than the configured low-priority defect), the software brings up the operational group so that the port monitoring the operational group becomes operationally up.
Fault detection is supported in only one direction from the MEP toward the other service endpoint. The fault is propagated in the opposite direction from the MEP toward the access port. Fault detection and propagation must not be configured in the other direction.
Note: This feature does not support two-way fault propagation and must not be used for scenarios that require it. |
Before configuring an operational group, ensure that the low-priority-defect allDef option is configured on the MEP so that a fault is raised for all errors or faults detected by the MEP.
Use the following syntax to configure an operational group for a Down MEP configured in an Epipe service on an uplink SAP.
The following example shows the command usage.
The access port on the 7210 SAS node to which the enterprise CPE is connected must be configured with the corresponding monitor object using the monitor-oper-group command.
Use the following syntax to configure monitoring of an operational group.
The following example shows the command usage.
The following restrictions apply for fault propagation:
In the last few years, service delivery to customers has drastically changed. The introduction of Broadband Service Termination Architecture (BSTA) applications such as Voice over IP (VoIP), TV delivery, video and high speed Internet services force carriers to produce services where the health and quality of Service Level Agreement (SLA) commitments are verifiable to the customer and internally within the carrier.
SAA is a feature that monitors network operations using statistics such as jitter, latency, response time, and packet loss. The information can be used to troubleshoot network problems, problem prevention, and network topology planning.
The results are saved in SNMP tables are queried by either the CLI or a management system. Threshold monitors allow for both rising and falling threshold events to alert the provider if SLA performance statistics deviate from the required parameters.
SAA allows two-way timing for several applications. This provides the carrier and their customers with data to verify that the SLA agreements are being correctly enforced.
In the 7210 SAS, for various applications, such as IP traceroute, control CPU inserts the timestamp in software.
When interpreting these timestamps care must be taken that some nodes are not capable of providing timestamps, as such timestamps must be associated with the same IP-address that is being returned to the originator to indicate what hop is being measured.
Because NTP precision can vary (+/- 1.5ms between nodes even under best case conditions), SAA one-way latency measurements might display negative values, especially when testing network segments with very low latencies. The one-way time measurement relies on the accuracy of NTP between the sending and responding nodes.
Loopback (LBM), linktrace (LTR) and two-way-delay measurements (Y.1731 ETH-DMM) can be scheduled using SAA. Additional timestamping is required for non Y.1731 delay-measurement tests, to be specific, loopback and linktrace tests. An organization-specific TLV is used on both sender and receiver nodes to carry the timestamp information. Currently, timestamps are only applied by the sender node. This means any time measurements resulting from loopback and linktrace tests includes the packet processing time of the remote node. Since Y.1731 ETH-DMM uses a four time stamp approach to remove the remote processing time it should be used for accurate delay measurements.
The SAA versions of the CFM loopback, linktrace and ETH-DMM tests support send-count, interval, timeout, and FC. The existing CFM OAM commands have not been extended to support send-count and interval natively. The summary of the test results are stored in an accounting file that is specified in the SAA accounting-policy.
SAA statistics enables writing statistics to an accounting file. When results are calculated an accounting record is generated.
To write the SAA results to an accounting file in a compressed XML format at the termination of every test, the results must be collected, and, in addition to creating the entry in the appropriate MIB table for this SAA test, a record must be generated in the appropriate accounting file.
Because the SAA accounting files have a similar role to existing accounting files that are used for billing purposes, existing file management information is leveraged for these accounting (billing) files.
When an accounting file has been created, accounting information can be specified and will be collected by the config>log>acct-policy> to file log-file-id context.
When you configure a test, use the config>saa>test>continuous command to make the test run continuously. Use the no continuous command to disable continuous testing and shutdown to disable the test completely. When you have configured a test as continuous, you cannot start or stop it by using the saa test-name [owner test-owner] {start | stop} [no-accounting] command.
The following is a sample SAA configuration output.
The 7210 SAS provides support for both the original Y.1564 testhead OAM tool implementation and an enhanced implementation of the tool, referred to as the service test testhead OAM tool. The 7210 SAS-D supports only the original Y.1564 testhead OAM tool implementation. The 7210 SAS-K 2F1C2T and 7210 SAS-K 2F6C4T support both the original and the enhanced implementations. The 7210 SAS-K 3SFP+ 8C supports only the enhanced implementations. For information about the enhanced implementation, see Service test testhead OAM Tool for the 7210 SAS-K 2F1C2T, 7210 SAS-K 2F6C4T, and 7210 SAS-K 3SFP+ 8C in this section.
Note:
|
ITU-T Y.1564 defines the out-of-service test methodology to be used and parameters to be measured to test service SLA conformance during service turn up. It primarily defines two test phases. The first test phase defines the service configuration test, which consists of validating whether the service is configured correctly. As part of this test the throughput, frame delay, frame delay variation (FDV), and frame loss ratio (FLR) is measured for each service. This test is typically run for a short duration. The second test phase consists of validating the quality of services delivered to the end customer and is referred to as the service performance test. These tests are typically run for a longer duration. All traffic is generated up to the configured rate for all the services simultaneously and the service performance parameters are measured for each service.
The 7210 SAS supports the service configuration test for a user-configured rate and measurement of delay, delay variation, and frame loss ratio with the testhead OAM tool. The testhead OAM tool supports bidirectional measurement. On the 7210 SAS-D, the testhead OAM tool can generate test traffic for only one service at a specific time. On the 7210 SAS-K 2F1C2T and 7210 SAS-K 2F6C4T, the testhead OAM tool can generate test traffic for up to four services simultaneously. On the 7210 SAS-K 3SFP+ 8C, the testhead OAM tool can generate test traffic for up to eight streams or services simultaneously. The tool validates if the user-specified rate is available and computes the delay, delay variation, and frame loss ratio for the service under test at the specified rate. The tool is capable of generating traffic at a rate of up to 1 Gb/s on the 7210 SAS-D, 7210 SAS-K 2F1C2T, and 7210 SAS-K 2F6C4T. The tool is capable of generating traffic at a rate of up to approximately 10 Gb/s on the 7210 SAS-Dxp and 7210 SAS-K 3SFP+ 8C. On some 7210 SAS devices, the resources needed for this feature must be configured on the front-panel port; on other 7210 SAS devices, the resources needed for this feature are automatically allocated by software from the internal ports. See Configuration guidelines for information about which 7210 SAS platforms need user configuration.
The following figure shows the remote loopback required and the flow of the frame through the network generated by the testhead tool.
The tool allows the user to specify the frame payload header parameters independent of the test SAP configuration parameters. This capability gives the user flexibility to test for different possible frame header encapsulations. The user can specify the appropriate VLAN tags, Ethertype, and Dot1p values independent of the SAP configuration like with actual service testing. In other words, the software does not use the parameters (For example: SAP ID, Source MAC, and Destination MAC) during the invocation of the testhead tool to build the test frames. Instead it uses only the parameters specified in the frame-payload CLI command. The software does not verify that the parameters specified match the service configuration used for testing, for example, software does not match if the VLAN tags specified matches the SAP tags, the Ethertype specified matches the user configured port Ethertype, and so on. It is expected that the user configures the frame-payload appropriately so that the traffic matches the SAP configuration.
7210 SAS-D and 7210 SAS-Dxp support Y.1564 testhead for performing CIR or PIR tests for both color-blind mode and color-aware mode. In color-aware mode, users can perform service turn-up tests to validate the performance characteristics (delay, jitter, and loss) for committed rate (CIR) and excess rate above CIR (that is, PIR rate). The testhead OAM tool uses the in-profile packet marking value and out-of-profile packet marking value to differentiate between committed traffic and PIR traffic in excess of CIR traffic. Traffic within CIR (that is, committed traffic) is expected to be treated as in-profile traffic in the network and traffic in excess of CIR (that is, PIR traffic) is expected to be treated as out-of-profile traffic in the network, allowing the network to prioritize committed traffic over PIR traffic. The testhead OAM tool allows the user to configure individual thresholds for green or in-profile packets and out-of-profile or yellow packets. It is used by the testhead OAM tool to compare the measured value for green or in-profile packets and out-of-profile or yellow packets against the configured thresholds and report success or failure.
Note: CIR and PIR tests in color-aware mode are only supported on the 7210 SAS-D and 7210 SAS-Dxp. |
The 7210 SAS testhead OAM tool supports the following functionality:
This section describes the prerequisites a user must be aware of before using the testhead OAM tool. It is divided into three sections. First, the generic prerequisites applicable to all 7210 SAS platforms are listed, followed by prerequisites specific to the 7210 SAS-D and 7210 SAS-Dxp platforms, and then followed by prerequisites specific to the 7210 SAS-K 2F1C2T, 7210 SAS-K 2F6C4T, and 7210 SAS-K 3SFP+ 8C.
The following list describes some prerequisites for using the testhead tool on the 7210 SAS-D and 7210 SAS-Dxp:
The following list describes the prerequisites for using Y,1564 testhead OAM functionality on the 7210 SAS-K 2F1C2T, 7210 SAS-K 2F6C4T, and 7210 SAS-K 3SFP+ 8C.
The 7210 SAS-K 2F1C2T, 7210 SAS-K 2F6C4T, and 7210 SAS-K 3SFP+ 8C support the service test framework through the use of the service test testhead OAM tool. This tool allows for configuration of multiple streams (also called flows) for which service performance metrics can be obtained. Please consult the Platform Scaling Guide to know the number of streams supported by the various platforms. With multiple streams, it is possible to potentially configure two service tests to validate two services each with two forwarding classes (FCs), or validate a single service with four FCs, or validate a mix of services and FCs as long as the number of streams are within the limit supported by the platform.
A set of streams under a single service test can be grouped together using the service-stream configuration commands and each stream can be configured with the options listed as follows:
The test results can be stored in an accounting record in XML format. The XML file contains the keywords and MIB references listed in the following figure.
XML file keyword | Description | TIMETRA-SAS-OAM-Y1564-MIB |
acceptanceCriteriaId | Provides the ID of the acceptance criteria policy used to compare the measured results | tmnxY1564StreamAccCritId |
accountingPolicy | Provides the ID of the accounting policy, which determines the properties of the accounting record generated, such as how frequently to write records, rollover interval, and so on | tmnxY1564ServTestAccPolicy |
achievedThroughput | The throughput measured by the tool, as observed by measuring the rate of testhead packets received by the tool | tmnxY1564StreamResAchvdThruput |
cirAdaptRule | The adaptation rule to apply to the configured CIR rate to match it to hardware-supported rates | tmnxY1564StreamCIRAdaptation |
cirRate | The user-configured CIR rate | tmnxY1564StreamAdminCIR |
cirTestDur | The duration, in seconds, of the CIR configuration test | tmnxY1564ServTestCirTestDuration |
cirThreshold | The CIR rate threshold to compare with the measured value | tmnxY1564AccCritCirThres |
dataPattern | The data pattern to include in the packet generated by the service testhead tool | tmnxY1564PayLddataPattrn |
description | The user-configured description for the test | tmnxY1564ServTestDescription |
desiredThroughput | The user-configured rate that is the target to achieve. The desired throughput value is either the user-configured CIR rate or PIR rate, based on the test type. | tmnxY1564StreamResDesiredThruput |
dstIp | The destination IP address to use in the packet generated by the tool | tmnxY1564PayLdDstIpv4Addr |
dstMac | The destination MAC address to use in the packet generated by the tool | tmnxY1564PayLdDstMac |
dstPort | The destination TCP/UDP port to use in the packet generated by the tool | tmnxY1564PayLdDstPort |
endTime | The time (wall-clock time) the test was completed | tmnxY1564ServTestCompletionTime |
etherType | The Ethertype value to use in the packet generated by the tool | tmnxY1564PayLdEthertype |
fc | The forwarding class for which the tool is being used to measure the performance metrics | tmnxY1564StreamFc |
fixedFrameSize | The frame size to use for the generated packet; used to specify a single value for all frames generated by the tool | tmnxY1564StreamFrameSize |
flr | The measured frame loss ratio | tmnxY1564StreamResMeasuredFLR |
flrAcceptance | Indicates whether the measured FLR is within the configured loss threshold | tmnxY1564StreamResFLRAcceptanceResult |
frameLossThreshold | The loss threshold configured in the acceptance criteria | tmnxY1564AccCritLossRiseThres |
frameMixId | The ID of the frame-mix policy. The testhead tool generates packets sizes as specified in the frame-mix policy. This is used to specify a mix of frames with different sizes to be generated by the tool. | tmnxY1564StreamFrameMixId |
framePayloadId | The ID of the frame payload. The frame payload defines the format of the payload and provides the frame/packet header values and data pattern to use for the payload. | tmnxY1564StreamPayLdId |
id | Provides information about either the frame mix policy ID, acceptance criteria policy ID, or frame payload ID to use for the service test, depending on the context it appears in | tmnxY1564StreamFrameMixId tmnxY1564StreamAccCritId tmnxY1564StreamPayLdId |
ipDscp | The IP DSCP value used in the frame payload | tmnxY1564PayLdDSCP |
ipProto | The IP protocol value used in the frame payload | tmnxY1564PayLdIpProto |
ipTos | The IP ToS bits value used in the frame payload | tmnxY1564PayLdIpTos |
ipTtl | The IP TTL value used in the frame payload | tmnxY1564PayLdIpTTL |
jitter | The measured jitter value | tmnxY1564StreamResMeasuredJitter |
jitterAcceptance | Indicates whether the measured jitter is within the configured jitter threshold | tmnxY1564StreamResJitterAcceptanceResult |
jitterThreshold | The jitter threshold configured in the acceptance criteria | tmnxY1564AccCritJittrRiseThres |
latencyAcceptance | Indicates whether the measured FLR is within configured loss threshold. | tmnxY1564StreamResLatencyAcceptanceResult |
latencyAvg | The average of latency values computed for the test stream | tmnxY1564StreamResMeasuredLatency |
latencyMax | The maximum value of latency measured by the tool | tmnxY1564StreamResMaxLatency |
latencyMin | The minimum value of latency measured by the tool | tmnxY1564StreamResMinLatency |
latencyThreshold | The latency threshold configured in the acceptance criteria | tmnxY1564AccCritLatRiseThres |
measuredCir | The measured CIR rate | tmnxY1564StreamResMeasuredCIR |
measuredpir | The measured PIR rate | tmnxY1564StreamResMeasuredPIR |
measuredThroughput | The measured throughput | tmnxY1564StreamResMeasuredThruput |
mfactor | A factor to use as a margin by which the observed throughput is off from the configured throughput to determine whether a service test passes or fails | tmnxY1564AccCritUseMFactor |
perfTestDur | The duration, in seconds, of the performance test | tmnxY1564ServTestPerformanceTestDuration |
pirAdaptRule | The PIR adaptation rule used | tmnxY1564StreamPIRAdaptation |
pirRate | The PIR rate configured | tmnxY1564StreamAdminPIR |
pirTestDur | The PIR test duration, in seconds | tmnxY1564ServTestCirPirTestDuration |
pirThreshold | The PIR threshold configured in the acceptance criteria | tmnxY1564AccCritPirThres |
pktCountRx | The received packet count | tmnxY1564StreamResRecvCount |
pktCountTx | The transmitted packet count | tmnxY1564StreamResTransCount |
policingTestDur | The policing test duration, in seconds | tmnxY1564ServTestPolicingTestDuration |
resultStatus | Indicates whether the stream has passed or failed | tmnxY1564StreamResStatus |
runningInstance | A counter used to indicate the run instance of the test | tmnxY1564ServTestRunningInstance |
sap | The SAP used as the test endpoint | tmnxY1564StreamSapPortId tmnxY1564StreamSapEncapValue |
sequence | The sequence of payload sizes specified in the frame-mix policy | tmnxY1564StreamFrameMixSeq |
serviceTest | A tag to indicate the start of the service test in the accounting record | None |
sizeA | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeA specifies the frame size for the packet identified with the letter ‘a’ in the frame sequence. | tmnxY1564FrameMixSizeA |
sizeB | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeB specifies the frame size for the packet identified with the letter ‘b’ in the frame sequence. | tmnxY1564FrameMixSizeB |
sizeC | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeC specifies the frame size for the packet identified with the letter ‘c’ in the frame sequence. | tmnxY1564FrameMixSizeC |
sizeD | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeD specifies the frame size for the packet identified with the letter ‘d’ in the frame sequence. | tmnxY1564FrameMixSizeD |
sizeE | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeE specifies the frame size for the packet identified with the letter ‘e’ in the frame sequence. | tmnxY1564FrameMixSizeE |
sizeF | A frame sequence can be configured by user to indicate the sequence of frame-sizes to be generated by the tool. The sequence of frames is specified using letters a-h and u. sizeF specifies the frame-size for packet identified with letter ‘f’ in the frame-sequence. | tmnxY1564FrameMixSizeF |
sizeG | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeG specifies the frame size for the packet identified with the letter ‘g’ in the frame sequence. | tmnxY1564FrameMixSizeG |
sizeH | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeH specifies the frame size for the packet identified with the letter ‘h’ in the frame sequence. | tmnxY1564FrameMixSizeH |
sizeU | A frame sequence can be configured by the user to indicate the sequence of frame sizes to be generated by the tool. The sequence of frames is specified using letters a to h and u. sizeU specifies the frame size for the packet identified with the letter ‘u’ in the frame sequence. sizeU is the user-defined packet size. | tmnxY1564FrameMixSizeU |
srcIp | The source IP address in the frame payload generated by the tool | tmnxY1564PayLdSrcIpv4Addr |
srcMac | The source MAC address in the frame payload generated by the tool | tmnxY1564PayLdSrcMac |
srcPort | The source TCP/UDP port in the frame payload generated by the tool | tmnxY1564PayLdSrcPort |
startTime | The time at which the test was started (wall clock time) | tmnxY1564ServTestStartTime |
streamId | The stream identifier | tmnxY1564StreamId |
streamOrdered | Indicates if the streams configured for the service test were run one after another or run in parallel | tmnxY1564ServTestStreamOrder |
testCompleted | Indicates if the test was completed or not | tmnxY1564StreamResCompleted |
testCompletion | The execution status of the test (either completed or running) | tmnxY1564ServTestCompletion |
testDuration | The duration of the entire test (including all test types) | tmnxY1564ServTestTime |
testIndex | The service test index configured | tmnxY1564ServTestIndex |
testResult | Indicates the result of the test | tmnxY1564ServTestTestResult |
testStopped | Indicates if the test was stopped and did not complete | tmnxY1564ServTestStopped |
testTime | The time taken for each stream. If the test is stopped, the time given is the execution time of the stream up until it was stopped. | tmnxY1564StreamResTestTime |
testTypeCir | The CIR configuration test | tmnxY1564StreamTests |
testTypeCirPir | The CIR-PIR configuration test | tmnxY1564StreamTests |
testTypePerf | The performance test | tmnxY1564StreamTests |
testTypePolicing | The policing test | tmnxY1564StreamTests |
throughputAcceptance | Indicates whether the measured throughput matches the configure CIR/PIR rate | tmnxY1564StreamResThruputAcceptanceResult |
trapEnabled | Indicates if the trap needs to be sent in completion of the test | tmnxY1564ServTestTrapEnable |
type | The test type configured for the stream ID | tmnxY1564PayLdType |
vlanTag1Dei | The DEI value set for VLAN Tag #1 (outer most VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagOneDei |
vlanTag1Id | The VLAN ID value set for VLAN Tag #1 (outer most VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagOne |
vlanTag1Tpid | The VLAN Tag TPID value set for VLAN Tag #1 (outer most VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagOneTpid |
vlanTag2Dei | The DEI value set for VLAN Tag #2 (inner VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagTwoDei |
vlanTag2Dot1p | The Dot1p value set for VLAN Tag #2 (inner VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagTwoDot1p |
vlanTag2Id | The VLAN ID value set for VLAN Tag #2 (inner VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagTwo |
vlanTag2Tpid | The VLAN tag TPID value set for VLAN Tag #2 (inner VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagTwoTpid |
vlanTag1Dot1p | ThenDot1p value set for VLAN Tag #1 (outermost VLAN tag) in the frame payload generated by the tool | tmnxY1564PayLdVTagOneDot1p |
On the 7210 SAS-K 2F1C2T, 7210 SAS-K 2F6C4T, and 7210 SAS-K 3SFP+ 8C, the service test testhead tool can be configured to generate a flow containing a sequence of frames of different sizes. The frame sizes and designations are defined by ITU-T Y.1564 and listed in the following table.
a | b | c | d | e | f | g | h | u |
64 | 128 | 256 | 512 | 1024 | 1280 | 1518 | MTU | User defined |
The test tool can be configured to generate a flow containing frames with sizes ranging from 64 bytes to 9212 bytes using the designated letters to specify the frame size to use. The frame size is configured using the fixed-size command. The configured size applies to all the frames in the test flow.
It is also possible to configure the test tool to generate frames of different sizes in a flow. The frame-mix command creates a template that specifies sizes of the frames to be generated by the testhead tool. The frame sizes are defined in ITU-T Y.1564.The frame-sequence command configures a string of up to 16 characters that specifies the order in which the frames are to be generated by the testhead tool for a specific template. For example, a frame-sequence configured as aabcdduh indicates that a packet of size-a configured in the specified frame mix template should be generated first, followed by another packet of size-a, followed by packet of size-b and so on until a packet of size-h is generated. The tool then starts again and repeats the sequence until the rate of packets generated matches the required rate.
This section lists the configuration guidelines for the testhead OAM tool. These guidelines apply to all platforms described in this guide unless a specific platform is called out explicitly. As well, these guidelines apply to both the testhead OAM tool and service test testhead OAM tool, unless indicated otherwise:
Epipe service configured with svc-sap-type | Test SAP encapsulations |
null-star | Null, :* , 0.* , Q.* |
Any | Null , :0 , :Q , :Q1.Q2 |
dot1q-preserve | :Q |
Note:
|
The following output is an example of CAM resource allocation for use by the OAM testhead tool.
Note: The user does not need to allocate loopback ports on the 7210 SAS-D. They are allocated by the software and can be displayed using the show system internal-loopback-ports command. |
The following output is an example of internal-loopback-ports information.
The preceding example shows that virtual port 1/1/11 and 1/1/13 are allocated for MAC-swap and testhead application.
The following output is an example of the testhead profile.
Before starting the test, return the testhead packets to the local node by ensuring that loopback with mac-swap is configured on the remote end point.
The following is an example of the CLI command to start the testhead session.
The following is an example of Result and Sample testhead output.
The following CLI command stops a testhead session.
The following CLI command clears testhead session results.
Note: This feature is only supported on the 7210 SAS-K 2F1C2T, 7210 SAS-K 2F6C4T, and 7210 SAS-K 3SFP+ 8C. |
The following output is an example of frame-mix parameters.
Note: In the following example, fixed-sized frames are used. The device provides an option to configure a stream with frames of different frame sizes using the frame-mix option shown in the preceding example. |
The following output is an example of frame-payload parameters.
The following output is an example of acceptance-criteria parameters.
The following output is an example of a service-test configuration.
The following output is an example of the CLI command used to start a service test.
The following output is an example of show service-test results.
The following output is an example of an overview of all the service test results using result-summary.
Note: Port loopback with MAC swap is only supported on the 7210 SAS-D. |
7210 SAS devices support port loopback for ethernet ports. There are two types of port loopback commands: port loopback without MAC swap and port loopback with MAC swap. Both these commands are helpful for testing the service configuration and measuring performance parameters such as throughput, delay, and jitter on service turn-up. Typically, a third-party external test device is used to inject packets at desired rate into the service at a central office location. For detailed information about port loop back functionality refer to the 7210 SAS-D, Dxp, K 2F1C2T, K 2F6C4T, K 3SFP+ 8C Interface Configuration Guide.
Note: Per-SAP loopback with MAC swap is supported on all 7210 SAS platforms as described in this document, except the 7210 SAS-D. |
Per- SAP loopback with MAC swap is is useful for testing the service configuration and measuring performance parameters such as throughput, delay, and jitter on service turn-up. Typically, the testhead OAM tool is used to inject packets at a desired rate into the service from the remote site. It is also possible to use a third-party external test device to inject packets at a desired rate into the service at a central office location. For detailed information about SAP loopback functionality, refer to the 7210 SAS-D, Dxp, K 2F1C2T, K 2F6C4T, K 3SFP+ 8C Services Guide.
OAM-PM provides an architecture for gathering and computing key performance indicators (KPIs) using standard protocols and a robust collection model. The architecture is comprised of the following foundational components:
The following figure shows the hierarchy of the architecture. This diagram is only meant to show the relationship between the components. It is not meant to depict all details of the required parameters.
OAM-PM configurations are not dynamic environments. All aspects of the architecture must be carefully considered before configuring the various architectural components, making external references to other related components, or activating the OAM-PM architecture. No modifications are allowed to any components that are active or have any active subcomponents. Any function being referenced by an active OAM-PM function or test cannot be modified or shut down. For example, to change any configuration element of a session, all active tests must be in a shutdown state. To change any bin group configuration (described later in this section) all sessions that reference the bin group must have every test shutdown. The description parameter is the only exception to this rule.
Session source and destination configuration parameters are not validated by the test that makes use of that information. When the test is activated with a no shutdown command, the test engine will attempt to send the test packets even if the session source and destination information does not accurately represent the entity that must exist to successfully transmit packets. If the entity does not exist, the transmit count for the test will be zero.
OAM-PM is not a hitless operation. If a high availability event occurs that causes the backup CPM to become the active CPM, or when ISSU functions are performed, the test data will not be correctly reported. There is no synchronization of state between the active and the backup control modules. All OAM-PM statistics stored in volatile memory will be lost. When the reload or high availability event is completed and all services are operational then the OAM-PM functions will commence.
It is possible that during times of network convergence, high CPU utilizations, or contention for resources, OAM-PM may not be able to detect changes to an egress connection or allocate the necessary resources to perform its tasks.
Note: OAM-PM is supported on all 7210 SAS platforms as described in this document, except the 7210 SAS-D. |
This is the overall collection of different tests, the test parameters, measurement intervals, and mapping to configured storage models. It is the overall container that defines the attributes of the session:
The session can be viewed as the single container that brings all aspects of individual tests and the various OAM-PM components under a single umbrella. If any aspects of the session are incomplete, the individual test cannot be activated with a no shutdown command, and an “Invalid ethernet session parameters” error will occur.
A number of standards bodies define performance monitoring packets that can be sent from a source, processed, and responded to by a reflector. The protocols available to carry out the measurements are based on the test family type configured for the session.
Ethernet PM delay measurements are carried out using the Two Way Delay Measurement Protocol version 1 (DMMv1) defined in Y.1731 by the ITU-T. This allows for the collection of Frame Delay (FD), InterFrame Delay Variation (IFDV), Frame Delay Range (FDR), and Mean Frame Delay (MFD) measurements for round trip, forward, and backward directions.
DMMv1 adds the following to the original DMM definition:
DMMv1 and DMM are backwards compatible and the interaction is defined in Y.1731 ITU-T-2011 Section 11 "OAM PDU validation and versioning".
Ethernet PM loss measurements are carried out using Synthetic Loss Measurement (SLM), which is defined in Y.1731 by the ITU-T. This allows for the calculation of Frame Loss Ratio (FLR) and availability.
A session can be configured with one or more tests. Depending on the session test type family, one or more test configurations may need to be included in the session to gather both delay and loss performance information. Each test that is configured shares the common session parameters and the common measurement intervals. However, each test can be configured with unique per-test parameters. Using Ethernet as an example, both DMM and SLM would be required to capture both delay and loss performance data.
Each test must be configured with a TestID as part of the test parameters, which uniquely identifies the test within the specific protocol. A TestID must be unique within the same test protocol. Again using Ethernet as an example, DMM and SLM tests within the same session can use the same TestID because they are different protocols. However, if a TestID is applied to a test protocol (like DMM or SLM) in any session, it cannot be used for the same protocol in any other session. When a TestID is carried in the protocol, as it is with DMM and SLM, this value does not have global significance. When a responding entity must index for the purpose of maintaining sequence numbers, as in the case of SLM, the TestID, Source MAC, and Destination MAC are used to maintain the uniqueness of the responder. This means that the TestID has only local, and not global, significance.
A measurement interval is a window of time that compartmentalizes the gathered measurements for an individual test that have occurred during that time. Allocation of measurement intervals, which equates to system memory, is based on the metrics being collected. This means that when both delay and loss metrics are being collected, they allocate their own set of measurement intervals. If the operator is executing multiple delay and loss tests under a single session, then multiple measurement intervals will be allocated, with one interval allocated per criteria per test.
Measurement intervals can be 15 minutes (15-min), one hour (1-hour) and 1 day (1-day) in duration. The boundary-type defines the start of the measurement interval and can be aligned to the local time of day clock, with or without an optional offset. The boundary-type can be aligned using the test-aligned option, which means that the start of the measurement interval coincides with the activation of the test. By default the start boundary is clock-aligned without an offset. When this configuration is deployed, the measurement interval will start at zero, in relation to the length. When a boundary is clock-aligned and an offset is configured, the specified amount of time will be applied to the measurement interval. Offsets are configured on a per-measurement interval basis and only applicable to clock-aligned measurement intervals. Only offsets less than the measurement interval duration are allowed. The following table describes some examples of the start times of each measurement interval.
Offset | 15-min | 1-hour | 1-day |
0 (default) | 00, 15, 30, 45 | 00 (top of the hour) | midnight |
10 minutes | 10, 25, 40, 55 | 10 min after the hour | 10 min after midnight |
30 minutes | rejected | 30 min after the hour | 30 min after midnight |
60 minutes | rejected | rejected | 01:00 AM |
Although test-aligned approaches may seem beneficial for simplicity, there are some drawbacks that need to be considered. The goal of the time-based and well defined collection windows allows for the comparison of measurements across common windows of time throughout the network and for relating different tests or sessions. It is suggested that proactive sessions use the default clock-aligned boundary type. On-demand sessions may make use of test-aligned boundaries. On-demand tests are typically used for troubleshooting or short term monitoring that does not require alignment or comparison to other PM data.
The statistical data collected and the computed results from each measurement interval are maintained in volatile system memory by default. The number of intervals stored is configurable per measurement interval. Different measurement intervals will have different defaults and ranges. The interval-stored parameter defines the number of completed individual test runs to store in volatile memory. There is an additional allocation to account for the active measurement interval. To look at the statistical information for the individual tests and a specific measurement interval stored in volatile memory, the show oam-pm statistics … interval-number command can be used. If there is an active test, it can be viewed by using the interval number 1. In this case, the first completed record would be interval number 2, and previously completed records would increment up to the maximum intervals stored value plus one.
As new tests for the measurement interval are completed, the older entries are renumbered to maintain their relative position to the current test. If the retained test data for a measurement interval consumes the final entry, any subsequent entries cause the removal of the oldest data.
There are drawbacks to this storage model. Any high availability function that causes an active CPM switch will flush the results that are in volatile memory. Another consideration is the large amount of system memory consumed using this type of model. Due to the risks and resource consumption this model incurs, an alternate method of storage is supported. An accounting policy can be applied to each measurement interval to write the completed data in system memory to non-volatile flash memory in an XML format. The amount of system memory consumed by historically completed test data must be balanced with an appropriate accounting policy. It is recommended that only necessary data be stored in non-volatile memory to avoid unacceptable risk and unnecessary resource consumption. It is further suggested that a large overlap between the data written to flash memory and stored in volatile memory is unnecessary.
The statistical information in system memory is also available through SNMP. If this method is chosen, a balance must be struck between the intervals retained and the times at which the SNMP queries collect the data. Determining the collection times through SNMP must be done with caution. If a file is completed while another file is being retrieved through SNMP, then the indexing will change to maintain the relative position to the current run. Correct spacing of the collection is key to ensuring data integrity.
The OAM-PM XML file contains the keywords and MIB references listed in the following table.
XML file keyword | Description | TIMETRA-OAM-PM-MIB object |
oampm | — | None - header only |
Keywords shared by all OAM-PM protocols | ||
sna | OAM-PM session name | tmnxOamPmCfgSessName |
mi | Measurement interval record | None - header only |
dur | Measurement interval duration (minutes) | tmnxOamPmCfgMeasIntvlDuration (enumerated) |
ivl | Measurement interval number | tmnxOamPmStsIntvlNum |
sta | Start timestamp | tmnxOamPmStsBaseStartTime |
ela | Elapsed time (seconds) | tmnxOamPmStsBaseElapsedTime |
ftx | Frames sent | tmnxOamPmStsBaseTestFramesTx |
frx | Frames received | tmnxOamPmStsBaseTestFramesRx |
sus | Suspect flag | tmnxOamPmStsBaseSuspect |
dmm | Delay record | None - header only |
mdr | Minimum frame delay, round-trip | tmnxOamPmStsDelayDmm2wyMin |
xdr | Maximum frame delay, round-trip | tmnxOamPmStsDelayDmm2wyMax |
adr | Average frame delay, round-trip | tmnxOamPmStsDelayDmm2wyAvg |
mdf | Minimum frame delay, forward | tmnxOamPmStsDelayDmmFwdMin |
xdf | Maximum frame delay, forward | tmnxOamPmStsDelayDmmFwdMax |
adf | Average frame delay, forward | tmnxOamPmStsDelayDmmFwdAvg |
mdb | Minimum frame delay, backward | tmnxOamPmStsDelayDmmBwdMin |
xdb | Maximum frame delay, backward | tmnxOamPmStsDelayDmmBwdMax |
adb | Average frame delay, backward | tmnxOamPmStsDelayDmmBwdAvg |
mvr | Minimum inter-frame delay variation, round-trip | tmnxOamPmStsDelayDmm2wyMin |
xvr | Maximum inter-frame delay variation, round-trip | tmnxOamPmStsDelayDmm2wyMax |
avr | Average inter-frame delay variation, round-trip | tmnxOamPmStsDelayDmm2wyAvg |
mvf | Minimum inter-frame delay variation, forward | tmnxOamPmStsDelayDmmFwdMin |
xvf | Maximum inter-frame delay variation, forward | tmnxOamPmStsDelayDmmFwdMax |
avf | Average inter-frame delay variation, forward | tmnxOamPmStsDelayDmmFwdAvg |
mvb | Minimum inter-frame delay variation, backward | tmnxOamPmStsDelayDmmBwdMin |
xvb | Maximum inter-frame delay variation, backward | tmnxOamPmStsDelayDmmBwdMax |
avb | Average inter-frame delay variation, backward | tmnxOamPmStsDelayDmmBwdAvg |
mrr | Minimum frame delay range, round-trip | tmnxOamPmStsDelayDmm2wyMin |
xrr | Maximum frame delay range, round-trip | tmnxOamPmStsDelayDmm2wyMax |
arr | Average frame delay range, round-trip | tmnxOamPmStsDelayDmm2wyAvg |
mrf | Minimum frame delay range, forward | tmnxOamPmStsDelayDmmFwdMin |
xrf | Maximum frame delay range, forward | tmnxOamPmStsDelayDmmFwdMax |
arf | Average frame delay range, forward | tmnxOamPmStsDelayDmmFwdAvg |
mrb | Minimum frame delay range, backward | tmnxOamPmStsDelayDmmBwdMin |
xrb | Maximum frame delay range, backward | tmnxOamPmStsDelayDmmBwdMax |
arb | Average frame delay range, backward | tmnxOamPmStsDelayDmmBwdAvg |
fdr | Frame delay bin record, round-trip | None - header only |
fdf | Frame delay bin record, forward | None - header only |
fdb | Frame delay bin record, backward | None - header only |
fvr | Inter-frame delay variation bin record, round-trip | None - header only |
fvf | Inter-frame delay variation bin record, forward | None - header only |
fvb | Inter-frame delay variation bin record, backward | None - header only |
frr | Frame delay range bin record, round-trip | None - header only |
frf | Frame delay range bin record, forward | None - header only |
frb | Frame delay range bin record, backward | None - header only |
lbo | Configured lower bound of the bin | tmnxOamPmCfgBinLowerBound |
cnt | Number of measurements within the configured delay range Note: The session_name, interval_duration, interval_number, {fd, fdr, ifdv}, bin_number, and {forward, backward, round-trip} indices are provided by the surrounding XML context | tmnxOamPmStsDelayDmmBinFwdCount tmnxOamPmStsDelayDmmBinBwdCount tmnxOamPmStsDelayDmmBin2wyCount |
slm | Synthetic loss measurement record | None - header only |
txf | Transmitted frames in the forward direction | tmnxOamPmStsLossSlmTxFwd |
rxf | Received frames in the forward direction | tmnxOamPmStsLossSlmRxFwd |
txb | Transmitted frames in the backward direction | tmnxOamPmStsLossSlmTxBwd |
rxb | Received frames in the backward direction | tmnxOamPmStsLossSlmRxBwd |
avf | Available count in the forward direction | tmnxOamPmStsLossSlmAvailIndFwd |
avb | Available count in the forward direction | tmnxOamPmStsLossSlmAvailIndBwd |
uvf | Unavailable count in the forward direction | tmnxOamPmStsLossSlmUnavlIndFwd |
uvb | Unavailable count in the forward direction | tmnxOamPmStsLossSlmUnavlIndBwd |
uaf | Undetermined available count in the forward direction | tmnxOamPmStsLossSlmUndtAvlFwd |
uab | Undetermined available count in the backward direction | tmnxOamPmStsLossSlmUndtAvlBwd |
uuf | Undetermined unavailable count in the forward direction | tmnxOamPmStsLossSlmUndtUnavlFwd |
uub | Undetermined unavailable count in the backward direction | tmnxOamPmStsLossSlmUndtUnavlBwd |
hlf | Count of HLIs in the forward direction | tmnxOamPmStsLossSlmHliFwd |
hlb | Count of HLIs in the backward direction | tmnxOamPmStsLossSlmHliBwd |
chf | Count of CHLIs in the forward direction | tmnxOamPmStsLossSlmChliFwd |
chb | Count of CHLIs in the backward direction | tmnxOamPmStsLossSlmChliBwd |
mff | Minimum FLR in the forward direction | tmnxOamPmStsLossSlmMinFlrFwd |
xff | Maximum FLR in the forward direction | tmnxOamPmStsLossSlmMaxFlrFwd |
aff | Average FLR in the forward direction | tmnxOamPmStsLossSlmAvgFlrFwd |
mfb | Minimum FLR in the backward direction | tmnxOamPmStsLossSlmMinFlrBwd |
xfb | Maximum FLR in the backward direction | tmnxOamPmStsLossSlmMaxFlrBwd |
afb | Average FLR in the backward direction | tmnxOamPmStsLossSlmAvgFlrBwd |
By default, the 15-min measurement interval stores 33 test runs (32+1) with a configurable range of 1 to 96, and the 1-hour measurement interval stores 9 test runs (8+1) with a configurable range of 1 to 24. The only storage for the 1-day measurement interval is 2 (1+1). This value for the 1-day measurement interval cannot be changed.
All three measurement intervals may be added to a single session if required. Each measurement interval that is included in a session is updated simultaneously for each test that is executing. If a measurement interval length is not required, it should not be configured. In addition to the three predetermined length measurement intervals, a fourth “always on” raw measurement interval is allocated at test creation. Data collection for the raw measurement interval commences immediately following the execution of a no shutdown command. It is a valuable tool for assisting in real-time troubleshooting as it maintains the same performance information and relates to the same bins as the fixed length collection windows. The operator may clear the contents of the raw measurement interval and flush stale statistical data to look at current conditions. This measurement interval has no configuration options, cannot be written to flash memory, and cannot be disabled; It is a single never-ending window.
Memory allocation for the measurement intervals is performed when the test is configured. Volatile memory is not flushed until the test is deleted from the configuration, a high availability event causes the backup CPM to become the newly active CPM, or some other event clears the active CPM system memory. Shutting down a test does not release the allocated memory for the test.
Measurement intervals also include a suspect flag. The suspect flag is used to indicate that data collected in the measurement interval may not be representative. The flag will be set to true only under the following conditions:
The suspect flag is not set when there are times of service disruption, maintenance windows, discontinuity, low packet counts, or other such events. Higher level systems would be required to interpret and correlate those types of event for measurement intervals which executed during the time that relate to the specific interruption or condition. Since each measurement interval contains a start and stop time, the information is readily available for higher level systems to discount the specific windows of time.
There are two main metrics that are the focus of OAM-PM: delay and loss. The different metrics have two unique storage structures and will allocate their own measurement intervals for these structures. This occurs regardless of whether the performance data is gathered with a single packet or multiple packet types.
Delay metrics include Frame Delay (FD), InterFrame Delay Variation (IFDV), Frame Delay Range (FDR) and Mean Frame Delay (MFD). Unidirectional and round trip results are stored for each metric:
FD, IFDV and FDR statistics are binnable results. FD, IFDV, FDR and MFD all include minimum, maximum, and average values. Unidirectional and round trip results are stored for each metric.
Unidirectional frame delay and frame delay range measurements require exceptional time of day clock synchronization. If the time of day clock does not exhibit extremely tight synchronization, unidirectional measurements will not be representative. In one direction, the measurement will be artificially increased by the difference in the clocks. In the other direction, the measurement will be artificially decreased by the difference in the clocks. This level of clocking accuracy is not available with NTP. To achieve this level of time of day clock synchronization, Precision Time Protocol (PTP) 1588v2 should be considered.
Round trip metrics do not require clock synchronization between peers, since the four timestamps allow for accurate representation of the round trip delay. The mathematical computation removes remote processing and any difference in time of day clocking. Round trip measurements do require stable local time of day clocks.
Any delay metric that is negative will be treated as zero and placed in bin 0, the lowest bin which has a lower boundary of 0 microseconds.
Delay results are mapped to the measurement interval that is active when the result arrives back at the source.
There are no supported log events based on delay metrics.
Loss metrics are only unidirectional and will report frame loss ratio (FLR) and availability information. Frame loss ratio is the computation of loss (lost/sent) over time. Loss measurements during periods of unavailability are not included in the FLR calculation as they are counted against the unavailability metric.
Availability requires relating three different functions. First, the individual probes are marked as available or unavailable based on sequence numbers in the protocol. A number of probes are rolled up into a small measurement window, typically 1 s. Frame loss ratio is computed over all the probes in a small window. If the resulting percentage is higher than the configured threshold, the small window is marked as unavailable. If the resulting percentage is lower than the threshold, the small window is marked as available. A sliding window is defined as some number of small windows, typically 10. The sliding window is used to determine availability and unavailability events. Switching from one state to the other requires every small window in the sliding window to be the same state and different from the current state.
Availability and unavailability counters are incremented based on the number of small windows that have occurred in all available and unavailable windows.
Availability and unavailability using synthetic loss measurements is meant to capture the loss behavior for the service. It is not meant to capture and report on service outages or communication failures. Communication failures of a bidirectional or unidirectional nature must be captured using some other means of connectivity verification, alarming, or continuity checking. During times of complete or extended failure periods it becomes necessary to timeout individual test probes. It is not possible to determine the direction of the loss because no response packets are being received back on the source. In this case, the statistics calculation engine maintains the previous state, updating the appropriate directional availability or unavailability counter. At the same time, an additional per-direction undetermined counter is updated. This undetermined counter is used to indicate that the availability or unavailability statistics could not be determined for a number of small windows.
During connectivity outages, the higher level systems can be used to discount the loss measurement interval, which covers the same span as the outage.
Availability and unavailability computations may delay the completion of a measurement interval. The declaration of a state change or the delay to a closing a measurement interval could be equal to the length of the sliding window and the timeout of the last packet. Closing of a measurement interval cannot occur until the sliding window has determined availability or unavailability. If the availability state is changing and the determination is crossing two measurement intervals, the measurement interval will not complete until the declaration has occurred. Typically, standard bodies indicate the timeout per packet. In the case of Ethernet, DMMv1, and SLM, timeout values are set at 5 s and cannot be configured.
There are no log events based on availability or unavailability state changes.
During times of availability, there can be times of high loss intervals (HLI) or consecutive high loss intervals (CHLI). These are indicators that the service was available but individual small windows or consecutive small windows experienced frame loss ratios exceeding the configured acceptable limit. A HLI is any single small window that exceeds the configured frame loss ratio. This could equate to a severely errored second, assuming the small window is one second. A CHIL is a consecutive high loss interval that exceeds a consecutive threshold within the sliding window. Only one HLI will be counted for a window.
Availability can only be reasonably determined with synthetic packets. This is because the synthetic packet is the packet being counted and provides a uniform packet flow that can be used for the computation. Transmit and receive counter-based approaches cannot reliably be used to determine availability because there is no guarantee that service data is on the wire, or the service data on the wire uniformity could make it difficult to make a declaration valid.
Figure 28 shows loss in a single direction using synthetic packets, and demonstrates what happens when a possible unavailability event crosses a measurement interval boundary. In the diagram, the first 13 small windows are all marked available (1), which means that the loss probes that fit into each of those small windows did not equal or exceed a frame loss ratio of 50%. The next 11 small windows are marked as unavailable, which means that the loss probes that fit into each of those small windows were equal to or above a frame loss ratio of 50%. After the 10th consecutive small window of unavailability, the state transitions from available to unavailable. The 25th small window is the start of the new available state which is declared following the 10th consecutive available small window. Notice that the frame loss ratio is 00.00%; this is because all the small windows that are marked as unavailable are counted toward unavailability, and as such are excluded from impacting the FLR. If there were any small windows of unavailability that were outside of an unavailability event, they would be marked as HLI or CHLI and be counted as part of the frame loss ratio.
Bin groups are templates that are referenced by the session. Three types of binnable statistics are available: FD, IFDV, and FDR, all of which are available in forward, backward, and round trip directions. Each of these metrics can have up to ten bin groups configured to group the results. Bin groups are configured by indicating a lower boundary. Bin 0 has a lower boundary that is always zero and is not configurable. The microsecond range of the bins is the difference between the adjacent lower boundaries. For example, bin-type fd bin 1 configured with lower-bound 1000 means that bin 0 will capture all frame delay statistics results between 0 and 1 ms. Bin 1 will capture all results above 1 ms and below the bin 2 lower boundary. The last bin to be configured would represent the bin that collects all the results at and above that value. Not all ten bins must be configured.
Each binnable delay metric type requires their own values for the bin groups. Each bin in a type is configurable for one value. It is not possible to configure a bin with different values for round trip, forward, and backward. Consideration must be taken when configuring the boundaries that represent the important statistics for that specific service.
As stated earlier in this section, this is not a dynamic environment. If a bin group is being referenced by any active test the bin group cannot shutdown. To modify the bin group it must be shut down. If the configuration of a bin group must be changed, and a large number of sessions are referencing the bin group, migrating existing sessions to a new bin group with the new parameters can be considered to reduce the maintenance window. To modify any session parameter, every test in the session must be shut down.
Bin group 1 is the default bin group. Every session requires a bin group to be assigned. By default, bin group 1 is assigned to every OAM-PM session that does not have a bin group explicitly configured. Bin group 1 cannot be modified. The bin group 1 configuration parameters are as follows:
The following table shows the architecture of all of the OAM-PM concepts previously described. It shows a more detailed hierarchy than previously shown in the introduction. This shows the relationship between the tests, the measurement intervals, and the storage of the results.
The following configuration examples are used to demonstrate the different show and monitoring commands available to check OAM-PM.
The monitor command can be used to automatically update the statistics for the raw measurement interval.