Delivery of services requires a number of operations occur properly and at different levels in the service delivery model. For example, operations such as the association of packets to a service, VC-labels to a service and each service to a service tunnel must be performed properly in the forwarding plane for the service to function properly. In order to verify that a service is operational, a set of in-band, packet-based Operation, Administration, and Maintenance (OAM) tools is required, with the ability to test each of the individual packet operations.
For in-band testing, the OAM packets closely resemble customer packets to effectively test the customer's forwarding path, but they are distinguishable from customer packets so they are kept within the service provider's network and not forwarded to the customer.
The suite of OAM diagnostics supplement the basic IP ping and traceroute operations with diagnostics specialized for the different levels in the service delivery model. There are diagnostics for MPLS LSPs, SDPs, services and VPLS MACs within a service.
The router LSP diagnostics are implementations of LSP ping and LSP trace based on RFC 8029, Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures. LSP ping provides a mechanism to detect data plane failures in MPLS LSPs. LSP ping and LSP trace are modeled after the ICMP echo request or reply used by ping and trace to detect and localize faults in IP networks.
For a given LDP FEC, RSVP P2P LSP, or BGP IPv4 or IPv6 label route, LSP ping verifies whether the packet reaches the egress label edge router (LER), while in LSP trace mode, the packet is sent to the control plane of each transit label switched router (LSR) which performs various checks to see if it is actually a transit LSR for the path.
The downstream mapping TLV is used in lsp-ping and lsp-trace to provide a mechanism for the sender and responder nodes to exchange and validate interface and label stack information for each downstream of an LDP FEC or an RSVP LSP and at each hop in the path of the LDP FEC or RSVP LSP.
Two downstream mapping TLVs are supported. The original Downstream Mapping (DSMAP) TLV defined in RFC 4379, Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures, (obsoleted by RFC 8029) and the new Downstream Detailed Mapping (DDMAP) TLV defined in RFC 6424, Mechanism for Performing Label Switched Path Ping (LSP Ping) over MPLS Tunnels, and RFC 8029.
When the responder node has multiple equal cost next-hops for an LDP FEC prefix, the downstream mapping TLV can further be used to exercise a specific path of the ECMP set using the path-destination option. The behavior in this case is described in the ECMP sub-section below.
This feature adds support of the Target FEC Stack TLV of type BGP Labeled IPv4 /32 Prefix as defined in RFC 8029.
The new TLV is structured as shown in Figure 23.
The user issues a LSP ping using the existing CLI command and specifying a new type of prefix:
oam lsp-ping bgp-label prefix ip-prefix/mask [src-ip-address ip-address] [fc fc-name [profile {in | out}]] [size octets] [ttl label-ttl] [send-count send-count] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [detail]
This feature supports BGP label IPv4 prefixes with a prefix length of 32 bits only and supports IPv6 prefixes with a prefix length of 128 bits only.
The path-destination option is used to exercise specific ECMP paths in the network when the LSR performs hashing on the MPLS packet.
Similarly, the user issues a LSP trace using the following command:
oam lsp-trace bgp-label prefix ip-prefix/mask [src-ip-address ip-address] [fc fc-name [profile {in | out}]] [max-fail no-response-count] [probe-count probes-per-hop] [size octets] [min-ttl min-label-ttl] [max-ttl max-label-ttl] [timeout timeout] [interval interval] [path-destination ip-address [interface if-name | next-hop ip-address]] [detail]
The following are the procedures for sending and responding to an LSP ping or LSP trace packet. These procedures are valid when the downstream mapping is set to the DSMAP TLV. The detailed procedures with the DDMAP TLV are presented in Using DDMAP TLV in LSP Stitching and LSP Hierarchy.
Note that only BGP label IPv4 /32 prefixes and BGP IPv6 /128 prefixes are supported since only these are usable as tunnels on the Nokia router platforms. The BGP IPv4 or IPv6 label prefix is also supported with the prefix SID attribute if BGP segment routing is enabled on the routers participating in the path of the tunnel.
The responder node must have an IPv4 address to use as the source address of the IPv4 echo reply packet. SR OS uses the system interface IPv4 address. When an IPv4 BGP label route resolves to an IPv6 next-hop and uses an IPv6 transport tunnel, any LSR or LER node which responds to an lsp-ping or lsp-trace message must have an IPv4 address assigned to the system interface or the reply is not sent. In the latter case, the lsp-ping or lsp-trace probe times out at the sender node.
Similarly, the responder node must have an IPv6 address assigned to the system interface so that it gets used in the IPv6 echo reply packet in the case of a BGP-LU IPv6 label route when resolved to an IPv4 or an IPv4-mapped IPv6 next-hop which itself is resolved to an IPv4 transport tunnel.
When the responder node has multiple equal cost next-hops for an LDP FEC or a BGP label prefix, it replies in the Downstream Mapping TLV with the downstream information of the outgoing interface which is part of the ECMP next-hop set for the prefix.
Note that when BGP label route is resolved to an LDP FEC (of the BGP next-hop of the BGP label route), ECMP can exist at both the BGP and LDP levels. The following selection of next-hop is performed in this case:
The following description of the behavior of LSP ping and LSP trace makes ae reference to a FEC in a generic way and which can represent an LDP FEC or a BGP label route. In addition the reference to a downstream mapping TLV means either the DSMAP TLV or the DDMAP TLV.
Lsp-ping and p2mp-lsp-ping operate over a network using unnumbered links without any changes. Lsp-trace, p2mp-lsp-trace and ldp-treetrace are modified such that the unnumbered interface is properly encoded in the downstream mapping (DSMAP/DDMAP) TLV.
In a RSVP P2P or P2MP LSP, the upstream LSR encodes the downstream router-id in the “Downstream IP Address” field and the local unnumbered interface index value in the “Downstream Interface Address” field of the DSMAP/DDMAP TLV as per RFC 8029. Both values are taken from the TE database.
In a LDP unicast FEC or mLDP P2MP FEC, the interface index assigned by the peer LSR is not readily available to the LDP control plane. In this case, the alternative method described in RFC 8029 is used. The upstream LSR sets the Address Type to IPv4 Unnumbered, the Downstream IP Address to a value of 127.0.0.1, and the interface index is set to 0. If an LSR receives an echo-request packet with this encoding in the DSMAP/DDMAP TLV, it bypasses interface verification but continues with label validation.
The DDMAP TLV provides the same features as the existing DSMAP TLV, plus the enhancement to trace the details of LSP stitching and LSP hierarchy. The latter is achieved using a new sub-TLV of the DDMAP TLV called the FEC stack change sub-TLV. Figure 24 shows the structures of these two objects as defined in RFC 6424.
The DDMAP TLV format is derived from the DSMAP TLV format. The key change is that variable length and optional fields have been converted into sub-TLVs. The fields have the same use and meaning as in RFC 8029 as shown in Figure 25.
The operation type specifies the action associated with the FEC stack change. The following operation types are defined.
More details on the processing of the fields of the FEC stack change sub-TLV are provided later in this section.
The user can configure which downstream mapping TLV to use globally on a system by using the following command:
configure test-oam mpls-echo-request-downstream-map {dsmap | ddmap}
This command specifies which format of the downstream mapping TLV to use in all LSP trace packets and LDP tree trace packets originated on this node. The Downstream Mapping (DSMAP) TLV is the original format in RFC 4379 (obsoleted by RFC 8029) and is the default value. The Downstream Detailed Mapping (DDMAP) TLV is the new enhanced format specified in RFC 6424 and RFC 8029.
This command applies to LSP trace of an RSVP P2P LSP, a MPLS-TP LSP, a BGP label route, or LDP unicast FEC, and to LDP tree trace of a unicast LDP FEC. It does not apply to LSP trace of an RSVP P2MP LSP which always uses the DDMAP TLV.
The global Downstream Mapping TLV setting impacts the behavior of both OAM LSP trace packets and SAA test packets of type lsp-trace and is used by the sender node when one of the following events occurs:
A consequence of the rules above is that a change to the value of mpls-echo-request-downstream-map option does not affect the value inserted in the downstream mapping TLV of existing tests.
The following are the details of the processing of the new DDMAP TLV:
In addition to performing the same features as the DSMAP TLV, the DDMAP TLV addresses the following scenarios:
In order to properly check a target FEC which is stitched to another FEC (stitching FEC) of the same or a different type, or which is tunneled over another FEC (tunneling FEC), it is necessary for the responding nodes to provide details about the FEC manipulation back to the sender node. This is achieved via the use of the new FEC stack change sub-TLV in the Downstream Detailed Mapping TLV (DDMAP) defined in RFC 6424.
When the user configures the use of the DDMAP TLV on a trace for an LSP that does not undergo stitching or tunneling operation in the network, the procedures at the sender and responder nodes are the same as in the case of the existing DSMAP TLV.
This feature however introduces changes to the target FEC stack validation procedures at the sender and responder nodes in the case of LSP stitching and LSP hierarchy. These changes pertain to the processing of the new FEC stack change sub-TLV in the new DDMAP TLV and the new return code 15 Label switched with FEC change. The following is a description of the main changes which are a superset of the rules described in Section 4 of RFC 6424 to allow greater scope of interoperability with other vendor implementations.
MPLS OAM supports Segment Routing extensions to lsp-ping and lsp-trace as specified in draft-ietf-mpls-spring-lsp-ping.
Segment Routing (SR) performs both shortest path and source-based routing. When the data plane uses MPLS encapsulation, MPLS OAM tools such as lsp-ping and lsp-trace can be used to check connectivity and trace the path to any mid-point or endpoint of an SR-ISIS, a SR-OSPF shortest path tunnel, or an SR-TE LSP.
The CLI options for lsp-ping and lsp-trace are under OAM and SAA for the following types of Segment Routing tunnels:
This section describes how MPLS OAM models the SR tunnel types.
An SR shortest path tunnel, SR-ISIS, or SR-OSPF tunnel, uses a single FEC element in the Target FEC Stack TLV. The FEC corresponds to the prefix of the node SID in a specific IGP instance.
Figure 26 illustrates the format for the IPv4 IGP-prefix segment ID:
In this format, the fields are as follows:
Figure 27 illustrates the format for the IPv6 IGP prefix segment ID:
In this format, the fields are as follows:
An SR-TE LSP, as a hierarchical LSP, uses the Target FEC Stack TLV, which contains a FEC element for each node SID and for each adjacency SID in the path of the SR-TE LSP. Because the SR-TE LSP does not instantiate state in the LSR other than the ingress LSR, MPLS OAM is just testing a hierarchy of node SID and adjacency SID segments towards the destination of the SR-TE LSP. The format of the node-SID is as illustrated above. Figure 28 illustrates the format for the IGP-Adjacency segment ID is as follows:
In this format, the fields are as follows:
Both lsp-ping and lsp-trace apply to the following contexts:
The following operations apply to lsp-ping and lsp-trace.
Figure 29 shows a sample topology for an lsp-ping and lsp-trace for SR-ISIS node SID tunnel.
Given this topology, the following is an output example for LSP-PING on DUT-A for target Node SID of DUT-F:
The following is an output example for LSP-TRACE on DUT-A for target node SID of DUT-F (DSMAP TLV):
The following is an output example for LSP-TRACE on DUT-A for target node SID of DUT-F (DDMAP TLV):
The following operations apply to lsp-ping and lsp-trace.
The following are sample outputs for lsp-ping and lsp-trace for some SR-TE LSPs. The first one uses a path with strict hops, each corresponding to an adjacency SID, while the second one uses a path with loose hops, each corresponding to a node SID. Assume the topology shown in Figure 30.
Example 1
The following is an output example for LSP-PING and LSP-TRACE on DUT-A for strict-hop adjacency SID SR-TE LSP, where:
Example 2
The following is an output example for LSP-PING and LSP-TRACE on DUT-A for loose-hop Node SID SR-TE LSP, where:
The following operations apply to lsp-ping and lsp-trace:
Example 1
The following is an output example of the lsp-trace command with the DDMAP TLV for LDP-to-SR direction (symmetric topology LDP-SR-LDP):
Example 2
The following is an output example of the lsp-trace command with the DDMAP TLV for SR-to-LDP direction (symmetric topology LDP-SR-LDP):
SR OS enhances lsp-ping and lsp-trace of a BGP IPv4 LSP resolved over an SR-ISIS IPv4 tunnel, an SR-OSPF IPv4 tunnel, or an SR-TE IPv4 LSP. The SR OS enhancement reports the full set of ECMP next-hops for the transport tunnel at both ingress PE and at the ABR or ASBR. The list of downstream next-hops is reported in the DSMAP or DDMAP TLV.
When the user initiates an lsp-trace of the BGP IPv4 LSP with the path-destination option specified, the CPM hash code, at the responder node, selects the outgoing interface to be returned in DSMAP or DDMAP. This decision is based on the modulo operation of the hash value on the label stack or the IP headers (where the DST IP is replaced by the specific 127/8 prefix address in the multipath type 8 field of the DSMAP or DDMAP) of the echo request message and the number of outgoing interfaces in the ECMP set.
Figure 31 depicts a sample topology used in the subsequent BGP over SR-OSPF, BGP over SR-TE (OSPF), BGP over SR-ISIS, and BGP over SR-TE (ISIS) examples.
The following are sample outputs of the lsp-trace command for a hierarchical tunnel consisting of a BGP IPv4 LSP resolved over an SR-ISIS IPv4 tunnel, an SR-OSPF IPv4 tunnel, or an SR-TE IPv4 LSP.
BGP over SR-OSPF example output:
BGP over SR-TE (OSPF) example output:
BGP over SR-ISIS example output:
BGP over SR-TE (ISIS) example output:
Assuming the topology in Figure 32 has the addition of an External Border Gateway Protocol (eBGP) peering between nodes B and C, the BGP IPv4 LSP spans the AS boundary and resolves to an SR-ISIS tunnel or an SR-TE LSP within each AS.
BGP over SR-ISIS in inter-AS option C example output:
BGP over SR-TE (ISIS) in inter-AS option C example output:
When IGP shortcut is enabled in an IS-IS or an OSPF instance and the family SRv4 or SRv6 is set to resolve over RSVP-TE LSPs, a hierarchical tunnel is created whereby an SR-ISIS IPv4 tunnel, an SR-ISIS IPv6 tunnel, or an SR-OSPF tunnel resolves over the IGP IPv4 shortcuts using RSVP-TE LSPs.
The following sample outputs are of the lsp-trace command for a hierarchical tunnel consisting of an SR-ISIS IPv4 tunnel and an SR-OSPF IPv4 tunnel, resolving over an IGP IPv4 shortcut using a RSVP-TE LSP.
The topology, as shown in Figure 33, is used for the following SR-ISIS over RSVP-TE and SR-OSPF over RSVP-TE example outputs.
SR-ISIS over RSVP-TE example output:
SR-OSPF over RSVP-TE example output:
When IGP shortcut is enabled in an IS-IS or an OSPF instance and the family IPv4 is set to resolve over SR-TE LSPs, a hierarchical tunnel is created whereby an LDP IPv4 FEC resolves over the IGP IPv4 shortcuts using SR-TE LSPs.
The following are sample outputs of the lsp-trace command for a hierarchical tunnel consisting of a LDP IPv4 FEC resolving over a IGP IPv4 shortcut using a SR-TE LSP.
The topology, as shown in Figure 34, is used for the following LDP over SR-TE (ISIS) and LDP over SR-TE (OSPF) example outputs.
LDP over SR-TE (ISIS) example output:
LDP over SR-TE (OSPF) example output:
This feature extends the support of lsp-ping, lsp-trace, and ICMP tunneling probes to IPv4 and IPv6 SR policies.
This feature describes the CLI options for lsp-ping and lsp-trace commands under the OAM and SAA contexts for the following type of Segment Routing tunnel: sr-policy.
The CLI does not require entry of the SR policy head-end parameter that corresponds to the IPv4 address of the router where the static SR policy is configured or where the BGP NRLRI of the SR policy is sent to by a controller or another BGP speaker. SR OS expects its IPv4 system address in the head-end parameter of both the IPv4 and IPv6 SR policy NLRIs, otherwise SR OS does not import the NRLI.
The source IPv4 or IPv6 address can be specified to encode in the Echo Request message of the LSP ping or LSP trace packet.
The endpoint command specifies the endpoint of the policy and which can consist of an IPv4 address, and therefore matching to a SR policy in the IPv4 tunnel-table, or an IPv6 address and therefore matching to a SR policy in the IPv6 tunnel-table.
The color command must correspond to the SR policy color attribute that is configured locally in the case of a static policy instance or signaled in the NLRI of the BGP signaled SR policy instance.
The endpoint and color commands test the active path (or instance) of the identified SR policy only.
The lsp-ping and lsp-trace commands can test one segment list at a time by specifying one segment list of the active instance of the policy or active candidate path. In this case, the segment-list id command is configured or segment list 1 is tested by default. The segment-list ID corresponds to the same index that was used to save the SR policy instance in the SR policy database. In the case of a static SR policy, the segment-list ID matches the segment list index entered in the configuration. In both the static and the BGP SR policies, the segment-list ID matches the index displayed for the segment list in the output of the show command of the policies.
The exercised segment list corresponds to a single SR-TE path with its own NHLFE or super NHLFE in the data path.
The ICMP tunneling feature support with SR policy is described in ICMP-Tunneling Operation and does not require additional CLI commands.
The following operations are supported with both lsp-ping and lsp-trace.
The ICMP tunneling feature operates in the same way as in a SR-TE LSP. When the label TTL of a traceroute packet of a core IPv4 or IPv6 route or a vpn-ipv4 or vpn-ipv6 route expires at an LSR, the latter generates an ICMP reply packet of type=11- (time exceeded) and injects it in the forward direction of the SR policy. When the packet is received by the egress LER or a BGP border router, SR OS performs a regular user packet route lookup in the data path in the GRT context or in a VPRN context and forwards the packet to the destination. The destination of the packet is the sender of the original packet which TTL expired at the LSR.
The network setup illustrated in Figure 35 shows an example configuration of segment routing using SRv6 that is discussed in this section. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for further information about the SRv6 feature.
As shown in Figure 35, the network administrator originates an ICMP ping or a UDP traceroute probe on node R1 to test the path of an SRv6 locator of node D, an SRv6 segment identifier (SID) owned by node D, or an IP prefix resolved to an SRv6 tunnel towards node D. R1 is referred to as the sender node. Node D is referred to as the target node because it owns the target locator or SID that is being tested. In general, a target node can be any router in the SRv6 network domain which either owns the target locator or SID, or a router in which the OAM probe was extracted due to matching a local route or due to the value of the hop-limit field setting in the packet.
The primary path to D is through R2 and R4. The link-protect TI-LFA backup path is through R3 as a PQ node and then R2 and R4.
The classic ping and traceroute OAM CLI commands are used to test an IPv4 or IPv6 prefix in a virtual routing and forwarding (VRF) table or in the base router table when resolved to an SRv6 tunnel, for example:
ping address [detail] [source ip-address]
traceroute address [detail] source ip-address]
The same CLI commands are used to test the address of an SRv6 locator or a SID. In this case, the user enters the IPv6 address of the target locator prefix or the target SID.
The source address encoded in the outer IPv6 header of the ping or traceroute packet is derived from the following steps, in ascending order:
The features in this section comply to draft-ietf-6man-spring-srv6-oam, Operations, Administration, and Maintenance (OAM) in Segment Routing Networks with IPv6 Data plane (SRv6).
The packet is encoded with a destination address set to the remote locator prefix or the specific remote SID, and the next-header field is ICMPv6 (for ping’s Echo request message) or UDP (for traceroute).The packet is encapsulated as shown in Figure 36. When the Topology-Independent Loop-Free Alternate (TI-LFA) or Remote LFA repair tunnel is activated, the LFA segment routing header (LFA SRH) is also pushed on the encapsulation of the SRv6 tunnel to the node D.
The outer IPv6 header hop-limit field is set according to the operation of the probe. For ping, the hop limit uses the default 254 or a user-entered value.
For traceroute, the hop limit is incrementally increased using one of the following:
The ingress PE looks up the prefix of the locator or SID in the routing table and if a route exists, it forwards the packet to the next hop. The ingress PE does not check if the target SID or locator has been received in ISIS or BGP.
ICMP ping and UDP traceroute operate similarly to any data or OAM IPv6 packet when expiring (the value in the hop-limit field is equal to or less than 1) at a transit SRv6 node, whether or not this node is a SID termination.
The data path at the ingress network interface where the packet is received extracts the packet to the CPM.
The CPM originates a TTL expiry ICMP reply message Type: "Time Exceeded", Code: "Time to Live exceeded in Transit".
The CPM sends the reply to the SRv6 router whose address is encoded in the SA field of the outer IPv6 header in the received packet. The source address is set to the system IPv6 address if configured or the address of the interface used to forward the packet to the next hop.
When the expired packet is a UDP traceroute, meaning that the next-header field is set to protocol UDP, and the UDP port range matches that of the UDP traceroute probe, the target node copies into the payload of the reply message the leading bytes (up to 128 bytes) in the received packet encapsulation, and include the outer IPv6 header and the SRH, if any.
When the hop-limit field value is higher than 1, the packet is processed in the data path similar to how any SRv6 user data packet is processed in the transit router. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for further information.
The data path in the ingress network port in the destination router that owns the target SID extracts the packet to CPM.
A UDP traceroute packet is extracted based on the hop-limit field value of 1 before the route lookup.
An ICMP ping packet is extracted after the route lookup matches a FIB entry of a local locator, End, or End.X SID.
The CPM checks that the target locator or SID address matches a local entry. This means that the locator or SID has either been configured manually by the user or it has been auto-allocated by the locator module for use by IS-IS or BGP.
A match on the locator requires an exact match on the locator field and that both the function and argument fields be zero.
A match on a SID requires both the locator and function fields to match. The argument field is not checked.
When a match on a local locator or SID exists, the CPM replies with the following:
The responder node copies into the payload of the reply message the leading bytes (up to 128 bytes) in the received packet encapsulation, and including the outer IPv6 header and the SRH, if any.
Figure 37 and Figure 38 illustrate the packet encapsulation from ingress PE to egress PE for both the primary path and the backup path. For the backup path, both the PSP and USP types of the LFA SRH are shown.
This feature implements the existing behavior of a ping or a traceroute packet, originated at the ingress PE node, for a prefix resolved to a SRv6 tunnel.If the OAM ping or traceroute packet is received from the CE router and expires (hop- limit field value equal to or less than 1), the ingress PE node responds as per current behavior. If the packet does not expire (hop-limit field value greater than 1), it is forwarded over the SRv6 tunnel as a data path packet.
The CPM originated ping/traceroute packet is encoded with DA set to End.DT4, End.DT6, or End.DT46 SID and the next-header field set to IPv6 (IPv6 VRF prefix) or to IPv4 (IPv4 VRF prefix). The next-header field of the inner IPv6 header is set to ICMPv6 (for ping) or to UDP (for traceroute).
The packet is encapsulated as shown in Figure 39. The LFA SRH is also shown in the packet encapsulation of the SRv6 tunnel to node D when the TI-LFA or Remote LFA repair tunnel is activated.
The outer IPv6 header hop-limit field is set to the default value 254.
The packet is processed in the data path, like any SRv6 user-data packet, by the transit router. See the 7750 SR and 7950 XRS Segment Routing and PCE User Guide for further information.
At the target node, if the packet matches a local locator prefix entry in the FIB and the payload type is either IPv4 or IPv6, then it is handed to the SRv6 Forwarding Path Extension (FPE).
The egress data path of the SRv6 FPE removes the SRv6 headers and passes the inner IPv6 packet to the ingress data path which performs regular exception handling for an ICMP ping or a UDP traceroute packet.
This behavior acts the same as described in Ping or Traceroute of an IPv4 or IPv6 VRF Prefix Resolved to an SRv6 Tunnel.
Figure 40 shows an IP/MPLS network which uses LDP ECMP for network resilience. Faults that are detected through IGP or LDP are corrected as soon as IGP and LDP re-converge. The impacted traffic is forwarded on the next available ECMP path as determined by the hash routine at the node that had a link failure.
However, there are faults which the IGP/LDP control planes may not detect. These faults may be due to a corruption of the control plane state or of the data plane state in a node. Although these faults are very rare and mostly due to misconfiguration, the LDP Tree Trace OAM feature is intended to detect these “silent” data plane and control plane faults. For example, it is possible that the forwarding plane of a node has a corrupt Next Hop Label Forwarding Entry (NHLFE) and keeps forwarding packets over an ECMP path only to have the downstream node discard them. This data plane fault can only be detected by an OAM tool that can test all possible end-to-end paths between the ingress LER and the egress LER. A corruption of the NHLFE entry can also result from a corruption in the control plane at that node.
When the LDP tree trace feature is enabled, the ingress LER builds the ECMP tree for a given FEC (egress LER) by sending LSP trace messages and including the LDP IPv4 Prefix FEC TLV as well as the downstream mapping TLV. In order to build the ECMP tree, the router LER inserts an IP address range drawn from the 127/8 space. When received by the downstream LSR, it uses this range to determine which ECMP path is exercised by any IP address or a sub-range of addresses within that range based on its internal hash routine. When the MPLS echo reply is received by the router LER, it records this information and proceed with the next echo request message targeted for a node downstream of the first LSR node along one of the ECMP paths. The sub-range of IP addresses indicated in the initial reply are used since the objective is to have the LSR downstream of the router LER pass this message to its downstream node along the first ECMP path.
The following figure illustrates the behavior through the following example adapted from RFC 8029:
LSR A has two downstream LSRs, B and F, for PE2 FEC. PE1 receives an echo reply from A with the Multipath Type set to 4, with low/high IP addresses of 127.1.1.1->127.1.1.255 for downstream LSR B and 127.2.1.1->127.2.1.255 for downstream LSR F. PE1 reflects this information to LSR B. B, which has three downstream LSRs, C, D, and E, computes that 127.1.1.1->127.1.1.127 would go to C and 127.1.1.128-> 127.1.1.255 would go to D. B would then respond with 3 Downstream Mappings: to C, with Multipath Type 4 (127.1.1.1->127.1.1.127); to D, with Multipath Type 4 (127.1.1.127->127.1.1.255); and to E, with Multipath Type 0.
The router supports multipath type 0 and 8, and up to a maximum of 36 bytes for the multipath length and supports the LER part of the LDP ECMP tree building feature.
A user configurable parameter sets the frequency of running the tree trace capability. The minimum and default value is 60 minutes and the increment is 1 hour.
The router LER gets the list of FECs from the LDP FEC database. New FECs are added to the discovery list at the next tree trace and not when they are learned and added into the FEC database. The maximum number of FECs to be discovered with the tree building feature is limited to 500. The user can configure FECs to exclude the use of a policy profile.
The periodic path exercising capability of the LDP tree trace feature runs in the background to test the LDP ECMP paths discovered by the tree building capability. The probe used is an LSP ping message with an IP address drawn from the sub-range of 127/8 addresses indicated by the output of the tree trace for this FEC.
The periodic LSP ping messages continuously probes an ECMP path at a user configurable rate of at least 1 message per minute. This is the minimum and default value. The increment is 1 minute. If an interface is down on a router LER, then LSP ping probes that normally go out this interface are not sent.
The LSP ping routine updates the content of the MPLS echo request message, specifically the IP address, as soon as the LDP ECMP tree trace has output the results of a new computation for the path in question.
The P2MP LSP ping complies to RFC 6425, Detecting Data Plane Failures in Point-to-Multipoint Multiprotocol Label Switching (MPLS) - Extensions to LSP Ping.
An LSP ping can be generated by entering the following OAM command:
The echo request message is sent on the active P2MP instance and is replicated in the data path over all branches of the P2MP LSP instance. By default, all egress LER nodes which are leaves of the P2MP LSP instance replies to the echo request message.
The user can reduce the scope of the echo reply messages by explicitly entering a list of addresses for the egress LER nodes that are required to reply. A maximum of 5 addresses can be specified in a single execution of the p2mp-lsp-ping command. If all 5 egress LER nodes are router nodes, they can parse the list of egress LER addresses and reply. RFC 6425 specifies that only the top address in the P2MP egress identifier TLV must be inspected by an egress LER. When interoperating with other implementations, the router egress LER responds if its address is anywhere in the list. Furthermore, if another vendor implementation is the egress LER, only the egress LER matching the top address in the TLV may respond.
If the user enters the same egress LER address more than once in a single p2mp-lsp-ping command, the head-end node displays a response to a single one and displays a single error warning message for the duplicate ones. When queried over SNMP, the head-end node issues a single response trap and issues no trap for the duplicates.
The timeout parameter should be set to the time it would take to get a response from all probed leaves under no failure conditions. For that purpose, its range extends to 120 seconds for a p2mp-lsp-ping from a 10 second lsp-ping for P2P LSP. The default value is 10 seconds.
The router head-end node displays a “Send_Fail” error when a specific S2L path is down only if the user explicitly listed the address of the egress LER for this S2L in the ping command.
Similarly, the router head-end node displays the timeout error when no response is received for an S2L after the expiry of the timeout timer only if the user explicitly listed the address of the egress LER for this S2L in the ping command.
The user can configure a specific value of the ttl parameter to force the echo request message to expire on a router branch node or a bud LSR node. The latter replies with a downstream mapping TLV for each branch of the P2MP LSP in the echo reply message. Note that a maximum of 16 downstream mapping TLVs can be included in a single echo reply message. It also sets the multipath type to zero in each downstream mapping TLV and not include any egress address information for the reachable egress LER nodes for this P2MP LSP.
If the router ingress LER node receives the new multipath type field with the list of egress LER addresses in an echo reply message from another vendor implementation, it ignores but not cause an error in processing the downstream mapping TLV.
If the ping expires at an LSR node which is performing a re-merge or cross-over operation in the data path between two or more ILMs of the same P2MP LSP, there is an echo reply message for each copy of the echo request message received by this node.
The output of the command without the detail parameter specified provides a high-level summary of error codes and/or success codes received.
The output of the command with the detail parameter specified shows a line for each replying node as in the output of the LSP ping for a P2P LSP.
The display is delayed until all responses are received or the timer configured in the timeout parameter expired. No other CLI commands can be entered while waiting for the display. A control-C (^C) command aborts the ping operation.
For more information about P2MP refer to the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide.
The P2MP LSP trace complies to RFC 6425. An LSP trace can be generated by entering the following OAM command:
The LSP trace capability allows the user to trace the path of a single S2L path of a P2MP LSP. Its operation is similar to that of the p2mp-lsp-ping command but the sender of the echo reply request message includes the downstream mapping TLV to request the downstream branch information from a branch LSR or bud LSR. The branch LSR or bud LSR then also includes the downstream mapping TLV to report the information about the downstream branches of the P2MP LSP. An egress LER does not include this TLV in the echo response message.
The probe-count parameter operates in the same way as in LSP trace on a P2P LSP. It represents the maximum number of probes sent per TTL value before giving up on receiving the echo reply message. If a response is received from the traced node before reaching maximum number of probes, then no more probes are sent for the same TTL. The sender of the echo request then increments the TTL and uses the information it received in the downstream mapping TLV to start sending probes to the node downstream of the last node which replied. This continues until the egress LER for the traced S2L path replied.
Since the command traces a single S2L path, the timeout and interval parameters keep the same value range as in LSP trace for a P2P LSP.
The P2MP LSP Trace makes use of the Downstream Detailed Mapping (DDMAP) TLV. The following excerpt from RFC 6424 details the format of the new DDMAP TLV entered in the path-destination belongs to one of the possible outgoing interface of the FEC.
The Downstream Detailed Mapping TLV format is derived from the Downstream Mapping (DSMAP) TLV format. The key change is that variable length and optional fields have been converted into sub-TLVs. The fields have the same use and meaning as in RFC 8029.
Similar to p2mp-lsp-ping, an LSP trace probe results on all egress LER nodes eventually receiving the echo request message but only the traced egress LER node replies to the last probe.
Also any branch LSR node or bud LSR node in the P2MP LSP tree may receive a copy of the echo request message with the TTL in the outer label expiring at this node. However, only a branch LSR or bud LSR which has a downstream branch over which the traced egress LER is reachable must respond.
When a branch LSR or BUD LSR node responds to the sender of the echo request message, it sets the global return code in the echo response message to RC=14 - "See DDMAP TLV for Return Code and Return Sub-Code" and the return code in the DDMAP TLV corresponding to the outgoing interface of the branch used by the traced S2L path to RC=8 - "Label switched at stack-depth <RSC>".
Since a single egress LER address, for example an S2L path, can be traced, the branch LSR or bud LSR node sets the multipath type of zero in the downstream mapping TLV in the echo response message as no egress LER address need to be included.
When a 7450 ESS, 7750 SR or 7950 XRS LSR performs a re-merge of one or more ILMs of the P2MP LSP to which the traced S2L sub-LSP belongs, it may block the ILM over which the traced S2L resides. This causes the trace to either fail or to succeed with a missing hop.
The following is an example of this behavior.
S2L1 and S2L2 use ILMs which re-merge at node B. Depending of which ILM is blocked at B, the TTL=2 probe either yields two responses or times out.
The router ingress LER detects a re-merge condition when it receives two or more replies to the same probe, such as the same TTL value. It displays the following message to the user regardless if the trace operation successfully reached the egress LER or was aborted earlier:
This warning message indicates to the user the potential of a re-merge scenario and that a p2mp-lsp-ping command for this S2L should be used to verify that the S2L path is not defective.
The router ingress LER behavior is to always proceed to the next ttl probe when it receives an OK response to a probe or when it times out on a probe. If however it receives replies with an error return code, it must wait until it receives an OK response or it times out. If it times out without receiving an OK reply, the LSP trace must be aborted.
The following are possible echo reply messages received and corresponding ingress LER behavior:
This feature enables the tunneling of ICMP reply packets over MPLS LSP at an LSR node as per RFC 3032. At an LSR node, including an ABR, ASBR, or data path Router Reflector (RR) node, the user enables the ICMP tunneling feature globally on the system using the config>router>icmp-tunneling command.
This feature supports tunneling ICMP replies to a UDP traceroute message. It does not support tunneling replies to an icmp ping message. The LSR part of this feature consists of crafting the reply ICMP packet of type=11- 'time exceeded', with a source address set to a local address of the LSR node, and appending the IP header and leading payload octets of the original datagram. The system skips the lookup of the source address of the sender of the label TTL expiry packet, which becomes the destination address of the ICMP reply packet. Instead, CPM injects the ICMP reply packet in the forward direction of the MPLS LSP the label TTL expiry packet was received from. The TTL of pushed labels should be set to 255.
The source address of the ICMP reply packet is determined as follows:
When the packet is received by the egress LER, it performs a regular user packet lookup in the data path in the GRT context for BGP shortcut, 6PE, and BGP label route prefixes, or in VPRN context for VPRN and 6VPE prefixes. It then forwards it to the destination, which is the sender of the original packet which TTL expired at the LSR.
If the egress LER does not have a route to the destination of the ICMP packet, it drops the packets.
The rate of the tunneled ICMP replies at the LSR can be directly or indirectly controlled by the existing IOM level and CPM levels mechanisms. Specifically, the rate of the incoming UDP traceroute packets received with a label stack can be controlled at ingress IOM using the distributed CPU protection feature. The rate of the ICMP replies by CPM can also be directly controlled by configuring a system wide rate limit for packets ICMP replies to MPLS expired packets which are successfully forwarded to CPM using the command 'configure system security vprn-network-exceptions'. Note that while this command's name refers to VPRN service, this feature rate limits ICMP replies for packets received with any label stack, including VPRN and shortcuts.
The 7450 ESS, 7750 SR and 7950 XRS router implementation supports appending to the ICMP reply of type Time Exceeded the MPLS label stack object defined in RFC 4950. It does not include it in the ICMP reply type of Destination unreachable.
The new MPLS Label Stack object permits an LSR to include label stack information including label value, EXP, and TTL field values, from the encapsulation header of the packet that expired at the LSR node. The ICMP message continues to include the IP header and leading payload octets of the original datagram.
In order to include the MPLS Label Stack object, the SR OS implementation adds support of RFC 4884, Extended ICMP to Support Multi-Part Messages, which defines extensions for a multi-part ICMPv4/v6 message of type Time Exceeded. Section 5 of RFC 4884 defines backward compatibility of the new ICMP message with extension header with prior standard and proprietary extension headers.
In order to guarantee interoperability with third party implementations deployed in customer networks, the router implementation is able to parse in the receive side all possible encapsulations formats as defined in Section 5 of RFC 4884. Specifically:
The new MPLS Label Stack object permits an LSR to include label stack information including label value, EXP, and TTL field values, from the encapsulation header of the packet that expired at the LSR node. The ICMP message continues to include the IP header and leading payload octets of the original datagram.
In the transmit side, when the MPLS Label Stack object is added as an extension to the ICMP reply message, it is appended to the message immediately following the "original datagram" field taken from the payload of the received traceroute packet. The size of the appended "original datagram" field contains exactly 128 octets. If the original datagram did not contain 128 octets, the "original datagram" field is zero padded to 128 octets.
For sample output of the traceroute OAM tool when the ICMP tunneling feature is enabled see, Traceroute with ICMP Tunneling In Common Applications.
When the ICMP reply packet is generated in CPM, its FC is set by default to NC1 with the corresponding default ToS byte value of 0xC0. The DSCP value can be changed by configuring a different value for an ICMP application under the config>router>sgt-qos icmp context.
When the packet is forwarded to the outgoing interface, the packet is queued in the egress network queue corresponding to its CPM assigned FC and profile parameter values. The marking of the packet's EXP is dictated by the {FC, profile}-to-EXP mapping in the network QoS policy configured on the outgoing network interface. The ToS byte, and DSCP value for that matter, assigned by CPM are not modified by the IOM.
At a high level, the major difference in the behavior of the UDP traceroute when ICMP tunneling is enabled at an LSR node is that the LSR node tunnels the ICMP reply packet towards the egress of the LSP without looking up the traceroute sender's address. When ICMP tunneling is disabled, the LSR looks it up and replies if the sender is reachable. However there are additional differences in the two behaviors and they are summarized in the following.
In the presence of ECMP, CPM generated UDP traceroute packets are not sprayed over multiple ECMP next-hops. The first outgoing interface is selected. In addition, a LSR ICMP reply to a UDP traceroute is also forwarded over the first outgoing interface regardless if ICMP tunneling is enabled or not. When ICMP tunneling is enabled, it means the packet is tunneled over the first downstream interface for the LSP when multiple next-hops exist (LDP FEC or BGP label route). In all cases, the ICMP reply packet uses the outgoing interface address as the source address of the reply packet.
The router SDP diagnostics are SDP ping and SDP MTU path discovery.
SDP ping performs in-band unidirectional or round-trip connectivity tests on SDPs. The SDP ping OAM packets are sent in-band, in the tunnel encapsulation, so it follows the same path as traffic within the service. The SDP ping response can be received out-of-band in the control plane, or in-band using the data plane for a round-trip test.
For a unidirectional test, SDP ping tests:
For a round-trip test, SDP ping uses a local egress SDP ID and an expected remote SDP ID. Since SDPs are unidirectional tunnels, the remote SDP ID must be specified and must exist as a configured SDP ID on the far-end router SDP round trip testing is an extension of SDP connectivity testing with the additional ability to test:
In a large network, network devices can support a variety of packet sizes that are transmitted across its interfaces. This capability is referred to as the Maximum Transmission Unit (MTU) of network interfaces. It is important to understand the MTU of the entire path end-to-end when provisioning services, especially for virtual leased line (VLL) services where the service must support the ability to transmit the largest customer packet.
The Path MTU discovery tool provides a powerful tool that enables service provider to get the exact MTU supported by the network's physical links between the service ingress and service termination points (accurate to one byte).
Nokia’s Service ping feature provides end-to-end connectivity testing for an individual service. Service ping operates at a higher level than the SDP diagnostics in that it verifies an individual service and not the collection of services carried within an SDP.
Service ping is initiated from a router to verify round-trip connectivity and delay to the far-end of the service. Nokia’s implementation functions for both GRE and MPLS tunnels and tests the following from edge-to-edge:
While the LSP ping, SDP ping and service ping tools enable transport tunnel testing and verify whether the correct transport tunnel is used, they do not provide the means to test the learning and forwarding functions on a per-VPLS-service basis.
It is conceivable, that while tunnels are operational and correctly bound to a service, an incorrect Forwarding Database (FDB) table for a service could cause connectivity issues in the service and not be detected by the ping tools. Nokia has developed VPLS OAM functionality to specifically test all the critical functions on a per-service basis. These tools are based primarily on the IETF document draft-stokes-vkompella-ppvpn-hvpls-oam-xx.txt, Testing Hierarchical Virtual Private LAN Services.
The VPLS OAM tools are:
For a MAC ping test, the destination MAC address (unicast or multicast) to be tested must be specified. A MAC ping packet is sent through the data plane. The ping packet goes out with the data plane format.
In the data plane, a MAC ping is sent with a VC label TTL of 255. This packet traverses each hop using forwarding plane information for next hop, VC label, and so on. The VC label is swapped at each service-aware hop, and the VC TTL is decremented. If the VC TTL is decremented to 0, the packet is passed up to the management plane for processing. If the packet reaches an egress node, and would be forwarded out a customer facing port, it is identified by the OAM label below the VC label and passed to the management plane.
MAC pings are flooded when they are unknown at an intermediate node. They are responded to only by the egress nodes that have mappings for that MAC address.
A MAC trace functions like an LSP trace with some variations. Operations in a MAC trace are triggered when the VC TTL is decremented to 0.
Like a MAC ping, a MAC trace is sent using the data plane.
When a traceroute request is sent via the data plane, the data plane format is used. The reply can be via the data plane or the control plane.
A data plane MAC traceroute request includes the tunnel encapsulation, the VC label, and the OAM, followed by an Ethernet DLC, a UDP, and IP header. If the mapping for the MAC address is known at the sender, the data plane request is sent down the known SDP with the appropriate tunnel encapsulation and VC label. If the mapping is not known, it is sent down every SDP (with the appropriate tunnel encapsulation per SDP and appropriate egress VC label per SDP binding).
The tunnel encapsulation TTL is set to 255. The VC label TTL is initially set to the min-ttl (default is 1). The OAM label TTL is set to 2. The destination IP address is the all-routers multicast address. The source IP address is the system IP of the sender.
The destination UDP port is the LSP ping port. The source UDP port is whatever the system provides (this source UDP port is the demultiplexer that identifies the particular instance that sent the request, when correlating the reply).
The Reply Mode is either 3 (that is, reply using the control plane) or 4 (that is, reply through the data plane), depending on the reply-control option. By default, the data plane request is sent with Reply Mode 3 (control plane reply).
The Ethernet DLC header source MAC address is set to either the system MAC address (if no source MAC is specified) or to the specified source MAC. The destination MAC address is set to the specified destination MAC. The EtherType is set to IP.
The Nokia-specific CPE ping function provides a common approach to determine if a destination IPv4 address can be resolved to a MAC address beyond the Layer 2 PE, in the direction of the CPE. The function is supported for both VPLS and Epipe services and on a number of different connection types. The service type determines the packet format for network connection transmissions. The transmission of the packet from a PE egressing an access connection is a standard ARP packet. This allows for next-hop resolution for even unmanaged service elements. In many cases, responses to ICMP echo requests are restricted to trusted network segments only; however, ARP packets are typically processed.
If the ARP response is processed on a local SAP connection on the same node from which the command was executed, the detailed SAP information is returned as part of the display function. If the response is not local, the format of the display depends on the service type.
The VPLS service construct is multipoint by nature, and simply returning a positive response to a reachability request would not supply enough information. For this reason, VPLS service CPE ping requests use the Nokia-specific MAC ping packet format. Execution of the CPE ping command generates a MAC ping packet using a broadcast Layer 2 address on all non-access ports. This packet allows for more information about the location of the target. A positive result displays the IP address of the Layer 2 PE and SAP information for the target location.
Each PE, including the local PE, that receives a MAC ping proxies an ARP request on behalf of the original source, as part of the CPE ping function. If a response is received for the ARP request, the Layer 2 PE processes the request, translates the ARP response, and responds back to the initial source with the appropriate MAC ping response and fields.
The MAC ping OAM tool makes it possible to detect whether a particular IPv4 address and MAC address have been learned in a VPLS, and on which SAP the target was found.
The Epipe service construction is that of cross-connection, and returning a positive response to a reachability request is an acceptable approach. For this reason, Epipe service CPE ping requests use standard ARP requests and proxy ARP processing. A positive result displays remote-SAP for any non-local responses. Since Epipe services are point-to-point, the path towards the remote SAP for the service should already be understood.
Nokia recommends that a source IP address of all zeros (0.0.0.0) is used, which prevents the exposure of the provider IP address to the CPE.
The CPE ping function requires symmetrical data paths for proper functionality. Issues may arise when the request egresses a PE and the response arrives on a related but different PE. When dealing with asymmetrical paths, the return-control option may be used to bypass some of the asymmetrical path issues. Asymmetrical paths can be common in all active multi-homing solutions.
For all applications except basic VPLS services (SAP and SDP bindings without a PBB context), CPE ping functionality requires minimum FP2-based hardware for all connections that may be involved in the transmission or processing of the proxy function.
This approach should only be considered for unmanaged solutions where standard Ethernet CFM (ETH-CFM) functions cannot be deployed. ETH-CFM has a robust set of fault and performance functions that are purpose-built for Ethernet services and transport.
Connection types used to support VPLS and Epipes include SAPs, SDP bindings, B-VPLS, BGP-AD, BGP-VPWS, BGP-VPLS, and MPLS-EVPN.
CPE ping has been supported for VPLS services since Release 3.0 of SR OS. It enables the connectivity of the access circuit between a VPLS PE and a CPE to be tested, even if the CPE is unmanaged and, therefore, the service provider cannot run standardized Ethernet OAM to the CPE. The command cpe-ping for a specific destination IP address within a VPLS is translated into a mac-ping towards a broadcast MAC address. All destinations within the VPLS context are reached by this ping to the broadcast MAC address. At all these destinations, an ARP is triggered for the specific IP address (with the IP destination address equal to the address from the request, mac-da equal to all ones, mac-sa equal to the CPM-mac-address and the IP source address, which is the address found in the request). The destination receiving a response replies back to the requester.
Release 10.0 extended the CPE ping command for local, distributed, and PBB Epipe services provisioned over a PBB VPLS. CPE ping for Epipe implements an alternative behavior to CPE ping for VPLS that enables fate sharing of the CPE ping request with the Epipe service. Any PE within the Epipe service (the source PE) can launch the CPE ping. The source PE builds an ARP request and encapsulates it to be sent in the Epipe as if it came from a customer device by using its chassis MAC as the source MAC address. The ARP request then egresses the remote PE device as any other packets on the Epipe. The remote CPE device responds to the ARP and the reply is transparently sent on the Epipe towards the source PE. The source PE then looks for a match on its chassis MAC in the inner customer DA. If a match is found, the source PE device intercepts this response packet.
This method is supported regardless of whether the network uses SDPs or SAPs. It is configured using the existing oam>cpe-ping CLI command.
Note: This feature does not support IPv6 CPEs. |
This feature supports FP2 and later and applies only to the 7450 ESS and 7750 SR.
To launch cpe-ping on an Epipe, all of the following must be true:
MAC populate is used to send a message through the flooding domain to learn a MAC address as if a customer packet with that source MAC address had flooded the domain from that ingress point in the service. This allows the provider to craft a learning history and engineer packets in a particular way to test forwarding plane correctness.
The MAC populate request is sent with a VC TTL of 1, which means that it is received at the forwarding plane at the first hop and passed directly up to the management plane. The packet is then responded to by populating the MAC address in the forwarding plane, similar to a conventional learn, although the MAC is an OAM-type MAC in the FDB to distinguish it from customer MAC addresses.
This packet is then taken by the control plane and flooded out the flooding domain (squelching, appropriately, the sender and other paths that would be squelched in a typical flood).
This controlled population of the FDB is very important to manage the expected results of an OAM test. The same functions are available by sending the OAM packet as a UDP/IP OAM packet. It is then forwarded to each hop and the management plane has to do the flooding.
Options for MAC populate are to force the MAC in the table to type OAM (in case it already existed as dynamic or static or an OAM-induced learning with some other binding). This prevents new dynamic learning from overwriting the existing OAM MAC entry, to allow customer packets with this MAC to either ingress or egress the network, while still using the OAM MAC entry.
Finally, an option to flood the MAC populate request causes each upstream node to learn the MAC, populate the local FDB with an OAM MAC entry, and to flood the request along the data plane using the flooding domain.
An age can be provided to age a particular OAM MAC after a different interval than other MACs in a FDB.
MAC purge is used to clear the FDBs of any learned information for a particular MAC address. This allows one to do a controlled OAM test without learning induced by customer packets. In addition to clearing the FDB of a particular MAC address, the purge can also indicate to the control plane not to allow further learning from customer packets. This allows the FDB to be clean, and be populated only via a MAC Populate.
MAC purge follows the same flooding mechanism as the MAC populate.
VCCV ping is used to check connectivity of a VLL in-band. It checks that the destination (target) PE is the egress for the Layer 2 FEC. It provides a cross-check between the data plane and the control plane. It is in-band, meaning that the VCCV ping message is sent using the same encapsulation and along the same path as user packets in that VLL. This is equivalent to the LSP ping for a VLL service. VCCV ping reuses an LSP ping message format and can be used to test a VLL configured over an MPLS and GRE SDP.
VCCV effectively creates an IP control channel within the pseudowire between PE1 and PE2. PE2 should be able to distinguish on the receive side VCCV control messages from user packets on that VLL. There are three possible methods of encapsulating a VCCV message in a VLL which translates into three types of control channels:
The first nibble is set to 0x1. The Format ID and the reserved fields are set to 0 and the channel type is the code point associated with the VCCV IP control channel as specified in the PWE3 IANA registry (RFC 4446). The channel type value of 0x21 indicates that the Associated Channel carries an IPv4 packet.
The use of the OAM control word assumes that the draft-martini control word is also used on the user packets. This means that if the control word is optional for a VLL and is not configured, the PE node only advertises the router alert label as the CC capability in the Label Mapping message. This method is supported by the 7450 ESS, 7750 SR and 7950 XRS routers.
When sending the label mapping message for the VLL, PE1 and PE2 must indicate which of the above OAM packet encapsulation methods (for example, which control channel type) they support. This is accomplished by including an optional VCCV TLV in the pseudowire FEC Interface Parameter field. The format of the VCCV TLV is shown below:
Note that the absence of the optional VCCV TLV in the Interface parameters field of the pseudowire FEC indicates the PE has no VCCV capability.
The Control Channel (CC) Type field is a bitmask used to indicate if the PE supports none, one, or many control channel types.
If both PE nodes support more than one of the CC types, then the router PE uses of the one with the lowest type value. For instance, OAM control word is used in preference to the MPLS router alert label.
The Connectivity Verification (CV) bitmask field is used to indicate the specific type of VCCV packets to be sent over the VCCV control channel. The valid values are:
0x00 None of the below VCCV packet type are supported.
0x01 icmp ping. Not applicable to a VLL over a MPLS or GRE SDP and as such is not supported by the 7450 ESS, 7750 SR, and 7950 XRS routers.
0x02 LSP ping. This is used in VCCV ping application and applies to a VLL over an MPLS or a GRE SDP. This is supported by the 7450 ESS, 7750 SR, and 7950 XRS routers.
A VCCV ping is an LSP echo request message as defined in RFC 8029. It contains an L2 FEC stack TLV which must include within the sub-TLV type 10 “FEC 128 Pseudowire”. It also contains a field which indicates to the destination PE which reply mode to use. There are four reply modes defined in RFC 8029:
Reply mode, meaning:
The reply is an LSP echo reply message as defined in RFC 8029. The message is sent as per the reply mode requested by PE1. The return codes supported are the same as those supported in the router LSP ping capability.
The VCCV ping feature is in addition to the service ping OAM feature which can be used to test a service between router nodes. The VCCV ping feature can test connectivity of a VLL with any third party node which is compliant to RFC 5085.
Figure 42 shows an example of an application of VCCV ping over a multi-segment pseudowire.
Pseudowire switching is a method for scaling a large network of VLL or VPLS services by removing the need for a full mesh of T-LDP sessions between the PE nodes as the number of these nodes grow over time. Pseudowire switching is also used whenever there is a need to deploy a VLL service across two separate routing domains.
In the network, a Termination PE (T-PE) is where the pseudowire originates and terminates. The Switching PE (S-PE) is the node which performs pseudowire switching by cross-connecting two spoke SDPs.
VCCV ping is extended to be able to perform the following OAM function:
Note that the originator of the VCCV ping message does not need to be a T-PE node; it can be an S-PE node. The destination of the VCCV ping message can also be an S-PE node.
VCCV trace to trace the entire path of a pseudowire with a single command issued at the T-PE. This is equivalent to LSP trace and is an iterative process by which T-PE1 sends successive VCCV ping messages while incrementing the TTL value, starting from TTL=1. The procedure for each iteration is the same as above and each node in which the VC label TTL expires checks the FEC and replies with the FEC to the downstream S-PE or T-PE node. The process is terminated when the reply is from T-PE2 or when a timeout occurs.
Although tracing of the MS-pseudowire path is possible using the methods explained in previous sections, these require multiple manual iterations and that the FEC of the last pseudowire segment to the target T-PE/S-PE be known a priori at the node originating the echo request message for each iteration. This mode of operation is referred to as a “ping” mode.
The automated VCCV-trace can trace the entire path of a pseudowire with a single command issued at the T-PE or at an S-PE. This is equivalent to LSP-trace and is an iterative process by which the ingress T-PE or T-PE sends successive VCCV-ping messages with incrementing the TTL value, starting from TTL=1.
The method is described in draft-hart-pwe3-segmented-pw-vccv, VCCV Extensions for Segmented Pseudo-Wire, and is pending acceptance by the PWE3 working group. In each iteration, the source T-PE or S-PE builds the MPLS echo request message in a way similar to VCCV Ping. The first message with TTL=1 has the next-hop S-PE T-LDP session source address in the Remote PE Address field in the pseudowire FEC TLV. Each S-PE which terminates and processes the message includes in the MPLS echo reply message the FEC 128 TLV corresponding the pseudowire segment to its downstream node. The inclusion of the FEC TLV in the echo reply message is allowed in RFC 8029. The source T-PE or S-PE can then build the next echo reply message with TTL=2 to test the next-next hop for the MS-pseudowire. It copies the FEC TLV it received in the echo reply message into the new echo request message. The process is terminated when the reply is from the egress T-PE or when a timeout occurs. If specified, the max-ttl parameter in the vccv-trace command stops on S-PE before reaching T-PE.
The results VCCV-trace can be displayed for a fewer number of pseudowire segments of the end-to-end MS-pseudowire path. In this case, the min-ttl and max-ttl parameters are configured accordingly. However, the T-PE/S-PE node still probes all hops up to min-ttl in order to correctly build the FEC of the desired subset of segments.
Note that this method does not require the use of the downstream mapping TLV in the echo request and echo reply messages.
MS pseudowire is supported with a mix of static and signaled pseudowire segments. However, VCCV ping and VCCV-trace is allowed until at least one segment of the MS pseudowire is static. Users cannot test a static segment but also, cannot test contiguous signaled segments of the MS-pseudowire. VCCV ping and VCCV trace is not supported in static-to-dynamic configurations.
Figure 42 shows how a trace can be performed on the MS-pseudowire originating from T-PE1 by a single operational command. The following process occurs:
When in the ping mode of operation, the sender of the echo request message requires the FEC of the last segment to the target S-PE/T-PE node. This information can either be configured manually or be obtained by inspecting the corresponding sub-TLV's of the pseudowire switching point TLV. However, the pseudowire switching point TLV is optional and there is no guarantee that all S-PE nodes populate it with their system address and the pseudowire ID of the last pseudowire segment traversed by the label mapping message. Thus, the router implementation always makes use of the user configuration for these parameters.
When in the trace mode operation, the T-PE automatically learns the target FEC by probing one by one the hops of the MS-pseudowire path. Each S-PE node includes the FEC to the downstream node in the echo reply message in a similar way that LSP trace causes the probed node to return the downstream interface and label stack in the echo reply message.
Upon receiving a VCCV echo request the control plane on S-PEs (or the target node of each segment of the MS pseudowire) validates the request and responds to the request with an echo reply consisting of the FEC 128 of the next downstream segment and a return code of 8 (label switched at stack-depth) indicating that it is an S-PE and not the egress router for the MS-pseudowire.
If the node is the T-PE or the egress node of the MS-pseudowire, it responds to the echo request with an echo reply with a return code of 3 (egress router) and no FEC 128 is included.
The operation to be taken by the node that receives the echo reply in response to its echo request depends on its current mode of operation such as ping or trace.
In ping mode, the node may choose to ignore the target FEC 128 in the echo reply and report only the return code to the operator.
However, in trace mode, the node builds and sends the subsequent VCCV echo request with a incrementing TTL and the information (such as the downstream FEC 128) it received in the echo request to the next downstream pseudowire segment.
The multicast forwarding information base (MFIB) ping OAM tool allows to easily verify inside a VPLS which SAPs would normally egress a certain multicast stream. The multicast stream is identified by a source unicast and destination multicast IP address, which are mandatory when issuing an MFIB ping command.
An MFIB ping packet is sent through the data plane and goes out with the data plane format containing a configurable VC label TTL. This packet traverses each hop using forwarding plane information for next hop, VC label, and so on. The VC label is swapped at each service-aware hop, and the VC TTL is decremented. If the VC TTL is decremented to 0, the packet is passed up to the management plane for processing. If the packet reaches an egress node, and would be forwarded out a customer facing port (SAP), it is identified by the OAM label below the VC label and passed to the management plane.
ATM Diagnostics applies to the 7450 ESS and 7750 SR only.
The ATM OAM ping allows operators to test VC-integrity and endpoint connectivity for existing PVCCs using OAM loopback capabilities.
If portId:vpi/vci PVCC does not exist, a PVCC is administratively disabled, or there is already a ping executing on this PVCC, then this command returns an error.
Because oam atm-ping is a dynamic operation, the configuration is not preserved. The number of oam atm-ping operations that can be performed simultaneously on a 7750 SR or 7450 ESS is configurable as part of the general OAM MIB configuration.
An operator can specify the following options when performing an oam atm-ping:
The result of ATM ping shows if the ping to a given location was successful. It also shows the round-trip time the ping took to complete (from the time the ping was injected in the ATM SAR device until the time the ping response was given to S/W by the ATM SAR device) and the average ping time for successful attempts up to the given ping response.
An OAM ATM ping in progress times out if a PVCC goes to the operational status down as result of a network failure, an administrative action, or if a PVCC gets deleted. Any subsequent ping attempts fails until the VC’s operational state changes to up.
To stop a ping in progress, an operator can enter “CTRL – C”. This stops any outstanding ping requests and returns a ping result up to the point of interruption (a ping in progress during the above stop request fails).
Ping and Trace tools for PWs and LSPs are supported with both IP encapsulation and the MPLS-TP on demand CV channel for non-IP encapsulation (0x025).
The 7450 ESS, 7750 SR, and 7950 XRS routers support VCCV Ping and VCCV Trace on single segment PWs and multi-segment PWs where every segment has static labels and a configured MPLS-TP PW Path ID. It also supports VCCV Ping and Trace on MS-PWs here a static MPLS-TP PW segment is switched to a dynamic T-LDP signaled segment.
Static MS-PW PWs are referred to with the sub-type static in the vccv-ping and vccv-trace command. This indicates to the system that the rest of the command contains parameters that are applied to a static PW with a static PW FEC.
Two ACH channel types are supported: the IPv4 ACH channel type, and the non-IP ACH channel type (0x0025). This is known as the non-ip associated channel. This is the default for type static. The Generic ACH Label (GAL) is not supported for PWs.
If the IPv4 associated channel is specified, then the IPv4 channel type is used (0x0021). In this case, a destination IP address in the 127/8 range is used, while the source address in the UDP/IP packet is set to the system IP address, or may be explicitly configured by the user with the src-ip-address option. This option is only valid if the ipv4 control-channel is specified.
The reply mode is always assumed to be the same application level control channel type for type static.
As with other PW types, the downstream mapping and detailed downstream mapping TLVs (DSMAP/DDMAP TLVs) are not supported on static MPLS-TP PWs.
The follow CLI command description shows the options that are only allowed if the type static option is configured. All other options are blocked.
vccv-ping static sdp-id:vc-id [target-fec-type pw-id-fec sender-src-address ip-addr remote-dst-address ip-address pw-id pw-id pw-type pw-type] [dest-global-id global-id dest-node-id node-id] [assoc-channel ipv4 | non-ip] [fc fc-name [profile {in | out}]] [size octets] [count send-count] [timeout timeout] [interval interval] [ttl vc-label-ttl] [src-ip-address ip-addr]
vccv-trace static sdp-id:vc-id [assoc-channel ipv4 | non-ip] [src-ip-address ipv4-address] [target-fec-type pw-id sender-src-address ip-address remote-dst-address ip-address pw-id pw-id pw-type pw-type] [detail] [fc fc-name [profile in | out]] [interval interval-value] [max-fail no-response-count] [max-ttl max-vc-label-ttl] [min-ttl min-vc-label-ttl] [probe-count probe-count] [size octets] [timeout timeout-value]
If the spoke SDP referred to by the sdp-id:vc-id has an MPLS-TP PW-Path-ID defined, then those parameters are used to populate the static PW TLV in the target FEC stack of the vccv-ping or vccv-trace packet. If a Global-ID and Node-ID is specified in the command, then these values are used to populate the destination node TLV in the vccv-ping or vccv-trace packet.
The global-id/node-id are only used as the target node identifiers if the vccv-ping is not end-to-end (for example, a TTL is specified in the vccv-ping or trace command and it is < 255), otherwise the value in the PW Path ID is used. For vccv-ping, the dest-node-id may be entered as a 4-octet IP address <a.b.c.d> or 32-bit integer <1 to 4294967295>. For vccv-trace, the destination node-id and global-id are taken form the spoke SDP context.
The same command syntax is applicable for SAA tests configured under configure saa test a type.
The 7450 ESS, 7750 SR, and 7950 XRS routers support end to end VCCV Ping and VCCV trace between a segment with a static MPLS-TP PW and a dynamic T-LDP segment by allowing the user to specify a target FEC type for the VCCV echo request message that is different from the local segment FEC type. That is, it is possible to send a VCCV Ping / Trace echo request containing a static PW FEC in the target stack TLV at a T-PE where the local egress PW segment is signaled, or a VCCV Ping or Trace echo request containing a PW ID FEC (FEC128) in the target stack TLV at a T-PE where the egress PW segment is a static MPLS-TP PW.
Note that all signaled T-LDP segments and the static MPLS-TP segments along the path of the MS-PW must use a common associated channel type. Since only the IPv4 associated channel is supported in common between the two segments, this must be used. If a user selects a non-IP associated channel on the static MPLS-TP spoke SDP, then vccv-ping and vccv-trace packets are dropped by the S-PE.
The target-fec-type option of the vccv-ping and vccv-trace command is used to indicate that the remote FEC type is different from the local FEC type. For a vccv-ping initiated from a T-PE with a static PW segment with MPLS-TP parameters, attempting to ping a downstream FEC128 segment, then a target-fec-type of pw-id is configured with a static PW type. In this case, an assoc-channel type of non-ip is blocked, and vice-versa. Likewise the reply-mode must be set to control-channel. For a vccv-ping initiated from a T-PE with a FEC128 PW segment, attempting to ping a downstream static PW FEC segment, a target-fec-type of static is configured with a pw-id PW type, then a control-channel type of non-ip is blocked, and vice-versa. Likewise the reply-mode must also be set to control-channel.
When using VCCV Trace, where the first node to be probed is not the first-hop S-PE. the initial TTL must be set to >1. In this case, the target-fec-type refers to the FEC at the first S-PE that is probed.
The same rules apply to the control-channel type and reply-mode as for the vccv-ping case.
For lsp-ping and lsp-trace commands:
The following commands are only valid if the sub-type static option is configured, implying that the lsp-name refers to an MPLS-TP tunnel LSP:
path-type. Values: active, working, protect. Default: active.
dest-global-id <global-id> dest-node-id <node-id>: Default: to global-id:node-id from the LSP ID.
assoc-channel: If this is set to none, then IP encapsulation over an LSP is used with a destination address in the 127/8 range. If this is set to ipv4, then IPv4 encapsulation in a G-ACh over an LSP is used with a destination address in the 127/8 range The source address is set to the system IP address, unless the user specifies a source address using the src-ip-address option. If this is set to non-ip, then non-IP encapsulation over a G-ACh with channel type 0x00025 is used. This is the default for sub-type static. Note that the encapsulation used for the echo reply is the same as the encapsulation used for the echo request.
downstream-map-tlv: LSP Trace commands with this option can only be executed if the control-channel is set to none. The DSMAP/DDMAP TLV is only included in the echo request message if the egress interface is either a numbered IP interface, or an unnumbered IP interface. The TLV is not included if the egress interface is of type unnumbered-mpls-tp.
For lsp-ping, the dest-node-id may be entered as a 4-octet IP address in the format a.b.c.d, or as a 32-bit integer in the range of 1 to 4294967295. For lsp-trace, the destination node-id and global-id are taken form the spoke-sdp context.
The send mode and reply mode are always taken to be an application level control channel for MPLS-TP.
The force parameter causes an LSP ping echo request to be sent on an LSP that has been brought oper-down by Bi-directional Forwarding Detection (BFD) (LSP-Ping echo requests would normally be dropped on oper-down LSPs). This parameter is not applicable to SAA.
The LSP ID used in the LSP ping packet is derived from a context lookup based on lsp-name and path-type (active/working/protect).
Dest-global-id and dest-node-id refer to the target global/node id. They do not need to be entered for end-to-end ping and trace, and the system uses the destination global ID and node ID from the LSP ID.
The same command syntax is applicable for SAA tests configured under config>saa>test.
EVPN is an IETF technology per RFC7432 that uses a new BGP address family and allows VPLS services to be operated as IP-VPNs, where the MAC addresses and the information to setup the flooding trees are distributed by BGP. The EVPN VXLAN connections, VXLAN Tunnel Endpoint (VTEP), uses a connection specific OAM Protocol for on demand connectivity verification. This connection specific OAM tool, VXLAN Ping, is described in the 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 2 Services and EVPN Guide: VLL, VPLS, PBB, and EVPN, within the VXLAN Section.
The sample outputs in the following section are examples only; actual displays may differ depending on supported functionality and user configuration.
The existing show>router>bfd is enhanced for MPLS-TP, as follows:
show>router>bfd>mpls-tp-lsp
This command displays the MPLS –TP paths for which BFD is enabled.
show>router>bfd>session [src ip-address [dest ip-address | detail]] | [mpls-tp-path lsp-id… [detail]]
This command shows the details of the BFD session on a particular MPLS-TP path, where lsp-id is the fully qualified lsp-id to which the BFD session is in associated.
A sample output is shown below:
RFC 6374, Packet Loss and Delay Measurement for MPLS Networks, provides a standard packet format and process for measuring delay of a unidirectional or MPLS-TP using the General Associated Channel (G-ACh), channel type 0x000C. Unidirectional LSPs, such as RSVP-TE, require an additional TLV to return a response to the querier (the launch point). RFC 7876, UDP Return Path for Packet Loss and Delay Measurement for MPLS Networks, defines the source IP information to include in the UDP Path Return TLV so the responding node can reach the querier using an IP network. The MPLS DM PDU does not natively include any IP header information. With MPLS TP there is no requirement for the TLV defined in RFC 7876.
The function of MPLS delay measurement is similar regardless of LSP type. The querier sends the MPLS DM query message toward the responder, transported in an MPLS LSP. The responder extracts the required PDU information to respond appropriately.
Launching MPLS DM tests are configured in the config>oam-pm>session session-name test-family mpls context. Basic architectural OAM PM components are required to be completed along with the MPLS specific configuration. The test PDU includes the following PDU settings;
TVLs can also be included, based on the configuration.
The maximum pad size of 257 is a result of the structure of the defined TLV. The length field is one byte, limiting the overall value filed to 255 bytes.
The reflector processes the inbound MPLS DM PDU and respond back to the querier based on the received information, using the response flag setting. Specific to the timestamp, the responder responds using the Query Timestamp Format, filling in the Timestamp 2 and Timestamp 3 values.
When the response arrives back at the querier, the delay metrics are computed. The common OAM-PM computation model and naming are used to simplify and rationalize the different technologies that leverage the OAM-PM infrastructure. The common methodology reports unidirectional and round trip delay metrics for Frame Delay (FD), InterFrame Delay Variation (IFDV), and Frame Delay Range (FDR). The term frame is not indicative of the underlying technology being measured. It represents a normal cross technology naming with applicability to the appropriate naming for the measured technology. The common normal naming requires a mapping to the supported delay measurements included in RFC 6374.
Description | RFC 6374 | OAM-PM |
A to B Delay | Forward | Forward |
B to A Delay | Reverse | Backward |
Two Way Delay (regardless of processing delays within the remote endpoint B) | Channel | Round-Trip |
Two Way Delay (includes processing delay at the remote endpoint B) | Round-Trip | Not supported |
Since OAM-PM uses a common reporting model, unidirectional (forward and backward, and round trip) are always reported. With unidirectional LSPs, the T4 and T3 timestamps are zeroed but the backward and round trip directions are still reported. In the case of unidirectional LSPs, the backward and round trip values are of no significance for the measured MPLS network.
An MPLS DM test may measure the endpoints of the LSP when the TTL is set equal to or higher than the termination point distance. Midpoints along the path that support MPLS DM response functions can be queried by a test setting a TTL to expire along the path. The MPLS DM launch and reflection, including mid-path transit nodes, capability is disabled by default. To launch and reflect MPLS DM test packets config>test-oam>mpls-dm must be enabled.
The SR OS implementation supports MPLS DM Channel Type 0x000C from RFC 6374 function for the following.
The following functions are not supported.
The following configuration provides an example that is comprised of the different MPLS OAM PM elements for the various LSPs. This example only includes configuration on the querier, excluding the basic MPLS and IP configurations. The equivalent MPLS configuration must be completed on all responders. Enabling MPLS DM is required on all queriers and responders.
The following describes the accounting policy configuration.
The following configuration enables EMPLE DM.
The following shows the RSVP LSP configuration.
The following shows the RSVP-Auto LSP configuration.
The following shows the MPLS-TP LSP configuration.
The following shows the MPLS OAM-PM configuration.
BIER supports the bier-ping and bier-trace OAM tools. FP4 hardware is required and only IPv4 is supported. The tools are not supported:
A bier-ping packet is sent in a certain subdomain. The user can specify the subdomain in which the BIER OAM packet must be generated. In addition, the BIER OAM packet has to be destined to a BFER or a set of BFERs. Multiple BFERs can be specified for bier-ping. A single BFER can be specified for bier-trace. The BFER can be specified in one of the following ways:
ECMP is not supported for BIER in 7x50. For BIER OAM, Multipath Entropy data sub-TLV of Downstream Detailed Mapping TLV is used for ECMP discovery. SR OS does not support the Multipath Entropy data sub-TLV type. If SR OS receives Multipath Entropy data sub-TLV in the BIER OAM packet, it responds with the return code "One or more of the TLVs was not understood".
BIER Ping and BIER Trace only support outbound time. Round-trip time is not supported, because multicast is unidirectional and BIER ping is in-band downstream, but out-of-band for echo reply. The outbound time is calculated from the network processor (NP) of the root to the NP of the leaf nodes, where the packet is timestamped.
If negative outbound times display for BIER OAM, the cause is usually that the root and the leaf nodes are not synchronized. In this case, the user must ensure that the root and the leaf nodes are synchronized.
In certain network configurations, it is not always possible to deploy preferred standards-based purpose-built robust connectivity verification tools such as Bidirectional Forwarding Detection (BFD), or Ethernet Connectivity Fault Management Continuity Check Message (ETH-CCM), or other more suited tools. When circumstances prevent the preferred connectivity validation methods, ICMP echo request and response ping connectivity check (icmp ping check) using ping templates can be used as an alternate connectivity checking method. Before deploying this approach an understanding of the treatment of these types of packets on the involved network elements is required. The ping check affects the operational state of the VRPN or IES service IPv4 interface (the service IP interface) being verified.
Deployment of this feature requires the following:
The ping template defines timers and thresholds that determine the basis for connectivity verification and influence the service IP interface operational state. The configuration of the ping template is located in the config>test-oam>icmp context. The configuration options are separated to allow different failure detection and recovery behaviors.
The transmission frequency (interval), loss detection (timeout) and threshold (failure-threshold) are used to check connectivity when the service IP interface is steady and operationally up, or steady and operationally down. When these values are monitoring connectivity and the service IP interface is operationally up, consecutive failures that reach the failure threshold transitions the interface to operationally down. When these values are monitoring connectivity and the service IP interface is operationally down, a first success triggers the recovery values to complete the validation. For example, if the interval 10 (seconds), timeout 5 (seconds) and failure-threshold 3 (count) the failure detection takes 30 seconds.
When a service IP interface transitions from operationally up to operationally down because of icmp ping check the log event “UTC WARNING: SNMP #2004 vprn1000999 int-PE-CE-999. Interface int-PE-CE-999 is not operational” is generated.
When a service IP interface has transitioned from operationally up to the operationally down state because of icmp ping check, the transmission continues at the specified interval until there is a successful ICMP echo response related to the ICMP echo request. When the first success is received, there is a possible transition from operationally down to operationally up and the function moves to the recovering phase. The icmp ping check packets for the affected service IP interface starts to transmit at frequency (reactivation-interval), invoking loss detection (reactivation-timeout) and consecutive success count (reactivation-threshold). If the reactivation threshold is reached, the service IP interface transitions from operationally down to operationally up. The transmission frequency (interval), loss detection (timeout) and threshold (failure-threshold) are used to monitor the service IP interface.
When a service IP interface transitions from operationally down to operationally up because of icmp ping check the log event “UTC WARNING: SNMP #2005 vprn1000999 int-PE-CE-999. Interface int-PE-CE-999 is operational” is generated.
If a failure occurs in the recovering phase, the reactivation-failure-threshold is consulted to determine the number of retries that should be attempted in this phase. This option allows a service IP interface a specified number of retries in this phase before returning to transmitting at interval and those associated values. The reactivation-failure-threshold parameter is bypassed if there was a previous success for the service IP interface in the recovery phase for the latest transition. This parameter determines the number of consecutive failures, without a previous success, before declaring the recovering is not proceeding and returns to the interval values. In larger scale environments this value may need to be increased.
Only packets related to the icmp ping check, ICMP echo request and ARP packets specifically associated with the assigned local ping template, can be sent when the interface is operationally down because of an icmp ping check failure. Only packets related to the icmp ping check, ICMP echo response and ARP packets specifically associated with the assigned local ping template, can be received when the interface is operationally down because of the ping check failure.
A ping check function should never be configured on both peers. This leads to deadlock conditions that can only be resolved by manually disabling the ping template under the interface. As previously stated, only packets associated with the local ping template can be transmitted and received on a service IP interface when the interface is operationally down because of icmp check.
The configured ping template values can be updated without having to change the administrative state or existing references. However, the service IP interfaces that reference a specific ping template configuration imports the values when the ping-template is administratively enabled under the service IP interface. There is no automatic updating of modified ping template values on service IP interfaces referencing a ping-template. In order to push the changes to the referencing service IP interface the command tools>perform>test-oam>icmp>ping-template-sync template-name is available. This command updates all interfaces that reference the specified ping-template. Executing this command updates all the referencing service IP interfaces in the background after the command is accepted. If there is an HA event and the tools command has not completed updating, all the interfaces that had were not updated at the time of the HA event do not receive the new values. If an HA event occurs and there is a concern that all interfaces may not have received the update the command should be executed again on the newly active. The command does not survive an HA event.
For a service IP interface to import and start using the icmp ping check, the ping-template template-name must be enabled and the destination-address ip-address, must be configured. When the ping-template is added to the service IP interface the values associated with that ping-template are imported. When the ping-template’s administrative state under the service IP interface is enabled, the values are checked again to ensure the latest values associated with the ping-template are being used. The source ip address of the packet is the primary IPv4 address of the service IP interface. This is not a configurable parameter.
When the ping-template command is administratively enabled under a service IP interface that is operationally up, the interface is assumed to have connectivity until proven otherwise. This means the interface state is not affected unless the ping template determines that there are connectivity issues based on the interval, timeout, and failure-threshold commands. If the desired behavior is for the ping-template to validate service IP interface connectivity before allowing the service IP interface to become operational, the service IP interface can be administratively disabled, the ping-template enabled under that interface, and then the interface administratively enabled. This is considered to be operationally down due to underlying conditions.
When the ping-template command is administratively enabled under a service IP interface that is operationally down because of an underlying condition unrelated to icmp ping check, when the underlying condition is cleared, the icmp ping check prevents the interface from entering the operationally up state until it can verify the connectivity. When the underlying condition is cleared the icmp ping check function enters the recovering phase using the reactivation-interval, reactivation-timeout, reactivation-threshold and the reactivation-failure-threshold values.
When a node is rebooted, service IP interfaces, with administratively enabled ping templates, must verify the interface connectivity before allowing it to progress to an operationally up state. This ensures that the interface does not bounce from operationally up to operationally down after a reboot and the service IP interface state is properly reflected when the reboot is complete. Service IP interfaces that have an administratively enabled ping-template enter the recovering phase using the reactivation-interval, reactivation-timeout, reactivation-threshold and the reactivation-failure-threshold values following a reboot.
When a soft reset condition is raised icmp ping check state for the service IP interface is held in the same state it entered the process until the soft reset is complete. The interfaces exit the soft reset in the same phase they entered but all counters are cleared. The service IP interfaces that have an administratively enabled ping template enter this held state if they are in any way related to any hardware that is undergoing a soft reset. Two examples to demonstrate the expected behavior are shown below. When a service IP interface is related to a LAG, if a single port member in that LAG is affected by the soft reset, the interface enters this held state. Similarly, if the service IP interface is connected using an R-VPLS configuration it enters the held state.
The protocol used to determine the icmp ping check function has been added to the distributed CPU protection list of protocols, icmp-ping-check. The distributed CPU protection function can be used to limit the amount of icmp ping check packets received on a service IP interface with an enabled ping template. This is an optional configuration that would prevent crossover impact on unrelated service IP interfaces using icmp ping check because of a rogue interface.
The show>service>id>interface ip-int-name detail command has been updated with the ping-template values and operational information. The most effective way to view the output is to use a match criterion for “Ping Template Values in Use”. The “Ping Template Values in Use” section of the output reports the current values that were imported from the referenced config>test-oam>icmp>ping-template. The “Operational Data” section of the output includes the administrative state (Up or Down) and destination address being tested (IP address or notConfigured). It also includes the current interval in use (interval or reactivation-interval) and the current state being reported, (operational, notRunning, failed). There are also pass and fail counters reporting, while in the current state, the number of consecutive passes or fails that have occurred. This provides a stability indicator. If these values are low, it may indicate that even though no operational state transitions have occurred there are intermittent but frequent failures. If neither of these counter are incrementing it is likely an underlying condition has been detected and the icmp ping check is not attempting to send and cannot receive connectivity packets. These counters are cleared when moving between different intervals, and for a soft reset.
The show>test-oam>icmp>ping-template and show>test-oam>icmp>ping-template-using have been added to display the various config>test-oam>icmp>ping-template configurations and services referencing the ping templates.
Using icmp ping check enabled on service IP interfaces incur longer recovery delays on failure and reboot because of the additional validations required to validate those interfaces.
The icmp ping check function supports IPv4 interfaces created on SAPs in VRPN and IES services and R-VPLS services, as well as Ethernet satellite (esat) connections. When the service IP interface is making use of an R-VPLS configuration, the interface between the VRPN or IES service and the VPLS service is a virtual connection. In order for the icmp ping check to function properly in R-VPLS environments, the connection being used to validate the peer must be reachable over a SAP.
The icmp ping check should only be used when other purpose-built connectivity checking is not a deployable solution. Interaction with contending protocols may be unexpended.
The interaction between icmp ping check and service IP interface hold-time, in general, the hold-time up option delays the deactivation of the associated IP interface by the specified number of seconds. The hold-time down option delays the activation of the associated IP interface by the specified number of seconds.
With the hold-time up option, if a service IP interface is about to transition from operational up to down because the port transitioned from operational up to down, loss of signal, administrative down, and so on, then hold-time up timer is started. The interface remains operationally up until the timer expires. The icmp ping check runs in parallel because the underlying operational state has been delayed. If it lasts longer than the detection for the icmp ping check it could fail while the interval is counting down. If the hold-time up counter expires the interface transitions to operationally down and the icmp ping check now recognizes the underlying issue and stops trying to transmit. Normal underlying condition recovery noted earlier in this section follow.
If however, the hold-time up is short circuited because the port returns to an operationally up state before the expiration of the hold-time up, the following interactions are noted.
With the hold-time down option, if a service IP interface is about to transition from operationally down to up because the port transitioned from operationally down to up, the interface remains down until the expiration of the down timer. When the timer expires, the icmp ping check follows the normal underlying condition recovery noted earlier in this section follows.
These validations do not support or impact IPv6 interfaces.
There is no support for config>system>enable-icmp-vse Nokia-specific ICMP packets on interfaces that are using ping templates.
ICMP ping check connectivity is only supported on FP3-based and above platforms and should not be configured on any service IP interfaces that are configured over hardware that does not meet this requirement.
The SR OS supports Two-Way Active Measurement Protocol (TWAMP) and Two-Way active Measurement Protocol Light (TWAMP Light).
Two-Way Active Measurement Protocol (TWAMP) provides a standards-based method for measuring the IP performance (packet loss, delay, and jitter) between two devices. TWAMP leverages the methodology and architecture of One-Way Active Measurement Protocol (OWAMP) to define a way to measure two-way or round-trip metrics.
There are four logical entities in TWAMP: the control-client, the session-sender, the server, and the session-reflector. The control-client and session-sender are typically implemented in one physical device (the “client”) and the server and session-reflector in a second physical device (the “server”). The router acts as the “server”.
The control-client and server establish a TCP connection and exchange TWAMP-Control messages over this connection. When a server accepts the TCP control session from the control-client, it responds with a server greeting message. This greeting includes the various modes supported by the server. The modes are a bit mask. Each bit in the mask represents a functionality supported on the server. When the control-client wants to start testing, the client communicates the test parameters to the server, requesting any of the modes that the server supports. If the server agrees to conduct the described tests, the test begin as soon as the client sends a Start-Sessions or Start-N-Session message. As part of a test, the session-sender sends a stream of UDP-based test packets to the session-reflector, and the session-reflector responds to each received packet with a response UDP-based test packet. When the session-sender receives the response packets from the session-reflector, the information is used to calculate two-way delay, packet loss, and packet delay variation between the two devices. The exchange of test PDUs is referred to as TWAMP Test.
The TWAMP test PDU does not achieve symmetrical packet size in both directions unless the frame is padded with a minimum of 27 bytes. The session-sender is responsible for applying the required padding. After the frame is appropriately padded, the session-reflector reduces the padding by the number of bytes needed to provide symmetry.
Server mode support includes:
TWAMP Light is an optional model included in the TWAMP standard RFC5357 that uses standard TWAMP test packets but provides a lightweight approach to gathering ongoing IP delay and synthetic loss performance data for base router and per VPRN statistics. Full details are described in Appendix I of RFC 5357 (Active Two Way Measurement Protocol). The SR OS implementation supports the TWAMP Light model for gathering delay and loss statistics.
For TWAMP Light, the complete TWAMP model is replaced with a simple session-sender session-reflector.
TWAMP Light maintains the TWAMP test packet exchange but eliminates the TWAMP TCP control connection with local configurations; however, not all negotiated control parameters are replaced with local configuration. For example, CoS parameters communicated over the TWAMP control channel are replaced with a reply-in-kind approach. The reply-in-kind model reflects back the received CoS parameters, which are influenced by the reflector’s QoS policies.
The reflector function is configured under the config>router>twamp-light command hierarchy for base router reflection, and under the config>service>vprn>twamp-light command hierarchy for per VPRN reflection. The TWAMP Light reflector function is configured per context and must be activated before reflection can occur; the function is not enabled by default for any context. The reflector requires the operator to define the TWAMP Light UDP listening port that identifies the TWAMP Light protocol and the prefixes that the reflector accepts as valid sources for a TWAMP Light request. Prior to release 13.0r4, if the configured TWAMP Light reflector UDP listening port was in use by another application on the system, a minor OAM message was presented indicating the UDP port was unavailable and that activation of the reflector is not allowed.
Notes: The TWAMP Light Reflector udp-port udp-port-number range configured as part of the config>service | router>twamp-light create command implements a restricted reserved UDP port range that must be adhere to range [862,64364..64373] prior to an upgrade or reboot. Configurations outside of this range results in a failure of the TWAMP Light reflector or the prevention of the upgrade operation. If an In Service Software Upgrade (ISSU) function is invoked and the udp-port udp-port-number range is outside of the allowable range and the TWAMP Light Reflector is in a no shutdown state, the ISSU operation is not allowed to proceed until, at a minimum, the TWAMP Light Reflector is shutdown. If the TWAMP Light Reflector is shutdown, the ISSU is allowed to proceed, but the TWAMP Light Reflector is not allowed to activate with a no shutdown until the range is brought in line the allowable range. A non-ISSU upgrade is be allowed to proceed regardless of the state (shutdown or no shutdown) of the TWAMP Light Reflector. The configuration is allowed to load, but the TWAMP Light Reflector remains inactive following the reload when the range is outside the allowable range. When the udp-port udp-port-number for a TWAMP Light Reflector is modified, all tests that were using the services of that reflector must update the dest-udp-port udp-port-number configuration parameter to match the new reflector listening port.
If the source IP address in the TWAMP Light packet arriving on the responder does not match a configured IP address prefix, the packet is dropped. Multiple prefix entries may be configured per context on the responder. Configured prefixes can be modified without shutting down the reflector function. An inactivity timeout under the config>oam-test>twamp>twamp-light command hierarchy defines the amount of time the reflector keeps the individual reflector sessions active in the absence of test packets. A responder requires CPM3 and beyond hardware.
Launching TWAMP Light test packets is under the control of the OAM Performance Monitoring (OAM-PM) architecture and as such adheres to those rules. This functionality is not available through interactive CLI or interactive SNMP, it is only available under the OAM-PM configuration construct. OAM-PM reports TWAMP Light delay and loss metrics. The OAM-PM architecture includes the assignment of a Test-ID. This protocol does not carry the 4-byte test ID in the packet. This is for local significance and uniformity with other protocols under the control of the OAM-PM architecture.
The OAM-PM construct allows various test parameters to be defined. These test parameters include the IP session-specific information which allocates the test to the specific routing instance, the source and destination IP address, the destination UDP port (which must match the UDP listening port on the reflector), the source UDP port and a number of other parameters that allow the operator to influence the packet handling. The source UDP port should only be configured when TWAMP distributed mode is being deployed. The probe interval and TWAMP Light packet padding size can be configured under the specific session. The pad size, the size of the all 0's pad, can configured to ensure that the TWAMP packet is the same size in both directions. The session-sender role facilitated by the OAM-PM TWAMP Light testing only sets the multiplier bits in the Error Estimate field contained in the TWAMP test packet. The 8-bit multiplier field is set to 00000001. The preceding 8 bits of the Error Estimate field comprised of S (1 bit - Time Sync), Z (1 bit MBZ) and Scale (6 bits) are set to 0.
TWAMP Test uses a single packet to gather both delay and loss metrics. This means there is special consideration over those approaches that utilize a specific tool per metric type.
In the TWAMP-Light case the interval parameter, which defines the probe spacing, is a common option applicable to all metrics collected under a single session. This requires the parameter to be removed from any test specific configurations, like the timing parameter associated with loss, specifically availability. Packet processing marks all fields in the PDU to report both delay and loss. The record-stats option can be used to refine which fields to process as part of the OAM-PM architecture. The default collection routine includes delay field processing only, record-stats delay. This is to ensure backward compatibility with previous releases that only supported the processing delay fields in the PDU. Enabling the processing of loss information requires the modification of the record-stats parameter. Adding loss to an active test requires the active test to be shutdown, modified and activate with the no shutdown command. It is critical to remember that the no shutdown action clears all previously allocated system memory for every test. Any results not written to flash or collected through SNMP are lost.
The record-stats setting do not change the configuration validation logic when a test is activated with the no shutdown command. Even if the loss metrics are not being processed and reported the configuration logic must ensure that the TWAMP test parameters are within the acceptable configuration limits, this includes default loss configuration statements. An operator has the ability to configure a TWAMP Light interval of 10s (10000ms) and record only delay statistics. The default timing parameter, used to compute and report availability and reliability, should allow for the activation of the test without a configuration violation. This requires the frame-per-delta-t frames default value of 1. An availability window cannot exceed 100s regardless of the record-stats setting. Computing the size of the availability window is a product of (interval*frames-per-delta-t*consec-delta-t).
The statistics display for the session with show all statistics that are being collected based on the record-stats configuration. If either of the metrics is not being recorded the statistics display “NONE” for the excluded metrics.
Multiple tests sessions between peers are allowed. These test sessions are unique entities and may have different properties. Each test generates TWAMP packets specific to their configuration.
TWAMP Light is supported on deployments that use IPv4 or IPv6 addressing, which may each have their own hardware requirements. All IP addressing must be unicast. IPv6 addresses cannot be a reserved or a link local address. Multiple test sessions may be configured between the same source and destination IP endpoints. The tuple Source IP, Destination IP, Source UDP, and Destination UDP provide a unique index for each test point.
The OAM-PM architecture does not validate any of the TWAMP Light test session information. A test session is allowed to be activated regardless of the validity of session information. For example, if the source IP address configured is not local within the router instance that the test is allocated, the session controller starts sending TWAMP Light test packets but does not receive any responses.
See OAM Performance Monitoring (OAM-PM) for more information about the integration of TWAMP Light and the OAM-PM architecture, including hardware dependencies.
RFC 6374, Packet Loss and Delay Measurement for MPLS Networks, provides a standard packet format and process for measuring delay of a unidirectional or bi-directional label switched path (LSP) using the General Associated Channel (G-ACh), channel type 0x000C. Unidirectional LSPs, such as RSVP-TE, require an additional TLV to return a response to the querier (the launch point). RFC 7876, UDP Return Path for Packet Loss and Delay Measurement for MPLS Networks, defines the source IP information to include in the UDP Path Return TLV so the responding node can reach the querier using an IP network. The MPLS DM PDU does not natively include any IP source information. With MPLS TP there is no requirement for the TLV defined in RFC 7876.
The function of MPLS delay measurement is similar regardless of LSP type. The querier sends the MPLS DM query message toward the responder, transported in an MPLS LSP. he responder extracts the required PDU information to response appropriately.
Launching MPLS DM tests is configured in the config>oam-pm>session session-name test-family mpls context. Basic architectural OAM PM components are required to be completed along with the MPLS specific configuration. The test PDU includes the following PDU settings;
For the base PDU:
TVLs can also be included, based on the configuration.
The maximum pad size of 257 is a result of the structure of the defined TLV. The length field is one byte, limiting the overall value to 255 bytes.
The reflector processes the inbound MPLS DM PDU and respond back to the querier based on the received information, using the response flag setting. Specific to the timestamp, the responder responds to the Query Timestamp Format, filling in the Timestamp 2 and Timestamp 3 values.
When the response arrives back at the querier, the delay metrics are computed. The common OAM-PM computation model and naming is used to simplify and rationalize the different technologies that leverage the OAM-PM infrastructure. The common methodology reports unidirectional and round trip delay metrics for Frame Delay (FD), InterFrame Delay Variation (IFDV), and Frame Delay Range (FDR). The term frame is not indicative of the underlying technology being measured. It represents a normal cross technology naming with applicability to the appropriate naming for the measured technology. The common normal naming requires a mapping to the supported delay measurements included in RFC 6374.
Description | RFC 6374 | OAM-PM |
A to B Delay | Forward | Forward |
B to A Delay | Reverse | Backward |
Two Way Delay (regardless of processing delays within the remote endpoint B) | Channel | Round-Trip |
Two Way Delay (includes processing delay at the remote endpoint B) | Round-Trip | — |
Since OAM-PM uses a common reporting model, unidirectional (forward and backward), round-trip is always reported. With unidirectional measurements, the T4 and T3 timestamps are zeroed but the round-trip and backward direction are still reported. With unidirectional measurements, the backward and round trip values are not of any significance.
An MPLS DM test may measure the endpoints of the LSP when the TTL is set to or higher than the termination point. Midpoints along the path that support MPLS DM response functions can be targeted by a test by setting a TTL to expire along the path. The MPLS DM launch and reflection, including mid-path transit nodes, capability is disabled by default. To launch and reflect MPLS DM test packets config>test-oam>mpls-dm must be enabled.
The SR OS implementation supports the following MPLS DM Channel Type 0x000C from RFC 6374 function:
The following functions are not supported:
The IEEE and the ITU-T have cooperated to define the protocols, procedures and managed objects to support service based fault management. Both IEEE 802.1ag standard and the ITU-T Y.1731 recommendation support a common set of tools that allow operators to deploy the necessary administrative constructs, management entities and functionality, Ethernet Connectivity Fault Management (ETH-CFM). The ITU-T has also implemented a set of advanced ETH-CFM and performance management functions and features that build on the proactive and on demand troubleshooting tools.
CFM uses Ethernet frames and is distinguishable by ether-type 0x8902. In certain cases the different functions use a reserved multicast Layer 2 MAC address that could also be used to identify specific functions at the MAC layer. The multicast MAC addressing is not used for every function or in every case. The Operational Code (OpCode) in the common CFM header is used to identify the PDU type carried in the CFM packet. CFM frames are only processed by IEEE MAC bridges.
IEEE 802.1ag and ITU-T Y.1731 functions that are implemented are available on the SR and ESS platforms.
This section of the guide provides configuration example for each of the functions. It also provides the various OAM command line options and show commands to operate the network. The individual service guides provides the complete CLI configuration and description of the commands in order to build the necessary constructs and management points.
Table 10 lists and expands the acronyms used in this section.
Acronym | Expansion | Supported Platform |
1DM | One way Delay Measurement (Y.1731) | All |
AIS | Alarm Indication Signal | All |
BNM | Bandwidth Notification Message (Y.1731 sub OpCode of GNM) | All |
CCM | Continuity check message | All |
CFM | Connectivity fault management | All |
CSF | Client Signal Fail (Receive) | All |
DMM | Delay Measurement Message (Y.1731) | All |
DMR | Delay Measurement Reply (Y.1731) | All |
ED | Ethernet Defect (Y.1731 sub OpCode of MCC) | All |
GNM | Generic Notification Message | All |
LBM | Loopback message | All |
LBR | Loopback reply | All |
LMM | (Frame) Loss Measurement Message | Platform specific |
LMR | (Frame) Loss Measurement Response | Platform specific |
LTM | Linktrace message | All |
LTR | Linktrace reply | All |
MCC | Maintenance Communication Channel (Y.1731) | All |
ME | Maintenance entity | All |
MA | Maintenance association | All |
MD | Maintenance domain | All |
MEP | Maintenance association end point | All |
MEP-ID | Maintenance association end point identifier | All |
MHF | MIP half function | All |
MIP | Maintenance domain intermediate point | All |
OpCode | Operational Code | All |
RDI | Remote Defect Indication | All |
TST | Ethernet Test (Y.1731) | All |
SLM | Synthetic Loss Message | All |
SLR | Synthetic Loss Reply (Y.1731) | All |
VSM | Vendor Specific Message (Y.1731) | All |
VSR | Vendor Specific Reply (Y.1731) | All |
The IEEE and the ITU-T use their own nomenclature when describing administrative contexts and functions. This introduces a level of complexity to configuration, discussion and different vendors naming conventions. The SR OS CLI has chosen to standardize on the IEEE 802.1ag naming where overlap exists. ITU-T naming is used when no equivalent is available in the IEEE standard. In the following definitions, both the IEEE name and ITU-T names are provided for completeness, using the format IEEE Name/ITU-T Name.
Maintenance Domain (MD)/Maintenance Entity (ME) is the administrative container that defines the scope, reach and boundary for testing and faults. It is typically the area of ownership and management responsibility. The IEEE allows for various formats to name the domain, allowing up to 45 characters, depending on the format selected. ITU-T supports only a format of “none” and does not accept the IEEE naming conventions.
Maintenance Association (MA)/Maintenance Entity Group (MEG) is the construct where the different management entities are contained. Each MA is uniquely identified by its MA-ID. The MA-ID is comprised of the MD level and MA name and associated format. This is another administrative context where the linkage is made between the domain and the service using the bridging-identifier configuration option. The IEEE and the ITU-T use their own specific formats. The MA short name formats (0 to 255) have been divided between the IEEE (0 to 31, 64 to 255) and the ITU-T (32 to 63), with five currently defined (1 to 4, 32). Even though the different standards bodies do not have specific support for the others formats a Y.1731 context can be configured using the IEEE format options.
Note: When a VID is used as the short MA name, 802.1ag does not support VLAN translation because the MA-ID must match all the MEPs. The default format for a short MA name is an integer. Integer value 0 means the MA is not attached to a VID. This is useful for VPLS services on SR OS platforms because the VID is locally significant. |
Note: The double quote character (“) included as part of the ITU-T recommendation T.50 is not a supported character on the SR OS. |
Maintenance Domain Level (MD Level)/Maintenance Entity Group Level (MEG Level) is the numerical value (0-7) representing the width of the domain. The wider the domain (higher the numerical value) the farther the ETH-CFM packets can travel. It is important to understand that the level establishes the processing boundary for the packets. Strict rules control the flow of ETH-CFM packets and are used to ensure proper handling, forwarding, processing and dropping of these packets. ETH-CFM packets with higher numerical level values flows through MEPs on MIPs on endpoints configured with lower level values. This allows the operator to implement different areas of responsibility and nest domains within each other. Maintenance association (MA) includes a set of MEPs, each configured with the same MA-ID and MD level used to verify the integrity of a single service instance.
Note: Domain format and requirements that match that format, as well as association format and those associated requirements, and the level must match on peer MEPs. |
Maintenance Endpoints/MEG Endpoints (MEP) are the workhorses of ETH-CFM. A MEP is the unique identification within the association (1-8191). Each MEP is uniquely identified by the MA-ID, MEP-ID tuple. This management entity is responsible for initiating, processing and terminating ETH-CFM functions, following the nesting rules. MEPs form the boundaries which prevent the ETH-CFM packets from flowing beyond the specific scope of responsibility. A MEP has direction, up or down. Each indicates the directions packets are generated; up toward the switch fabric, down toward the SAP away from the fabric. Each MEP has an active and passive side. Packets that enter the active point of the MEP are compared to the existing level and processed accordingly. Packets that enter the passive side of the MEP are passed transparently through the MEP. Each MEP contained within the same maintenance association and with the same level (MA-ID) represents points within a single service. MEP creation on a SAP is allowed only for Ethernet ports with NULL, q-tags, q-in-q encapsulations. MEPs may also be created on SDP bindings. A vMEP is a service level MEP configuration that installs ingress (down MEP-like) extraction on the supported ETH-CFM termination points within a VPLS configuration.
Maintenance Intermediate Points/MEG Intermediate Points (MIPs) are management entities between the terminating MEPs along the service path. MIPs provide insight into the service path connecting the MEPs. MIPs only respond to Loopback Messages (LBM) and Linktrace Messages (LTM). All other CFM functions are transparent to these entities.
MIP creation is the result of the mhf-creation mode and interaction with related MEPs, and with the direction of the MEP. Two different authorities can be used to determine the MIPs that should be considered and instantiated. The domain and association or the default-domain hierarchies match the configured bridge identifier and VLAN to the service ID and any configured primary VLAN. When a primary VLAN MIP is not configured, the VLAN is either ignored or configured as none.
The domain and association MIP creation function triggers a search for all ETH-CFM domain association bridge identifier matches to the service it is linked to. A MIP candidate is then be evaluated using the mhf-creation mode and the rules that govern the algorithm. The domain association mhf-creation modes and their uses are listed below:
For all modes except static mode, only a single MIP can be created. All candidates are collected and the lowest-level valid MIP is created. In static mode, all valid MIPs are created for the bridge identifier VLAN pair. A MIP is considered invalid if the level of the MIP is equal to or below a downward-facing MEP, or below the level of an upward-facing MEP and the MIP shares the same service component as the Up MEP.
Not all creation modes require the mip creation statement within the service. The explicit and default mhf-creation modes may instantiate a MIP without the mip creation statement under the service if a lower-level MEP exists for the domain association bridge identifier. If a lower-level MEP does not exist, the default and static mhf-creation modes require the mip creation statement on the service connection.
MEPs require the domain and association configurations to ensure that all ETH-CFM PDUs can be supported. MIPs have restricted ETH-CFM PDU support: ETH-LB and ETH-LT. These two protocols do not require the configuration of a domain and association. MIPs may be created outside of the association context using the default-domain table.
The default-domain table is an object table populated with values that are used for MIP creation. The table is indexed by the bridge identifier and VLAN. An index entry is automatically added when the mip creation statement is added under a SAP or SDP binding. When an index entry is added, the bridge identifier is set to the service ID and the VLAN is set to the primary-vlan-enable vlan-id. If the MIP does not use primary VLAN functionality, the VLAN is configured as none. When the entry has been added to the default-domain table, the default values can be configured. The default-domain table defers to the system-wide, read-only values.
Because there are two different locations able to process the MIP creation logic, a per-bridge identifier VLAN authority must be determined. The authority is a component, table, or configuration that is responsible for executing the MIP creation algorithm. In general, any domain association bridge identifier that could be used to create a specific MIP is authoritative. Other configurations influence the authority, such as the type of MIP (primary VLAN or non-primary VLAN), the different mhf-creation modes, the interaction of those modes with MEPs, and the direction of the MEP.
The following rules provide some high-level guidelines to determine the authority.
When the authority for MIP creation is determined, the MIP attributes are derived from that creation table. The default domain table defers to the read-only, system-wide MIP values and inherits those defaults. Some of the objects under the default-domain hierarchy must be configured using the same statement to avoid transient and unexpected MIP creation while the configuration is being completed. To this end, the mhf-creation mode and level have been combined in the same configuration statement.
The standard mhf-creation modes (none, default, explicit) are configurable as part of the default-domain table. Static mode can only be configured under the domain association bridge identifier. This is because default domain table indexing precludes multiple MIPs at different levels.
MIP creation requires configuration. The default values in both the domain association and the default domain table prevent MIP instantiation.
The show eth-cfm mip-instantiation command can be used to check the authority for each MIP.
There are two locations in the configuration where ETH-CFM is defined. The first location, where the domains, associations (including links to the service), MIP creation method, common ETH-CFM functions, and remote MEPs are defined under the top-level eth-cfm command. The second location is within the service or facility.
Table 11 is a general table that indicates ETH-CFM support for the different services and SAP or SDP binding. It is not meant to indicate the services that are supported or the requirements for those services on the individual platforms.
Service | Ethernet Connection | Down MEP | Up MEP | MIP | Virtual MEP |
Epipe | — | — | — | — | No |
SAP | Yes | Yes | Yes | — | |
Spoke-SDP | Yes | Yes | Yes | — | |
PW-SAP | No | No | Yes | — | |
VPLS | — | — | — | — | Yes |
SAP | Yes | Yes | Yes | — | |
Spoke-SDP | Yes | Yes | Yes | — | |
Mesh-SDP | Yes | Yes | Yes | — | |
B-VPLS | — | — | — | — | Yes |
SAP | Yes | Yes | Yes | — | |
Spoke-SDP | Yes | Yes | Yes | — | |
Mesh-SDP | Yes | Yes | Yes | — | |
I-VPLS | — | — | — | — | No |
SAP | Yes | Yes | Yes | — | |
Spoke-SDP | Yes | Yes | Yes | — | |
M-VPLS | — | — | — | — | No |
SAP | Yes | Yes | Yes | — | |
Spoke-SDP | Yes | Yes | Yes | — | |
Mesh-SDP | Yes | Yes | Yes | — | |
PBB EPIPE | — | — | — | — | No |
SAP | Yes | Yes | Yes | — | |
Spoke-SDP | Yes | Yes | Yes | — | |
IPIPE | — | — | — | — | No |
SAP | Yes | No | No | — | |
Ethernet-Tunnel SAP | Yes | No | No | — | |
IES | — | — | — | — | No |
SAP | Yes | No | No | — | |
Spoke-SDP (Interface) | Yes | No | No | — | |
Subscriber Group-int SAP | Yes | No | No | — | |
VPRN | — | — | — | — | No |
SAP | Yes | No | No | — | |
Spoke-SDP (Interface) | Yes | No | No | — | |
Subscriber Group-int SAP | Yes | No | No | — |
Figure 44 illustrates the usage of an Epipe on two different nodes that are connected using ether SAP 1/1/2:100.31. The SAP 1/1/10:100.31 is an access port that is not used to connect the two nodes.
Examining the configuration from NODE1, MEP 101 is configured with a direction of UP causing all ETH-CFM traffic originating from this MEP to generate into the switch fabric and out the mate SAP 1/1/2:100.31. MEP 111 uses the default direction of DOWN causing all ETH-CFM traffic that is generated from this MEP to send away from the fabric and only egress the SAP on which it is configured, SAP 1/1/2:100.31.
Further examination of the domain constructs reveal that the configuration properly uses domain nesting rules. In this case, the Level 3 domain is completely contained in a Level 4 domain.
Figure 45 illustrates the creation of an explicit MIP using the association MIP construct.
An addition of association 2 under domain 4 includes the mhf-creation explicit statement. This means that when the level 3 MEP is assigned to the SAP 1/1/2:100.31 using the definition in domain 3 association 1, creating the higher level MIP on the same SAP. Since a MIP does not have directionality “Both” sides are active. The service configuration and MEP configuration within the service did not change.
Figure 46 illustrates a simpler method that does not require the creation of the lower level MEP. The operator simply defines the association parameters and uses the mhf-creation default setting, then places the MIP on the SAP of their choice.
NODE1:
NODE2:
Figure 47 shows the detailed IEEE representation of MEPs, MIPs, levels and associations, using the standards defined icons.
SAPs support a comprehensive set of rules including wild cards to map packets to services. For example, a SAP mapping packets to a service with a port encapsulation of QinQ may choose to only look at the outer VLAN and wildcard the inner VLAN. SAP 1/1/1:100.* would map all packets arriving on port 1/1/1 with an outer VLAN 100 and any inner VLAN to the service the SAP belongs to. These powerful abstractions extract inbound ETH-CFM PDUs only when there is an exact match to the SAP construct. In the case of the example when then an ETH-CFM PDU arrives on port 1/1/1 with a single VLAN with a value of 100 followed immediately with e-type (0x8902 ETH-CFM). Furthermore, the generation of the ETH-CFM PDUs that egress this specific SAP are sent with only a single tag of 100. The primary VLAN is required if the operator needs to extract ETH-CFM PDUs or generate ETH-CFM PDUs on wildcard SAPs and the offset includes an additional VLAN that was not part of the SAP configuration.
Table 12 shows how packets that would normally bypass the ETH-CFM extraction would be extracted when the primary VLAN is configured. This assumes that the processing rules for MEPs and MIPs is met, E-type 0x8902, Levels and OpCodes.
Port Encapsulation | E-type | Ingress Tag(s) | Ingress SAP | No Primary VLAN ETH-CFM Extraction | With Primary VLAN (10) ETH-CFM Extraction | ||
— | — | — | — | MEP | MIP | MEP | MIP |
Dot1q | 0x8902 | 10 | x/y/z:* | No | No | Yes | Yes |
Dot1q | 0x8902 | 10.10 | x/y/z:10 | No | No | Yes | Yes |
QinQ | 0x8902 | 10.10 | x/y/z:10.* | No | No | Yes | Yes |
QinQ (Default Behavior) | 0x8902 | 10.10 | x/y/z:10.0 | No | No | Yes | Yes |
Null | 0x8902 | 10 | x/y/z | No | No | Yes | Yes |
The mapping of the service data remains unchanged. The primary VLAN function allows for one additional VLAN offset beyond the SAP configuration, up to a maximum of two VLANs in the frame. If a fully qualified SAP specifies two VLANs (SAP 1/1/1:10.10) and a primary VLAN of 12 is configured for the MEP there is no extraction of ETH-CFM for packets arriving tagged 10.10.12. That exceeds the maximum of two tags.
The mapping or service data based on SAPs has not changed. ETH-CFM MPs functionality remains SAP specific. In instances where as service includes a specific SAP with a specified VLAN (1/1/1:50) and a wildcard SAP on the same port (1/1/1:*) it is important to understand how the ETH-CFM packets are handled. Any ETH-CFM packet with etype 0x8902 arriving with a single tag or 50 would be mapped to a classic MEP configured under SAP 1/1/1:50. Any packet arriving with an outer VLAN of 50 and second VLAN of 10 would be extracted by the 1/1/1:50 SAP and would require a primary VLAN enabled MEP with a value of 10, assuming the operator would like to extract the ETH-CFM PDU of course. An inbound packet on 1/1/1 with an outer VLAN tag of 10 would be mapped to the SAP 1/1/1:*. If ETH-CFM extraction is required under SAP 1/1/1:* a primary VLAN enabled MEP with a value of 10 would be required.
The packet that is generated from a MEP or MIP with the primary VLAN enabled is include that VLAN. The SAP encapsulates the primary VLAN using the SAP encapsulation.
Primary VLAN support includes UP MEPs, DOWN MEPs and MIPs on Ethernet SAPs, including LAG, as well as SDP bindings for Epipe and VPLS services. Classic MEPs, those without a primary VLAN enabled, and a primary VLAN enabled MEPs can co-exist under the same SAP or SDP binding. Classic MIPs and primary VLAN-enabled MIPs may also coexist. The enforcement of a single classic MIP per SAP or SDP binding continues to be enforced. However, the operator may configure multiple primary VLAN-enabled MIPs on the same SAP or SDP binding. MIPs in the primary VLAN space must include the mhf-creation static configuration under the association and must also include the specific VLAN on the MIP creation statement under the SAP. The no version of the mip command must include the entire statement including the VLAN information.
The eight MD Levels (0 to 7) are specific to context in which the Management Point (MP) is configured. This means the classic MPs have a discrete set of the levels from the primary VLAN enabled space. Each primary VLAN space has its own eight Level MD space for the specified primary VLAN. Consideration must be given before allowing overlapping levels between customers and operators should the operator be provision a customer facing MP, like a MIP on a UNI. CPU Protection extensions for ETH-CFM are VLAN unaware and based on MD Level and the OpCode. Any configured rates are applied to the Level and OpCode as a group.
There are two configuration steps to enable the primary VLAN. Under the bridging instance, contained within the association context (config>eth-cfm>domain>assoc>bridge), the VLAN information must be configured. Until this is enabled using the primary-vlan-enable option as part of the MEP creation step or the MIP statement (config>service>…>{sap | mesh-sdp | spoke-sdp}>eth-cfm) the VLAN specified under the bridging instance remains inactive. This is to ensure backward interoperability.
Primary VLAN functions require an FP2-based card or better. Primary VLAN is not supported for vpls-sap-templates, sub-second CCM intervals, or vMEPs.
An operator may see the following INFO message (during configuration reload), or MINOR (error) message (during configuration creation) when upgrading to 11.0r4 or later if two MEPs are in a previously undetected conflicting configuration. The messaging is an indication that a MEP, the one stated in the message using format (domain md-index/association ma-index/mep mep-id), is already configured and has allocated that context. During a reload (INFO) a MEP that encounters this condition is created but its state machine is disabled. If the MINOR error occurs during a configuration creation this MEP fails the creation step. The indicated MEP must be correctly re-configured.
A loopback message is generated by an MEP to its peer MEP or a MIP (Figure 48). The functions are similar to an IP ping to verify Ethernet connectivity between the nodes.
The following loopback-related functions are supported:
The ETH-LBM (loopback) function includes parameters for sub second intervals, timeouts, and new padding parameters.
When an ETH-LBM command is issued using a sub second interval (100ms), the output success is represented with a “!” character, and a failure is represented with a “.” The updating of the display waits for the completion of the previous request before producing the next result. However, the packets maintain the transmission spacing based on the interval option specified in the command.
When the interval is one seconds or higher, the output provides detailed information that includes the number of bytes (from the LBR), the source MEP ID (format md-index/ma-index/mepid), and the sequence number as it relates to this test and the result.
Since ETH-LB does not support standard timestamps, no indication of delay is produced as these times are not representative of network delay.
By default, if no interval is included in the command, the default is back to back LBM transmissions. The maximum count for such a test is 5.
Multicast loopback also supports the new intervals (see 3.4.2). However, the operator must be careful when using this approach. Every MEP in the association responds to this request. This means an exponential impact on system resources for large scale tests. If the multicast option is used and there with an interval of 1 (100ms) and there are 50 MEPs in the association, this results in a 50 times increase in the receive rate (500pps) compared to a unicast approach. Multicast displays are not be updated until the test is completed. There is no packet loss percentage calculated for multicast loopback commands.
This on demand operation tool is used to quickly check the reachability of all MEPs within an Association. A multicast address can be coded as the destination of an oam eth-cm loopback command. The specific class 1 multicast MAC address or the keyword “multicast” can be used as the destination for the loopback command. The class 1 ETH-CFM multicast address is in the format 01:80:C2:00:00:3x (where x = 0 - 7 and is the number of the domain level for the source MEP). When the “multicast” option is used, the class 1 multicast destination is built according to the local MEP level initiating the test.
Remote MEPs that receive the multicast loopback message, configured at the equivalent level, are terminated and process the multicast loopback message by responding with the appropriate unicast loopback response (ETH-LBR). Regardless of whether a multicast or unicast ETH-LBM is used, there is no provision in the standard LBR PDU to carry the MEP-ID of the responder. This means only the remote MEP MAC Address will be reported and subsequently displayed. MIPs do not extract a multicast LBM request. The LBM multicast is transparent to the MIP.
MEP loopback stats are not updated as a result of this test being run. That means the received, out-of-order and bad-msdu counts are not affected by multicast loopback tests. The multicast loopback command is meant to provide immediate connectivity troubleshooting feedback for remote MEP reachability only.
A linktrace message is originated by an MEP and targeted to a peer MEP in the same MA and within the same MD level (Figure 49). Linktrace traces a specific MAC address through the service. The peer MEP responds with a linktrace reply message after successful inspection of the linktrace message. The MIPs along the path also process the linktrace message and respond with linktrace replies to the originating MEP if the received linktrace message that has a TTL greater than 1 and forward the linktrace message if a look up of the target MAC address in the Layer 2 FDB is successful. The originating MEP shall expect to receive multiple linktrace replies and from processing the linktrace replies, it can put together the route to the target bridge.
A traced MAC address is carried in the payload of the linktrace message, the target MAC. Each MIP and MEP receiving the linktrace message checks whether it has learned the target MAC address. In order to use linktrace the target MAC address must have been learned by the nodes in the network. If so, a linktrace message is sent back to the originating MEP. Also, a MIP forwards the linktrace message out of the port where the target MAC address was learned.
The linktrace message itself has a multicast destination address. On a broadcast LAN, it can be received by multiple nodes connected to that LAN. But, at most, one node sends a reply.
The following linktrace related functions are supported:
The following output includes the SenderID TLV contents if it is included in the LBR.
A Continuity Check Message (CCM) is a multicast frame that is generated by a MEP and multicast to all other MEPs in the same MA. The CCM does not require a reply message. To identify faults, the receiving MEP maintains an internal list of remote MEPs it should be receiving CCM messages from.
This list is based off of the remote-mepid configuration within the association the MEP is created in. When the local MEP does not receive a CCM from one of the configured remote MEPs within a pre-configured period, the local MEP raises an alarm.
An MEP may be configured to generate ETH-CC packet using a unicast destination Layer 2 MAC address. This may help reduce the overhead in some operational models where Down MEPs per peer are not available. For example, mapping an I-VPLS to a PBB core where a hub is responsible for multiple spokes is one of the applicable models. When ETH-CFM packets are generated from an I-context toward a remote I-context, the packets traverse the B-VPLS context. Since many B-contexts are multipoint, any broadcast, unknown or multicast packet is flooded to all appropriate nodes in the B-context. When ETH-CC multicast packets are generated, all the I-VPLS contexts in the association must be configured with all the appropriate remote MEPids. If direct spoke to spoke connectivity is not part of the validation requirement, the operational complexity can be reduced by configuring unicast DA addressing on the “spokes” and continuing to use multicast CCM from the “hub”. When the unicast MAC is learned in the forwarding DB, traffic is scoped to a single node.
Defect condition, reception, and processing remains unchanged for both hub and spokes. When an ETH-CC defect condition is raised on the hub or spoke, the appropriate defect condition is set and distributed throughout the association from the multicasting MEP. For example, should a spoke raise a defect condition or timeout, the hub sets the RDI bit in the multicast ETH-CC packet which is received on all spokes. Any local hub MEP defect condition continues to be propagated in the multicast ETH-CC packet. Defect conditions are cleared as per normal behavior.
The forwarding plane must be considered before deploying this type of ETH-CC model. A unicast packet is handled as unknown when the destination MAC does not exist in local forwarding table. If a unicast ETH-CC packet is flooded in a multipoint context, it reaches all the appropriate I-contexts. This causes the spoke MEPs to raise the “DefErrorCCM” condition because an ETH-CC packet was received from a MEP that has not been configured as part of the receiving MEPs database.
The remote unicast MAC address must be configured and is not automatically learned. A MEP cannot send both unicast and multicast ETH-CC packets. Unicast ETH-CC is only applicable to a local association with a single configured remote peer. There is no validation of MAC addresses for ETH-CC packets. The configured unicast destination MAC address of the peer MEP only replaces the multicast class 1 destination MAC address with a unicast destination.
Unicast CCM is not supported on any MEPs that are configured with sub second CCM-intervals.
The following functions are supported:
The optional ccm-tlv-ignore command ignores the reception of interface-status and port-status TLVs in the ETH-CCM PDU on Facility MEPs (port, LAG, QinQ, tunnel and router). No processing is performed on the ignored ETH-CCM TLVs values.
Any TLV that is ignored is reported as absent for that remote peer and the values in the TLV do not have an impact on the ETH-CFM state machine. This the same behavior as if the remote MEP never included the ignored TLVs in the ETH-CCM PDU. If the TLV is not properly formed, the CCM PDU fails the packet parsing process, which causes it to be discarded and a defect condition is raised.
There are various display commands that are available to show the status of the MEP and the list of remote peers.
As specified in the section “Continuity Checking (CC),” all remote MEP-IDs must be configured under the association using the remote-mepid command in order to accept them as peers. When a CCM is received from a MEP-ID that has not been configured, the “unexpected MEP” causes the defErrorCCM condition to be raised. The defErrorCCM is raised for all invalid CC reception conditions.
The auto-mep-discovery option allows for the automatic adding of remote MEP-IDs contained in the received CCM. Once learned, the automatically discovered MEP behave the same as a manually configured entry. This includes the handling and reporting of defect conditions. For example, if an auto discovered MEP is deleted from its host node, it experiences the standard timeout on the node which auto discovered it.
When this function is enabled, the “unexpected MEP” condition no longer exists. That is because all MEPs are accepted as peers and automatically added to the MEP database upon reception. There is an exception to this statement. If the maintenance association has reached its maximum MEP count, and no new MEPs can be added, the “unexpected MEP” condition raises the defErrorCCM defect condition. This is because the MEP was not added to the association and the remote MEP is still transmitting CCM.
The clear eth-cfm auto-discovered-meps [mep-id] domain md-index association ma-index is available to remove auto discovered MEPs from the association. When the optional mep-id is included as part of the clear command, only that specific MEP-ID within the domain and association is cleared. If the optional mep-id is omitted when the clear command is issued, all auto discovered MEPs that match the domain and association will be cleared. The clear command is only applicable to auto- discovered MEPs.
If there is a failure to add a MEP to the MEP database and the action was manual addition using the “remote-mepid” configuration statement, the error “MINOR: ETH_CFM #1203 Reached maximum number of local and remote endpoints configured for this association” is produced. When failure to add a MEP to the database through an auto discovery, no event is created. The CCM Last Failure indicator tracks the last CCM error condition. The decode can be viewed using the show eth-cfm mep mep-id domain md-index association ma-index command. An association may include both the manual addition of remote peers using the remote-mepid and the auto-mep-discovery option.
The all-remote-mepid display includes an additional column AD to indicate where a MEP has been auto discovered, using the indicator T.
Auto discovered MEPs do not survive a system reboot. These are not permanent additions to the MEP database and are not reloaded after a reboot. The entries are relearned when the CCM is received. Auto discovered MEPs can be changed to manually created entries simply by adding the appropriate remote-mepid statement to the proper association. At that point, the MEP is no longer considered auto discovered and can no longer be cleared.
If a remote-mepid statement is removed from the association context and auto-mep-discovery is configured and a CC message arrives from that remote MEP, it is added to the MEP database, this time as an auto discovered MEP.
The individual MEP database for an association must not exceed the maximum number of MEPs allowed. A MEP database consists of all local MEPs plus all configured remote-mepids and all auto-discovered MEPs. If the number of MEPs in the association has reached capacity, no new MEPs may be added. The number of MEPs must be brought below the maximum value before MEPs can be added. Also, the number of MEPs across all MEP databases must not exceed the system maximum. The number of MEPs supported per association and the total number of MEPs across all associations is platform dependent.
ETH-CFM grace is an indication that MEPs on a node undergoing a maintenance operation may be expected to be unable to transmit or receive ETH-CC PDUs, failing to satisfy the peers requirements. Without the use of a supporting grace function, CCM-enabled MEPs time out after an interval of 3.5 × ccm-interval. During planned maintenance operations, the use of grace can extend the timeout condition to a longer interval.
The Ethernet CFM system-wide configuration eth-cfm>system>[no] grace-tx-enable command controls the transmission of ETH-CFM grace. The ETH-CFM grace function is enabled by the Soft Reset notification by default. The ETH-CFM grace function determines the individual MEP actions based on their configured parameters.
To transmit a grace PDU, the MEP must be administratively enabled and ETH-CC must also be enabled. The ETH-CC interval is ignored. Grace transmission uses the class 1 DA, with the last nibble (4 bits) indicating the domain level, for all grace-enabled MEPs. When a grace event occurs, all MEPs on a node that are configured for grace actively participate in the grace function until the grace event has completed. When a soft reset occurs, ETH-CFM does not determine which peers are directly affected by a soft reset of a specific IOM or line card. This means that all MEPs enter a grace state, regardless of their location on the local node.
The grace process prevents the local MEP from presenting a new timeout condition, and prevents its peer, also supporting a complementary grace process, from declaring a new timeout defect (DefRemoteCCM). Other defects, unrelated to timeout conditions, are processed as during normal operation. This includes the setting, transmission, and reception processing of the RDI flag in the CCM PDU. Since the timeout condition has been prevented, it can be assumed that the RDI is caused by some other unrelated CCM defect condition. Entering the grace period does not clear existing defect conditions, and any defect condition that exists at the start of the grace period is maintained and cleared using normal operation.
Two approaches are supported for ETH-CFM grace:
Both approaches use the same triggering infrastructure but have unique PDU formats and processing behaviors. Only one grace transmission function can be active under an individual MEP. MEPs can be configured to receive and process both grace PDU formats. If a MEP receives both types of grace PDUs, the last grace PDU received becomes the authority for the grace period, using its procedures. If the operator needs to clear a grace window or expected defect window on a receiving peer, the appropriate authoritative reception function can be disabled.
Active AIS server transmissions include a vendor-specific TLV that instructs the client to extend the timeout of AIS during times of grace. When the grace period is completed, the server MEP removes the TLV and the client reverts to standard timeout processing based on the interval in the AIS PDU.
The ETH-VSM Multicast Class 1 DA announcement includes the start of a grace period, the new remote timeout value of 90 s, and the completion of the grace process.
At the start of the maintenance operation, a burst of three packets is sent over a 3-second window to reduce the chance that a remote peer may miss the grace announcement. Following the initial burst, evenly-spaced ETH-VSM packets are sent at intervals of one third of the ETH-VSM grace window; this means that the ETH-VSM packet are sent every 30 seconds to all appropriate remote peers. Reception of an ETH-VSM grace packet refreshes the timeout calculation. The local node that is undergoing the maintenance operation alsos delay the CCM timeout of the local MEP during the grace window using the announced ETH-VSM interval. MEPs restart their timeout countdown when any ETH-CC PDU is received.
At the end of the maintenance operation, there is a burst of three ETH-VSM grace packets to signal that the maintenance operation has been completed. Once the first of these packets has been received, the receiving peer transitions back to the ETH-CCM message and associated interval as the indication for the remote timeout (3.5 × ccm-interval + hold (where applicable)).
CCM packets continue to be sent during this process, but loss of the CCM packets during the advertised grace window do not affect the peer timeout. The only change to the CCM processing is the timeout value used during the grace operation. During the operation, the value that is announced as part of the ETH-VSM packet is used. If the grace value is lower than the configured CCM interval standard timeout computation (3.5 × ccm-interval + hold (where applicable)), the grace value is not installed as the new timeout metric.
This is a value-added function that is applicable only to nodes that implement support for Nokia’s approach for announcing grace using ETH-VSM. This pre-dates the introduction of the ITU-T Y.1371 Ethernet-Expected Defect (ETH-ED) standard. As specified in the standards, when a node does not support a specific optional function such as ETH-VSM, the message is ignored and no processing is performed.
The ETH-VSM function is enabled by default for reception and transmission. The per-MEP configuration statements under the grace>eth-vsm-grace context can affect the transmission, reception, and processing of the ETH-VSM grace function.
The ETH-ED PDU is used to announce the expected defect window to peer MEPs. The peer MEPs uses the expected defect window value to prevent ETH-CC timeout (DefRemoteCCM) conditions for the announcing MEP. The MEP announcing ETH-ED does not time out any remote peers during the expected defect window. The expected defect window is not a configurable value.
At the start of the operation, a burst of three packets are sent over a 3-second window in order to reduce the chance that a remote peer may miss the expected defect window announcement.
It is possible to restrict the value that is installed for the expected defect timer by configuring the max-rx-defect-window command for the receiving MEP. A comparison is used to determine the expected defect timer to be installed during grace. Either the lower of the received expected defect timer values in the ETH-ED PDU or the configured maximum is installed if they are larger than the standard computation for ETH-CC timeout. The no max-rx-defect-window command is configured by default; therefore, the maximum received expected defect window is disabled, and it is not considered in determining the installed expected defect timer.
Subsequent ETH-ED packets are only transmitted at the completion of the Soft Rest function that triggered the grace function. The three-packet burst at the completion of the Soft Reset function contains an expected defect window size of 5 seconds. Receiving peers should use this new advertisement to reset the expected window to 5 seconds.
The termination of the grace window occurs when the expected defect window timer reaches zero, or when the receive function is manually disabled.
In some cases, the requirement exists to prevent a MEP from entering the defRemoteCCM defect, remote peer timeout, for more time than the standard 3.5 times the ccm-interval. Both the IEEE 802.1ag standard and ITU-T Y.1731 recommendation provide a non-configurable 3.5 times the CCM interval to determine a peer time out. However, when sub-second CCM timers (10 ms/100 ms) are enabled, the carrier may want to provide additional time for different network segments to converge before declaring a peer lost because of a timeout. To maintain compliance with the specifications, the ccm-hold-timer down delay-down option artificially increases the amount of time it takes for a MEP to enter a failed state if the peer times out. This timer is only additive to CCM timeout conditions. All other CCM defect conditions, like defMACStatus, defXconCCM, and so on, maintain their existing behavior of transitioning the MEP to a failed state and raising the proper defect condition without delay.
When the ccm-hold-timer down delay-down option is configured, the following calculation is used to determine the remote peer time out: 3.5 × ccm-interval + ccm-hold-timer down delay-down.
This command is configured under the association. Only sub-second CCM-enabled MEPs support this hold timer. Ethernet tunnel paths use a similar but slightly different approach and continue to use the existing method. Ethernet tunnels are blocked from using this new hold timer.
It is possible to change this command on the fly without deleting it first. Entering the command with the new values change the values without having to first delete the command.
It is possible to change the ccm-interval of a MEP on the fly without first deleting it. This means it is possible to change a sub-second CCM-enabled MEP to 1 second or more. The operator is prevented from changing an association from a sub second CCM interval to a non-sub second CCM interval when a ccm-hold-timer is configured in that association. The ccm-hold-timer must be removed using the no option prior to allowing the transition from sub second to non-sub second CCM interval.
Alarm Indication Signal (AIS) provides a MEP the ability to signal a fault condition in the reverse direction of the MEP, out the passive side. When a fault condition is detected the MEP generates AIS packets at the configured client levels and at the specified AIS interval until the condition is cleared. Currently a MEP that is configured to generate AIS must do so at a level higher than its own. The MEP configured on the service receiving the AIS packets is required to have the active side facing the receipt of the AIS packet and must be at the same level as the AIS. The absence of an AIS packet for 3.5 times the AIS interval set by the sending a node clear the condition on the receiving MEP.
AIS generation is not subject to the CCM low-priority-defect parameter setting. When enabled, AIS is generated if the MEP enters any defect condition, by default this includes CCM RDI condition.
To prevent the generation of AIS for the CCM RDI condition, the AIS version of the low-priority-defect parameter (under the ais-enable command) can be configured to ignore RDI by setting the parameter value to macRemErrXcon. The low-priority-defect parameter is specific and influences the protocol under which it is configured. When the low-priority-defect parameter is configured under CCM, it only influences CCM and not AIS. When the low-priority-defect parameter is configured under AIS, it only influences AIS and not CCM. Each protocol can make use of this parameter using different values.
AIS configuration has two components: receive and transmit. AIS reception is enabled when the command ais-enable is configured under the MEP. The transmit function is enabled when the client-meg-level is configured.
Alarm Indication Signal function is used to suppress alarms at the client (sub) layer following detection of defect conditions at the server (sub) layer. Due to independent restoration capabilities provided within the Spanning Tree Protocol (STP) environments, ETH-AIS is not expected to be applied in the STP environment.
Transmission of frames with ETH-AIS information can be enabled or disabled on a MEP. Frames with ETH-AIS information can be issued at the client MEG Level by a MEP, including a Server MEP, upon detecting the following conditions:
For a point-to-point ETH connection at the client (sub) layer, a client layer MEP can determine that the server (sub) layer entity providing connectivity to its peer MEP has encountered defect condition upon receiving a frame with ETH-AIS information. Alarm suppression is straightforward since a MEP is expected to suppress defect conditions associated only with its peer MEP.
For multipoint ETH connectivity at the client (sub) layer, a client (sub) layer MEP cannot determine the specific server (sub) layer entity that has encountered defect conditions upon receiving a frame with ETH-AIS information. More importantly, it cannot determine the associated subset of its peer MEPs for which it should suppress alarms since the received ETH-AIS information does not contain that information. Therefore, upon receiving a frame with ETH-AIS information, the MEP suppresses alarms for all peer MEPs whether or not there is still connectivity.
Only a MEP, including a server MEP, is configured to issue frames with ETH-AIS information. Upon detecting a defect condition the MEP can immediately start transmitting periodic frames with ETH-AIS information at a configured client MEG Level. A MEP continues to transmit periodic frames with ETH-AIS information until the defect condition is removed. Upon receiving a frame with ETH-AIS information from its server (sub) layer, a client (sub) layer MEP detects AIS condition and suppresses alarms associated with all its peer MEPs. A MEP resumes alarm generation upon detecting defect conditions once AIS condition is cleared.
AIS may also be triggered or cleared based on the state of the entity over which it has been enabled. Including the optional command interface-support-enable under the ais-enable command tracks the state of the entity and invoke the appropriate AIS action. This means that operators are not required to enable CCM on a MEP in order to generate AIS if the only requirement is to track the local entity. If a CCM enabled MEP is enabled in addition to this function then both are used to act upon the AIS function. When both CCM and interface support are enabled, a fault in either triggers AIS. In order to clear the AIS state, the entity must be in an UP operational state and there must be no defects associated with the MEP. The interface support function is available on both service MEPs and facility MEPs both in the Down direction only, with the following exception. An Ethernet QinQ Tunnel Facility MEP does not support interface-support-enable. Many operational models for Ethernet QinQ Tunnel Facility MEPs are deployed with the SAP in the shutdown state.
The following specific configuration information is used by a MEP to support ETH-AIS:
A MIP is transparent to frames with ETH-AIS information and therefore does not require any information to support ETH-AIS functionality.
It is important to note that Facility MEPs do not support the generation of AIS to an explicitly configured endpoint. An explicitly configured endpoint is an object that contains multiple individual endpoints, as in pseudowire redundancy.
AIS is enabled under the service and has two parts, receive and transmit. Both components have their own configuration option. The ais-enable command under the SAP allows for the processing of received AIS packets at the MEP level. The client-meg-level command is the transmit portion that generates AIS if the MEP enter a fault state.
When MEP 101 enters a defect state, it starts to generate AIS out the passive side of the MEP, away from the fault. In this case, the AIS generates out sap 1/1/10:100.31 since MEP 101 is an up MEP on that SAP. The Defect Flag indicates that an RDI error state has been encountered. The Eth-Ais Tx Counted value is increasing, indicating that AIS is actively being sent.
A single network event may, in turn, cause the number of AIS transmissions to exceed the AIS transmit rate of the network element. A pacing mechanism is in place to assist the network element to gracefully handle this overload condition. Should an event occur that causes the AIS transmit requirements to exceed the AIS transmit resources, a credit system is used to grant access to the resources. Once all the credits have been used, any remaining MEPs attempting to allocate a transmit resource are placed on a wait list, unable to transmit AIS. If a credit be released, when the condition that caused the MEP to transmit AIS is cleared, a MEP on the wait list consumes the newly available credit. If it is critical that AIS transmit resources be available for every potential event, consideration must be given to the worst case scenario and the configuration should never exceed the potential. Access to the resources and the wait list are ordered and maintained in first come first serve basis.
A MEP that is on the wait list only increments the “Eth-Ais Tx Fail” counter and not the “Eth-Ais TxCount” for every failed attempt while the MEP is on the wait list.
There is no synchronization of AIS transmission state between peer nodes. This is particularly important when AIS is used to propagate fault in ETH-CFM MC-LAG linked designs.
Client signal fail (CSF) is a method that allows for the propagation of a fault condition to a MEP peer, without requiring ETH-CC or ETH-AIS. The message is sent when a MEP detects an issue with the entity in the direction the MEP to its peer MEP. A typical deployment model is an UP MEP configured on the entity that is not executing ETH-CC with its peer. When the entity over which the MEP is configured fails, the MEP can send the ETH-CSF fault message.
In order to process the reception of the ETH-CSF message, the csf-enable function must be enabled under the MEP. When processing of the received CSF message is enabled, the CSF is used as another method to trigger fault propagation, assuming fault propagation is enabled. If CSF is enabled but fault propagation is not enabled, the MEP shows the state of CSF being received from the peer. And lastly, when there is no fault condition, the CSF Rx State displays DCI (Client defect clear) indicating there are no existing failures, even if no CSF has been received. The CSF Rx State indicates the various fault and clear conditions received from the peer during the event.
CSF carries the type of defect that has been detected by the local MEP generating the CSF message.
Clearing the CSF state can be either implicit, time out, or explicit, requiring the client to send the PDU with the clear indicator (011 – DCI – Client defect clear indication). The receiving node uses the multiplier option to determine how to clear the CSF condition. When the multiplier is configured as non-zero (in increments of half seconds between 2 and 30) the CSF is cleared when CSF PDUs have not been received for that duration. A multiplier value of 0 means that the peer that has generated the CSF must send the 011 – DCI flags. There is no timeout condition.
Service-based MEP supports the reception of the ETH-CSF as an additional trigger for the fault propagation process. Primary VLAN and Virtual MEPs do not support the processing of the CSF PDU. CSF is transparent to MIPs. There is no support for the transmission of ETH-CSF packets on any MEP.
Ethernet test provides a MEP with the ability to send an in-service on-demand function to test connectivity between two MEPs. The test is generated on the local MEP and the results are verified on the destination MEP. Any ETH-TST packet generated that exceeds the MTU is silently dropped by the lower level processing of the node.
Specific configuration information required by a MEP to support ETH-test is the following:
A MIP is transparent to the frames with ETH-Test information and does not require any configuration information to support ETH-Test functionality.
Both nodes require the eth-test function to be enabled in order to successfully execute the test. Since this is a dual-ended test, initiate on sender with results calculated on the receiver, both nodes need to be check to see the results.
One-way delay measurement provides a MEP with the ability to check unidirectional delay between MEPs. An ETH-1DM packet is timestamped by the generating MEP and sent to the remote node. The remote node timestamps the packet on receipt and generates the results. The results, available from the receiving MEP, indicate the delay and jitter. Jitter, or delay variation, is the difference in delay between tests. This means the delay variation on the first test is not valid. It is important to ensure that the clocks are synchronized on both nodes to ensure the results are accurate. NTP can be used to achieve a level of clock synchronization between the nodes.
Note: Accuracy relies on the nodes ability to timestamp the packet in hardware, and the support of PTP for clock sync. |
Two-way delay measurement is similar to one-way delay measurement except it measures the round trip delay from the generating MEP. In this case, clock synchronization issues do not influence the round-trip test results because four timestamps are used. This allows the time it takes for the remote node to process the frame to be removed from the calculation, and as a result, clock variances are not included in the results. The same consideration for first test and hardware based time stamping stated for one-way delay measurement are applicable to two-way delay measurement.
Delay can be measured using one-way and two-way on demand functions. The two-way test results are available single-ended, test initiated, calculation and results viewed on the same node. There is no specific configuration under the MEP on the SAP in order to enable this function. An example of an on demand test and results are below. The latest test result is stored for viewing. Further tests overwrite the previous results. Delay Variation is only valid if more than one test has been executed.
Note: Release 9.0 R1 uses pre-standard OpCodes and does not interoperate with any other release or future release. |
This synthetic loss measurement approach is a single-ended feature that allows the operator to run on-demand and proactive tests to determine “in”, “out” loss and “unacknowledged” packets. This approach can be used between peer MEPs in both point to point and multipoint services. Only remote MEP peers within the association and matching the unicast destination respond to the SLM packet.
The specification uses various sequence numbers in order to determine in which direction the loss occurred. Nokia has implemented the required counters to determine loss in each direction. In order to properly use the information that is gathered the following terms are defined:
The per probe specific loss indicators are available when looking at the on-demand test runs, or the individual probe information stored in the MIB. When tests are scheduled by Service Assurance Application (SAA) the per probe data is summarized and per probe information is not maintained. Any “unacknowledged” packets are recorded as “in-loss” when summarized.
The on-demand function can be executed from CLI or SNMP. The on demand tests are meant to provide the carrier a means to perform on the spot testing. However, this approach is not meant as a method for storing archived data for later processing. The probe count for on demand SLM has a range of one to 100 with configurable probe spacing between one second and ten seconds. This means it is possible that a single test run can be up to 1000 seconds in length. Although possible, it is more likely the majority of on demand case are run up to 100 probes or less at a one second interval. A node may only initiate and maintain a single active on demand SLM test at any given time. A maximum of one storage entry per remote MEP is maintained in the results table. Subsequent runs to the same peer overwrite the results for that peer. This means when using on demand testing the test should be run and the results checked prior to starting another test.
The proactive measurement functions are linked to SAA. This backend provides the scheduling, storage and summarization capabilities. Scheduling may be either continuous or periodic. It also allows for the interpretation and representation of data that may enhance the specification. As an example, an optional TLV has been included to allow for the measurement of both loss and delay/jitter with a single test. The implementation does not cause any interoperability because the optional TLV is ignored by equipment that does not support this. In mixed vendor environments loss measurement will continue to be tracked but delay and jitter only reports round trip times. It is important to point out that the round trip times in this mixed vendor environment include the remote nodes processing time because only two time stamps are included in the packet. In an environment where both nodes support the optional TLV to include time stamps unidirectional and round trip times are reported. Since all four time stamps are included in the packet, the round trip time in this case does not include remote node processing time. Of course, those operators that wish to run delay measurement and loss measurement at different frequencies are free to run both ETH-SL and ETH-DM functions. ETH-SL is not replacing ETH-DM. Service Assurance is only briefly discussed here to provide some background on the basic functionality.
The ETH-SL packet format contains a test-id that is internally generated and not configurable. The test-id is visible for the on demand test in the display summary. It is possible a remote node processing the SLM frames receive overlapping test-ids as a result of multiple MEPs measuring loss between the same remote MEP. For this reason, the uniqueness of the test is based on remote MEP-ID, test-id and source MAC of the packet.
ETH-SL is applicable to up and down MEPs and as per the recommendation transparent to MIPs. There is no coordination between various fault conditions that could impact loss measurement. This is also true for conditions where MEPs are placed in shutdown state as a result of linkage to a redundancy scheme like MC-LAG. Loss measurement is based on the ETH-SL and not coordinated across different functional aspects on the network element. ETH-SL is supported on service based MEPs.
It is possible that two MEPs may be configured with the same MAC on different remote nodes. This causes various issues in the FDB for multipoint services and is considered a misconfiguration for most services. It is possible to have a valid configuration where multiple MEPs on the same remote node have the same MAC. In fact, this is somewhat likely. Only the first responder is used to measure packet loss. The second responder is dropped. Since the same MAC for multiple MEPs is only truly valid on the same remote node.
There is no way for the responding node to understand when a test is completed. For this reason a configurable inactivity-timer determines the length of time a test is valid. The timer maintains an active test as long as it is receiving packets for that specific test, defined by the test-id, remote MEP Id and source MAC. When there is a gap between the packets that exceeds the inactivity timer value, the responding node releases the index in the table and responds with a sequence number of 1, regardless of the sequence number sent by the instantiating node. Expiration of this timer causes the reflecting peer to expire the previous test. Packets that follow the expiration of a text are viewed as a new test. The default for the inactivity-timer is 100 second and has a range of ten to 100 seconds.
Only the configuration is supported by HA. There is no synchronization of data between active and standby. Any unwritten, or active tests are lost during a switchover and the data is not recoverable.
ETH-SL provides a mechanism for operators to pro-actively trend packet loss.
The Ethernet Frame Loss Measurement allows the collection of frame counters in order to determine the unidirectional frame loss between point-to-point ETH-CFM MEP peers. This measurement does not count its own PDU in order to determine frame loss. The ETH-LMM protocol PDU includes four counters which represent the data sent and received in each direction: Transmit Forward (TxFCf), Receive Forward (RxFCf), Transmit Backward (TxFCb) and the Receive Backward (RxFCb).
The ETH-LMM protocol is designed specifically for point-to-point connections. It is impossible for the protocol to accurately report loss if the point-to-point relationship is broken; for example, if a SAP or MPLS binding receives data from multiple peers, as can be the case in VPLS deployments, this protocol would not be reliable indicator of frame loss.
The loss differential between transmit and receive is determined the first time an LMM PDU is sent. Each subsequent PDU for a specific test performs a computation of differential loss from that epoch. Each processing cycle for an LMR PDU determines if there is a new maximum or minimum loss window, adds any new loss to the frame loss ratio computation, and updates the four raw transmit and receive counters. The individual probe results are not maintained; these results are only used to determine a new minimum or maximum. A running total of all transmit and receive values is used to determine the average Frame Loss Ratio (FLR) at the completion of the measurement interval. The data set includes the protocol information in the opening header, followed by the frame counts in each direction, and finally the FLR percentages.
The user must understand the caveats of service before selecting this method of loss measurement. Statistics are maintained per forwarding complex. Multiple path environments may spread frames between the same two peers across different forwarding complexes (for example, link aggregation groups). The ETH-LMM protocol has no method to rationalize different transmit and receive statistics when there are complex changes or when any statistics are cleared on either of the peer entities. The protocol resynchronizes but the data collected for that measurement interval is invalid. The protocol has no method to determine if the loss is true loss or whether some type of complex switch has occurred or statistics were cleared. Consequently, the protocol cannot use any suspect flag to mark the data as invalid. Higher level systems must coordinate network events and administrative actions that can cause the counters to become non-representative of the service data loss.
Packet reordering also affect frame loss and gain reporting. If there is queuing contention on the local node or if path differences in the network cause interleaved or delayed frames, the counter stamped into the LMM PDU can introduce frame gain or loss in either direction. For example, if the LMM PDU is stamped with the TxFCf counter and the LMM PDU traffic is interleaved, the interleaving cannot be accounted for in the counter and a potential gain is realized in the forward direction. This is because the original counter included as the TxFCf value does not include the interleaved packets and the RxFCf counter on the remote peer includes them. Gains and losses even out over the life of the measurement interval. Absolute values are used for any negative values, per interval or at the end of the measurement interval.
Launching a single-ended test is under the control of the OAM Performance Monitoring (OAM-PM) architecture, and the test adheres to the rules of OAM-PM. The ETH-LMM functionality is only available under the OAM-PM configuration. This feature is not available through interactive CLI or SAA. OAM-PM requires the configuration of a test ID for all OAM-PM tests. The ETH-LMM protocol does not define the necessity for this ID, nor does it carry the 4-byte test ID in the packet. This is for local significance and uniformity with other protocols under the control of the OAM-PM architecture.
Support is included for point-to-point Up and Down Service MEPs and Down Facility MEPs (port, LAG, and base router interfaces). Base router interface accuracy may be affected by the Layer 2 or Layer 3 inter-working functions, routing protocol, ACLs, QoS policies, and other Layer 3 functions that were never meant to be accounted for by an Ethernet frame loss measurement tool. Launch functions require IOM/IMM or later, as well as a SF/CPM3 or later.
Resource contention extends beyond the sharing of common LMM resources used for packet counting and extraction. There is also protocol-level contention. For example, Cflowd cannot be counted or sampled on an entity that is collecting LMM statistics. Collection of statistics per Ethernet SAP, per MPLS SDP binding, or per facility is not enabled by default.
ETH-LMM is not supported in the following models:
QinQ tunnel collection will be the aggregate of all outer VLANs that share the VLAN with the tunnel. If the QinQ is configured to collect LMM statistics, then any service MEP that shares the same VLAN as the QinQ tunnel will be blocked from configuring the respective collect-lmm-stats command. The reverse is also true; if a fully qualified SAP is configured to collect LMM statistics, the QinQ tunnel that shares the outer VLAN will be blocked from configuring the respective collect-lmm-stats command.
QoS models contribute significantly to the accuracy of the LMM counters. If the QoS function is beyond the LMM counting function, it can lead to mismatches in the counter and transmit and receive information.
A single LMM counter per SAP or per MPLS SDP binding or per facility counter is the most common option for deployment of the LMM frame-based counting model. This single counter model requires careful consideration for the counter location. Counter integrity is lost when counting incurs entity conflicts, as is typical in facility MEP and service MEP overlap. The operator must choose one type of facility MEP or the service MEP. If a facility MEP is chosen (Port, LAG, QinQ Tunnel or Base Router Interface) care must be taken to ensure the highest configured MEP performs the loss collection routine.
Configuring loss collection on a lower level MEP will lead to additive gain introduced in both directions. Although the collection statement is not blocked by CLI or SNMP when there are potential conflicts, only one can produce accurate results. The operator must be aware of lower level resource conflicts. For example, a null based service SAP, any default SAP context or SAP that covers the entire port or facility resource, such as sap 1/1/1, will always count the frame base loss counter against the SAP and never the port, regardless of the presences of a MEP or the collect-lmm-stats configuration on the SAP. Resource contention extends beyond the sharing of common resources used for packet counting and extraction.
In order for this feature to function with accurate measurements, the collect-lmm-stats is required under the ETH-CFM context for the Ethernet SAP or MPLS SDP binding or under the MEP in the case of the facility MEP. If this command is not enabled on the launch and reflector, the data in the ETH-LMM and ETH-LMR PDU will not be representative and the data captured will be invalid.
The show>service>sdp-using eth-cfm and show>service>sap-using eth-cfm commands have been expanded to include the collect-lmm-stats option for service based MEPs. The show>eth-cfm>cfm-stack-table facility command has been expanded to include collect-lmm-stats to view all facility MEPs. Using these commands with this new option displays the entities that are currently collecting LMM counter.
The counter will include all frames that are transmitted or received regardless of class of service or discard eligibility markings. Locally transmitted and locally terminated ETH-CFM frames on the peer collecting the statistics will not be included in the counter. However, there are deployment models that will introduce artificial frame loss or gain when the ETH-CFM launch node and the terminating node for some ETH-CFM packets are not the same peers. Figure 53 demonstrates this issue.
Frame loss measurement can be deployed per forwarding class (FC) counter. The config>oam-pm>session>ethernet>lmm>enable-fc-collection command in the related oam-pm session enables frames to be counted on an FC basis, either in or out of profile. This counting method alleviates some of the ordering and interleaving issues that arise when using a single counter, but does not improve on the base protocol concerns derived from multiple paths and complex based counting.
This approach requires the operator to configure the individual FCs of interest and the profile status of the frames under the collect-lmm-fc-stats context. The command allows for the addition or removal of an individual FC by using a differential. The entire command with the desired FC statements must be included. The system will determine the new, deleted, and unchanged FCs. New FCs will be allocated a counter. Deleted FCs will stop counting. Unchanged FCs will continue counting.
Support for per-FC collection includes SAPs, MPLS SDP bindings, and router interfaces.
The enable-fc-collection command must be coordinated between the ETH-LMM test and counting model in order to configure either single per SAP or MPLS SDP binding counter, or per FC counter. The command is disabled by default, and single per SAP or MPLS SDP binding counter is used.
Symmetrical QoS is required for proper collection of frame counters. The FC must match the priority of the OAM-PM ETH-LMM test. The ETH-LM PDUs must ensure that they are mapped to the proper FC on ingress and egress so that the appropriate counters are collected. Mismatches between the ETH-LMM PDUs and the collected FC will cause incorrect or no data to be reported.
The show>eth-cfm>collect-lmm-fc-stats command will display the SAPs, MPLS SDP bindings, and router interfaces that are configured for per-FC collection, and whether the collection is priority aware or unaware. It also includes the base mapping of OAM-PM ETH-LMM priority to FC.
Entities that support LMM collection may only use one of the following collection models:
The collect-lmm-stats and collect-lmm-fc-stats commands are mutually exclusive.
OAM-PM will reject ETH-LMM test configurations from same source MEPs that have different enable-fc-collection configurations.
Ensure that the LMM collection model that is configured on the entity (collect-lmm-stats or collect-lmm-fc-stats) matches the configuration of the enable-fc-collection command within the OAM-PM session, and that the priority of the test maps to the required FC.
ETH-CFM relies on Ethernet addressing and reachability. ETH-CFM destination addressing may be derived from the Ethernet encapsulation, or may be a target address within the ETH-CFM PDU. Addressing is the key to identifying both the source and the destination management points (MPs).
The SR OS implementation dynamically assigns the MP MAC address using the appropriate pool of available hardware addresses on the network element, which simplifies the configuration and maintenance of the MP. The MP MAC address is tied to the specific hardware element, and its addressing can change when the associated hardware is changed.
The optional mac-address mac-address configuration command can be used to eliminate the dynamic nature of the MEP MAC addressing. This optional configuration associates a configured MAC address with the MEP in place of dynamic hardware addressing. The optional mac-address configuration is not supported for all service types.
ETH-CFM tests can adapt to changing destination MAC addressing by using the remote-mepid mep-id command in place of the unicast statically-configured MAC address. SR OS maintains a learned remote MAC table (visible by using the show>eth-cfm>learned-remote-mac command) for all MEPs that are configured to use ETH-CC messaging. Usually, when the remote-mepid mep-id command is used as part of a supported test function, the test will search the learned remote MAC table for a unicast address that associates the local MEP and the requested remote MEP ID. If a unicast destination address is found for that relationship, it will be used as the unicast destination MAC address.
The learned remote MAC table is updated and maintained by the ETH-CC messaging process. Once an address is learned and recorded in the table, it is maintained even if the remote peer times out or the local MEP is shut down. The address will not be maintained in the table if the remote-mepid statement is removed from the associated context by using the no remote-mepid mep-id command for a peer. The CCM database will clear the peer MAC address and enter an all-0 MAC address for the entry when the peer times out. The learned remote MAC table will maintain the previously learned peer MAC address. If an entry must be deleted from the learned remote MAC table, the clear>learned-remote-mac [mep mep-id [remote-mepid mep-id]] domain md-index association ma-index command can be used. Deleting a local MEP will remove the local MEP and all remote peer relationships, including the addresses previously stored in the learned remote MAC table.
The individual ETH-CFM test scheduling functions that use the remote-mepid mep-id option have slightly different operational behaviors.
Global interactive CFM tests support the remote-mepid mep-id option as an alternative to mac-address. A test will only start if a learned remote MAC table contains a unicast MAC address for the remote peer, and will run to completion with that MAC address. If the table does not contain the required unicast entry associated with the specified remote MEP ID, the test will fail to start.
SAA ETH-CFM test types support the remote-mepid mep-id option as an alternative to mac-address. If, at the scheduled start of the individual run, the learned remote MAC table contains a unicast learned remote MAC address for the remote peer, the test will run to completion with the initial MAC address. If the table does not contain the required entry, the test will terminate after the lesser window of either the full test run or 300 s. A run that cannot successfully determine a unicast MAC address will designate the last test result as “failed”. If a test is configured with the continuous configuration option, it will be rescheduled; otherwise, the test will not be rescheduled.
OAM-PM Ethernet test families, specifically DMM, SLM, and LMM, support the remote-mepid mep-id option as an alternative to the dest-mac ieee-address configuration. If the learned remote MAC table contains a unicast learned remote MAC address for the remote peer, the test will use this MAC address as the destination. OAM-PM will adapt to changes for MAC addressing during the measurement interval when the remote-mepid mep-id option is configured. It should be expected that the measurement interval will include update-induced PM errors during the transition. If the table does not contain the required entry, the test will not attempt to transmit test PDUs, and will present the “Dest Remote MEP Unknown” detectable transmission error.
The Ethernet Bandwidth Notification (ETH-BN) function is used by a server MEP to signal changes in link bandwidth to a client MEP.
This functionality is for point-to-point microwave radios to modify the downstream traffic rate toward the microwave radio to match its microwave link rate. When a microwave radio uses adaptive modulation, the capacity of the radio can change based on the condition of the microwave link. For example, in adverse weather conditions that cause link degradation, the radio can change its modulation scheme to a more robust one (which will reduce the link bandwidth) to continue transmitting. This change in bandwidth is communicated from the server MEP on the radio, using ETH-BNM (Ethernet Bandwidth Notification Message), to the client MEP on the connected router. The server MEP transmits periodic frames with ETH-BN information including the interval, the nominal, and currently available bandwidth. A port MEP with the ETH-BN feature enabled will process the information contained in the CFM PDU and the associated port egress rate can be modified appropriately to adjust the rate of traffic sent to the radio.
A port MEP, that is not a LAG member port, supports the client side reception and processing of the ETH-BN CFM PDU sent by the server MEP. By default, processing is disabled. The config>port>ethernet>eth-cfm>mep>eth-bn>no receive CLI command sets the ETH-BN processing state on the port MEP. A port MEP supports untagged packet processing of ETH-CFM PDUs at domain levels zero (0) and one (1) only. The port client MEP sends the ETH-BN rate information received to be applied to the port egress rate in a QoS update. A pacing mechanism limits the number of QoS updates sent. The config>port>ethernet>eth-cfm>mep>eth-bn>rx-update-pacing CLI command allows the updates to be paced using a configurable range of one (1) to 600 seconds (the default is five seconds). The pacing timer begins to countdown following the most recent QoS update sent to the system for processing. When the timer expires, the most recent update that arrived from the server MEP is compared to the most recent value sent for system processing. If the value of the current bandwidth is different than the previously processed value, the update is sent and the process begins again. Updates with a different current bandwidth that arrive when the pacing timer has already expired are not be subject to a timer delay. Refer to the 7450 ESS, 7750 SR, 7950 XRS, and VSR Interface Configuration Guide for more information on these commands.
A complimentary QoS configuration is required to allow the system to process nominal bandwidth updates from the CFM engine. The config>port>ethernet>no eth-bn-egress-rate-changes CLI command is required to enable the QoS function to update the port egress rates based on the current available bandwidth updates from the CFM engine. By default, the function is disabled.
Both the CFM and the QoS functions must be enabled for the changes in current bandwidth to dynamically update the egress rate.
When the MEP enters a state that prevents it from receiving the ETH-BNM, the current bandwidth last sent for processing is cleared and the egress rate reverts to the configured rate. Under these conditions, the last update cannot be guaranteed as current. Explicit notification is required to dynamically update the port egress rate. The following types of conditions lead to ambiguity:
If the eth-bn-egress-rate-changes is disabled using the no option, CFM continues to send updates, but the updates are held without affecting the port egress rate.
The ports supporting ETH-BN MEPs can be configured for network, access, or hybrid modes. When ETH-BN is enabled on a port MEP and the config>port>ethernet>eth-cfm>mep>eth-bn>receive and the QoS config>port>ethernet>eth-bn-egress-rate-changes contexts are configured, the egress rate is dynamically changed based on the current available bandwidth indicated by the ETH-BN server.
The port egress rate is capped by the minimum of the configured egress-rate and the maximum port rate and the minimum egress rate is one kbyte. If a current bandwidth of zero is received, it does not affect the egress port rate and the previously processed current bandwidth will continue to be used.
The client MEP requires explicit notification of changes to update the port egress rate. The system does not timeout any previously-processed current bandwidth rates using a timeout condition. The specification does allow a timeout of the current bandwidth if a frame has not been received in 3.5 times the ETH-BNM interval. However, the implicit approach can lead to misrepresented conditions and has not been implemented.
When starting or restarting the system, the configured egress rate is used until a ETH-BNM arrives on the port with a new bandwidth request from the ETH-BN server MEP.
An event log is generated each time the egress rate is changed based on reception of a BNM. If a BNM is received that does not result in a bandwidth change, no event log is generated.
The destination MAC address can be a Class 1 multicast MAC address (that is, 01-80-C2-00-0x) or the MAC address of the port MEP configured. Standard CFM validation and identification must be successful to process any CFM PDU.
For information on the eth-bn-egress-rate-changes command, refer to the 7450 ESS, 7750 SR, 7950 XRS, and VSR Interface Configuration Guide.
The PDU used for ETH-BN information is called the Bandwidth Notification Message (BNM). It is a sub-OpCode within the Ethernet Generic Notification Message (ETH-GNM).
Table 13 shows the BNM PDU format fields.
Label | Description |
MEG Level | Carries the MEG level of the client MEP (0 to 7). This filed must be set to either 0 or 1 to be recognized as a port MEP. |
Version | The current version is 0. |
OpCode | The value for this PDU type is GNM (32). |
Flags | Contains one information element: Period (3 bits) to indicate how often ETH-BN messages are transmitted by the server MEP. Valid values are:
|
TLV Offset | This value is set to 13. |
Sub-OpCode | The value for this PDU type is BNM (1). |
Nominal Bandwidth | The nominal full bandwidth of the link, in Mbytes/s. This information is reported in the display but not used to influence QoS egress rates. |
Current Bandwidth | The current bandwidth of the link in Mbytes/s. The value is used to influence the egress rate. |
Port ID | A non-zero unique identifier for the port associated with the ETH-BN information, or zero if not used. This information is reported in the display, but is not used to influence QoS egress rates. |
End TLV | An all zeros octet value. |
The show eth-cfm mep eth-bandwidth-notification display output includes the ETH-BN values received and extracted from the PDU, including a last reported value and the pacing timer. If the n/a value appears in the field, it means that field has not been processed.
The base show eth-cfm mep output is expanded to include the disposition of the ETH-BN receive function and the configured pacing timer.
The show port port-id detail is expanded to include and Ethernet Bandwidth Notification Message Information section. This section includes the ETH-BN Egress Rate disposition and the current Egress BN rate being used.
A number of statistics are available to view the current overall processing requirements for CFM. Any packet that is counted against the CFM resource will be included in the statistics counters. These counters do not include the counting of sub-second CCM, ETH-CFM PDUs that are generated by non-ETH-CFM functions (which includes OAM-PM & SAA) or are filtered by an applicable security configuration.
SAA and OAM-PM use standard CFM PDUs. The reception of these packets are included in the receive statistics. However, these two functions are responsible for launching their own test packets and do not consume ETH-CFM transmission resources.
Per system and per MEP statistics are available with a per OpCode breakdown. Use the show>eth-cfm>statistics command to view the statistics at the system level. Use the show>eth-cfm>mep mep-id domain md-index association ma-index statistics command to view the per MEP statistics. These statistics may be cleared by substituting the clear command for the show command. The clear function will only clear the statistics for that function. For example, clear the system statistics does not clear the individual MEP statistics, each maintain their own unique counters.
All known OpCodes are listed in transmit and receive columns. Different versions for the same OpCode are not distinguished for this display. This does not imply the network element supports all listed functions in the table. Unknown OpCodes will be dropped.
It is also possible to view the top ten active MEPs on the system. The term active can be defined as any MEP that is in a no shutdown state. The tools dump eth-cfm top-active-meps command can be used to see the top ten active MEPs on the system. The counts are based from the last time to command was issued with the clear option. MEPs that are in a shutdown state are still terminating packets, but these do not appear on the active list.
These statistics help operators to determine the busiest active MEPs on the system as well a breakdown of per OpCode processing at the system and MEP level.
The debug infrastructure supports the decoding of both received and transmitted valid ETH-CFM packets for MEPs and MIPs that have been tagged for decoding. The eth-cfm hierarchy has been added to the existing debug CLI command tree. When a MEP or MIP is tagged by the debug process, valid ETH-CFM PDUs will be decoded and presented to the logging infrastructure for operator analysis. Fixed queue limits restrict the overall packet rate for decoding. The receive and transmit ETH-CFM debug queues are serviced independently. Receive and transmit correlation is not guaranteed across the receive and transmit debug queues. The tools dump eth-cfm debug-packet command will display message queue exceptions.
Valid ETH-CFM packets must pass a multiple-phase validity check before being passed to the debug parsing function. The MAC addresses must be non-zero. If the destination MAC address is multicast, the last nibble of the multicast address must match the expected level of a MEP or MIP tagged for decoding. Packet length and TLV formation, usage, and, where applicable, field validation are performed. Finally, the OpCode-specific TLV structural checks are performed against the remainder of the PDU.
An ETH-CFM packet that passes the validation process is passed to the debug decoding process for tagged MEPs or MIPs. The decoding process parses the PDU for analysis. Truncation of individual TLVs will occur when:
The number of printable bytes is dependent on the reason for truncation.
Any standard fields in the PDU that are defined for a certain length with a Must Be Zero (MBZ) attribute in the specification will be decoded based on the specification field length. There is no assumption that packets adhere to the MBZ requirement in the byte field; for example, the MEP-ID is a 2-byte field with three reserved MBZ bits, which translates into a standard MEP-ID range of 0 to 8191. If the MBZ bits are violated, then the 2-byte field will be decoded using all non-zero bits in the 2-byte field.
The decoding function is logically positioned between ETH-CFM and the forwarding plane. Any ETH-CFM PDU discarded by an applicable security configuration will not be passed to the debug function. Any packet that is discarded by squelching (using the config>service>sap>eth-cfm>squelch-ingress-levels and config>service>sap>eth-cfm>squelch-ingress-ctag-levels commands) or CPU protection (using the config>service>sap>eth-cfm>cpu-protection eth-cfm-monitoring command), will bypass the decoding function. Care must be taken when interpreting specific ETH-CFM PDU decodes. Those PDUs that have additional, subsequent, or augmented information applied by the forwarding mechanisms may not be part of the decoded packet. Augmentation includes the timestamp (the stamping of hardware based counters [LMM]) applied to ETH-CFM PDUs by the forwarding plane.
This function allows for enhanced troubleshooting for ETH-CFM PDUs to and from tagged MEPs and MIPs. Only defined and node-supported functionality are decoded, possibly with truncation. Unsupported or unknown functionality on the node is treated on a best-effort basis, typically handled with a decode producing a truncated number of hex bytes.
This functionality does not support decoding of sub-second CCM, or any ETH-CFM PDUs that are processed by non-ETH-CFM entities (which includes SAA CFM transmit functions), or MIPs created using the default-domain table.
UP MEPs and Down MEPs have been aligned to better emulate service data. When an UP MEP or DOWN MEP is the source of the ETH-CFM PDU the priority value configured, as part of the configuration of the MEP or specific test, will be treated as the Forwarding Class (FC) by the egress QoS policy. The numerical ETH-CFM priority value resolves FCs using the following mapping:
If there is no egress QoS policy, the priority value will be mapped to the CoS values in the frame. An ETH-CFM frame utilizing VLAN tags will have the DEI bit mark the frame as “discard ineligible”. However, egress QoS Policy may overwrite this original value. The Service Assurance Agent (SAA) uses [fc {fc-name} [profile {in | out}]] to accomplish similar functionality.
UP MEPs and DOWN MEPs terminating an ETH-CFM PDU will use the received FC as the return priority for the appropriate response, again feeding into the egress QoS policy as the FC.
This does not include Ethernet Linktrace Response (ETH-LTR). The specification requires the highest priority on the bridge port should be used in response to an Ethernet Linktrace Message (ETH-LTM). This provides the highest possible chance of the response returning to the source. Operators may configure the linktrace response priority of the MEP using the ccm-ltm-priority. MIPs inherit the MEPs priority unless the mip-ltr-priority is configured under the bridging instance for the association (config>eth-cfm>domain>assoc>bridge).
OAM mapping is a mechanism that enables a way of deploying OAM end-to-end in a network where different OAM tools are used in different segments. For instance, an Epipe service could span across the network using Ethernet access (CFM used for OAM), pseudowire (T-LDP status signaling used for OAM), and Ethernet access (E-LMI used for OAM). Another example allows an Ipipe service, where one end is Ethernet and the other end is Frame Relay, ATM, PPP, MLPPP, or HDLC.
In the SR OS implementation, the Service Manager (SMGR) is used as the central point of OAM mapping. It receives and processes the events from different OAM components, then decides the actions to take, including triggering OAM events to remote peers.
Fault propagation for CFM is by default disabled at the MEP level to maintain backward compatibility. When required, it can be explicitly enabled by configuration.
Fault propagation for a MEP can only be enabled when the MA is comprised of no more than two MEPs (point-to-point).
Fault propagation cannot be enabled for eth-tun control MEPs (MEPs configured under the eth-tun primary and protection paths). However, failure of the eth-tun (meaning both paths fail) is propagated by SMGR because all the SAPs on the eth-tun go down.
CFM MEP declares a connectivity fault when its defect flag is equal to or higher than its configured lowest defect priority. The defect can be any of the following depending on configuration:
The following additional fault condition applies to Y.1731 MEPs:
Setting the lowest defect priority to allDef may cause problems when fault propagation is enabled in the MEP. In this scenario, when MEP A sends CCM to MEP B with interface status down, MEP B will respond with a CCM with RDI set. If MEP A is configured to accept RDI as a fault, then it gets into a dead lock state, where both MEPs will declare fault and never be able to recover. The default lowest defect priority is DefMACstatus. In general terms, when a MEP propagates fault to a peer the peer receiving the fault must not reciprocate with a fault back to the originating MEP with a fault condition equal to or higher than the originating MEP low-priority-defect setting. It is also very important that different Ethernet OAM strategies should not overlap the span of each other. In some cases, independent functions attempting to perform their normal fault handling can negatively impact the other. This interaction can lead to fault propagation in the direction toward the original fault, a false positive, or worse, a deadlock condition that may require the operator to modify the configuration to escape the condition. For example, overlapping Link Loss Forwarding (LLF) and ETH-CFM fault propagation could cause these issues.
When CFM is the OAM module at the other end, it is required to use any of the following methods (depending on local configuration) to notify the remote peer:
For using AIS for fault propagation, AIS must be enabled for the MEP. The AIS configuration needs to be updated to support the MD level of the MEP (currently it only supports the levels above the local MD level).
Note that the existing AIS procedure still applies even when fault propagation is disabled for the service or the MEP. For example, when a MEP loses connectivity to a configured remote MEP, it generates AIS if it is enabled. The new procedure that is defined in this document introduces a new fault condition for AIS generation, fault propagated from SMGR, that is used when fault propagation is enabled for the service and the MEP.
The transmission of CCM with interface status TLV is triggered and does not wait for the expiration of the remaining CCM interval transmission. This rule applies to CFM fault notification for all services.
For a specific SAP and SDP-binding, CFM and SMGR can only propagate one single fault to each other for each direction (up or down).
When there are multiple MEPs (at different levels) on a single SAP and SDP-binding, the fault reported from CFM to SMGR will be the logical OR of results from all MEPs. Basically, the first fault from any MEP will be reported, and the fault will not be cleared as long as there is a fault in any local MEP on the SAP and SDP-binding.
Down and up MEPs are supported for Epipe services as well as fault propagation. When there are both up and down MEPs configured in the same SAP and SDP-binding and both MEPs have fault propagation enabled, a fault detected by one of them will be propagated to the other, which in turn will propagate fault in its own direction.
When a MEP detects a fault and fault propagation is enabled for the MEP, CFM needs to communicate the fault to SMGR, so SMGR will mark the SAP or SDP-binding faulty but still oper up. CFM traffic can still be transmitted to or received from the SAP and SDP-binding to ensure when the fault is cleared, the SAP will go back to normal operational state. Since the operational status of the SAP and SDP-binding is not affected by the fault, no fault handling is performed. For example, applications relying on the operational status are not affected.
If the MEP is an up MEP, the fault is propagated to the OAM components on the same SAP or SDP binding; if the MEP is a down MEP, the fault is propagated to the OAM components on the mate SAP or SDP-binding at the other side of the service.
When a SAP or SDP-binding becomes faulty (oper-down, admin-down, or pseudowire status faulty), SMGR needs to propagate the fault to up MEP(s) on the same SAP or SDP-bindings about the fault, as well as to OAM components (such as down MEPs and E-LMI) on the mate SAP or SDP-binding.
This section describes procedures for the scenario where an Epipe service is down due to the following:
When a fault occurs on the SAP side, the pseudowire status bit is set for both active and standby pseudowires. When only one of the pseudowire is faulty, SMGR does not notify CFM. The notification occurs only when both pseudowire becomes faulty. The SMGR propagates the fault to CFM.
Since there is no fault handling in the pipe service, any CFM fault detected on an SDP binding is not used in the pseudowire redundancy’s algorithm to choose the most suitable SDP binding to transmit on.
Deployment of solutions that include legacy to Ethernet aggregation should involve fault inter-working consideration. Protocols like Frame Relay propagate fault using the Local Management Interface (LMI). However, other protocols do not include a dedicated management interface over which to indicate fault. PPP, MLPPP and Cisco HDLC must use a different mechanism to communicate fault between the two different connection types.
The eth-legacy-fault-notification option and the associated parameters along with Ethernet CFM fault propagation on the Ethernet SAP MEP must be enabled in order to properly inter-work the Ethernet and PPP, MLPPP or Cisco HDLC connections. Figure 54 shows the various high level functions that inter-work Ethernet aggregation and legacy interfaces using point to point Ipipe services.
In general the Ipipe service requires the ce-address information to be learned or manually configured as part of the Ethernet SAP object before the legacy interface connection can be established. IPv6 includes an optimization that uses the Link Local IPv6 address to start the legacy negotiation process and does not require the ce-addressing described previously. This IPv6 optimization does not align well with fault inter-working functions and is disabled when the eth-legacy-fault-notification function is enabled.
Fault propagation is not active from the Ethernet SAP to the legacy connection if the ce-address information for the Ethernet SAP has not been learned or configured. If both IPv4 and IPv6 are configured, each protocol will require ce-addressing to be learned or configured enabling fault inter-working for that protocol. Once the ce-address has been learned or configured for that protocol, fault inter-working will be active for that protocol. If either IPv4 or IPv6 ce-addressing from the Ethernet SAP is resident, the access legacy SAP will be operational. The NCP layer will indicate which unique protocol is operational. Fault propagation toward the Ethernet SAP from the legacy connection will still be propagated even if the ce-address is not resident within the Ipipe under the following conditions; if any SAP or the Service is shutdown, or the legacy SAP is not configured.
The learned Ethernet ce-address is a critical component in Ipipe service operation and fault propagation. In order to maintain the address information the keep option must be configured as part of the ce-address-discovery command. If the keep command is not configured, the address information is lost when the Ethernet SAP transitions to a non-operational state. When the address information is flushed, the Ipipe service will propagate the fault to the legacy PPP, MLPPP and Cisco HDLC connections. The lack of the ce-addressing on the Ethernet SAPs may cause a deadlock condition that requires operator intervention to resolve the issue. The keep command must be configured when the eth-legacy-fault-notification functionality is enabled with PPP, MLPPP and Cisco HDLC legacy interfaces, and fault propagation is required using this type of aggregation deployment. The keep option is specific to and only supported when eth-legacy-fault-notification is configured. If the keep option is configured as part of the ce-address-discovery command, the eth-legacy-fault-propagation cannot be removed. Configuration changes to the ce-address-discovery command may affect the stored ce-address information. For example, if the eth-legacy-fault-notification ipv6 keep is changed to ce-address-discovery keep, the stored IPv6 ce-address information is flushed. If the keep option is removed, all discovered ce-address information is flushed if the SAP is operationally down.
The ce-address stored in the Ipipe service as part of the discovery process will be updated if a new ARP arrives from the layer three device connected to the Ethernet SAP. If the layer three device connected to the Ethernet SAP does not send an ARP to indicate the addressing information has been changed, the ce-address stored locally as part of the previous discovery function will be maintained. If changes are made to the layer three device connected to the Ethernet SAP that would alter the ARP information and that device does not generate an ARP packet, or the Ipipe inter-working device does not receive the ARP packet, for example, the Ethernet SAP is admin down for IPv4, or the service is operationally down for IPv6, the stored ce-address retained by the Ipipe as a result of the keep operation will be stale. This stale information will result in a black hole for service traffic. The clear service id service-id arp can be used to flush stale ARP information. This will not solicit a arp from a peer.
The keep option will not maintain the ce-address information when the Ethernet SAP is administratively shutdown or when the node reboots.
Once all the ce-addressing has been populated in the Ipipe the legacy interfaces establishment will commence. The successful establishment of these connections will render the Ipipe service functional. Legacy connection faults and Ethernet SAP faults may now be propagated.
Should the Ethernet SAP enter a non-operational state as a result of a cable or validation protocol (ETH-CCM), the fault will be inter-worked with the specific legacy protocol. Ethernet faults will inter-work with the legacy interfaces in the following manner:
As previously stated, inter-working faults on the legacy connection with the Ethernet infrastructure requires a Down MEP with CCM-enabled configured on the Ethernet SAP with fault-propagation enabled. There are two different methods to propagate fault from a CCM-enabled MEP; use-int-tlv or suspend-ccm. The use-int-tlv approach will cause the CCM message to include the Interface Status TLV with a value of is Down. This will raise a defMACStatus priority error on the peer MEP. The suspend-ccm approach will cause the local MEP to suspend transmissions of the CCM messages to the peer MEP. This will raise a defRemoteCCM timeout condition on the peer. The peer must accept these notifications and processes these fault conditions on the local MEP. When the MEP receives these errors, it must not include a defect condition in the CCM messages it generates that is above the peers low-priority-defect setting. In standard operation, the MEP receiving the error should only set the RDI bit in the CCM header. If the MEP improperly responds with a defect condition that is higher than the low-priority-defect of the MEP that had generated the initial fault then a deadlock condition will occur and operator intervention will be required. The two CFM propagation methods and the proper responses are shown in Figure 55.
From a protocol (NCP) perspective, PPP and MLPPP connections have a micro view. Those connections understand the different protocols carried over the PPP and MLPPP connections, and individual protocol errors that can occur. The Ethernet SAP has a macro view without this layer three understanding. When the dual stack IPv4 and IPv6 is deployed, fault can only be propagated from the legacy connection toward the associated Ethernet SAP if both protocols fail on the PPP or MLPPP. If either of the protocols are operational then PPP or MLPPP will not propagate fault in the direction of the Ethernet connection.
Ethernet connection faults are prioritized over legacy faults. When an Ethernet fault is detected, any fault previously propagated from the PPP, MLPPP or Cisco HDLC will be squelched in favor of the higher priority Ethernet SAP failure. All legacy fault conditions, including admin port down, will in turn be dismissed for the duration of the Ethernet fault and will not be rediscovered until the expiration of the recovery-timer. This configurable timer value is the amount of time the process waits to allow the legacy connections to recover and establish following the clearing of the Ethernet fault. If the timer value is too short then false positive propagation will occur from the legacy side to the Ethernet connection. If the timer value is too long then secondary legacy faults will not be propagated to the associated Ethernet SAP for an extended period of time, delaying the proper state on the layer three device connected to the Ethernet SAP. Any packets arriving on the Ethernet SAP will be dropped until the legacy connection has recovered. As soon as the legacy connection recovers forwarding across the Ipipe will occur regardless of the amount of time remaining for the recovery timer. Operators are required to adjust this timer value to their specific network requirements. If the timer adjustment is made while the service is active, the new timer will replace the old value and the new value will start counting down when called.
If the eth-legacy-fault-notification command is disabled from an active Ipipe service then any previously reported fault will be cleared and the recovery-timer will be started. If the eth-legacy-fault-notification command is added to an active Ipipe service, the process will check for outstanding faults and take the appropriate action.
Cisco HDLC behavior must be modified in order to better align with the fault inter-working function. In order to enable the eth-legacy-fault-notification, keepalives must be enabled. The following describes the new behavior for the Cisco HDLC port:
The show service command has been expanded to include the basic Ethernet Legacy Fault Notification information and the specific SAP configuration.
The “Eth Legacy Fault Notification” section displays the configured recovery-timer value and whether the eth-legacy-fault-notification is active “Admin State: inService” (no shutdown) or inactive “Admin State: outOfService” (shutdown).
The “Ipipe SAP Configuration Information” displays the current Ethernet fault propagated to the associated legacy connection state; “Legacy Fault Notify”: False indicates no fault is currently being propagated and True indicates fault is currently being propagated. The “Recvry Timer Rem” is used to show the amount of time remaining before the recovery timer expires. A time in seconds will only be displayed for this parameter if an Ethernet fault has cleared and the recovery timer is currently counting down to 0.0 seconds.
A number of examples have been included using the service configuration below to demonstrate the various conditions. Many of the display commands have been trimmed in an effort to present feature relevant information.
The following output is an example of a service fully operational with no faults.
The same service is used to demonstrate an Ethernet SAP failure condition propagating fault to the associated PPP connection. In this case an ETH-CCM time out has occurred. Only the changes have been highlighted.
The log events below will be specific to the failure type and the protocols involved.
The following output displays an example when the Ethernet fault condition clears a transitional state occurs.
An example of the legacy fault propagation to the associated Ethernet SAP and the remote peer using the ETH-CFM fault propagation, assuming no Ethernet Fault is taking precedence.
This feature is only supported for an Ipipe service that has a single legacy connection with an encap-type PPP, MLPPP or Cisco-HDLC and an Ethernet SAP. No other combinations are supported. Deployments using APS cannot use this fault propagation functionality.
The propagation of fault is based on the interaction of a number of resources and software functions. This means that propagation and recovery will vary based on the type of failure, the scale of the failure, the legacy protocol, the system overhead at the time of the action, and other interactions.
Before maintenance operations are performed the operation should be aware of the operational state of the service and any fault propagation state. Admin legacy port state down conditions do not cause fault propagation, it is the operational port state that conveys fault. During a Major ISSU operation, legacy faults will be cleared and not propagated toward the Ethernet network. In order to prevent this clearing of faults, the operator may consider shutting down the Ethernet port or shutdown the ETH-CFM MEPs to cause a timeout upstream.
Note: The CLI commands for these functions can be found in the 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 2 Services and EVPN Guide: VLL, VPLS, PBB, and EVPN. |
When a SAP or SDP-binding becomes faulty (oper-down, admin-down, or pseudowire status faulty), SMGR propagates the fault to OAM components on the mate SAP or SDP-binding.
When the service is administratively shutdown, SMGR propagates the fault to OAM components on both SAP or SDP-bindings.
When the fault occurs on the SAP side, the pseudowire status bit is set for both active and standby pseudowires.
When only one of the pseudowire is faulty, SMGR does not notify CFM. The notification only occurs when both pseudowires become faulty. Then the SMGR propagates the fault to CFM. Since there is no fault handling in the pipe service, any CFM fault detected on a SDP-binding is not used in the pseudowire redundancy’s algorithm to choose the most suitable SDP-binding to transmit on.
For VPLS services, on down MEPs are supported for fault propagation.
When a MEP detects a fault and fault propagation is enabled for the MEP, CFM communicate the fault to the SMGR. The SMGR will mark the SAP and SDP-binding as oper-down. Note that oper-down is used here in VPLS instead of “oper-up but faulty” in the pipe services. CFM traffic can be transmitted to or received from the SAP and SDP-binding to ensure when the fault is cleared, the SAP will go back to normal operational state.
Note that as stated in CFM Connectivity Fault Conditions, a fault is raised whenever a remote MEP is down (not all remote MEPs have to be down). When it is not desirable to trigger fault handling actions in some cases when a down MEP has multiple remote MEPs, operators can disable fault propagation for the MEP.
If the MEP is a down MEP, SMGR performs the fault handling actions for the affected service(s). Local actions done by the SMGR include (but are not limited to):
If the service instance is a B-VPLS, and an associated B-MAC address is configured for the failed SAP and SDP-binding, the SMGR performs a lookup using the B-MAC address to find out which pipe services will be notified and then propagate fault to these services.
Within the same B-VPLS service, all SAPs/SDP-bindings configured with the same fault propagation B-MACs must be faulty or oper down for the fault to be propagated to the appropriate pipe services.
When a VPLS service is down:
A SAP or SDP binding that has a down MEP fault is made operationally down. This causes pseudowire redundancy or Spanning Tree Protocol (STP) to take the appropriate actions.
However, the reverse is not true. If the SAP or SDP binding is blocked by STP, or is not tx-active due to pseudowire redundancy, no fault is generated for this entity.
For IES and VPRN services, only down MEP is supported on Ethernet SAPs and spoke SDP bindings.
When a down MEP detects a fault and fault propagation is enabled for the MEP, CFM communicates the fault to the SMGR. The SMGR marks the SAP/SDP binding as operationally down. CFM traffic can still be transmitted to or received from the SAP and SDP-binding to ensure when the fault is cleared and the SAP will go back to normal operational state.
Because the SAP and SDP-binding goes down, it is not usable to upper applications. In this case, the IP interface on the SAP and SDP-binding go down. The prefix is withdrawn from routing updates to the remote PEs. The same applies to subscriber group interface SAPs on the 7450 ESS and 7750 SR.
When the IP interface is administratively shutdown, the SMGR notifies the down MEP and a CFM fault notification is generated to the CPE through interface status TLV or suspension of CCM based on local configuration.
When the node acts as a pseudowire switching node, meaning two pseudowires are stitched together at the node, the SMGR will not communicate pseudowire failures to CFM. Such features are expected to be communicated by pseudowire status messages, and CFM will run end-to-end on the head-end and tail-end of the stitched pseudowire for failure notification.
LLF and CFM fault propagation are mutually exclusive. CLI protection is in place to prevent enabling both LLF and CFM fault propagation in the same service, on the same node and at the same time. However, there are still instances where irresolvable fault loops can occur when the two schemes are deployed within the same service on different nodes. This is not preventable by the CLI. At no time should these two fault propagation schemes be enabled within the same service.
802.3ah EFM OAM declares a link fault when any of the following occurs:
When 802.3ah EFM OAM declares a fault, the port goes into operation state down. The SMGR communicates the fault to CFM MEPs in the service.
OAM fault propagation in the opposite direction (SMGR to EFM OAM) is not supported.
Service Application Agent (SAA) is a tool that allows operators to configure a number of different tests that can be used to provide performance information like delay, jitter and loss for services or network segments. The test results are saved in SNMP tables or summarized XML files. These results can be collected and reported on using network management systems.
SAA uses the resources allocated to the various OAM processes. These processes are not dedicated to SAA but shared throughout the system. Table 14 provides guidance on how these different OAM functions are logically grouped.
Test | Description |
Background | Tasks configured outside of the SAA hierarchy that consume OAM task resources. Specifically, these include SDP-Keep Alive, Static route cpe-check, filter redirect-policy, ping-test, and vrrp policy host-unreachable. These are critical tasks that ensure the network operation and may affect data forwarding or network convergence. |
SAA Continuous | SAA tests configured as continuous (always scheduled). |
SAA non-continuous | SAA tests that are not configured as continuous, hence scheduled outside of the SAA application. These tests require the oam saa test-name start command to initiate the test run. |
Non-SAA (Directed) | Any task that does not include any configuration under SAA. These tests are SNMP or via the CLI that is used to troubleshoot or profile network condition. This would take the form “oam test-type”, or ping or traceroute with the specific test parameters. |
SAA test types are restricted to those that utilize a request response mechanism, single-ended tests. Dual-ended tests that initiate the test on one node but require the statistical gathering on the other node are not supported under SAA. As an example, Y.1731 defines two approaches for measuring frame delay and frame delay variation, single-ended and dual-ended. The single-ended approach is supported under SAA.
Post processing analysis of individual test runs can be used to determine the success or failure of the individual runs. The operator can set rising and lowering thresholds for delay, jitter, and loss. Exceeding the threshold will cause the test to have a failed result. A trap can be generated when the test fails. The operator is also able to configure a probe failure threshold and trap when these thresholds are exceeded.
Each supported test type has configuration properties specific to that test. Not all options, intervals, and parameters are available for all tests. Some configuration parameters, such as the sub second probe interval require specific hardware platforms.
The SAA ping style commands, listed in the CLI description, may be configured as continuous, meaning automatically re-scheduled. Several closure and rescheduling functions occur that affect the probe spacing between runs.
Trace type tests apply the timeout to each individual packet, which may affect spacing. This is required because packet timeout may be required to move from one probe to the next probe. For tests that do not require this type of behavior, typically ping and ETH-CFM PM functions, the probes will be sent at the specified probe interval and the timeout is only applied at the end of the test if any probe has been lost during the run. When the timeout is applied at the end of the run, the test is considered complete when either all response have been received or the timeout expires at the end of the test run. For tests marked as continuous (always scheduled), the spacing between the runs may be delayed by the timeout value when a packet is lost. The test run is complete when all probes have either been received back or the timeout value has expired.
In order to preserve system resources, specifically memory, the operator should only store summarized history results. By default, summary results are stored for tests configured with sub second probe intervals, or a probe count above 100 or is written to a file. By default, per probe information will be stored for test configured with an interval of one second or above counters, and probe counts of 100 or less and is not written to a file. The operator may choose to override these defaults using the probe-history {keep | drop | auto} option. The auto option sets the defaults above. The other options override the default retention schemes based on the operator requirements, per probe retention keep or summary only information drop. The probe data can be viewed using the show saa test command. If the per probe information is retained, this probe data is available at the completion of the test run. The summary data is updated throughout the test run. The overall memory system usage is available using the show system memory-pools command. The OAM entry represents the overall memory usage. This includes the history data stored for SAA tests. A clear saa testname option is available to release the memory and flush the test results.
SAA launched tests will maintain two most recent completed and one in progress test. It is important to ensure that the collection and accounting record process is configured in such a way to write the data to file before it is overwritten. Once the results are overwritten they are lost.
Any data not written to file will be lost on a CPU switch over.
There are a number of show commands to help the operator monitor the test oam tool set.
show test-oam oam-config-summary —Provides information about the configured tests.
show test-oam oam-perf — Provides the transmit (launched form me) rate information and remotely launched test receive rate on the local network element.
clear test-oam oam-perf — Provides the ability to clear the test oam performance stats for a current view of the different rates in the oam-perf command above.
monitor test-oam oam-perf — Makes use of the monitor command to provide time sliced performance stats for test oam functions.
OAM Performance Monitoring (OAM-PM) provides an overall architecture for gathering and computing key performance indicators (KPI) using standard protocols and a robust collection model. The architecture is comprised of a number of foundational components.
The hierarchy of the architecture is captured in the Figure 56. This diagram is only meant to draw the relationship between the components. It is not meant to depict all the detailed parameters required.
OAM-PM configurations are not dynamic environments. All aspects of the architecture must be carefully considered before configuring the components, making external references to other related components, or activating the OAM-PM architecture. No modifications are allowed to any components that are active or have any active sub-components. Any function being referenced by an active OAM-PM function or test cannot be modified or have its state shutdown. For example, to change any configuration element of a session, all active tests must be in a shut down state. To change any bin group configuration (other than the description parameter) all sessions with a delay-capable test that references the bin group (config>oam-pm>session>bin-group bin-group-number) must be shut down.
Session sources and destinations configuration parameters are not validated by the test that makes uses of that information. Once the test is activated with a no shutdown, the test engine attempts to send the test packets even if the session source and destination information does not accurately represent the entity that must exist to successfully transmit or terminate the packets. If the session is a MEP-based Ethernet session and the source-based MEP does not exist, the transmit count for the test will be zero. If the source-based session is TWAMP Light, the OAM-PM transmit counter will increment but the receive counter will not.
OAM-PM is not a hitless operation. If a high availability event occurs, causing the backup CPM to become the newly active or when ISSU functions are performed. Tests in flight will not be completed, open files may not be closed, and test data not written to a properly closed XML file will be lost. There is no synchronization of state between the active and the backup control modules. All OAM-PM statistics stored in volatile memory will be lost. Once the reload or high availability event is completed and all services are operational then the OAM-PM functions will commence.
It is possible that during times of network convergence, high CPU utilizations or contention for resources, OAM-PM may not be able to detect changes to an egress connection or allocate the necessary resources to perform its tasks.
This is the overall collection of different tests, the test parameters, measurement intervals, and mapping to configured storage models. It is the overall container that defines the attributes of the session.
Session Type: Assigns the mantra of the test to either proactive (default) or on-demand. Individual test timing parameters will be influenced by this setting. A proactive session will start immediately following the no shutdown of the test. A proactive test will continue to execute until a manual shutdown stops the individual test. On-demand tests do not start immediately following the no shutdown command. The operator must start an on-demand test by using the oam oam-pm session start command and specifying the applicable protocol. The operator can override the no test-duration default by configuring a fixed amount of time the test will execute, up to 24 hours (86400 seconds). If an on-demand test is configured with a test-duration, it is important to shut down and delete the tests when they are completed and all the results collected. This will free all system memory that has been reserved for storing the results. In the event of a high-availability event that causes the backup CPM to become the newly active, all on-demand tests will need to be restarted manually using the oam oam-pm session start command for the specific protocol.
Test Family: The main branch of testing that will be addressed a specific technology. The available test parameters for the session will be based off the test family. The destination, source, and the priority are common to all tests under the session and defined separately from the individual test parameters.
Test Parameters: The parameters include individual tests with the associated parameters including start and stop times and the ability to activate and deactivate the individual test.
Measurement Interval: Assignment of collection windows to the session with the appropriate configuration parameters and accounting policy for that specific session.
The “Session” can be viewed as the single container that brings all aspects of individual tests and the various OAM-PM components under a single umbrella. If any aspects of the session are incomplete, the individual test may fail to be activated with a no shutdown command. If this situation occurs an error, it will indicate with “Invalid session parameters”.
A number of standards bodies define performance monitoring packets that can be sent from a source, processed, and responded to by a reflector. The protocol may be solely focused on measuring a single specific performance criteria or multiple. The protocols available to carry out the measurements will be based on the test family type configured for the session.
Ethernet PM delay measurements are carried out using the Two Way Delay Measurement Protocol version 1 (DMMv1) defined in Y.1731 by the ITU-T. This allows for the collection of Frame Delay (FD), InterFrame Delay Variation (IFDV), Frame Delay Range (FDR) and Mean Frame Delay (MFD) measurements, round trip, forward, and backward.
DMMv1 adds the following to the original DMM definition:
DMMv1 and DMM are backwards compatible and the interaction is defined in Y.1731 ITU-T-2011 Section 11 “OAM PDU validation and versioning.”
Ethernet PM loss measurements are carried out using the Synthetic Loss Measurement (SLM) defined in Y.1731 by the ITU-T. This allows for the calculation of Frame Loss Ratio (FLR) and availability. The ITU-T also defines a frame loss measurement (LMM) approach that provides frame loss ratio (FLR) and raw transmit and receive frame counters in each direction and availability metrics.
IP Performance data uses the TWAMP test packet for gathering both delay and loss metrics. OAM-PM supports Appendix I of RFC 5357 (TWAMP Light). The SR OS supports the gathering of delay metrics Frame Delay (FD), InterFrame Delay Variation (IFDV), Frame Delay Range (FDR) and Mean Frame Delay (MFD) round trip, forward and backward.
A session can be configured with one test or multiple tests. Depending on sessions test type family, one or more test configurations may need to be included in the session to gather both delay and loss performance information. Each test that is configured within a session will share the common session parameters and common measurement intervals. However, each test can be configured with unique per test parameters. Using Ethernet as an example, both DMM and SLM would be required to capture both delay and loss performance data. IP performance measurement uses a single TWAMP packet for both delay and synthetic loss.
Each test must be configured with a TestID as part of the test parameters. This uniquely identifies the test within the specific protocol. A TestID must be unique within the same test protocol. Again using Ethernet as an example, DMM and SLM tests within the same session can use the same TestID because they are different protocols. However, if a TestID is applied to a test protocol (like DMM or SLM) in any session, it cannot be used for the same protocol in any other session. When a TestID is carried in the protocol, as it is with DMM and SLM, this value does not have global significance. When a responding entity must index for the purpose of maintaining sequence numbers, as in the case of SLM, the tuple TestID, Source MAC, and Destination MAC are used to maintain the uniqueness on the responder. This means the TestID has only local and not global significance. TWAMP test packets also require a TestID to be configured but do not carry this information in the PDU. However, it is required for uniform provisioning under the OAM-PM architecture. TWAMP uses a four tuple Source IP, Destination IP, Source UDP, and Destination UDP to maintain unique session indexes.
Each OAM-PM session test contains both an administrative and an operational state. The administrative state is linked to the shutdown or no shutdown configuration for the test. The operational state indicates if the test is actively sending or trying to send test frames. For all proactive session types, the administrative and operational states are linked. The test immediately starts and attempts to send test frames as soon as the no shutdown command is accepted. Sessions configured using the session-type on-demand command do not link the two states. The operational state is up when the test is actively generating or attempting to generate test frames. This does not occur at the time the no shutdown command is accepted. The oam oam-pm ... start command is required to commence the transmission of the test frames for on-demand tests. When the manually issued start command is accepted, the operational state changes to up. At the expiration of the test duration, or when the test is manually stopped using the oam oam-pm ... stop command, the operational state transitions to down.
A new field, “Detectable Tx Err”, has been added to each Ethernet and IP OAM-PM test to indicate if there is a detectable transmit error that is preventing the test frames from being sent. This does not affect the two existing states. Detectable Tx Err information can be used to assist in troubleshooting, as it conveys information about a current detectable error state. No log events are created and no historical references are maintained when the error condition clears or changes.
The Detectable Tx Err condition is checked for each test which is operationally up and attempting to send frames. The audit function checks to see if a transmit error condition exists that could prevent the packet from being sent; for example, the source MEP is not fully configured, or the IP interface associated with the source is down. A raised detectable transmit error is cleared for the test if the audit process executes and finds no further detectable transmit error conditions. The interval of the audit function depends on a number of factors, such as the test family, the probe interval, and the number of active tests.
This is an ongoing maintenance function that executes while the test is operationally up. A maximum of one detectable transmit error can be presented to the operator. However, more underlying conditions may be detected should the existing condition be cleared.
When a test’s operational state is anything other than up, the detection process will stop and any Detectable Tx Error fields will display a value of “none”. History is not maintained for detectable transmit errors.
Not all errors are detectable. This function is only meant to guide the operator to a potential area of concern. There is a very large dependency on the direction of the MEP, the service type, the test protocol, and resource requirements that the test maintains over the underlying entity; any of these can influence the reporting of certain errors. Tests requiring the resources of an Up MEP, for example, will typically only report the conditions relating to incomplete configuration or the administrative state of the MEP. Up MEPs ignore the presence of egress connections and service states. However, LMM tests will report an unexpected error condition when the Up MEP SDP binding is down because it explicitly requires resources from the non-operational SDP binding. The same cannot be said if an LMM test was executing from an Up MEP configured over a non-operational SAP.
The TIMETRA-OAM-PM-MIB TEXTUAL-CONVENTION TmnxOamPmDetectableTxError MIB lists all possible detectable error conditions.
The show oam-pm session session-name command provides a detailed view of the session, including each test and the Detectable Tx Err field for those tests.
The show oam-pm sessions detectable-tx-errors command lists all sessions that include a test with a detectable transmit error and the associated error.
A measurement interval is a window of time that compartmentalizes the gathered measurements for an individual test that has occurred during that time. Allocation of measurement intervals, which equates to system memory, is based on the metrics being collected. This means that when both delay and loss metrics are being collected, they allocate their own set of measurement intervals. If the operator is executing multiple delay and loss tests under a single session then multiple measurement intervals will be allocated one per criteria per test.
Measurement intervals can be five minutes (5-min), 15 minutes (15-min), one hour (1-hour), and one day (1-day) in duration. The boundary-type defines the start of the measurement interval and can be aligned to the local time of day clock (wall clock), with or without an optional offset. The boundary-type can be test-aligned, which means the start of the measurement interval coincides with the no shutdown of the test, for proactive tests. By default the start boundary is clocked aligned without an offset. When this configuration is deployed, the measurement interval will start at zero, in relation to the length. When a boundary is clock aligned and an offset is configured, that amount of time will be applied to the measurement interval. Offsets are configured on a per measurement interval basis and only applicable to clock-aligned and not test aligned measurement intervals. Only offsets less than the measurement interval duration are be allowed. Table 15 provides some examples of the start times of each measurement interval.
Offset | 5-min | 15-min | 1-hour | 1-day |
0 (default) | 00,5,10,15..55 | 00,15,30,45 | 00 (top of the hour) | midnight |
10 minutes | rejected | 10,25,40,55 | 10 min after the hour | 10 minutes after midnight |
30 minutes | rejected | rejected | 30 minutes after the hour | 30 minutes after midnight |
60 minutes | rejected | rejected | rejected | 01:00am |
Although test aligned approaches may seem beneficial for simplicity, there are some drawbacks that need to be considered. The goal of time based and well-defined collection windows allows for the comparison of measurements across common windows of time throughout the network and for relating different tests or sessions. It is suggested that proactive sessions use the default clock-aligned boundary type. On-demand sessions may make use of test-aligned boundaries. On-demand tests are typically used for troubleshooting or short term monitoring that does not require alignment or comparison to other PM data.
The statistical data collected and the computed results from each measurement interval will be maintained in volatile system memory by default. The number of intervals-stored is configurable per measurement interval. Different measurement interval lengths will have different defaults and ranges. The interval-stored parameter defines the number of completed individual test runs to store in volatile memory. There is an additional allocation to account for the active measurement interval. In order to look at the statistical information for the individual tests and a specific measurement interval stored in volatile memory, the show oam-pm statistics … interval-number can be used. If there is an active test, it can be viewed using the interval-number 1. In this case, the first completed record would be 2, previously completed would number back to the maximum intervals stored value plus one.
As new tests for the measurement interval complete, the older entries will get renumbered to maintain their relative position to the current test. As the retained test data for a measurement interval consumes the final entry, any subsequent entries will cause the removal of the oldest data.
There are obvious drawbacks to this storage model. Any high availability function that causes an active CPM switch will flush the results that were in volatile memory. Another consideration is the large amount of system memory consumed using this type of model. Given the risks and resource consumption this model incurs, an alternate method of storage is supported. An accounting policy can be applied to each measurement interval in order write the completed data in system memory to non-volatile flash in an XML format. The amount of system memory consumed by historically completed test data must be balanced with an appropriate accounting policy. It is recommended that the only necessary data be stored in non-volatile memory to avoid unacceptable risk and unnecessary resource consumption. It is also suggested that a large overlap between the data written to flash and stored in volatile memory is unnecessary.
The statistical information is system memory is also available by SNMP. If this method is chosen then a balance must be struck between the intervals retained and the times at which the SNMP queries collect the data. One must be cautious when determining the collection times through SNMP. If a file completes while another file is being retrieved through SNMP then the indexing will change to maintain the relative position to the current run. Proper spacing of the collection is key to ensuring data integrity.
The OAM-PM XML File contains the following keywords and MIB references.
XML File Keyword | Description | TIMETRA-OAM-PM-MIB Object |
oampm | — | None - header only |
Keywords Shared by all OAM-PM Protocols | ||
sna | OAM-PM session name | tmnxOamPmCfgSessName |
mi | Measurement Interval record | None - header only |
dur | Measurement Interval duration (minutes) | tmnxOamPmCfgMeasIntvlDuration (enumerated) |
ivl | measurement interval number | tmnxOamPmStsIntvlNum |
sta | Start timestamp | tmnxOamPmStsBaseStartTime |
ela | Elapsed time in seconds | tmnxOamPmStsBaseElapsedTime |
ftx | Frames sent | tmnxOamPmStsBaseTestFramesTx |
frx | Frames received | tmnxOamPmStsBaseTestFramesRx |
sus | Suspect flag | tmnxOamPmStsBaseSuspect |
dmm | Delay Record | None - header only |
mdr | minimum frame delay, round-trip | tmnxOamPmStsDelayDmm2wyMin |
xdr | maximum frame delay, round-trip | tmnxOamPmStsDelayDmm2wyMax |
adr | average frame delay, round-trip | tmnxOamPmStsDelayDmm2wyAvg |
mdf | minimum frame delay, forward | tmnxOamPmStsDelayDmmFwdMin |
xdf | maximum frame delay, forward | tmnxOamPmStsDelayDmmFwdMax |
adf | average frame delay, forward | tmnxOamPmStsDelayDmmFwdAvg |
mdb | minimum frame delay, backward | tmnxOamPmStsDelayDmmBwdMin |
xdb | maximum frame delay, backward | tmnxOamPmStsDelayDmmBwdMax |
adb | average frame delay, backward | tmnxOamPmStsDelayDmmBwdAvg |
mvr | minimum inter-frame delay variation, round-trip | tmnxOamPmStsDelayDmm2wyMin |
xvr | maximum inter-frame delay variation, round-trip | tmnxOamPmStsDelayDmm2wyMax |
avr | average inter-frame delay variation, round-trip | tmnxOamPmStsDelayDmm2wyAvg |
mvf | minimum inter-frame delay variation, forward | tmnxOamPmStsDelayDmmFwdMin |
xvf | maximum inter-frame delay variation, forward | tmnxOamPmStsDelayDmmFwdMax |
avf | average inter-frame delay variation, forward | tmnxOamPmStsDelayDmmFwdAvg |
mvb | minimum inter-frame delay variation, backward | tmnxOamPmStsDelayDmmBwdMin |
xvb | maximum inter-frame delay variation, backward | tmnxOamPmStsDelayDmmBwdMax |
avb | average inter-frame delay variation, backward | tmnxOamPmStsDelayDmmBwdAvg |
mrr | minimum frame delay range, round-trip | tmnxOamPmStsDelayDmm2wyMin |
xrr | maximum frame delay range, round-trip | tmnxOamPmStsDelayDmm2wyMax |
arr | average frame delay range, round-trip | tmnxOamPmStsDelayDmm2wyAvg |
mrf | minimum frame delay range, forward | tmnxOamPmStsDelayDmmFwdMin |
xrf | maximum frame delay range, forward | tmnxOamPmStsDelayDmmFwdMax |
arf | average frame delay range, forward | tmnxOamPmStsDelayDmmFwdAvg |
mrb | minimum frame delay range, backward | tmnxOamPmStsDelayDmmBwdMin |
xrb | maximum frame delay range, backward | tmnxOamPmStsDelayDmmBwdMax |
arb | average frame delay range, backward | tmnxOamPmStsDelayDmmBwdAvg |
fdr | frame delay bin record, round-trip | None - header only |
fdf | frame delay bin record, forward | None - header only |
fdb | frame delay bin record, backward | None - header only |
fvr | inter-frame delay variation bin record, round-trip | None - header only |
fvf | inter-frame delay variation bin record, forward | None - header only |
fvb | inter-frame delay variation bin record, backward | None - header only |
frr | frame delay range bin record, round-trip | None - header only |
frf | frame delay range bin record, forward | None - header only |
frb | frame delay range bin record, backward | None - header only |
lbo | Configured lower bound of the bin | tmnxOamPmCfgBinLowerBound |
cnt | Number of measurements within the configured delay range. Note that the session_name, interval_duration, interval_number, {fd, fdr, ifdv}, bin_number, and {forward, backward, round-trip} indices are all provided by the surrounding XML context. | tmnxOamPmStsDelayDmmBinFwdCount tmnxOamPmStsDelayDmmBinBwdCount tmnxOamPmStsDelayDmmBin2wyCount |
slm | Synthetic Loss Measurement Record | None - header only |
txf | Transmitted frames in the forward direction | tmnxOamPmStsLossSlmTxFwd |
rxf | Received frames in the forward direction | tmnxOamPmStsLossSlmRxFwd |
txb | Transmitted frames in the backward direction | tmnxOamPmStsLossSlmTxBwd |
rxb | Received frames in the backward direction | tmnxOamPmStsLossSlmRxBwd |
avf | Available count in the forward direction | tmnxOamPmStsLossSlmAvailIndFwd |
avb | Available count in the backward direction | tmnxOamPmStsLossSlmAvailIndBwd |
uvf | Unavailable count in the forward direction | tmnxOamPmStsLossSlmUnavlIndFwd |
uvb | Unavailable count in the backward direction | tmnxOamPmStsLossSlmUnavlIndBwd |
uaf | Undetermined available count in the forward direction | tmnxOamPmStsLossSlmUndtAvlFwd |
uab | Undetermined available count in the backward direction | tmnxOamPmStsLossSlmUndtAvlBwd |
uuf | Undetermined unavailable count in the forward direction | tmnxOamPmStsLossSlmUndtUnavlFwd |
uub | Undetermined unavailable count in the backward direction | tmnxOamPmStsLossSlmUndtUnavlBwd |
hlf | Count of HLIs in the forward direction | tmnxOamPmStsLossSlmHliFwd |
hlb | Count of HLIs in the backward direction | tmnxOamPmStsLossSlmHliBwd |
chf | Count of CHLIs in the forward direction | tmnxOamPmStsLossSlmChliFwd |
chb | Count of CHLIs in the backward direction | tmnxOamPmStsLossSlmChliBwd |
mff | minimum FLR in the forward direction | tmnxOamPmStsLossSlmMinFlrFwd |
xff | maximum FLR in the forward direction | tmnxOamPmStsLossSlmMaxFlrFwd |
aff | average FLR in the forward direction | tmnxOamPmStsLossSlmAvgFlrFwd |
mfb | minimum FLR in the backward direction | tmnxOamPmStsLossSlmMinFlrBwd |
xfb | maximum FLR in the backward direction | tmnxOamPmStsLossSlmMaxFlrBwd |
afb | average FLR in the backward direction | tmnxOamPmStsLossSlmAvgFlrBwd |
lmm | Frame Loss Measurement Record | None - header only |
txf | Transmitted frames in the forward direction | tmnxOamPmStsLossLmmTxFwd |
rxf | Received frames in the forward direction | tmnxOamPmStsLossLmmRxFwd |
txb | Transmitted frames in the backward direction | tmnxOamPmStsLossLmmTxBwd |
rxb | Received frames in the backward direction | tmnxOamPmStsLossLmmRxBwd |
mff | minimum FLR in the forward direction | tmnxOamPmStsLossLmmMinFlrFwd |
xff | maximum FLR in the forward direction | tmnxOamPmStsLossLmmMaxFlrFwd |
aff | average FLR in the forward direction | tmnxOamPmStsLossLmmAvgFlrFwd |
mfb | minimum FLR in the backward direction | tmnxOamPmStsLossLmmMinFlrBwd |
xfb | maximum FLR in the backward direction | tmnxOamPmStsLossLmmMaxFlrBwd |
afb | average FLR in the backward direction | tmnxOamPmStsLossLmmAvgFlrBwd |
ave | lmm availability enabled/disabled | No TIMETRA-OAM-PM-MIB entry |
avf | available count in the forward direction | tmnxOamPmStsLossLmmAvailIndFwd |
avb | available count in the backward direction | tmnxOamPmStsLossLmmAvailIndBwd |
uvf | unavailable count in the forward direction | tmnxOamPmStsLossLmmUnavlIndFwd |
uvb | unavailable count in the backward direction | tmnxOamPmStsLossLmmUnavlIndBwd |
uaf | undetermined available count in the forward direction | tmnxOamPmStsLossLmmUndtAvlFwd |
uab | undetermined available count in the backward direction | tmnxOamPmStsLossLmmUndtAvlBwd |
uuf | undetermined unavailable count in the forward direction | tmnxOamPmStsLossLmmUndtUnavlFwd |
uub | undetermined unavailable count in the backward direction | tmnxOamPmStsLossLmmUndtUnavlBwd |
hlf | count of HLIs in the forward direction | tmnxOamPmStsLossLmmHliFwd |
hlb | count of HLIs in the backward direction | tmnxOamPmStsLossLmmHliBwd |
chf | count of CHLIs in the forward direction | tmnxOamPmStsLossLmmChliFwd |
chb | count of CHLIs in the backward direction | tmnxOamPmStsLossLmmChliBwd |
udf | undetermined delta-t in the forward direction | tmnxOamPmStsLossLmmUndetDelTsFwd |
udb | undetermined delta-t in the backward direction | tmnxOamPmStsLossLmmUndetDelTsBwd |
TLD | TWAMP Light Delay Record | None - header only |
mdr | minimum frame delay, round-trip | tmnxOamPmStsDelayTwl2wyMin |
xdr | maximum frame delay, round-trip | tmnxOamPmStsDelayTwl2wyMax |
adr | average frame delay, round-trip | tmnxOamPmStsDelayTwl2wyAvg |
mdf | minimum frame delay, forward | tmnxOamPmStsDelayTwlFwdMin |
xdf | maximum frame delay, forward | tmnxOamPmStsDelayTwlFwdMax |
adf | average frame delay, forward | tmnxOamPmStsDelayTwlFwdAvg |
mdb | minimum frame delay, backward | tmnxOamPmStsDelayTwlBwdMin |
xdb | maximum frame delay, backward | tmnxOamPmStsDelayTwlBwdMax |
adb | average frame delay, backward | tmnxOamPmStsDelayTwlBwdAvg |
mvr | minimum inter-frame delay variation, round-trip | tmnxOamPmStsDelayTwl2wyMin |
xvr | maximum inter-frame delay variation, round-trip | tmnxOamPmStsDelayTwl2wyMax |
avr | average inter-frame delay variation, round-trip | tmnxOamPmStsDelayTwl2wyAvg |
mvf | minimum inter-frame delay variation, forward | tmnxOamPmStsDelayTwlFwdMin |
xvf | maximum inter-frame delay variation, forward | tmnxOamPmStsDelayTwlFwdMax |
avf | average inter-frame delay variation, forward | tmnxOamPmStsDelayTwlFwdAvg |
mvb | minimum inter-frame delay variation, backward | tmnxOamPmStsDelayTwlBwdMin |
xvb | maximum inter-frame delay variation, backward | tmnxOamPmStsDelayTwlBwdMax |
avb | average inter-frame delay variation, backward | tmnxOamPmStsDelayTwlBwdAvg |
mrr | minimum frame delay range, round-trip | tmnxOamPmStsDelayTwl2wyMin |
xrr | maximum frame delay range, round-trip | tmnxOamPmStsDelayTwl2wyMax |
arr | average frame delay range, round-trip | tmnxOamPmStsDelayTwl2wyAvg |
mrf | minimum frame delay range, forward | tmnxOamPmStsDelayTwlFwdMin |
xrf | maximum frame delay range, forward | tmnxOamPmStsDelayTwlFwdMax |
arf | average frame delay range, forward | tmnxOamPmStsDelayTwlFwdAvg |
mrb | minimum frame delay range, backward | tmnxOamPmStsDelayTwlBwdMin |
xrb | maximum frame delay range, backward | tmnxOamPmStsDelayTwlBwdMax |
arb | average frame delay range, backward | tmnxOamPmStsDelayTwlBwdAvg |
fdr | frame delay bin record, round-trip | None - header only |
fdf | frame delay bin record, forward | None - header only |
fdb | frame delay bin record, backward | None - header only |
fvr | inter-frame delay variation bin record, round-trip | None - header only |
fvf | inter-frame delay variation bin record, forward | None - header only |
fvb | inter-frame delay variation bin record, backward | None - header only |
frr | frame delay range bin record, round-trip | None - header only |
frf | frame delay range bin record, forward | None - header only |
frb | frame delay range bin record, backward | None - header only |
lbo | Configured lower bound of the bin | tmnxOamPmCfgBinLowerBound |
cnt | Number of measurements within the configured delay range. Note that the session_name, interval_duration, interval_number, {fd, fdr, ifdv}, bin_number, and {forward, backward, round-trip} indices are all provided by the surrounding XML context. | tmnxOamPmStsDelayTwlBinFwdCount tmnxOamPmStsDelayTwlBinBwdCount tmnxOamPmStsDelayTwlBin2wyCount |
TLL | TWAMP Light Loss Record | None - header only |
slm | Synthetic Loss Measurement Record | None - header only |
txf | Transmitted frames in the forward direction | tmnxOamPmStsLossTwlTxFwd |
rxf | Received frames in the forward direction | tmnxOamPmStsLossTwlRxFwd |
txb | Transmitted frames in the backward direction | tmnxOamPmStsLossTwlTxBwd |
rxb | Received frames in the backward direction | tmnxOamPmStsLossTwlRxBwd |
avf | Available count in the forward direction | tmnxOamPmStsLossTwlAvailIndFwd |
avb | Available count in the backward direction | tmnxOamPmStsLossTwlAvailIndBwd |
uvf | Unavailable count in the forward direction | tmnxOamPmStsLossTwlUnavlIndFwd |
uvb | Unavailable count in the backward direction | tmnxOamPmStsLossTwlUnavlIndBwd |
uaf | Undetermined available count in the forward direction | tmnxOamPmStsLossTwlUndtAvlFwd |
uab | Undetermined available count in the backward direction | tmnxOamPmStsLossTwlUndtAvlBwd |
uuf | Undetermined unavailable count in the forward direction | tmnxOamPmStsLossTwlUndtUnavlFwd |
uub | Undetermined unavailable count in the backward direction | tmnxOamPmStsLossTwlUndtUnavlBwd |
hlf | Count of HLIs in the forward direction | tmnxOamPmStsLossTwlHliFwd |
hlb | Count of HLIs in the backward direction | tmnxOamPmStsLossTwlHliBwd |
chf | Count of CHLIs in the forward direction | tmnxOamPmStsLossTwlChliFwd |
chb | Count of CHLIs in the backward direction | tmnxOamPmStsLossTwlChliBwd |
mff | minimum FLR in the forward direction | tmnxOamPmStsLossTwlMinFlrFwd |
xff | maximum FLR in the forward direction | tmnxOamPmStsLossTwlMaxFlrFwd |
aff | average FLR in the forward direction | tmnxOamPmStsLossTwlAvgFlrFwd |
mfb | minimum FLR in the backward direction | tmnxOamPmStsLossTwlMinFlrBwd |
xfb | maximum FLR in the backward direction | tmnxOamPmStsLossTwlMaxFlrBwd |
afb | average FLR in the backward direction | tmnxOamPmStsLossTwlAvgFlrBwd |
dm | MPLS Delay Record | None - header only |
mdr | minimum frame delay, round-trip | tmnxOamPmStsDelayMpls2wyMin |
xdr | maximum frame delay, round-trip | tmnxOamPmStsDelayMpls2wyMax |
adr | average frame delay, round-trip | tmnxOamPmStsDelayMpls2wyAvg |
mdf | minimum frame delay, forward | tmnxOamPmStsDelayMplsFwdMin |
xdf | maximum frame delay, forward | tmnxOamPmStsDelayMplsFwdMax |
adf | average frame delay, forward | tmnxOamPmStsDelayMplsFwdAvg |
mdb | minimum frame delay, backward | tmnxOamPmStsDelayMplsBwdMin |
xdb | maximum frame delay, backward | tmnxOamPmStsDelayMplsBwdMax |
adb | average frame delay, backward | tmnxOamPmStsDelayMplsBwdAvg |
mvr | minimum inter-frame delay variation, roundtrip | tmnxOamPmStsDelayMpls2wyMin |
xvr | maximum inter-frame delay variation, roundtrip | tmnxOamPmStsDelayMpls2wyMax |
avr | average inter-frame delay variation, roundtrip | tmnxOamPmStsDelayMpls2wyAvg |
mvf | minimum inter-frame delay variation, forward | tmnxOamPmStsDelayMplsFwdMin |
xvf | maximum inter-frame delay variation, forward | tmnxOamPmStsDelayMplsFwdMax |
avf | average inter-frame delay variation, forward | tmnxOamPmStsDelayMplsFwdAvg |
mvb | minimum inter-frame delay variation, backward | tmnxOamPmStsDelayMplsBwdMin |
xvb | maximum inter-frame delay variation, backward | tmnxOamPmStsDelayMplsBwdMax |
avb | average inter-frame delay variation, backward | tmnxOamPmStsDelayMplsBwdAvg |
mrr | minimum frame delay range, round-trip | tmnxOamPmStsDelayMpls2wyMin |
xrr | maximum frame delay range, round-trip | tmnxOamPmStsDelayMpls2wyMax |
arr | average frame delay range, round-trip | tmnxOamPmStsDelayMpls2wyAvg |
mrf | minimum frame delay range, forward | tmnxOamPmStsDelayMplsFwdMin |
xrf | maximum frame delay range, forward | tmnxOamPmStsDelayMplsFwdMax |
arf | average frame delay range, forward | tmnxOamPmStsDelayMplsFwdAvg |
mrb | minimum frame delay range, backward | tmnxOamPmStsDelayMplsBwdMin |
xrb | maximum frame delay range, backward | tmnxOamPmStsDelayMplsBwdMax |
arb | average frame delay range, backward | tmnxOamPmStsDelayMplsBwdAvg |
fdr | frame delay bin record, round-trip | None - header only |
fdf | frame delay bin record, forward | None - header only |
fdb | frame delay bin record, backward | None - header only |
fvr | inter-frame delay variation bin record, roundtrip | None - header only |
fvf | inter-frame delay variation bin record, forward | None - header only |
fvb | inter-frame delay variation bin record, backward | None - header only |
frr | frame delay range bin record, round-trip | None - header only |
frf | frame delay range bin record, forward | None - header only |
frb | frame delay range bin record, backward | None - header only |
cnt | The number of measurements within the configured delay range. Note that the session_name, interval_duration, interval_number, {fd, fdr, ifdv}, bin_number, and {forward, backward, round-trip} indices are all provided by the surrounding XML context. | tmnxOamPmStsDelayMplsBinFwdCount tmnxOamPmStsDelayMplsBinBwdCount tmnxOamPmStsDelayMplsBin2wyCount |
By default, a five minute measurement interval will store 33 test runs (32+1) with a configurable range of [1 to 96]. By default, 15-min measurement interval will store 33 test runs (32+1) with a configurable range of [1 to 96]. The 5-min and 15-min measurement intervals share the [1 to 96] pool up to a maximum of 96. In the unlikely case where both the 5-min and 15-min measurement intervals are configured for the same oam-pm session, the total combined intervals stored cannot exceed 96. By default, 1-hour measurement intervals will store 9 test runs (8+1) with a configurable range of [1 to 24]. The only storage for the 1-day measurement interval is 2 (1+1). When the 1-day measurement interval is configured, this is the only value for intervals. The value cannot be changed.
All four measurement intervals may be included for a single session if required. Each measurement interval that is included in a session will be updated simultaneously for each test that is being executed. If a measurement interval duration is not required, it should not be configured, as this consumes unnecessary resources. In addition to the four predefined lengths, a fifth measurement interval is always on and is allocated at test creation, the “raw” measurement interval. It is a valuable tool for assisting in real time troubleshooting as it maintains the same performance information and relates to the same bins as the fixed length collection windows. The operator may clear the contents of the raw measurement interval in order to flush stale statistical data in order to look at current conditions. This measurement interval has no configuration options, and it cannot be written to flash and cannot be disabled. It is a single never ending collection window.
Memory allocation for the measurement intervals is performed when the test transitions from an operationally down state to an operationally up state. Any previous stored test data will be cleared from volatile memory in favor of the new allocation. This will result in the loss of all data that has not been written to the XML file or collected by some other means. Volatile memory will be flushed and completely released when the test is deleted from the configuration, or a high availability event causes the backup CPM to become the newly active CPM, or some other event clears the active CPM system memory. Following an HA event, memory reallocation occurs when the operational state of the test changes from down to up. Shutting down a test does not release the allocated memory for the test.
Measurement intervals also include a suspect flag. The suspect flag is used to indicate that data collected in the measurement interval may not be representative. The flag will be set to true only under the following conditions;
The suspect flag is not set to true when there are times of service disruption, maintenance windows, discontinuity, low packet counts, or other such type events. Higher level systems would be required to interpret and correlate those types of event for measurement intervals that are executed during the time that relate to the specific interruption or condition. Since each measurement interval contains a start and stop time, the information is readily available to those higher level system to discount the specific windows of time.
There are two main metrics that are the focus of OAM-PM, delay and loss. The different metrics have their own unique storage structures and will allocate their own measurement intervals for these structures. This is regardless of whether the performance data is gathered with a single packet or multiple packet types.
Delay metrics include the following:
Unidirectional and round trip results are stored for each metric.
Unidirectional frame delay and frame delay range measurements require exceptional time of day clock synchronization. If the time of day clock does not exhibit extremely tight synchronization, unidirectional measurements will not be representative. In one direction, the measurement will be artificially increased by the difference in the clocks. In one direction, the measurement will be artificially decreased by the difference in the clocks. This level of clocking accuracy is not available with NTP. In order to achieve this level of time of day clock synchronization, consideration must be given to Precision Time Protocol (PTP) 1588v2.
Round trip metrics do not require clock synchronization between peers since the four timestamps allow for accurate representation of the round trip delay. The mathematical computation removes remote processing and any difference in time of day clocking. Round trip measurements do require stable local time of day clocks.
Any delay metric that is negative is treated as zero and placed bin 0, the lowest bin which has a lower boundary of 0 microseconds. In order to isolate these outlying negative results, the lower boundary of bin 1 for the frame delay type could be set to a value of 1 micro second. This means bin 0 would then only collect results that are 1 micro second or less. This would be an indication of the number of negative results that are being collected.
Delay results are mapped to the measurement interval that is active when the result arrives back at the source.
Loss metrics are only unidirectional and will report Frame Loss Ratio (FLR) and availability information. Frame loss ratio is the percentage computation of loss (lost/sent). Loss measurements during periods of unavailability are not included in the FLR calculation as they are counted against the unavailability metric.
Availability requires relating three different functions. First, the individual probes are lost or received based on sequence numbers in the protocol. A number of probes are rolled up into a small measurement window (delta-t). Frame loss ratio is computed over all the probes in a small window. If the resulting percentage is higher than the configured threshold, the small window is marked as high loss. If the resulting percentage is lower than or equal to the threshold, the small window is marked as non-high loss. A sliding window is defined as some number of small windows. The sliding window is used to determine availability and unavailability events. Switching from one state to the other requires every small window in the sliding window to be the same state and different from the current state. The maximum size of the sliding window cannot be greater than 100 seconds. The default values for these availability parameters can differ from PDU type to PDU type.
Availability and unavailability counters are incremented based on the number of small windows that have occurred in all available and unavailable windows.
Availability and unavailability reporting is not meant to capture and report on service outages or communication failures. Communication failures of a bidirectional or unidirectional nature must be captured using some other means of connectivity verification, alarming, or continuity checking. During periods of complete or extended failure, it becomes necessary to timeout individual test probes. It is not possible to determine the direction of the loss because no response packets are being received back on the source. In this case, the statistics calculation engine will maintain the previous state updating the appropriate directional availability or unavailability counter. At the same time, an additional per direction undetermined counter will be updated. This undetermined counter is used to indicate that the availability or unavailability statistics were indeterminable for a number of small windows.
During connectivity outages the higher level systems could be used to discount the loss measurement interval which covers the same span as the outage.
Availability and unavailability computations may delay the completion of a measurement interval. The declaration of a state change or the delay to closing a measurement interval could be equal to the length of the sliding window and the timeout of the last packet. A measurement interval cannot be closed until the sliding window has determined availability or unavailability. If the availability state is changing and the determination is crossing two measurement intervals, the measurement interval will not complete until the declaration has occurred. Typically, standards bodies indicate the timeout value per packet. For Ethernet, the timeout value for DMMv1, LMM, and SLM is set at 5s and is not configurable.
There are no log events based on availability or unavailability state changes. Based on the subjective nature of these counters, considering complete failure or total loss when it may not be possible to determine availability or unavailability, these counters represent the raw values that must be interpreted.
During times of availability, there can be times of high loss intervals (HLI) or consecutive high loss intervals (CHLI). These are indicators that the service was available, but individual small windows or consecutive small windows experienced frame loss ratios exceeding the configured acceptable limit. A HLI is any single small window that exceeds the configured frame loss ratio. This could equate to a severely errored second, assuming the small windows is one second in length. A CHLI is consecutive high loss intervals that exceed a consecutive threshold within the sliding window. Only one CHLI will be counted within a window. By default, HLI and CHLI counters are only incremented during periods of availability. These counters are not incremented during periods of unavailability.
The optional hli-force-count command can be used to modify the HLI counting behavior. When included as part of the loss parameters, counting of HLI and, by extension, CHLI, will continue during times of unavailability or undetermined unavailability. This optional configuration parameter does not influence how the availability states are determined or counted.
Both ETH-SLM and ETH-LMM provide methods for reporting loss. ETH-SLM uses the synthetic packets on the wire. ETH-LMM monitors the amount of service data. ETH-LMM frame loss counting is significantly different from ETH-SLM synthetic packet counting. The PDUs provide the largest variance.
The SLM PDU includes a sequence number that allows the mapping of message to the appropriate responses. The LMM PDU is fixed without support for optional TLVs. ETH-LMM has no method of correlating message and response. This means the LMR could represent a sample window equating to more than one LMM. SLM produces known and constant load on the network, whereas service data frames, counted by LMM, will vary or possibly be null. These two facts require slightly different approaches to availability and reliability.
LMM requires the availability command to not be shut down in order to enable the collection and computation of availability. When availability is not enabled, the existing behavior for frame loss ration min/max/avg remains unchanged. Determining if any new min/max has been encountered and computing the avg is based on every individual LMR. When LMM availability is enabled, the determination of the new min/max is based on the delta-t window. The longer sample size reduces the impact of gain and loss that may be introduced by reordering. No new FLR min/max values will be considered during periods of determined unavailability, regardless of the configuration for the availability option. The average FLR computation depends on the configuration of the availability metric. If availability is enabled, FLR is based on the sum of all LMR calculations received during periods of availability. If availability is not enabled, FLR is based on the LMR received prior to the closing of the measurement interval.
LMM will continue to count and report transmit and receive delta frames collected by the collect-lmm-stats command for the SAP, regardless of availability state. LMM will continue to display data frames counted during periods of unavailability; however, these frames will not count towards the average FLR of the measurement interval. In contrast, SLM does not count the synthetic packets on the wire during periods of unavailability. The raw transmit and receive information gathered by LMM has significant value regardless of availability in an unavailability state.
LMM includes a new counter, undetermined-delta-t, for the forward and backward directions. This new counter counts the number of delta-t windows that have no LMR responses recorded in that window of time, and provide an indication of the quality and scope of the delta-ts.
Figure 57 looks at loss in a single direction using synthetic packets. It demonstrates what happens when a possible unavailability event crosses a measurement interval boundary. As shown, the first 13 small windows are all marked available (1). This means that the lost probes that fit into each of those small windows did not equal or exceed a frame loss ratio of 50%. The next 11 small windows are marked as unavailable. This means that the lost probes that fit into each of those small windows were equal to or above a frame loss ratio of 50%. After the 10th consecutive small window of unavailability, the state transitions from available to unavailable. The 25th small window is the start of the new available state which is declared following the 10th consecutive available small window. Notice that the frame loss ratio (FLR) is 00.00%. This is because all the small windows that are marked as unavailable are counted towards unavailability and as such are excluded from impacting the FLR. If there were any small windows of unavailability that were outside an unavailability event, they would be marked as HLI or CHLI and be counted as part of the FLR.
Bin groups are templates that are referenced by the session. Three types of binnable statistics are available:
Each of these metrics can have up to 10 bins configured to group the results. Bins are configured by indicating a lower boundary. Bin 0 has a lower boundary that is always zero and not configurable. The micro second range of the bins is the difference between the adjacent lower boundaries. For example, bin-type fd bin 1 configured with a lower-bound 1000 micro seconds means bin 0 will capture all frame delay statistics results between 0 and 1ms. Bin 1 will capture all results above 1ms and below the bin 2 lower boundary. The last bin to be configured would represent the bin that collects all the results at and above that value. Not all ten bins must be configured.
A bin group configuration may relegate the first and last bin to capture anomalous measurements. Anomalous measurements can result from legacy equipment that queues a number of packets prior to circuit establishment. The relegation model characterizes these bins as non-representative of real network delay measurements. Results in these bins should be omitted from the average (avg) calculation. To accommodate these models, the command exclude-from-avg is available under the config>oam-pm>bin-group>bin-type hierarchy. This excludes results from the rolling average calculation that map to the excluded bins. The bins statistics will still accumulate all of the results, but the results are not part of the average computation. Every configured bin is included in the average calculation by default.
Each binnable delay metric type requires their own values for the bin groups. Each bin in a type is configurable for one value. It is not possible to configure a bin with different values for round trip, forward and backward. Consideration must be given to the configuration of the boundaries that represent the important statistics for that specific service or the values that meet the desired goals.
This is not a dynamic environment. If a bin group is referenced by any active delay-capable test, the bin group cannot be shut down. To modify the bin group, the delay-capable tests must be shut down. To change the setting of a bin group where a large number of sessions reference a bin group, consider migrating existing sessions to a new bin group with the new parameters to reduce the maintenance window.
Bin group 1 is the default bin group. Every session requires a bin group to be assigned. By default, bin group 1 is assigned to every OAM-PM session that does not have a bin group explicitly configured. Bin group 1 cannot be modified. Any bin lower-bound value that aligns with the 5000 μs (5 ms) default value (bin number × 5000 μs) will not be displayed as part of the output of the info command within the configuration. The info command does not display default values. The info detail command is required to show the default values. The bin group 1 configuration parameters are shown below.
Service Level Agreements (SLAs) typically require performance data to be collected over 5-minutes, or longer, measurement intervals. Network optimization tools require average performance values to be computed over shorter periods of time. The OAM PM streaming function takes advantage of the OAM PM architecture and test definitions to provide the basis for short window average results streaming.
The delay-template configuration allows for the definition of the common parameters; the metric type, direction, the length of the sample window and an integrity value. Once the template is defined it can be assigned to the appropriate technology for supporting test for collection, calculation, and reporting. The results of the process are sent via on change update notifications to subscribers.
The steaming function is supported for:
The updates are only sent if a subscription is registered for the on change values. The keys are common per individual test; session [session-name], the metric [metric-id] and the newest-closed [direction]. The following values are sent for each completed sample window; close-time (UTC), sample-count (number of samples used to compute delay), suspect (equal to or higher than the window-integrity percentage = no | lower = yes), and delay (compute value over the window in micro-seconds).
There is no history maintained on the node for this information. The single statistic is overwritten every time a sample window closes for a test configured to use a delay template. It is up to the higher-level systems to store and use this data.
Figure 58 brings together all the concepts discussed in the OAM-PM architecture. It shows a more detailed hierarchy than previously shown in the introduction including the relationship between the tests, the measurement intervals, and the storage of the results.
Figure 58 is a logical representation and not meant to represent the exact flow between elements in the architecture. For example, the line connecting the Acct-Policy and the Intervals Stored & Collected is not intended to show the accounting policy being responsible for the movement of data from completed records to Be Collected to Collected.
The following configuration includes information specific to the TWAMP Light session controller. It does not include the IP network configuration.
The following is an example of a service configuration launch for a VPRN service.
The following displays a TWAMP light reflector configuration for a VPRN service.
The following provides an example of configurations that are comprised of the different MPLS OAM PM elements for the various Label Switched Paths (LSPs). This example only includes configuration on the querier, excluding the basic MPLS and IP configurations. The equivalent MPLS configuration must be completed on all responders. Enabling MPLS DM is required on all queriers and responders.
The following describes the accounting policy configuration.
The following configuration enables MPLS DM.
The following shows the RSVP LSP configuration.
The following shows the RSVP-Auto LSP configuration.
The following shows the MPLS-TP LSP configuration.
The following shows the MPLS OAM-PM configuration.
The following configurations provide examples of different Ethernet OAM PM elements using ETH-CFM tools.
The RAW measurement interval can also use the monitor command to automatically update the statistics.
The following configuration and show commands provide an example of how frame loss measurement (ETH-LMM) can be used to collect frame loss metrics and the statistics gathered.
The LMM reflector must be configured to collect the statistics on the SAP or MPLS SDP binding where the terminating MEP has been configured.
The launch point must also enable statistical collection on the SAP or MPLS SDP binding of the MEP launch point.
The launch point must configure the OAM-PM session parameters. The CLI below shows a session configured with DMM for delay measurements (1s intervals) and LMM for frame loss measurements (10s interval). When using LMM for frame loss, the frame loss ratio and the raw frame transmit and receive statistics are captured, along with basic measurement interval and protocol information.
The previous section described the OAM-PM architecture. That provides a very powerful and well-defined mechanism to collect key performance information. This data is typically uploaded to higher level systems for consolidation and reporting tracking performance trends and conformance to Service Level Agreements (SLA). Event monitoring (event-mon) allows thresholds to be applied to the well-defined counters, percentage and binned results for a single and measurement interval per session. This Traffic Crossing Alert (TCA) function can be used to raise a log event when a configured threshold is reached. Optionally, the TCA can be cleared if a clear threshold is not breached in a subsequent measurement interval.
Thresholds can be applied to binned delay metrics and the various loss metric counters or percentages. The type of the TCA is based on the configuration of the two threshold values, threshold raise-threshold and clear clear-threshold. The on network element TCA functions are provided to log an event that is considered an exception condition that requires immediate attention. A single threshold can be applied to the collected metric.
Stateless TCAs are those events that do not include a configured clear-threshold. Stateless TCAs will raise the event when the raise-threshold is reached but do not share state with any following measurement intervals. Each subsequent measurement interval is treated as a unique entity without previous knowledge of any alerts raised. Each measurement interval will consider only its data collection and raise all TCAs as the thresholds are reached. A stateless event raised in one measurement interval silently expires at the end of that measurement interval without an explicit clear event.
Stateful TCAs require the configuration of the optional clear clear-threshold. Stateful TCAs will raise the event when the raise-threshold is reached and carry that state forward to subsequent measurement intervals. That state is maintained and no further raise events will be generated for that monitored event until a subsequent measurement interval completes and the value specified by the clear-threshold is not reached. When a subsequent measurement interval completes and the specific clear-threshold is not crossed an explicit clear log event is generated. Clear events support a value of zero which means that the event being cleared must have no errors at the completion of the measurement interval to clear a previous raise event. At this point, the event is considered cleared and a raise is possible when the next threshold raise-threshold is reached.
The raise threshold must be higher than the clear threshold. The only time both can be equal is if they are disabled.
Alerts can only be raised and cleared once per measurement interval per threshold. Once a raise is issued no further monitoring for that event occurs in that measurement interval. A clear is only logged at the end of a subsequent measurement interval following a raise and only for stateful event monitoring.
Changing threshold values or events to monitor for the measurement interval do not require the individual tests within the session or the related resource (bin-group) to be shutdown. Starting the monitoring process, adding a new event to monitor, or altering a threshold will stop the existing function that has changed with the new parameters activated at the start of the next measurement interval. Stopping the monitoring or removing an event will maintain the current state until the completion of the adjacent measurement interval after which any existing state will be cleared.
OAM-PM sessions may have multiple measurement intervals. Event monitoring can only be configured against a single configured measurement interval per session.
Delay event thresholds can be applied to Frame Delay (FD), InterFrame Delay Variation (IFDV) and Frame Delay Range (FDR). These are binned delay metrics with directionality, forward, backward and round-trip. Configuration of event thresholds for these metrics are within the config>oam-pm>bin-group bin-group-number and applied to a specific bin-type. The delay-event specifies the direction that is to be measured {forward | backward | round-trip}, the thresholds and the lowest bin number. The lowest bin value applies the threshold to the cumulative results in that bin and all higher. The default bin group (bin-group 1) cannot be modified and as such does not support the configuration of event thresholds. A session that makes use of a bin group inherits those bin group attributes including delay event threshold settings.
If the operator subscribes to a model that relegates one or more of the highest bins to anomalous results, these results should not be included in the TCA count. The delay-event-exclusion command is available under the config>oam-pm>bin-group>bin-type hierarchy. This command will exclude any results in the specified bin, along with the results in any bin higher than the one specified, from the TCA count. In order to use this command, a delay-event in the same direction for the same bin type must be configured. The excluded bins must be higher than the TCA threshold configured using the threshold raise-threshold command. This command is similar to the delay-event command. This does not require the bin group to be shut down. On-the-fly changes will cause the delay event to be suspended until the next measurement interval for the affected bin type.
Ethernet supports gathering delay information using the ETH-DMM protocol. IP supports the gathering of delay information using the TWAMP Light function.
Loss events and threshold are configured within the session under the specific loss based protocol. Loss event thresholds can be applied to the average Frame Loss Ratio (FLR) in the forward and backward direction. This event is analyzed at the end of the measurement interval to see if the computed FLR is equal to or higher than the configured threshold as a percentage. The availability and reliability loss events may be configured against the counts in forward and backward direction as well as the aggregate (sum of both directions). The aggregate is only computed for thresholds and not stored as an independent value in the standard OAM-PM loss dataset. The availability and reliability loss events include the high loss interval (HLI), Consecutive HLI (CHLI), unavailability, undetermined availability, and undetermined unavailability.
Ethernet supports the gathering of loss information using ETH-SLM and ETH-LMM. IP supports the gathering of loss information using TWAMP Light functionality. ETH-SLM, ETH-LMM, and TWAMP Light support threshold configuration for FLR and the availability and reliability loss events.
Configuring the event threshold and their behavior, stateless or stateful, completes the first part of the requirement. The event monitoring function must be enabled per major function, delay or loss. This is configured under the measurement interval that is used to track events. One measurement interval per session can be configured to track events. If event tracking of type, delay or loss, is configured against a measurement interval within the session no other measurement interval can be used to track events. For example, if the measurement interval 15-min or oam-pm session eth-pm-session has delay-events active, no other measurement interval within that session can be used to track delay or loss-events.
When a raise threshold is reached a log event warning is generated from the OAM application using the number 2300. If the event is stateful, clear clear-threshold configured, an explicit clear will be logged when a subsequent measurement interval does not exceed the clear threshold. The clear event is also a warning message from the OAM protocol but uses number 2301.
The session name is included as part of the subject.
A more detailed message is included immediately following the subject. This includes:
Only those events deemed important should be configured and activated per session.
A simple Ethernet session example is provided to show the basic configuration and monitoring of threshold event monitoring.
The bin group is configured for the required thresholds.
The OAM-PM session contains all the session attributes, test attributes and the loss event thresholds and the configuration of the event monitoring functions.
Bi-directional Forwarding Detection (BFD) is an efficient, short-duration detection of failures in the path between two systems. If a system stops receiving BFD messages for a long enough period (based on configuration), it is assumed that a failure along the path has occurred and the associated protocol or service is notified of the failure.
BFD can provide a mechanism used for failure detection over any media, at any protocol layer, with a wide range of detection times and overhead, to avoid a proliferation of different methods.
SR OS supports asynchronous and on-demand modes of BFD in which BFD messages are sent to test the path between systems.
If multiple protocols are running between the same two BFD endpoints, only a single BFD session is established, and all associated protocols will share the single BFD session.
As well as the typical asynchronous mode, there is also an echo function defined within RFC 5880, Bidirectional Forwarding Detection, that allows either of the two systems to send a sequence of BFD echo packets to the other system, which loops them back within that system’s forwarding plane. If a number of these echo packets are lost, the BFD session is declared down.
The base BFD specification does not specify the encapsulation type to be used for sending BFD control packets. Instead, use the appropriate encapsulation type for the medium and network. The encapsulation for BFD over IPv4 and IPv6 networks is specified in RFC 5881, Bidirectional Forwarding Detection (BFD) for IPv4 and IPv6 (Single Hop), and RFC 5883, Bidirectional Forwarding Detection (BFD) for Multihop Paths, BFD for IPv4 and IPv6. This specification requires that BFD control packets be sent over UDP with a destination port number of 3784 (single hop) or 4784 (multi-hop paths) and the source port number must be within the range 49152 to 65535.
Also, the TTL of all transmitted BFD packets must have an IP TTL of 255. All BFD packets received must have an IP TTL of 255 if authentication is not enabled. If authentication is enabled, the IP TTL should be 255, but can still be processed if it is not (assuming the packet passes the enabled authentication mechanism).
If multiple BFD sessions exist between two nodes, the BFD discriminator is used to de-multiplex the BFD control packet to the appropriate BFD session.
The BFD control packet has two sections: a mandatory section and an optional authentication section.
Field | Description |
Vers | The version number of the protocol. The initial protocol version is 0. |
Diag | A diagnostic code specifying the local system’s reason for the last transition of the session from Up to some other state. Possible values are: 0-No diagnostic 1-Control detection time expired 2-Echo function failed 3-Neighbor signaled session down 4-Forwarding plane reset 5-Path down 6-Concatenated path down 7-Administratively down |
D Bit | The demand mode bit. (Not supported) |
P Bit | The poll bit. If set, the transmitting system is requesting verification of connectivity, or of a parameter change. |
F Bit | The final bit. If set, the transmitting system is responding to a received BFD control packet that had the poll (P) bit set. |
Rsvd | Reserved bits. These bits must be zero on transmit and ignored on receipt. |
Length | Length of the BFD control packet, in bytes. |
My Discriminator | A unique, non-zero discriminator value generated by the transmitting system, used to demultiplex multiple BFD sessions between the same pair of systems. |
Your Discriminator | The discriminator received from the corresponding remote system. This field reflects back the received value of my discriminator, or is zero if that value is unknown. |
Desired Min TX Interval | This is the minimum interval, in microseconds, that the local system would like to use when transmitting BFD control packets. |
Required Min RX Interval | This is the minimum interval, in microseconds, between received BFD control packets that this system is capable of supporting. |
Required Min Echo RX Interval | This is the minimum interval, in microseconds, between received BFD echo packets that this system is capable of supporting. If this value is zero, the transmitting system does not support the receipt of BFD echo packets. |
Echo support for BFD calls for the support of the echo function within BFD. By supporting BFD echo, the router loops back received BFD echo messages to the original sender based on the destination IP address in the packet.
The echo function is useful when the local router does not have sufficient CPU power to handle a periodic polling rate at a high frequency. Therefore, it relies on the echo sender to send a high rate of BFD echo messages through the receiver node, which is only processed by the receiver’s forwarding path. This allows the echo sender to send BFD echo packets at any rate.
SR OS does not support the sending of echo requests, only the response to echo requests.
The following applications of centralized BFD require BFD to run on the SF/CPM.
One application for a central BFD implementation is so BFD can be supported over spoke SDPs used to inter-connect IES or VPRN interfaces. When there are spoke SDPs for inter-connections over an MPLS network between two routers, BFD is used to speed up failure detections between nodes so re-convergence of unicast and multicast routing information can begin as quickly as possible.
The MPLS LSP associated with the spoke SDP can enter or egress from multiple interfaces on the router. BFD for these types of interfaces cannot exist on the IOM/XCM by itself.
A second application for a central BFD implementation is so BFD can be supported over LAG or VSM interface. This is useful where BFD is not used for link failure detection, but for node failure detection. In this application, the BFD session can run between the IP interfaces associated with the LAG or VSM interface, but there is only one session between the two nodes. There is no requirement for the message flow to across a certain link, or VSM, to get to the remote node.
BFD sessions can be associated with an unnumbered IPv4 interface to monitor the liveliness of the connection for IP and MPLS routing protocols, when routing protocol adjacencies and static routes are configured to utilize this function. When a BFD session is associated with an unnumbered interface as the local anchor point, the BFD parameters are taken from the BFD configuration under the unnumbered interface context. If the BFD parameters are not configured within the unnumbered interface context, then BFD sessions are not attempted. All BFD sessions associated with an unnumbered interface are automatically run on the FP complex associated with the CPM.
BFD is supported over MPLS-TP, RSVP, and LDP LSPs, as well as over pseudowires that support Layer 2 services such as Epipe VPLS spoke-SDPs and mesh-SDPs using centralized BFD. Refer to the 7450 ESS, 7750 SR, 7950 XRS, and VSR MPLS Guide and 7450 ESS, 7750 SR, 7950 XRS, and VSR Layer 2 Services and EVPN Guide: VLL, VPLS, PBB, and EVPN for more information.
For more information, refer to the Seamless Bidirectional Forwarding Detection.
Seamless BFD, RFC 7880, Seamless Bidirectional Forwarding Detection (S-BFD), is a form of BFD that avoids the negotiation and state establishment for the BFD sessions. This is done primarily by pre-determining the session discriminator and then using other mechanisms to distribute the discriminators to a remote network entity. This allows client applications or protocols to more quickly initiate and perform connectivity tests. Furthermore, a per-session state is maintained only at the head end of a session. The tail end simply reflects BFD control packets back to the head end.
A seamless BFD session is established between an initiator and a reflector. There is only one instance of a reflector per SR OS router. A discriminator is assigned to the reflector. Each of the initiators on a router is also assigned a discriminator.
Seamless BFD sessions are created on the request of a client application, for example, MPLS. This user guide describes the base S-BFD configuration required on initiator and reflector routers. Application-specific configuration is required to create S-BFD sessions.
The S-BFD reflector is configured by using the following CLI:
The discriminator value must be allocated from the S-BFD reflector pool, 524288 - 526335. When the router receives an S-BFD packet from the initiator, with the local router's S-BFD discriminator as the “YourDiscriminator” field, then the local node sends the S-BFD packet back to the initiator via a routed path. The state field in the reflected packet is populated with either the Up or AdminDown state value based on the local-state configuration.
Note: Only a single reflector discriminator per node is supported, and the reflector cannot be no shutdown unless at least a discriminator is configured. |
Seamless BFD control packets are discarded when the reflector is not configured, is shutdown, or the “YourDiscriminator” field does not match the discriminator of the reflector. Both IPv4 and IPv6 are supported, but in the case of IPv6, the reflector can only reflect BFD control packets with a global unicast destination address.
Before an application can request the establishment of an S-BFD session, a mapping table of remote discriminators to peer far-end IP addresses must exist. The mapping can be accomplished in two ways:
See Static S-BFD Discriminator Configuration and Automated S-BFD Discriminator Distribution for more information about mapping remote discriminators to IP-addresses and to originated router-id.
To statically map a Seamless BFD remote IP address with its discriminator, use the following CLI commands:
The S-BFD initiator immediately starts sending S-BFD packets if the discriminator value of the far-end reflector is known, no session setup is required.
With S-BFD sessions, there is no INIT state. The initiator state changes from AdminDown to Up when it begins to send (initiate) S-BFD packets.
The S-BFD initiator sends the BFD packet to the reflector using the following fields:
If the initiator receives a valid response from the reflector with an Up state, the initiator declares the S-BFD session state as Up.
If the initiator fails to receive a certain number of responses, as determined by the BFD multiplier in the BFD template for the session, the initiator declares the S-BFD session state as Failed.
If any of the discriminators change, the session fails and the router attempts to restart with the new values. If the reflector discriminator changes at the far-end peer, the session fails, but the mapping may not have been updated locally before the system checks for a new reflector discriminator from the local mapping table. The session is bounced, bringing it up with the new values. If any discriminator is deleted, the corresponding S-BFD sessions are deleted.
It is possible to automatically map an S-BFD remote IP address with its discriminator using IGP routing protocol extensions. The required protocol extensions are introduced by RFC 7883 for IS-IS and RFC 7884 for OSPF. These extensions provide the encodings to advertise the S-BFD discriminators as opaque information within the advertised IGP link state information. BGP-LS added extensions allow the export of IS-IS and OSPF S-BFD discriminator information using encodings defined in draft-ietf-idr-bgp-ls-sbfd-extensions-01.
Two preconditions must apply before automated mapping of S-BFD discriminators is enabled:
Note: traffic-engineering is not supported in VPRN or for OSPFv3. |
The following is an example of an OSPF configuration output:
This section provides sample output of the traceroute OAM tool when the ICMP tunneling feature is enabled in a few common applications.
The ICMP tunneling feature is described in Tunneling of ICMP Reply Packets over MPLS LSP and provides supports for appending to the ICMP reply of type Time Exceeded the MPLS label stack object defined in RFC 4950. The new MPLS Label Stack object permits an LSR to include label stack information including label value, EXP, and TTL field values, from the encapsulation header of the packet that expired at the LSR node.
The hashing visibility tool allows operators to define a test packet and then inject that packet into a specified ingress port. The result of the test displays the egress port, routing context, egress interface name, and the IP next hop used to forward the packet.
There are three major steps when running this test:
The steps to configure the header templates are:
The steps to configure parameter overrides and header sequences are:
Execute the test with the oam find-egress packet packet-id ingress-port port-id command. This causes the specified test frame or packet to be injected at specified port and this reports the result.
The ETH-CFM architecture defines the hierarchy that supports separation of Ethernet CFM OAM domains of responsibility. Typically, encapsulation methods are used to tunnel traffic transparently through intermediate segments. Using a network topology as shown in Figure 62, CE traffic arrives at the aggregation node and is encapsulated with a service-provider tag which hides the customer-specific tag as the packet moves through segments of the network. This method treats the ETH-CFM traffic in the same manner. The application of additional tags prevents ETH-CFM conflicts in the various Ethernet CFM OAM domains, even if the domain levels, in this case, four in this example are reused.
In some scenarios, this additional tagging principle is not desployed and this may result in conflicts and collisions. For example, as shown in Figure 63, an additional pair of Domain Level 4 UP MEPs are configured on the aggregation nodes. These aggregation nodes are c-tag aware. The ETH-CFM packets that are transmitted from the CE pass transparently through the passive side of the MEP (the side facing away from the ETH-CFM packet transmission) and arrive on the active side of the unintended peer MEP. This could cause a number of defect conditions to occur on the unexpectedly terminating MEP and on the unreachable MEP. For simplicity, only the direction from left CE to right side of the network is shown, although the problem exists in both directions.
These issues can be resolved through communication of ETH-CFM domain-level ownership using a business agreement. However, this communication is only a business agreement and could be violated by misconfiguration. Network-level enforcement of this agreement is important to protect both the ETH-CFM OAM domains of responsibility.
ETH-CFM ingress squelching capabilities are available to enforce the agreement and prevent unwanted ETH-CFM packets from entering a domain of responsibility that should not be exposed to ETH-CFM packets from outside its domain. Figure 64 shows the generic enforcement of the agreement using squelching. In this agreement, Domain Levels 4 and lower are reserved by the provider of the EVC. Domain Levels 5 and above are outside the EVC provider’s scope and must pass transparently through the Ethernet CFM OAM domain. The EVC provider’s boundaries are configured to enforce this agreement and silently discard all ETH-CFM packets that arrive on the ingress points of the boundary at Domain Level 4 and below.
Two different squelch functions are supported, using the squelch-ingress-levels command and the squelch-ingress-ctag-levels command.
The squelch-ingress-levels command configures an exact service delineation SAP and binding match at the ingress followed immediately by Ethernet type 0x8902. This configuration silently discards the appropriate ETH-CFM packets according to the configured levels of the command, regardless of the presence of an ingress ETH-CFM management point, MEP or MIP. This squelch function occurs prior to other ETH-CFM packet processing functions.
The squelch-ingress-ctag-levels command is supported for Epipe and VPLS services only. It configures an exact service delineation SAP and binding match of the ingress skipping one addition tag at the ingress, for a maximum total tag length of two tags, followed by Ethernet type 0x8902. This configuration silently discards the appropriate ETH-CFM packets according to the levels that match the configured squelch levels, if an addition tag beyond the service delineation exists. It ignores the value of the additional tag exposing that entire range to this squelch function, if there is no ingress ETH-CFM management point, MEP or MIP, at one of the configured levels covered by the squelch configuration. This squelch function is different from the option configured by squelch-ingress-levels, because it occurs after the processing of an ingress MEP or ingress MIP configured with a primary VLAN within the configured squelch levels. When a primary VLAN ingress MEP or ingress MIP is configured at a VLAN within the squelch level, that entire primary VLAN ETH-CFM function follows regular ETH-CFM primary VLAN rules. In this configuration, the ingress ETH-CFM packets that do not have an ingress MEP or ingress MIP configured for that VLAN are exposed to the squelching rules instead of the primary VLAN rules of ETH-CFM processing. In this case, ETH-CFM primary VLAN ingress processing occurs before the squelch-ingress-ctag-levels functions.
Both variants can be configured together on supported connections and within their supported services. Figure 65 shows the logical processing chain and interaction using an ingress QinQ SAP in the form VID.* and various ingress ETH-CFM MEPs. Although not shown in the Figure 65, the processing rules are the same for ingress MIPs, which are ETH-LBM (loopback) and ETH-CFM-LTM (linktrace) aware.
There is no requirement to configure ingress MEPs or ingress MIPs if the desired goal is simply to silently discard ETH-CFM packets matching a domain level criterion. The squelch commands require contiguous levels configuration.