Each BGP RIB with IP routes (unlabeled IPv4, labeled-unicast IPv4, unlabeled IPv6, and labeled-unicast IPv6) submits its best path for each prefix to the common IP route table, unless the disable-route-table-install command is configured or the selective-label-ipv4-install command has prevented the installation. The best path is selected by the BGP decision process. The default preference for BGP routes submitted by the label-IPv4 and label-IPv6 RIBs (these appear in the route table and FIB as having a BGP-LABEL protocol type) can be modified by using the label-preference command. The default preference for BGP routes submitted by the unlabeled IPv4 and IPv6 RIBs can be modified by using the preference command.
If a BGP RIB has multiple BGP paths for the same IPv4 or IPv6 prefix that qualify as the best path up to a specific point in the comparison process, then a specified number of these multipaths can be submitted to the common IP route table. This is called BGP multipath and must be explicitly enabled using one or more commands in the multi-path context. These commands specify the maximum number of BGP paths, including the overall best path, that each BGP RIB can submit to the route table for any particular IPv4 or IPv6 prefix. If ECMP, with a limit of n, is enabled in the base router instance, then up to n paths are selected for installation in the IP FIB. In the data-path, traffic matching the IP route is load-shared across the ECMP next hops based on a per-packet hash calculation.
By default, the hashing is not sticky, meaning that when one or more of the ECMP BGP next hops fail, all traffic flows matching the route are potentially moved to new BGP next hops. If required, a BGP route can be marked (using the sticky-ecmp action in route policies) for sticky ECMP behavior so that BGP next hop failures are handled by moving only the affected traffic flows to the remaining next hops as evenly as possible. If new ECMP BGP next hops become available for a marked BGP, then route flows are moved as evenly as possible onto the resultant set of next hops. For more information about sticky ECMP, see BGP support for sticky ECMP.
A BGP route to an IPv4 or IPv6 prefix is a candidate for installation as an ECMP next hop only if it meets all of the following criteria:
The multipath route must be the same type of route as the best path (same AFI/SAFI and, in some cases, same next-hop resolution method).
The multipath route must tie with the best path for all criteria of greater significance than next-hop cost, except for criteria that are configured to be ignored.
If the best path selection reaches the next-hop cost comparison, the multipath route must have the same next-hop cost as the best route, unless unequal-cost is configured.
The multipath route must not have the same BGP next hop as the best path or any other multipath route.
The multipath route must not cause the ECMP limit of the routing instance to be exceeded. Use the ecmp command to configure the ECMP limit of the routing instance.
The multipath route must not cause the applicable maximum paths limit to be exceeded.
The multipath route must have the same neighbor AS in its AS path as the best path if restrict is set to same-neighbor-as.
By default, any path with the same AS path length as the best path, regardless of the neighbor AS, is eligible to be considered a multipath.
The route must have the same AS path as the best path if restrict is set to exact-as-path.
By default, any path with the same AS path length as the best path, regardless of the AS numbers, is eligible to be considered a multipath.
SR OS also supports IBGP multipath. In some topologies, a BGP next hop is resolved by an IP route that has multiple ECMP next hops. When ibgp-multipath is not configured, only one of the ECMP next hops is programmed as the next hop of the BGP route in the IOM. When ibgp-multipath is configured, the IOM attempts to use all the ECMP next hops of the resolving route in the forwarding state. Although the name of the ibgp-multipath command implies that it is specific to IBGP-learned routes, this is not the case. It also applies to routes learned from any multi-hop BGP session including routes learned from multi-hop EBGP peers.
The multi-path and ibgp-multipath commands are not mutually exclusive and work together. The first context enables ECMP load-sharing across different BGP next hops (corresponding to different BGP routes) while the ibgp-multipath enables ECMP load-sharing across the next hops of IP routes that resolve the BGP next hops.
Finally, ibgp-multipath does not control traffic load sharing toward a BGP next hop that is resolved by a tunnel, as when dealing with BGP shortcuts or labeled routes (VPN-IP, label-IPv4, or label-IPv6). When a BGP next hop is resolved by a tunnel that supports ECMP, the load-sharing of traffic across the ECMP next hops of the tunnel is automatic.
SR OS supports direct resolution of a BGP next hop to multiple RSVP-TE or SR-TE tunnels. In addition, a BGP next hop can be resolved by multiple LDP ECMP next hops that each correspond to a separate LDP-over-RSVP or LDP-over-SRTE tunnel. It is also possible for a BGP next hop to be resolved by an IGP shortcut route that has multiple RSVP-TE or SR-TE tunnels as its ECMP next hops.