Weighted ECMP for BGP routes

In some cases, the ECMP BGP next-hops of an IP route correspond to paths with very different bandwidths and it makes sense for the ECMP load-balancing algorithm to distribute traffic across the BGP next-hops in proportion to their relative bandwidths. The bandwidth associated with a path can be signaled to other BGP routers by including a link-bandwidth extended community in the BGP route. The link-bandwidth extended community is optional and non-transitive and encodes an autonomous system (AS) number and a bandwidth.

The SR OS implementation supports the link-bandwidth extended community in routes associated with the following address families: IPv4, IPv6, label-IPv4, label-IPv6, VPN-IPv4, and VPN-IPv6. The router automatically performs weighted ECMP for an IP BGP route when all of the ECMP BGP next-hops of the route include a link-bandwidth extended community. The relative weight of traffic sent to each BGP next-hop is visible in the output of the show router route-table extensive and show router fib extensive commands.

A route with a link-bandwidth extended community can be received from any IBGP peer. If such a route is received from an EBGP peer, the link-bandwidth extended community is stripped from the route unless an accept-from-ebgp command applies to that EBGP peer. However, a link-bandwidth extended community can be added to routes received from a directly connected (single hop) EBGP peer, potentially replacing the received Extended Community. This is accomplished using the add-to-received-ebgp command, which is available in group and neighbor configuration contexts.

When a route with a link-bandwidth extended community is advertised to an EBGP peer, the link-bandwidth extended community is removed by default. However, transitivity across an AS boundary can be allowed by configuring the send-to-ebgp command.

When a route with a link-bandwidth extended community is advertised to a peer using next-hop-self, the Extended Community is usually removed if it was not added locally (that is, by policy or add-to-received-ebgp command). However, in the special case that a route is readvertised (with next-hop-self) toward a peer covered by the scope of an aggregate-used-paths command, and the re-advertising router has installed multiple ECMP paths toward the destination each associated with a link-bandwidth extended community, the route is readvertised with a link-bandwidth extended community encoding the total bandwidth of all the used multi-paths.

The link-bandwidth extended community associated with a BGP route can be displayed using the show router bgp routes command. For the bandwidth value, the system automatically converts the binary value in the extended community to a decimal number in units of Mb/s (1 000 000 b/s).

Weighted ECMP across the BGP next-hops of an IP BGP route is supported in combination with ECMP at the level of the route or tunnel that resolves one or more of the ECMP BGP next-hops. This ECMP at the resolving level can also be weighted ECMP when the following conditions all apply: