EVPN Weighted ECMP for IP prefix routes

SR OS supports Weighted ECMP for EVPN IP Prefix routes (IPv4 and IPv6), in the EVPN Interface-less (EVPN-IFL) and EVPN Interface-ful (EVPN-IFF) models.

Based on draft-ietf-bess-evpn-unequal-lb, the EVPN Link Bandwidth extended community is used in the IP Prefix routes to indicate a weight that the receiver PE must consider when load balancing traffic to multiple EVPN, CE, or both next hops. The supported weight in the extended community is of type Generalized weight and encodes the count of CEs that advertised prefix N to a PE in a BGP PE-CE route. The following figure shows the use of EVPN Weighted ECMP.

Figure: Weighted ECMP for IP Prefix routes use case

In the preceding figure, some multi-rack Container Network Functions (CNFs) are connected to a few TORs in the EVPN network. Each CNF advertises the same anycast service network 10.1.1.0/24 using a single PE-CE BGP session. Without Weighted ECMP, the TOR2, TOR3 and TO4 would re-advertise the prefix in an EVPN IP-Prefix route and flows to 10.1.1.0/24 from the Border Leaf-1 would be equally distributed among TOR2, TOR3 and TOR4. However, the needed load balancing distribution is based on the count of CNFs that are attached to each TOR. That is, out of five flows to 10.1.1.0/24, three should be directed to TOR3 (because it has three CNFs attached), one to TOR4 and one to either TOR2 or TOR1 (since CNF1 is dual-homed to both).

Weighted ECMP achieves the needed unequal load balancing based on the CNF count on each TOR. In the Figure: Weighted ECMP for IP Prefix routes use case example, if Weighted ECMP is enabled, the TORs add a weight encoded in the EVPN IP Prefix route, where the weight matches the count of CNFs that each TOR has locally . The Border Leaf creates an ECMP set for prefix 10.1.1.0/24 were the weights are considered when distributing the load to the prefix.

The procedures associated with EVPN Weighted ECMP for IP Prefix routes can be divided into advertising and receiving procedures:

Example: EVPN-IFL service configuration

Suppose PE2, PE4, and PE5 are attached to the same EVPN-IFL service on vprn 2000. PE4 is connected to two CEs (CE-41 and CE-42) and PE5 to one CE (CE-51). The three CEs advertise the same prefix 192.168.1.0/24 using PE-CE BGP and the goal is for PE2 to distribute to PE4 twice as many flows (to 192.168.1.0/24) as for PE5.

The configuration of PE4 and PE5 follows:

*A:PE-4# configure service vprn 2000 
*A:PE-4>config>service>vprn# info 
----------------------------------------------
            ecmp 10
            autonomous-system 64500
            interface "to-CE41" create
                address 10.41.0.1/24
                sap pxc-3.a:401 create
                exit
            exit
            interface "to-CE42" create
                address 10.42.0.1/24
                sap pxc-3.a:402 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2000
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.4:2000
                    vrf-target target:64500:2000
                    no shutdown
                exit
            exit
            bgp
                multi-path
                    ipv4 10
                exit
                eibgp-loadbalance
                router-id 4.4.4.4
                rapid-withdrawal
                group "pe-ce"
                    family ipv4 ipv6
                    neighbor 10.41.0.2
                        peer-as 64541
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                    neighbor 10.42.0.2
                        peer-as 64542
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
                no shutdown
            exit
            no shutdown


A:PE-5# configure service vprn 2000 
A:PE-5>config>service>vprn# info 
----------------------------------------------
            autonomous-system 64500
            interface "to-CE51" create
                address 10.51.0.1/24
                sap pxc-3.a:501 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2000
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.5:2000
                    vrf-target target:64500:2000
                    no shutdown
                exit
            exit
            bgp
                multi-path
                    ipv4 10
                exit
                eibgp-loadbalance
                router-id 5.5.5.5
                rapid-withdrawal
                group "pe-ce"
                    family ipv4 ipv6
                    neighbor 10.51.0.2
                        peer-as 64551
                        evpn-link-bandwidth
                            add-to-received-bgp 1
                        exit
                    exit
                exit
                no shutdown
            exit
            no shutdown

The configuration on PE2 follows:

*A:PE-2# configure service vprn 2000 
*A:PE-2>config>service>vprn# info 
----------------------------------------------
            ecmp 10
            interface "to-PE" create
                address 20.10.0.1/24
                sap pxc-3.a:2000 create
                exit
            exit
            bgp-evpn
                mpls
                    auto-bind-tunnel
                        resolution any
                    exit
                    evi 2000
                    evpn-link-bandwidth
                        advertise
                        weighted-ecmp
                    exit
                    route-distinguisher 192.0.2.2:2000
                    vrf-target target:64500:2000
                    no shutdown
                exit
            exit
            no shutdown

Example: PE4 and PE5 IP Prefix route advertisement

As a result of the preceding configuration, PE4 (next-hop 2001:db8::4) and PE5 (next-hop 2001:db8::5) advertise the IP Prefix route from the CEs with weights 2 and 1 respectively:

*A:PE-2# show router bgp routes evpn ip-prefix prefix 192.168.1.0/24 community target:64500:2000 hunt 
===============================================================================
 BGP Router ID:192.0.2.2        AS:64500       Local AS:64500      
===============================================================================
 Legend -
 Status codes  : u - used, s - suppressed, h - history, d - decayed, * - valid
                 l - leaked, x - stale, > - best, b - backup, p - purge
 Origin codes  : i - IGP, e - EGP, ? - incomplete

===============================================================================
BGP EVPN IP-Prefix Routes
===============================================================================
-------------------------------------------------------------------------------
RIB In Entries
-------------------------------------------------------------------------------
Network        : n/a
Nexthop        : 2001:db8::4
Path Id        : None                   
From           : 2001:db8::4
Res. Nexthop   : fe80::b446:ffff:fe00:142
Local Pref.    : 100                    Interface Name : int-PE-2-PE-4
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:2000 evpn-bandwidth:1:2
                 bgp-tunnel-encap:MPLS
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.4
Flags          : Used Valid Best IGP 
Route Source   : Internal
AS-Path        : 64541 
EVPN type      : IP-PREFIX              
ESI            : ESI-0
Tag            : 0                      
Gateway Address: 00:00:00:00:00:00
Prefix         : 192.168.1.0/24
Route Dist.    : 192.0.2.4:2000         
MPLS Label     : LABEL 524283           
Route Tag      : 0                      
Neighbor-AS    : 64541
Orig Validation: N/A                    
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default                
Last Modified  : 01h19m43s              
 
Network        : n/a                  
Nexthop        : 2001:db8::5
Path Id        : None                   
From           : 2001:db8::5
Res. Nexthop   : fe80::b449:1ff:fe01:1f
Local Pref.    : 100                    Interface Name : int-PE-2-PE-5
Aggregator AS  : None                   Aggregator     : None
Atomic Aggr.   : Not Atomic             MED            : None
AIGP Metric    : None                   IGP Cost       : 10
Connector      : None
Community      : target:64500:2000 evpn-bandwidth:1:1
                 bgp-tunnel-encap:MPLS
Cluster        : No Cluster Members
Originator Id  : None                   Peer Router Id : 192.0.2.5
Flags          : Used Valid Best IGP 
Route Source   : Internal
AS-Path        : 64551 
EVPN type      : IP-PREFIX              
ESI            : ESI-0
Tag            : 0                      
Gateway Address: 00:00:00:00:00:00
Prefix         : 192.168.1.0/24
Route Dist.    : 192.0.2.5:2000         
MPLS Label     : LABEL 524285           
Route Tag      : 0                      
Neighbor-AS    : 64551
Orig Validation: N/A                    
Source Class   : 0                      Dest Class     : 0
Add Paths Send : Default                
Last Modified  : 00h08m45s              
 
-------------------------------------------------------------------------------
RIB Out Entries
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Routes : 2
===============================================================================

Example: PE2 prefix installation

The show router id route-table extensive command performed on PE2, shows that PE2 installs the prefix with weights 2 and 1 respectively for PE4 and PE5:

*A:PE-2# show router 2000 route-table 192.168.1.0/24 extensive 

===============================================================================
Route Table (Service: 2000)
===============================================================================
Dest Prefix             : 192.168.1.0/24
  Protocol              : EVPN-IFL
  Age                   : 01h22m47s
  Preference            : 170
  Indirect Next-Hop     : 2001:db8::4
    Label               : 524283
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 2
    Resolving Next-Hop  : 2001:db8::4 (LDP tunnel)
      Metric            : 10
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 2001:db8::5
    Label               : 524285
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8::5 (LDP tunnel)
      Metric            : 10
      ECMP-Weight       : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================

*A:PE-2# show router 2000 fib 1 192.168.1.0/24 extensive                                              

===============================================================================
FIB Display (Service: 2000)
===============================================================================
Dest Prefix             : 192.168.1.0/24
  Protocol              : EVPN-IFL
  Installed             : Y
  Indirect Next-Hop     : 2001:db8::4
    Label               : 524283
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 2
    Resolving Next-Hop  : 2001:db8::4 (LDP tunnel)
      ECMP-Weight       : 1
  Indirect Next-Hop     : 2001:db8::5
    Label               : 524285
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8::5 (LDP tunnel)
      ECMP-Weight       : 1
===============================================================================
Total Entries : 1
===============================================================================

Example: EVPN-IFL handling

In case of EVPN-IFL, Weighted ECMP is also supported for EIBGP load balancing among EVPN and CE next hops. For example, PE4 installs the same prefix with an EVPN-IFL next hop and two CE next hops, and each one with its normalized weight:

*A:PE-4# /show router 2000 route-table 192.168.1.0/24 extensive 

===============================================================================
Route Table (Service: 2000)
===============================================================================
Dest Prefix             : 192.168.1.0/24
  Protocol              : BGP
  Age                   : 00h02m27s
  Preference            : 170
  Indirect Next-Hop     : 10.41.0.2
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 10.41.0.2
      Interface         : to-CE41
      Metric            : 0
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 10.42.0.2
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 10.42.0.2
      Interface         : to-CE42
      Metric            : 0
      ECMP-Weight       : N/A
  Indirect Next-Hop     : 2001:db8::5
    Label               : 524285
    QoS                 : Priority=n/c, FC=n/c
    Source-Class        : 0
    Dest-Class          : 0
    ECMP-Weight         : 1
    Resolving Next-Hop  : 2001:db8::5 (LDP tunnel)
      Metric            : 10
      ECMP-Weight       : N/A
-------------------------------------------------------------------------------
No. of Destinations: 1
===============================================================================