BGP best-external in a VPRN context

If two or more PE routers connect to a multi-homed site and learn routes for a common set of IP prefixes from that site, then the failure of one of the PE routers or a PE-CE link can be handled by rerouting the traffic over the alternate paths. The traffic failover time in this situation can be reduced if all the PE routers have advance knowledge of the potential backup paths and do not have to wait for BGP route advertisements and, or withdrawals to reprogram their forwarding tables. This can be challenging with normal BGP procedures because a PE router is not allowed to advertise, to other PE routers, a BGP route that it has learned from a connected CE device if that route is not its active route for the destination in the route table. If the multi-homing scenario calls for all traffic destined for an IP prefix to be carried over a preferred primary path (passing through PE1-CE1 for example), then all other PE routers (PE2, PE3, and so on) have that VPN route as their active route for the destination, and they are not able to advertise their own routes for the same IP prefix.

The SR OS supports a VPRN feature, configured using the export-inactive-bgp command, that resolves the issue described above. When a VPRN is configured with this command, it is allowed to advertise (as a VPN-IP route toward other PEs) its best CE-BGP route for an IP prefix, even when that CE-BGP route is inactive in the route table because of the presence of a more-preferred VPN-IP route from another PE. In order for the CE-BGP route to be advertised, the CE-BGP route must be accepted by the VRF export policy. When a VPN-IP route is advertised because of the export-inactive-bgp command, the label carried in the route is a per-next-hop label corresponding to the next-hop IP address of the CE-BGP route, or a per-prefix label; this helps avoid packet looping issues because of unsynchronized IP FIBs.

When a PE router that advertised a backup path for an IP prefix receives a withdrawal for the VPN-IP route that it was using as the primary/active route, its backup path may be promoted to the primary path; that is, the CE-BGP route may become the active route for the destination. In this case, the PE router is required to re-advertise the VPN-IP route with a per-VRF label if that is the default allocation policy and there is no label-per-prefix policy override. It takes some time for the new VPN-IP route to reach all the ingress routers and for them to update their forwarding tables. In the meantime, traffic continues to be received with the old per-next-hop label. The egress PE drops this in-flight traffic unless label retention is configured using the bgp-labels-hold-timer command in the config>router>mpls-labels context. This command configures a delay (in seconds) between the withdrawal of a VPN-IP route with a per-next-hop label and the deletion of the corresponding label forwarding entry in the IOM. The value of bgp-labels-hold-timer should be large enough to account for the propagation delay of the route withdrawal to all the ingress routers.