The RIB API service proto definition allows each MPLS tunnel and each MPLS label entry to have multiple next-hop-groups, each with a primary next hop and optionally one backup next hop. When a tunnel or label entry has more than one next-hop-group, this instructs the router to spray matching traffic across the next-hop-groups based on an ECMP or weighted-ECMP algorithm.
At any time, traffic hashed to a particular next-hop-group uses only the primary or backup next hop for forwarding. The selection of the active next hop within each next-hop-group is influenced by failures and by next-hop-switch Request messages made by the owner gRPC client. The specific rules are:
If the primary next hop is resolved to an up interface when the next-hop-group is initially activated then it immediately becomes the active next hop.
If the primary next hop is unresolved when the next-hop-group is initially activated then no next hop is immediately activated (even if the backup next hop is up) and a fixed wait-timer is started (three seconds). If the primary next hop comes up during that timer window then it is immediately activated. If the timer runs out and the primary has not yet come up the backup next hop is activated and stays active even if the primary comes up a short while later, after the timer expired.
If the currently active next hop fails, the system automatically activates the other next hop.
If the system receives a next-hop-switch Request targeting this specific entry and next-hop-group then the next hop indicated in the Request message is immediately activated, as long as it is up. If the requested next hop is down the message is ignored.