5. BGP

This chapter provides information about configuring BGP.

5.1. BGP Overview

Border Gateway Protocol (BGP) is an inter-autonomous system routing protocol. An autonomous system (AS) is a network or group of routers logically organized and controlled by a common network administration. BGP enables routers to exchange network reachability information, including information about other ASs that traffic must traverse to reach other routers in other ASs. To implement BGP, the ASN must be specified in the config>router context. A 7210 SAS BGP configuration must contain at least one group and include information about at least one 7210 SAS neighbor (peer).

AS paths are the routes to each destination. Other attributes, such as the origin of the path, the multiple exit discriminator (MED), the local preference, and communities included with the route are called path attributes. When BGP interprets routing and topology information, loops can be detected and eliminated. Route preference for routes learned from the configured peers can be enabled among groups of routes to enforce administrative preferences and routing policy decisions.

Note:

  1. MP-BGP (family IPv4 and IPv6) for use in Layer 3 VPN services (also known as VPRN services) is supported on all 7210 SAS platforms as described in this document.
  2. BGP (family IPv4 and IPv6) is not available for use in the base routing instance. It is only available for use as a PE-CE routing protocol.

5.1.1. BGP Communication

There are two types of BGP peers, internal BGP (iBGP) and external BGP (eBGP) (Figure 19).

  1. iBGP is used to communicate with peers in the same autonomous system. Routes received from an iBGP peer in the same autonomous system are not advertised to other iBGP peers (unless the router is a route reflector) but can be advertised to an eBGP peer.
  2. eBGP is used to communicate with peers in different autonomous systems. Routes received from an router in a different AS can be advertised to both eBGP and iBGP peers.

Autonomous systems share routing information, such as routes to each destination and information about the route or AS path, with other ASs using BGP. Routing tables contain lists of known routers, reachable addresses, and associated path cost metrics to each router. BGP uses the information and path attributes to compile a network topology.

5.1.1.1. Message Types

Four message types are used by BGP to negotiate parameters, exchange routing information and indicate errors. They are:

  1. Open message
    After a transport protocol connection is established, the first message sent by each side is an Open message. If the Open message is acceptable, a Keepalive message confirming the Open is sent back. When the Open is confirmed, Update, Keepalive, and Notification messages can be exchanged.
    Open messages consist of the BGP header and the following fields:
    1. version
      The current BGP version number is 4.
    2. local ASN
      The autonomous system number is configured in the config>router context.
    3. hold time
      Configure the maximum time BGP will wait between successive messages (either keep alive or update) from its peer, before closing the connection. Configure the local hold time with in the config>router>bgp context.
    4. BGP identifier
      IP address of the BGP system or the router ID. The router ID must be a valid host address.
  2. Update message
    Update messages are used to transfer routing information between BGP peers. The information contained in the packet can be used to construct a graph describing the relationships of the various autonomous systems. By applying rules, routing information loops and some other anomalies can be detected and removed from the inter-AS routing,
    The Update messages consist of a BGP header and the following optional fields:
    1. unfeasible routes length
      The field length which lists the routes being withdrawn from service because they are considered unreachable.
    2. withdrawn routes
      The associated IP address prefixes for the routes withdrawn from service.
    3. total path attribute length
      The total length of the path field that provides the attributes for a possible route to a destination.
    4. path attributes
      The path attributes presented in variable length TLV format.
    5. network layer reachability information (NLRI)
      IP address prefixes of reachability information.
  3. Keepalive message
    Keepalive messages, consisting of only a 19 octet message header, are exchanged between peers frequently so hold timers do not expire. The keepalive messages determine whether a link is unavailable.
  4. Notification message
    A Notification message is sent when an error condition is detected. The peering session is terminated and the BGP connection (TCP connection) is closed immediately after sending it.
    Figure 19 shows BGP configuration.
    Figure 19:  BGP Configuration 

5.1.2. Group Configuration and Peers

To enable BGP routing, participating routers must have BGP enabled and be assigned to an autonomous system and the neighbor (peer) relationships must be specified. A router typically belongs to only one AS. TCP connections must be established in order for neighbors to exchange routing information and updates. Neighbors exchange BGP open messages that includes information such as AS numbers, BGP versions, router IDs, and hold-time values. Keepalive messages determine whether a connection is established and operational. The hold-time value specifies the maximum time BGP will wait between successive messages (either keep alive or update) from its peer, before closing the connection.

In BGP, peers are arranged into groups. A group must contain at least one neighbor. A neighbor must belong to a group. Groups allow multiple peers to share similar configuration attributes.

Although neighbors do not have to belong to the same AS, they must be able to communicate with each other. If TCP connections are not established between two neighbors, the BGP peering will not be established and updates will not be exchanged.

Peer relationships are defined by configuring the IP address of the routers that are peers of the local BGP system. When neighbor and peer relationships are configured, the BGP peers exchange Update messages to advertise network reachability information.

5.1.3. Hierarchical Levels

BGP parameters are initially applied on the global level. These parameters are inherited by the group and neighbor (peer) levels. Parameters can be modified and overridden on a level-specific basis. BGP command hierarchy consists of three levels:

  1. global level
  2. group level
  3. neighbor level

Many of the hierarchical BGP commands can be modified on different levels. The most specific value is used. That is, a BGP group-specific command takes precedence over a global BGP command. A neighbor-specific statement takes precedence over a global BGP and group-specific command; for example, if you modify a BGP neighbor-level command default, the new value takes precedence over group- and global- level settings.

Note:

Careful planning is essential to implement commands that can affect the behavior of global, group, and neighbor-levels. Because the BGP commands are hierarchical, analyze the values that can disable features on the global or group levels that must be enabled at the neighbor level. For example, if you enable the damping command on the global level but want it disabled only for a specific neighbor (not for all neighbors within the group), you cannot configure a double no command (no no damping) to enable the feature.

5.1.4. Route Reflection

In a standard BGP configuration, all BGP speakers within an AS, must have full BGP mesh to ensure that all externally learned routes are redistributed through the entire AS. iBGP speakers do not re-advertise routes learned from one iBGP peer to another iBGP peer. If a network grows, scaling issues could emerge because of the full mesh configuration requirement. Instead of peering with all other iBGP routers in the network, each iBGP router only peers with a router configured as a route reflector.

Route reflection circumvents the full mesh requirement but maintains the full distribution of external routing information within an AS. Route reflection is effective in large networks because it is manageable, scalable, and easy to implement. Route reflection is implemented in autonomous systems with a large internal BGP mesh to reduce the number of iBGP sessions required within an AS.

Note:

The 7210 SAS can be configured only as route reflector clients. Only the client functionality of a route reflector described here is available for use with the 7210 SAS. The route reflector server-side functionality cannot be used on the 7210 SAS.

A large AS can be sub-divided into one or more clusters. Each cluster contains at least one route reflector which is responsible for redistributing route updates to all clients. Route reflector clients do not need to maintain a full peering mesh between each other. They only require a peering to the route reflector(s) in their cluster. The route reflectors must maintain a full peering mesh between all non-clients within the AS.

Each route reflector must be assigned a cluster ID and specify which neighbors are clients and which are non-clients to determine which neighbors should receive reflected routes and which should be treated as a standard iBGP peer. Additional configuration is not required for the route reflector besides the typical BGP neighbor parameters.

Figure 20 shows a simple full-mesh configuration with several BGP routers. When SR-A receives a route from SR-1 (an external neighbor), it must advertise route information to all of its iBGP peers (SR-B, SR-C, SR-D, etc). To prevent loops, iBGP learned routes are not re-advertised to other iBGP peers.

Figure 20:  Fully Meshed BGP Configuration 

When route reflectors are configured, the routers within a cluster do not need to be fully meshed. Figure 20 shows a fully meshed network and Figure 21 shows the same network but with route reflectors configured to minimize the iBGP mesh between SR-A, SR-B, SR-C, and SR-D. SR-A, configured as the route reflector, is responsible for redistributing route updates to clients SR-B, SR-C, and SR-D. iBGP peering between SR-B, SR-C and SR-D is not necessary because even iBGP learned routes are reflected to the route reflector’s clients.

In Figure 21, SR-E and SR-F are shown as non-clients of the route reflector. As a result, a full mesh of iBGP peerings must be maintained between, SR-A, SR-E and SR-F.

Figure 21:  BGP Configuration with Route Reflectors 

A route reflector enables communication between the clients and non-client peers. Clients of a route reflector do not need to be fully meshed but non-client peers need to be fully meshed within an AS.

A grouping, called a cluster, is composed of a route reflector (or a redundant pair of route reflectors configured with the same cluster-id) and its client peers. Each route reflector is assigned a cluster ID and this defines the cluster that it and its clients belong to. Multiple route reflectors can be configured within a cluster for redundancy. A router assumes the role as a route reflector by configuring the cluster cluster-id command. No other command is required unless you want to disable reflection to specific clients.

When a route reflector receives an advertised route, it selects the best path. If the best path was received from an eBGP peer then it is typically advertised, with next hop unchanged, to all clients and non-client peers of the route reflector. If the best path was received from a non-client peer then it is advertised to all clients of the route reflector. If the best path was received from a client then it is advertised to all clients and non-client peers.

5.1.5. Fast External Failover

Fast external failover on a group and neighbor basis is supported. For eBGP neighbors, this feature controls whether the router should drop an eBGP session immediately upon an interface-down event, or whether the BGP session should be kept up until the hold-time expires.

When fast external failover is disabled, the eBGP session stays up until the hold-time expires or the interface comes back up. If the BGP routes become unreachable as a result of the down IP interface, BGP withdraws the unavailable route immediately from other peers.

5.1.6. Sending of BGP Communities

The capability to explicitly enable or disable the sending of the BGP community attribute to BGP neighbors, other than through the use of policy statements, is supported.

This feature allows an administrator to enable or disable the sending of BGP communities to an associated peer. This feature overrides communities that are already associated with a specific route or that may have been added via an export route policy. That is, even if the export policies leave BGP communities attached to a specific route, when the disable-communities feature is enabled, no BGP communities are advertised to the associated BGP peers.

5.1.7. ECMP and BGP Route Tunnels

Note:

ECMP is not supported for BGP route tunnels.

ECMP is not available for BGP route tunnels and also not on the transport LSP that is used to resolve BGP next-hop. If multiple LSP next-hops are available, only then the first next-hop is used and the rest ignored.

5.1.8. Next-Hop Resolution of BGP Labeled Routes to Tunnels

The user enables the resolution of RFC 3107 BGP label route prefixes using tunnels to BGP next hops in the TTM with using following commands:

CLI Syntax:
config>router>bgp>next-hop-res
labeled-route-transport-tunnel
[no] family family
resolution {any | disabled | filter}
resolution-filter
[no] ldp
[no] rsvp
[no] sr-isis
[no] sr-ospf

The transport-tunnel context allows the user to configure different types of BGP label routes: label-IPv4 and VPN routes (which includes both VPN-IPv4 and VPN-IPv6 routes). By default, all labeled routes resolve to LDP, even if the preceding CLI commands are not configured in the system.

If the resolution command is set to disabled, the default binding to LDP tunnels resumes. If resolution is set to any, the supported tunnel type selection is based on the TTM preference. The order of preference of TTM tunnels is the following:

  1. RSVP
  2. LDP
  3. segment routing OSPF
  4. segment routing IS-IS

If the rsvp option is enabled, BGP searches for the best metric RSVP LSP to the address of the BGP next-hop. The address can correspond to the system interface or to another loopback used by the BGP instance on the remote node. MPLS provides the LSP metric in the tunnel table. In the case of multiple RSVP LSPs with the same lowest metric, BGP selects the LSP with the lowest tunnel ID.

If the ldp option is enabled, BGP searches for an LDP LSP with a FEC prefix corresponding to the address of the BGP next-hop.

If the sr-isis or sr-ospf option is enabled, an SR tunnel to the BGP next-hop is selected in the TTM from the lowest preference IS-IS or OSPF instance. If many instances have the same lowest preference, the lowest numbered IS-IS or OSPF instance is chosen.

If one or more explicit tunnel types are specified using the resolution-filter option, only these tunnel types are selected again following the TTM preference. The resolution command must be set to filter to activate the list of tunnel types configured in the resolution-filter context.

5.1.8.1. VPN-IPv4 and VPN-IPv6 Route Resolution

The user enables the resolution of VPN-IPv4 and VPN-IPv6 prefixes using tunnels to MP-BGP peers using the following commands:

CLI Syntax:
config>service>vprn
auto-bind-tunnel
resolution {any|disabled|filter}
resolution-filter
[no] ldp
[no] rsvp
[no] sr-isis
[no] sr-ospf

The auto-bind-tunnel context configures the binding of VPRN routes to tunnels. The user must configure the resolution command to enable auto-bind resolution to tunnels in the TTM. If the resolution command is set to disabled, auto-binding to a tunnel is removed.

If the resolution command is set to any, any supported tunnel type in the vprn context is selected following the TTM preference. If one or more explicit tunnel types are specified using the resolution-filter command, only these tunnel types are selected again following the TTM preference. The following tunnel types are supported in a vprn context in order of preference: RSVP, LDP, and segment routing (SR).

If the rsvp command is enabled, BGP searches for the best metric RSVP LSP to the address of the BGP next-hop. This address can correspond to the system interface or to another loopback that the BGP instance uses on the remote node. MPLS provides the LSP metric in the tunnel table. In the case of multiple RSVP LSPs with the same lowest metric, BGP selects the LSP with the lowest tunnel ID.

If the ldp command is enabled, BGP searches for an LDP LSP with a FEC prefix corresponding to the address of the BGP next-hop.

If the sr-isis or sr-ospf command is configured, an SR tunnel to the BGP next-hop is selected in the TTM from the lowest preference ISIS or OSPF instance. If many instances have the same lowest preference, the lowest numbered IS-IS or OSPF instance is chosen.

The BGP tunnel type is not explicitly configured in VPRN resolution and is therefore implicit. It is always preferred over any other tunnel type enabled in the auto-bind-tunnel context. However, the BGP tunnel type is configurable as a new tunnel type for BGP EVPN prefixes. The user must enable the BGP tunnel type to ensure that inter-area or inter-as prefixes are resolved.

The user must set the resolution command to filter to activate the list of tunnel types configured under resolution-filter.

When configured in a VPRN service (using the configure>service>vprn>spoke-sdp command), an explicit SDP to a BGP next-hop overrides the auto-bind-tunnel selection for that BGP next-hop only. There is no support for reverting automatically to the auto-bind-tunnel selection if the explicit SDP goes down. The user must delete the explicit spoke-SDP in the VPRN service to resume using the auto-bind-tunnel selection for the BGP next-hop.

5.1.9. Route Selection Criteria

For each prefix in the routing table, the routing protocol selects the best path. Then, the best path is compared to the next path in the list until all paths in the list are exhausted. The following parameters are used to determine the best path:

  1. Routes are not considered if they are unreachable.
  2. An RTM’s preference is lowered as well as the hierarchy of routes from a different protocol. The lower the preference the higher the chance of the route being the active route.
  3. Routes with higher local preference have preference.
  4. Routes with the shorter AS path have preference.
  5. Routes with the lower origin have preference. IGP = 0 EGP = 1 INCOMPLETE = 2
  6. Routes with the lowest MED metric have preference. Routes with no MED value are exempted from this step unless always-compare-med is configured.
  7. Routes learned by an eBGP peer rather than those learned from an iBGP peer are preferred.
  8. Routes with the lowest IGP cost to the next-hop path attribute are preferred.
  9. Routes with the lowest BGP-ID are preferred.
  10. Routes with shortest cluster list are preferred.
  11. Routes with lowest next-hop IP address are preferred.

Notes:

  1. For BGP-VPN routes with the same prefix but a different Route Distinguisher (RD) that are imported in a VRF, if ECMP is not enabled in that VRF, the preceding selection criteria are used until parameter point 8. If all selection criteria are still the same after that point, the last updated route will be selected.
  2. For BGP-VPN routes with the same prefix but a different Route Distinguisher (RD) that reach parameter point 8 in the selection criteria, all routes are flagged as BEST and USED, although the actual number of used routes depends on the ECMP value configured in the VRF.
    Note:

    7210 SAS devices do not support BGP ECMP (multi-path). That is, an ECMP value of 1 is always used.

  3. For BGP-VPN routes with the same prefix and same Route Distinguisher (RD) that reach parameter point 8 in the selection criteria, such routes are flagged as BEST but parameter points 9 to 11 determine which routes are submitted to the VRF and marked as USED in accordance with the ECMP value configured in the VRF.

5.1.10. BGP Path Attributes

A BGP route for a specific NLRI is distinguished from other BGP routes for the same NLRI by its set of path attributes. Each path attribute is encoded as a TLV in the Path Attributes field of the Update message, and describes a property of the path. The type field of the TLV identifies the path attribute and the value field carries data specific to the attribute type.

The 7210 SAS supports the following path attributes:

  1. ORIGIN (well-known mandatory)
  2. AS_PATH (well-known mandatory)
  3. NEXT_HOP (well-known; required only in Update messages that have IPv4 prefixes in the NLRI field); see Next-hop Indirection for information about the NEXT_HOP attribute
  4. MED (optional non-transitive)
  5. LOCAL_PREF (well-known; required only in Update messages sent to iBGP peers)
  6. ATOMIC_AGGR (well-known discretionary)
  7. AGGREGATOR (optional transitive)
  8. COMMUNITY (optional transitive)
  9. ORIGINATOR_ID (optional non-transitive)
  10. CLUSTER_LIST (optional non-transitive)
  11. MP_REACH_NLRI (optional non-transitive)
  12. MP_UNREACH_NLRI (optional non-transitive)
  13. EXT_COMMUNITY (optional transitive)
  14. AS4_PATH (optional transitive)
  15. AS4_AGGREGATOR (optional transitive)
  16. CONNECTOR (optional transitive)
  17. PMSI_TUNNEL (supported only on platforms that support NG-MVPN with BGP signaling; refer to the 7210 SAS OS Release Notes for more information about NG-MVPN with BGP signaling)

5.1.10.1. NEXT_HOP Attribute

The NEXT_HOP attribute indicates the IPv4 address of the BGP router that is the next hop to reach the IPv4 prefixes in the NLRI field. If the Update message is advertising routes other than IPv4 unicast routes, next hop of these routes is encoded in the MP_REACH_NLRI attribute and the NEXT_HOP attribute is not included in the Update message.

5.1.10.1.1. Next-hop Indirection

The 7210 SAS supports next-hop indirection for most types of BGP routes. Next-hop indirection means that BGP next hops are logically separated from resolved next hops in the forwarding plane (IOMs). The separation allows the grouping of routes that share the same BGP next hops such that if the method of BGP next-hop resolution changes, only one forwarding plane update is required, instead of one update for each route in the group. The convergence time after the next-hop resolution change is uniform, and not linear, with the number of prefixes. The next-hop indirection technology supports Prefix-Independent Convergence (PIC). The 7210 SAS uses next-hop indirection whenever possible; there is no option to disable the functionality.

On the 7210 SAS, the following families support next-hop indirection:

  1. label-IPv4
  2. VPN-IPv4
  3. label-IPv6
  4. VPN-IPv6
  5. L2-VPN
  6. PW route

5.1.11. BGP Routing Information Base

The entire set of BGP routes learned and advertised by a BGP router make up its BGP Routing Information Base (RIB). Conceptually, the BGP RIB contains three parts:

  1. RIB-IN
  2. LOC-RIB
  3. RIB-OUT

The RIB-IN (or Adj-RIBs-In, as defined in RFC 4271) contains the BGP routes received from peers that the router has stored in its memory.

The LOC-RIB contains modified versions of the BGP routes in the RIB-IN. The path attributes of a RIB-IN route can be modified using BGP import policies. All LOC-RIB routes for the NLRI are compared using the BGP decision process, which selects the best path for each NLRI. The local router uses the best paths in the LOC-RIB for forwarding, filtering, auto-discovery, and other tasks.

The RIB-OUT (or Adj-RIBs-Out, as defined in RFC 4271) contains the BGP routes selected for advertisement to peers. A BGP route is generally not advertised to a peer; that is, the router is not held in the RIB-OUT unless it is used locally, but there are exceptions. BGP export policies modify the path attributes of a LOC-RIB route to create the path attributes of the RIB-OUT route. A specific LOC-RIB route can be advertised with different path attribute values to different peers, and a 1:N relationship may exist between LOC-RIB and RIB-OUT routes.

5.1.11.1. LOC-RIB Features

The 7210 SAS implements the following LOC-RIB processing features:

  1. BGP decision process
  2. BGP route installation in the route table
  3. BGP route installation in the tunnel table
  4. BGP fast reroute
  5. route flap damping (RFD)

See BGP Fast Reroute for more information about BGP fast reroute.

5.1.11.2. BGP Fast Reroute

BGP fast reroute uses indirection techniques in the forwarding plane and BGP backup path precomputation in the control plane to support the fast reroute of BGP traffic around unreachable or failed BGP next hops. BGP fast reroute is supported for label-IPv4 routes.

Table 61 describes the scenarios supported in the base router BGP context.

Table 61:  BGP Fast Reroute Scenarios (Base Router Context) 

Ingress Packet

Primary Route

Backup Route

PIC

IPv4

IPv4 route with next-hop A resolved by an IPv4 route or any shortcut tunnel

IPv4 route with next-hop B resolved by an IPv4 route or any shortcut tunnel

No

IPv4

Label-IPv4 route with next-hop A resolved by any transport tunnel

Label-IPv4 route with next-hop B resolved by any transport tunnel

Yes

IPv4

Label-IPv4 route with next-hop A resolved by a local route

Label-IPv4 route with next-hop B resolved by a local route

Yes

IPv4

Label-IPv4 route with next-hop A resolved by a static route

Label-IPv4 route with next-hop B resolved by a static route

Yes

5.1.11.2.1. Calculating Backup Paths

BGP fast reroute is optional on the 7210 SAS. Use the bgp backup-path command to enable the feature.

Note:

On the 7210 SAS, the backup-path command is supported only for label-IPv4 routes.

In the base router context, the backup-path command is used to control fast reroute on a per-RIB basis (labeled IPv4 routes). When the command specifies a particular family, BGP attempts to find a backup path for every prefix learned by the associated BGP RIB.

The backup path is the best path after the primary path and any paths using the same BGP next hop as the primary path have been removed.

5.1.11.2.2. Failure Detection and Switchover to the Backup Path

When BGP fast reroute is enabled, BGP decides when a primary path is no longer usable and notifies the IOM. Based on BGP input, the IOM immediately reroutes affected traffic to the backup path.

When BGP fast reroute is enabled, the IOM reroutes traffic onto a backup path based on input from BGP. When BGP decides that a primary path is no longer usable, it notifies the IOM and affected traffic is immediately switched to the backup path.

The following events trigger failure notifications to the IOM and traffic rerouting to backup paths:

  1. peer IP address is unreachable and peer tracking is enabled
  2. BFD session associated with the BGP peer goes down
  3. BGP session is terminated with the peer (for example, send or receive NOTIFICATION)
  4. there is no longer any route (allowed by the next-hop resolution policy, if configured) that can resolve the BGP next-hop address
  5. BGP tunnel that resolves the next hop goes down because the BGP label-IPv4 route is withdrawn by the peer or becomes invalid due to an unresolved next hop

5.1.12. BGP Add-Path

5.1.12.1. Receiving Multiple Paths per Prefix from a BGP Peer

If the 7210 SAS receives an advertisement of an NLRI and path from a specific peer and that peer subsequently advertises the same NLRI with different path information (a different next-hop or different path attributes), the new path overwrites the existing path.

However, when the add-path has been negotiated with the peer, the newly advertised path is stored in the RIB-IN along with all paths previously advertised (and not withdrawn) by the peer.

For router A to receive multiple paths per NLRI from peer B for a specific address family (AFI x, SAFI y), the BGP capabilities advertisement during session setup must indicate that peer B must send multiple paths for (AFI x, SAFI y) and router A is willing to receive multiple paths for (AFI x, SAFI y).

When the add-path receive capability for (AFI x, SAFI y) has been negotiated with a peer, all advertisements and withdrawals of NLRI within that address family by that peer include a path identifier.

If the add-path has been negotiated with a peer and a path identifier is expected but missing, or if the add-path has not been negotiated with a peer and a path identifier is present but not expected, a Notification message is sent with the error subcode indicating Invalid Network Field, in accordance with standard BGP error handling procedures.

The path identifiers have no significance to the receiving peer. If the combination of NLRI and path identifier in an advertisement from a peer is unique (does not match an existing route in the RIB-IN from that peer), the route is added to the RIB-IN. If the combination of NLRI and path identifier in a received advertisement is the same as an existing route in the RIB-IN from the peer, the new route replaces the existing one. If the combination of NLRI and path identifier in a received withdrawal matches an existing route in the RIB-IN from the peer, that route is removed from the RIB-IN.

A BGP Update message from an add-path peer may advertise and withdraw more than one NLRI belonging to one or more address families. In this case, the add-path may be supported for some address families and not others. In this situation, the receiving BGP router should not require that all path identifiers in the Update message be the same.

Figure 22 shows an Update message carrying an IPv4 NLRI with a path identifier.

Figure 22:  BGP Update Message with Path Identifier for IPv4 NLRI 

Currently, add-path is only supported for the iBGP sessions with other add-path capable peers. The add-path capability is not supported for eBGP sessions or for native IPv4 and IPv6 routes (that is, IPv4 and IPv6 routes advertised without a label) in iBGP sessions.The ability to receive multiple paths per prefix from an add-path peer is configurable per route type. The supported route types are the following:

  1. label-IPv4
  2. label-IPv6

5.1.12.2. Path Selection with Add-Paths

The LOC-RIB may have multiple paths for a prefix. The path selection mode refers to the algorithm used to decide which of these paths to advertise to an add-path peer. SR OS supports the Add-N path selection algorithm described in draft-ietf-idr-add-paths-guidelines. The Add-N algorithm selects as candidates for advertisement the N best paths with unique BGP next-hops. In the SR OS implementation, the default value of N is configurable per address family at the BGP instance, group, and neighbor levels; however, this default value can be overridden for specific prefixes using route policies. The maximum number of paths to advertise for a prefix to an add-path neighbor is the value N assigned by a BGP import policy to the best path for P; otherwise, it defaults to the neighbor, group, or instance level configuration of N for the address family to which P belongs.

Add-paths allows non-best paths to be advertised to a peer, but it still complies with basic BGP advertisement rules, such as the iBGP split horizon rule that a route learned from an iBGP neighbor cannot be readvertised to another iBGP neighbor unless the router is configured as a route reflector.

5.1.12.3. BGP Decision Process with Add-Path

To use multiple paths per NLRI for forwarding and to advertise multiple paths per NLRI to add-path peers, a router implementing an add-path must run a modified version of the BGP decision process. The existing BGP decision algorithm selects the one best path for any particular NLRI. Paths that are second best or third best remain in the RIB-IN but are not installed in the LOC-RIB and not advertised to peers.

The system automatically changes its BGP decision process for routes belonging to a particular address family whenever either of the following applies:

  1. BGP edge PIC is enabled for the address family
  2. the add-path send capability is enabled for that address family on one or more peering sessions

When BGP PIC is enabled, the BGP decision process selects a backup path per prefix or NLRI to install in the LOC_RIB. The algorithm is summarized as follows.

  1. Select the single best path based on a full evaluation of all the BGP tie-breaking rules, as described in the following examples.
    1. Select the route with the highest route preference.
    2. From all routes with an AIGP metric, select the route with the lowest sum of the AIGP metric value stored with the RIB-IN copy of the route and the iteratively resolved distance between the calculating router and the BGP NEXT_HOP of the route.
    3. Select the route with the highest local preference (LOCAL_PREF).
    4. Select the route with the shortest AS path.
    5. Select the route with the lowest origin.
    6. Among routes advertised by the same neighbor AS (unless always-compare-med is configured). Select the route with the lowest MED.
    7. Prefer routes learned from eBGP peers over routes learned from iBGP peers.
    8. Select the route with the lowest IGP cost (unless ignore-nh-metric is configured).
    9. Select the route received by the peer with the lowest originator ID or BGP identifier.
    10. Select the route with the shortest cluster list.
    11. Select the route received from the lowest peer IP address.
  2. Select up to one additional second best path among the paths remaining after removing from consideration all paths with a NEXT_HOP or BGP identifier (or originator ID) in common with any of the previously-selected best paths. A full evaluation of all the BGP tie-breaking rules is required to find this single second-best path, as shown in the following examples.
    1. Select the route with the highest route preference.
    2. Select the route with the highest local preference (LOCAL_PREF).
    3. Select the route with the shortest AS path (unless as-path-ignore is configured).
    4. Select the route with the lowest origin.
    5. Among routes advertised by the same neighbor AS (unless always-compare-med is configured) select the route with the lowest MED.
    6. Prefer routes learned from eBGP peers over routes learned from iBGP peers.
    7. Select the route with the lowest IGP cost.
    8. Select the route received by the peer with the lowest originator ID or BGP identifier.
    9. Select the route with the shortest cluster list.
    10. Select the route received from the lowest peer IP address.

5.1.12.4. Advertising Multiple Paths Using Add-Path

For router A to send multiple paths per NLRI to peer B for a particular address family (AFI x, SAFI y), the BGP capability advertisement during session setup must indicate that router A must send multiple paths for (AFI x, SAFI y), and peer B is willing to receive multiple paths for (AFI x, SAFI y).

By default, unless changed through configuration, all paths for a particular NLRI in the LOC-RIB are advertised to all add-path peers with which the send capability has been negotiated. All such advertisements (and any subsequent withdrawals) include a path identifier. Each advertised path for a specific NLRI must have a unique path identifier. When a path is reflected or propagated from one peer to another, the path identifier is expected to change, even if there has been no change in the next-hop. A BGP Update message sent to an add-path peer may advertise and withdraw more than one NLRI belonging to one or more address families. In this case, the add-path may be supported for some address families and not others, and the path identifiers associated with different NLRI in the Update message may be the same or different.

In the current implementation, the add-path is only supported by the iBGP sessions it forms with other add-path capable peers. The add-path capability is not supported for eBGP sessions or for native IPv4 and IPv6 routes (that is, IPv4 and IPv6 routes advertised without a label) in iBGP sessions.The ability to receive multiple paths per prefix from an add-path peer is configurable per route type. The following route types are supported:

  1. label-IPv4
  2. label-IPv4

5.1.12.5. Limiting the Number of Paths per Prefix

Advertising multiple paths per prefix to a peer means that the peer must maintain more entries in its RIB-IN than would be the case without add-path. The memory and CPU resources associated with these extra paths may not be justified if the peer cannot take advantage of them. Operators may therefore want precise control over the number of paths per prefix to send to particular peers.

The new add-paths CLI node (BGP, group or neighbor level) has address family-specific commands to set the maximum number of paths to send per prefix.

To ensure routing consistency in cases where an add-path speaking router has a mix of add-path and non add-path peers and where the number of paths to send for a particular prefix can vary by add-path peer, the following behavior should be enforced: if the advertising router advertises n paths for prefix XYZ to peer1 and m paths to peer2, and n < m, then all the paths advertised to peer1 must be included in the paths advertised to peer2. Suppose the LOC-RIB has N paths for prefix XYZ. The preceding behavior can be guaranteed if:

  1. the N paths are sorted in strict order of their preference by the BGP decision process: p1 (overall best path found during step 1 of BGP Decision Process with Add-Path), p2, p3, …, pN (a path found during step 2 or 3 of BGP Decision Process with Add-Path)
  2. p1 (only) is advertised to non add-path peers, add-path peers that indicate a send-only capability and add-path peers for which the configured path-limit is 1
  3. (p1, p2) is advertised to add-path peers for which the configured path-limit is 2
  4. (p1, p2, p3, …, pN) is advertised to add-path peers for which the configured path-limit is N, or else the path limit is configured as max (default)

5.1.13. AIGP Metric

The accumulated IGP (AIGP) metric is an optional, non-transitive attribute that can be attached to selected routes using route policies. In networks that use AIGP, BGP paths with a lower end-to-end IGP cost are preferred, even if the compared paths span more than one AS or IGP instance. AIGP differs from MED in the following important ways.

  1. AIGP is not transitive between completely distinct autonomous systems. It is only transitive across internal AS boundaries.
  2. AIGP is always compared in paths that have the AIGP attribute, regardless of whether they are located in different neighbor ASs.
  3. AIGP is more important than MED in the BGP decision process.
  4. AIGP is automatically incremented every time there is a BGP next-hop change, so that the system can track the end-to-end IGP cost. All arithmetic operations on MED attributes must be performed manually, such as by using route policies.

On the 7210 SAS, AIGP is supported only in the base router BGP instance and only for label-IPv4 and 6PE routes. The AIGP attribute is only sent to peers configured using the aigp command. If the attribute is received from a peer that is not configured for AIGP, or if the attribute is received in a non-supported route type, the attribute is discarded and not propagated to other peers. The AIGP attribute is still displayed in BGP show command output.

When the 7210 SAS receives a route with an AIGP attribute and it re-advertises the route to an AIGP-enabled peer without changes to the BGP next hop, the AIGP metric value is unchanged by the advertisement (RIB-OUT) process. However, if the route is re-advertised with a new BGP next hop, the AIGP metric value is automatically incremented, either by the route table or tunnel table cost to reach the received BGP next hop, or by a value configured using route policies.

5.1.14. Command Interactions and Dependencies

This section describes the BGP command interactions and dependencies that apply to the configuration or operational maintenance of 7210 SAS routers.

See the BGP Command Reference for detailed descriptions of the BGP configuration commands.

5.1.14.1. Changing the ASN

If the autonomous system number (ASN) is changed on a router with an active BGP instance, the new ASN is not used until the BGP instance is restarted, either by administratively disabling or enabling the BGP instance or by rebooting the system with the new configuration.

5.1.14.2. Changing the Local ASN

Changing the local ASN of an active BGP instance:

  1. at the global level, causes the BGP instance to restart with the new local ASN
  2. at the group level, causes BGP to re-establish the peer relationships with all peers in the group with the new local ASN
  3. at the neighbor level, causes BGP to re-establish the peer relationship with the new local ASN

5.1.14.3. Changing the Router ID at the Configuration Level

If you configure a new router ID in the config>router context, protocols are not automatically restarted with the new router ID. The updated router ID is only used the next time the protocol is initialized or reinitialized. An interim period can occur when the protocols use different router IDs.

5.1.14.4. Hold Time and Keep Alive Timer Dependencies

The BGP hold time specifies the maximum time BGP will wait between successive messages (either keep alive or update) from its peer, before closing the connection. This configuration parameter can be set at three levels. The most specific value is used.

  1. Global level — applies to all peers
  2. Group level — applies to all peers in group
  3. Neighbor level — only applies to specified peer

Although the keep alive time can be user specified, the configured keep alive timer is overridden by the value of hold time under the following circumstances:

  1. If the hold time specified is less than the configured keep alive time, then the operational keep alive time is set to one third of the specified hold time; the configured keep alive time is unchanged.
  2. If the hold time is set to zero, then the operational value of the keep alive time is set to zero; the configured keep alive time is unchanged. This means that the connection with the peer will be up permanently and no keep alive packets are sent to the peer.

If the hold time or keep alive values are changed, the changed timer values take effect when the new peering relationship is established. Changing the values cause the peerings to restart. The changed timer values are used when renegotiating the peer relationship.

5.1.14.5. Import and Export Route Policies

Import and export route policy statements are specified for BGP on the global, group, and neighbor level. Up to five unique policy statement names can be specified in the command line per level. The most specific command is applied to the peer. Defining the policy statement name is not required before being applied. Policy statements are evaluated in the order in which they are specified within the command context.

The import and export policies configured on different levels are not cumulative. The most specific value is used. An import or export policy command specified on the neighbor level takes precedence over the same command specified on the group or global level. An import or export policy command specified on the group level takes precedence over the same command specified on the global level.

5.1.14.6. Route Damping and Route Policies

To prevent BGP systems from sending excessive route changes to peers, BGP route damping can be implemented. Damping can reduce the number of Update messages sent between BGP peers, to reduce the load on peers, without adversely affecting the route convergence time for stable routes.

The damping profile defined in the policy statement is applied to control route damping parameters. Route damping characteristics are specified in a route damping profile and are referenced in the action for the policy statement or in the action for a policy entry. Damping can be specified at the global, group, or neighbor level with the most specific command applied to the peer.

5.1.14.7. AS Override

The BGP-4 Explicit AS Override simplifies the use of the same ASN across multiple RFC 2547 VPRN sites.

The Explicit AS Override feature can be used in VPRN scenarios where a customer is running BGP as the PE-CE protocol and some or all of the CE locations are in the same Autonomous System (AS). With normal BGP, two sites in the same AS would not be able to reach each other directly since there is an apparent loop in the ASPATH.

With AS Override enabled on an egress eBGP session, the Service Provider network can rewrite the customer ASN in the ASPATH with its own ASN as the route is advertised to the other sites within the same VPRN.

5.1.15. Configuration Guideline for BGP

The following are the configuration guidelines for BGP:

  1. 7210 SAS can act only as a route reflector client.
  2. 7210 SAS support IPv4 family for PE-CE eBGP instance and for RFC 3107 labeled IPv4 routes. It supports IPv4 family in the base routing instance to exchange IPv4 routes.
  3. 7210 SAS support IPv6 family for PE-CE eBGP instance and for RFC 3107 labeled IPv6 routes. It supports IPv6 family in the base routing instance to exchange IPv6 routes.

5.2. BGP Configuration Process Overview

Figure 23 shows the process to provision basic BGP parameters.

Figure 23:  BGP Configuration and Implementation Flow 

5.3. Configuration Notes

This section describes BGP configuration caveats.

5.3.1. General

  1. Before BGP can be configured, the router ID (a valid host address, not the MAC address default) and autonomous system global parameters must be configured.
  2. BGP instances must be explicitly created on each BGP peer. There are no default BGP instances on a 7210 SAS.

5.3.1.1. BGP Defaults

The following list summarizes the BGP configuration defaults:

  1. By default, the 7210 SAS is not assigned to an AS.
  2. A BGP instance is created in the administratively enabled state.
  3. A BGP group is created in the administratively enabled state.
  4. A BGP neighbor is created in the administratively enabled state.
  5. No BGP router ID is specified. If no BGP router ID is specified, BGP uses the router system interface address.
  6. The 7210 SAS BGP timer defaults are the values recommended in IETF drafts and RFCs (see BGP MIB Notes)
  7. If no import route policy statements are specified, all BGP routes are accepted.
  8. If no export route policy statements specified, all best and used BGP routes are advertised and non-BGP routes are not advertised.

5.3.1.2. BGP MIB Notes

The 7210 SAS implementation of the RFC 1657 MIB variables listed in Table 62 differs from the IETF MIB specification.

Table 62:  7210 SAS and IETF MIB Variations  

MIB Variable

Description

RFC 1657 Allowed Values

Allowed Values

bgpPeerMinASOriginationInterval

Time interval in seconds for the MinASOriginationInterval timer. The suggested value for this timer is 15 seconds.

1 to 65535

2 to 255

bgpPeerMinRouteAdvertisementInterval

Time interval in seconds for the MinRouteAdvertisementInterval timer. The suggested value for this timer is 30.

1 to 65535

1 to 255  a

    Note:

  1. A value of 0 is supported when the rapid-update command is applied to an address family that supports it.

If SNMP is used to set a value of X to the MIB variable in Table 63, there are three possible results:

Table 63:  MIB Variable with SNMP 

Condition

Result

X is within IETF MIB values

and

X is within 7210 SAS values

SNMP set operation does not return an error

MIB variable set to X

X is within IETF MIB values

and

X is outside 7210 SAS values

SNMP set operation does not return an error

MIB variable set to “nearest” 7210 SAS supported value (e.g. 7210 SAS range is 2 - 255 and X = 65535, MIB variable will be set to 255)

Log message generated

X is outside IETF MIB values

and

X is outside 7210 SAS values

SNMP set operation returns an error

When the value set using SNMP is within the IETF allowed values and outside the 7210 SAS values as specified in Table 62 and Table 63, a log message is generated. The log messages that display are similar to the following log messages:

Sample Log Message for setting bgpPeerMinASOriginationInterval to 65535

576 2006/11/12 19:45:48 [Snmpd] BGP-4-
bgpVariableRangeViolation: Trying to set bgpPeerMinASOrigInt to 65535 -
 valid range is [2-255] - setting to 255

Sample Log Message for setting bgpPeerMinASOriginationInterval to 1

594 2006/11/12 19:48:05 [Snmpd] BGP-4-
bgpVariableRangeViolation: Trying to set bgpPeerMinASOrigInt to 1 -
 valid range is [2-255] - setting to 2

Sample Log Message for setting bgpPeerMinRouteAdvertisementInterval to 256

535 2006/11/12 19:40:53 [Snmpd] BGP-4-
bgpVariableRangeViolation: Trying to set bgpPeerMinRouteAdvInt to 256 -
 valid range is [2-255] - setting to 255

Sample Log Message for setting bgpPeerMinRouteAdvertisementInterval to 1

566 2006/11/12 19:44:41 [Snmpd] BGP-4-
bgpVariableRangeViolation: Trying to set bgpPeerMinRouteAdvInt to 1 -
 valid range is [2-255] - setting to 2