Health status and failure events

A health value determines the activity of a NAT group within a pair of redundant nodes. The health value of a NAT group is internally calculated. The system can automatically decrease this value depending on the events that can negatively affect the system’s ability to perform NAT at a needed capacity.

A NAT group with a higher health value becomes active.

Table: Activity states at equal health shows activity states, if paired NAT groups have equal health values on both nodes. Preferred is a configuration parameter that influences the activity state for a pair of NAT groups with equal health value (typical use case would be load balancing per NAT group).

Table: Activity states at equal health
Node 1 Node 2 Active node Comments

no preferred configured

no preferred configured

Whichever node becomes active first, remains active

If both nodes are becoming active simultaneously, the node with the highest system chassis MAC address becomes a controller node that decides which node becomes active node and which standby based on the health and preference values.

When the health and preference are equal, the controller node does not preempt (trigger a switchover) an already active node.

preferred configured

no preferred configured

Node 1

Node 1 always preempts Node 2 (if the health values are equal)

no preferred configured

preferred configured

Node 2

Node 2 always preempts Node 1 (if the health values are equal)

preferred configured

preferred configured

Whichever node becomes active first, remains active

Same as for no preference on both nodes

The health parameter is initially set to a value of 1000 under the following circumstances:

The above circumstances imply that the system is fully operational with no failures that would affect NAT operation.

However, the health value can be influenced by the events that can affect NAT operation, and that are outside of ISA-related failures, for example, unhealthy ports and paths that lead traffic in and out of the NAT node. Such events are explicitly tracked or monitored for the purpose of dynamically adjusting the health value and therefore influencing the activity of the NAT groups.

Stateful inter-chassis NAT redundancy protects against the following failures:

Port and oper-group state change influences the reachability of the NAT node and consequently this affects network-wide NAT operation. If that port or path capacity in and out of the NAT node drops below a specific level, a switchover to a healthier NAT node may be needed.

Port states can be tracked or monitored on the private side (inside) and on the public side (outside) of NAT.

Oper-groups are constructs that are tracking states of BFD enabled interfaces, SAPs, and VRRP instances.

BFD sessions targeted to the next hop can traverse intermediate Layer 2 nodes and can have longer reach than port tracking.

Another benefit of monitoring ports and paths is that it can help reduce the amount of traffic on the inter-chassis communication link (ICL) if that active node loses direct connection to the node downstream or upstream from it. The link for inter-chassis control communication (ICL) must always be present (for synchronization purposes). However, this link does not need to be designed for heavy traffic loads during extended periods of time occurs if traffic bearing ports are not colocated with the active node. However, this link is used for shorter transient periods that are caused by switchovers.