Flow synchronization

The goal of flow synchronization is to minimize service interruption after a switchover. Minimum service interruption in this context allows some packet drop during a switchover, but the user sessions are preserved without requiring a user to restart them. Flows are synchronized directly between the ISAs at the time of creation and deletion, and always only from the active side to the standby side.

The amount of traffic carried across the inter-chassis link is proportional to the number of flows that are synchronized, and it also depends on the NAT type (NAT44, DS-Lite or NAT64). The size of a flow record on the wire is around 100 bytes and a number of flow records can be packed into a single frame whose MTU is adjustable. With approximately 90 bytes of header overhead per frame, and the number of synchronized flows, it is relatively easy to estimate the required bandwidth of the inter-chassis links. The minimum recommended bandwidth of the inter-chassis link is 10 Gb/s.

Excluding short-lived flows from synchronization could further reduce the necessary bandwidth for synchronization. The flow replication threshold is configurable:

 configure
      isa
         nat-group <id>
            active-mda-limit <limit>   
            inter-chassis-redundancy
               replication-threshold <seconds>   [0..300]

For the replication-threshold, Nokia recommends choosing a value that is larger than any of the typical short-lived flows that are not closed by the protocol itself but rely on the timeout value for its expiration (udp-dns, udp-initial, or icmp-query).

After a switchover, a resynchronization of flows occurs. The new standby ISA starts clearing all its flows and the synchronization process restarts. This means that an attempt is made to resynchronize flows from the currently active side to the standby side. During this process, the active ISAs continue to forward traffic and create new flows.

While the flow synchronization is in progress, a switchover is not allowed unless the health on the active side drops to 0, which means that one or more ISAs on the active side have failed.