A VC split can occur as a result of the following failures in a VC:
a single stacking link failure
a single node failure
a double failure consisting of two failed links, two failed nodes fail, or a failed link and a failed node
In the event of single stacking link failure in a VC, all nodes including the active and standby CPM-IMM nodes can continue to communicate with each other. There is no impact on the control plane and no services are lost because an alternate path exists around the point of failure. Services on all the IMM-only nodes continue to forward traffic; however, there might be an impact to the switching throughput as the stacking port bandwidth reduces by half.
Similarly, in the event of a single node failure in a VC, the VC can continue to operate with the active CPM-IMM node (or the standby CPM-IMM node if the active failed). There is no impact to the control plane or services and services on all the IMM continue to forward traffic. However, services provisioned on the failed node are lost and there might be an impact to the switching throughput as the stacking port bandwidth reduces by half.
If a double failure occurs, because two links fail, or two nodes fail, or a link and a node fail, the VC will have two islands of nodes/cards. One of these islands needs to own the VC. To decide which island of nodes will own the VC and continue normal functions, the following occurs:
If an island has only IMM-only nodes, they all reboot.
If one of the islands has both active and standby CPM-IMM nodes, nothing happens to the nodes in that island. They continue to work as normal and services configured on these nodes continue to operate in the VC without impact. The nodes on the second island reboot.
If one of the islands has the active CPM-IMM node and the other has the standby CPM-IMM node, the island with the greater number of nodes continues to be functional in the VC and all the nodes in the other island reboot. If software determines that the island with the active CPM-IMM is also the island with the greater number of nodes, that island continues normal operation while the nodes in island with the standby CPM-IMM all reboot (including the standby node). If software determines that the island with the standby CPM-IMM is the island with the greater number of nodes, the standby node takes the role of active CPM-IMM and that island continues operations. The nodes in the other island all reboot.
If both islands have the same number of nodes, the node that was the standby CPM-IMM node before the failure occurred becomes the active CPM-IMM. All the nodes in the island with the previously active CPM-IMM node (that is, active before the failure) will reboot. While rebooting, if the CPM-IMM node is unable to contact the current active node in boot.tim, they will reboot again. The IMM-only node will reboot and wait to hear from the active CPM (basically, it will wait for the operator to fix the problem and join the islands).
CPM-IMM nodes store information to detect whether a reboot is occurring due to a VC split. If a split has occurred, the node attempts to connect to the current active CPM-IMM. When it talks to the active CPM-IMM and boots up successfully, it clears the stored VC split information. If the CPM-IMM node is unable to talk to the current active CPM node, it will reboot and start the process all over again until it can successfully talk to an active CPM.
Users can clear the VC split information stored in the database and stop the reboot process by interrupting the boot process and providing a Yes response at the boot loader prompt to clear the VC split information.