11. Facility Alarms

11.1. Facility Alarms Overview

Facility Alarms provide a useful tool for operators to easily track and display the basic status of their equipment facilities. Facility Alarm support is intended to cover a focused subset of router states that are likely to indicate service impacts (or imminent service impacts) related to the overall state of hardware assemblies (cards, fans, links, and so on).

In the CLI, for brevity, the keyword or command alarm is used for commands related to Facility Alarms. This chapter may occasionally use the term alarm as a short form for facility alarm.

The CLI display for show routines allows the system operator to easily identify current facility alarm conditions and recently cleared facility alarms without searching event logs or monitoring various card and port show commands to determine the health of basic equipment in the system such as cards and ports.

The SR OS alarm model is based on RFC 3877, Alarm Management Information Base (MIB), (which evolved from the IETF Disman drafts).

11.2. Facility Alarms vs. Log Events

Facility Alarms are different than log events. Facility Alarms have a state (at least two states: active and clear) and a duration, and can be modeled with state transition events (raised, cleared). A log event occurs when the state of some object in the system changes. Log events notify the operator of a state change (for example, a port going down, an IGP peering session coming up, and so on). Facility alarms show the list of hardware objects that are currently in a bad state. Facility alarms can be examined at any time by an operator, whereas log events can be sent by a router asynchronously when they occur (for example, as an SNMP notification or trap, or a syslog event).

While log events provide notifications about a large number of different types of state changes in SR OS, facility alarms are intended to cover a focused subset of router states that are likely to indicate service impacts (or imminent service impacts) related to the overall state of hardware assemblies (cards, fans, links, and so on).

The facility alarm module processes log events in order to generate the raised and cleared state for the facility alarms. If a raising log event is suppressed under event-control, then the associated facility alarm will not be raised. If a clearing log event is suppressed under event-control, then it is still processed for the purpose of clearing the associated facility alarm. If a log event is a raising event for a Facility Alarm, and the associated Facility Alarm is raised, then changing the log event to suppress will clear the associated Facility Alarm.

Log event filtering, throttling and discarding of log events during overload do not affect facility alarm processing. In all cases, non-suppressed log events are processed by the facility alarm module before they are discarded.

Figure 38 illustrates the relationship of log events, facility alarms and the LEDs.

Figure 38:  Log Events, Facility Alarms and LEDs 

Facility Alarms are different and independent functionality from other uses of the term alarm in SR OS such as:

  1. Log events that use the term alarm (tmnxEqPortSonetAlarm)
  2. configure card fp hi-bw-mcast-src [alarm]
  3. configure mcast-management multicast-info-policy bundle channel source-override video analyzer alarms
  4. configure port ethernet report-alarm
  5. configure system thresholds no memory-use-alarm
  6. configure system thresholds rmon no alarm
  7. configure system security cpu-protection policy alarm

11.3. Facility Alarm Severities and Alarm LED Behavior

The Alarm LEDs on the CPM/CCM reflects the current status of the Facility Alarms:

  1. The Critical Alarm LED is lit if there is 1 or more active Critical Facility Alarms
  2. Similarly with the Major and Minor alarm LEDs
  3. The OT Alarm LED is not controlled by the Facility Alarm module

The supported alarm severities are as follows:

  1. Critical (with an associated LED on the CPM/CCM)
  2. Major (with an associated LED on the CPM/CCM)
  3. Minor (with an associated LED on the CPM/CCM)
  4. Warning (no LED)

Facility alarms inherit their severity from the raising log event.

A raising log event for a facility alarm configured with a severity of indeterminate or cleared will result in the facility alarm not being raised. But, a clearing log event is processed in order to clear facility alarms, regardless of the severity of the clearing log event.

Changing the severity of a raising log event only affects subsequent occurrences of that log event and facility alarms. Facility alarms that are already raised when their raising log event severity is changed maintain their original severity.

11.4. Facility Alarm Hierarchy

Facility Alarms for children objects is not raised for failure of a parent object. For example, when an MDA or XMA fails (or is shutdown) there is not a set of port facility alarms raised.

When a parent facility alarm is cleared, children facility alarms that are still in occurrence on the node appears in the active facility alarms list. For example, when a port fails there is a port facility alarm, but if the MDA or XMA is later shutdown the port alarm is cleared (and a card alarm will be active for the MDA or XMA). If the MDA or XMA comes back into service, and the port is still down, then a port alarm becomes active once again.

The supported facility alarm hierarchy is as follows (parent objects that are down cause alarms in all children to be masked):

  1. CPM -> Compact Flash
  2. CCM -> Compact Flash
  3. IOM/IMM -> MDA -> Port -> Channel
  4. XCM -> XMA -> Port
Note:

A masked facility alarm is not the same as a cleared facility alarm. The cleared facility alarm queue does not display entries for previously raised facility alarms that are currently masked. If the masking event goes away, then the previously raised facility alarms will once again be visible in the active facility alarm queue.

11.5. Facility Alarm List

Table 49 and Table 50 show the supported Facility Alarms.

Table 49:  Facility Alarm, Facility Alarm Name, Raising Log Event, Sample Details String and Clearing Log Event  

Facility Alarm

Facility Alarm Name/Raising Log Event

Sample Details String

Clearing Log Event

59-2004-1

linkDown

Interface intf-towards-node-B22 is not operational

linkUp

64-2091-1

tmnxSysLicenseInvalid

Error - <reason> record. <hw> will reboot the chassis <timeRemaining>

tmnxSysLicenseValid

64-2092-1

tmnxSysLicenseExpiresSoon

The license installed on <hw> expires <timeRemaining>

tmnxSysLicenseValid

64-2221-1

tmnxSysStandbyLicensingError

CPM B is not licensed; license record not found

tmnxSysStandbyLicensingReady

64-2226-1

tmnxSysLicenseUpdateRequired

System license update is required

tmnxSysLicenseValid

93-2006-1

tmnxSatSyncIfTimHoldover

Synchronous timing interface on satellite esat-1 is in holdover state

tmnxSatSyncIfTimHoldoverClear

93-2008-1

tmnxSatSyncIfTimRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'los(1)'

Synchronous timing interface on satellite, alarm on reference 1

tmnxSatSyncIfTimRef1AlarmClear

93-2008-2

tmnxSatSyncIfTimRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oof(2)'

Synchronous timing interface on satellite, alarm on reference 1

same as 93-2008-1

93-2008-3

tmnxSatSyncIfTimRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oopir(3)'

Synchronous timing interface on satellite, alarm on reference 1

same as 93-2008-1

93-2010-x

same as 93-2008-x but for ref2

same as 93-2008-x but for ref2

same as 93-2008-x but for ref2

7-2001-1

tmnxEqCardFailure

Class MDA Module: failed, reason: Mda 1 failed startup tests

tmnxChassisNotificationClear

7-2003-1

tmnxEqCardRemoved

Class CPM Module: removed

tmnxEqCardInserted

7-2004-1

tmnxEqWrongCard

Class IOM Module: wrong type inserted

tmnxChassisNotificationClear

7-2005-1

tmnxEnvTempTooHigh

Chassis 1: temperature too high

tmnxChassisNotificationClear

7-2011-1

tmnxEqPowerSupplyRemoved

Power supply 1, power lost

tmnxEqPowerSupplyInserted

7-2017-1

tmnxEqSyncIfTimingHoldover

Synchronous Timing interface in holdover state

tmnxEqSyncIfTimingHoldoverClear

7-2019-1

tmnxEqSyncIfTimingRef1Alarm

with attribute tmnxSyncIfTimingNotifyAlarm == 'los(1)'

Synchronous Timing interface, alarm los on reference 1

tmnxEqSyncIfTimingRef1AlarmClear

7-2019-2

tmnxEqSyncIfTimingRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oof(2)'

Synchronous Timing interface, alarm oof on reference 1

same as 7-2019-1

7-2019-3

tmnxEqSyncIfTimingRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oopir(3)'

Synchronous Timing interface, alarm oopir on reference 1

same as 7-2019-1

7-2021-x

same as 7-2019-x but for ref2

same as 7-2019-x but for ref2

same as 7-2019-x but for ref2

7-2030-x

same as 7-2019-x but for the BITS input

same as 7-2019-x but for the BITS input

same as 7-2019-x but for the BITS input

7-2033-1

tmnxChassisUpgradeInProgress

Class CPM Module: software upgrade in progress

tmnxChassisUpgradeComplete

7-2073-x

same as 7-2019-x but for the BITS2 input

same as 7-2019-x but for the BITS2 input

same as 7-2019-x but for the BITS2 input

7-2092-1

tmnxEqPowerCapacityExceeded

The system has reached maximum power capacity <x> watts

tmnxEqPowerCapacityExceededClear

7-2094-1

tmnxEqPowerLostCapacity

The system can no longer support configured devices. Power capacity dropped to <x> watts

tmnxEqPowerLostCapacityClear

7-2096-1

tmnxEqPowerOverloadState

The system has reached critical power capacity. Increase available power now

tmnxEqPowerOverloadStateClear

7-2104-1

tmnxEqLowSwitchFabricCap

The switch fabric capacity is less than the forwarding capacity of IOM 1 due to errors in fabric links

tmnxEqLowSwitchFabricCapClear

7-2134-1

tmnxSyncIfTimBITS2048khzUnsup

The revision of 1/1 does not meet the specifications to support the 2048kHz BITS interface type

tmnxSyncIfTimBITS2048khzUnsupClr

7-2136-1

tmnxEqMgmtEthRedStandbyRaise

The standby CPM's management Ethernet port A/1 is serving as the system's management Ethernet port

tmnxEqMgmtEthRedStandbyClear

7-2138-1

tmnxEqPhysChassPowerSupOvrTmp

Power supply 2 over temperature

tmnxEqPhysChassPowerSupOvrTmpClr

7-2140-1

tmnxEqPhysChassPowerSupAcFail

Power supply 1 AC failure

tmnxEqPhysChassPowerSupAcFailClr

7-2142-1

tmnxEqPhysChassPowerSupDcFail

Power supply 2 DC failure

tmnxEqPhysChassPowerSupDcFailClr

7-2144-1

tmnxEqPhysChassPowerSupInFail

Power supply 1 input failure

tmnxEqPhysChassPowerSupInFailClr

7-2146-1

tmnxEqPhysChassPowerSupOutFail

Power supply 1 output failure

tmnxEqPhysChassPowerSupOutFailClr

7-2148-1

tmnxEqPhysChassisFanFailure

Fan 2 failed

tmnxEqPhysChassisFanFailureClear

7-2153-1

tmnxCpmMemSizeMismatch

The standby CPM A has a different memory size than the active B

tmnxCpmMemSizeMismatchClear

7-2156-1

tmnxPhysChassPwrSupWrgFanDir

The front to back fan direction for chassis 1 power supply 1 is not supported

tmnxPhysChassPwrSupWrgFanDirClr

7-2157-1

tmnxPhysChassPwrSupPemACRect

Chassis 1 power supply 1 acRec1 failed or missing

tmnxPhysChassPwrSupPemACRectClr

7-2159-1

tmnxPhysChassPwrSupInputFeed

Chassis 1 power supply 1 inputFeedA not supplying power

tmnxPhysChassPwrSupInputFeedClr

7-2161-1

tmnxEqBpEpromFail

The active CPM is no longer able to access any of backplane EPROMs due to a hardware defect

tmnxEqBpEpromFailClear

7-2163-1

tmnxEqBpEpromWarning

The active CPM is no longer to access one backplane EPROM due to a hardware defect but a redundant EPROM is present and accessible.

tmnxEqBpEpromWarningClear

7-2165-1

tmnxPhysChassisPCMInputFeed

Chassis 1 pcm 1 1 not supplying power

tmnxPhysChassisPCMInputFeedClr

7-2190-1

tmnxPhysChassisPMOutFail

Chassis 1 Power Shelf 1 Power Module 4 output failure

tmnxPhysChassisPMOutFailClr

7-2192-1

tmnxPhysChassisPMInputFeed

Chassis 1 Power Shelf 1 Power Module 3 inputFeedA inputFeedB not supplying power

tmnxPhysChassisPMInputFeedClr

7-2194-1

tmnxPhysChassisFilterDoorOpen

Filter door is missing or open

tmnxPhysChassisFilterDoorClosed

7-2196-1

tmnxPhysChassisPMOverTemp

Chassis 1 Power Shelf 1 over temperature

tmnxPhysChassisPMOverTempClr

7-2203-x

same as 7-2019-x but for SyncE

same as 7-2019-x but for SyncE

same as 7-2019-x but for SyncE

7-2205-x

same as 7-2019-x but for E2

same as 7-2019-x but for E2

same as 7-2019-x but for E2

7-4001-1

tmnxInterChassisCommsDown

Control communications disrupted between the Active CPM and the chassis

tmnxInterChassisCommsUp

7-4003-1

tmnxCpmIcPortDown

CPM Interconnect Port is not operational. Error code = invalid-connection

tmnxCpmIcPortUp

7-4006-1

tmnxCpmIcPortSFFRemoved

CPM interconnect port SFF removed

tmnxCpmIcPortSFFInserted

7-4007-1

tmnxCpmNoLocalIcPort

CPM A can not reach the chassis using its local CPM interconnect ports

tmnxCpmLocalIcPortAvail

7-4017-1

tmnxSfmIcPortDown

SFM interconnect Port is not operational. Error code = invalid-connection to Fabric 10 IcPort 2

tmnxSfmIcPortUp

7-6002-1

tmnxPowerShelfCommsDown

Chassis 1 Power Shelf 1 lost communication with cpmA

tmnxPowerShelfCommsUp

7-6005-1

tmnxPowerShelfOutputStatusDown

Chassis 1 Power Shelf 2 output status switched to off

tmnxPowerShelfOutputStatusUp

Table 50:  Facility Alarm Name/Raising Log Event, Cause, Effect and Recovery  

Facility Alarm

Facility Alarm Name/Raising Log Event

Cause

Effect

Recovery

59-2004-1

linkDown

A linkDown trap signifies that the SNMP entity, acting in an agent role, has detected that the ifOperStatus object for one of its communication links is about to enter the down state from some other state (but not from the notPresent state).

The indicated interface is taken down.

If the ifAdminStatus is down then the interface state is deliberate and there is no recovery.

If the ifAdminStatus is up then try to determine that cause of the interface going down: cable cut, distal end went down, and so on.

64-2091-1

tmnxSysLicenseInvalid

Generated when the license becomes invalid for the reason specified in the log event/alarm.

The system will reboot at the end of the time remaining.

Configure a valid license file location and file name.

64-2092-1

tmnxSysLicenseExpiresSoon

Generated when the license is due to expire soon.

The system will reboot at the end of the time remaining.

Configure a valid license file location and file name.

64-2221-1

tmnxSysStandbyLicensingError

Generated when the standby detects a licensing failure. The reason is specified in tmnxSysLicenseErrorReason.

The standby CPM may not synchronized and may be put into a failed state.

Configure a valid license file location and file name, given the value of tmnxSysLicenseErrorReason.

64-2226-1

tmnxSysLicenseUpdateRequired

The tmnxSysLicenseUpdateRequired notification is generated once after the system boots up and the license is determined by the system to be valid, but requires to be updated to the correct software version.

The system will use the license until it is updated.

Update and activate the updated license.

93-2006-1

tmnxSatSyncIfTimHoldover

The tmnxSatSyncIfTimHoldover notification is generated when the synchronous equipment timing subsystem of the satellite transitions into a holdover state

The transmit timing of all synchronous interfaces on the satellite are no longer synchronous with the host. This could result in traffic loss.

Investigate the state of the two input timing references on the satellite and the links between the host and the satellite (i.e. the uplinks) that drive them for failures.

93-2008-1

tmnxSatSyncIfTimRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'los(1)'

The tmnxSatSyncIfTimRef1Alarm notification is generated when an alarm condition on the first timing reference is detected.

If the other timing reference is free of faults, the satellite no longer has a backup timing reference. If the other timing reference also has a fault, the satellite will likely no longer be synchronous with the host.

Investigate the state of the link between the host and the satellite (i.e. the uplink) that drives the first timing reference on the satellite for faults.

93-2008-2

tmnxSatSyncIfTimRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oof(2)'

The same cause as 93-2008-1

The same effect as 93-2008-1

Investigate the state of the link between the host and the satellite (i.e. the uplink) that drives the first timing reference on the satellite for faults.

93-2008-3

tmnxSatSyncIfTimRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oopir(3)'

The same cause as 93-2008-1

The same effect as 93-2008-1

Investigate the state of the link between the host and the satellite (i.e. the uplink) that drives the first timing reference on the satellite for faults.

93-2010-x

same as 93-2008-x but for ref2

The same cause as 93-2008-x but for ref2

The same as 93-2008-x but for ref2

The same as 93-2008-x but for ref2

7-2001-1

tmnxEqCardFailure

Generated when one of the cards in a chassis has failed. The card type may be IOM (or XCM), MDA (or XMA), SFM, CCM, CPM, Compact Flash, and so on. The reason is indicated in the details of the log event or alarm, and also available in the tmnxChassisNotifyCardFailureReason attribute included in the SNMP notification.

The effect is dependent on the card that has failed. IOM (or XCM) or MDA (or XMA) failure will cause a loss of service for all services running on that card. A fabric failure can impact traffic to and from all cards.

7750 SR, 7450 ESS — If the IOM/IMM fails then the two associated MDAs for the slot will also go down.

7950 XRS — If one out of two XMAs fails in a XCM slot then the XCM will remain up. If only one remaining operational XMA within a XCM slot fails, then the XCM will go into a booting operational state.

Before taking any recovery steps collect a tech-support file, then try resetting (clear) the card. If unsuccessful, try removing and re-inserting the card. If that does not work then replace the card.

7-2003-1

tmnxEqCardRemoved

Generated when a card is removed from the chassis. The card type may be IOM (or XCM), MDA (or XMA), SFM, CCM, CPM, Compact Flash, and so on.

The effect is dependent on the card that has been removed. IOM (or XCM) or MDA (or XMA) removal will cause a loss of service for all services running on that card. A fabric removal can impact traffic to and from all cards.

Before taking any recovery steps collect a tech-support file, then try re-inserting the card. If unsuccessful, replace the card.

7-2004-1

tmnxEqWrongCard

Generated when the wrong type of card is inserted into a slot of the chassis. Even though a card may be physically supported by the slot, it may have been administratively configured to allow only certain card types in a particular slot location. The card type may be IOM (or XCM), MDA (or XMA), SFM, CCM, CPM, Compact Flash, and so on.

The effect is dependent on the card that has been incorrectly inserted. Incorrect IOM (or XCM) or MDA (or XMA) insertion will cause a loss of service for all services running on that card.

Insert the correct card into the correct slot, and ensure the slot is configured for the correct type of card.

7-2005-1

tmnxEnvTempTooHigh

Generated when the temperature sensor reading on an equipment object is greater than its configured threshold.

This could be causing intermittent errors and could also cause permanent damage to components.

Remove or power down the affected cards, or improve the cooling to the node. More powerful fan trays may also be required.

7-2011-1

tmnxEqPowerSupplyRemoved

Generated when:

  1. one of the power supplies is removed from the chassis
  2. low input voltage is detected. The operating voltage range for the 7750 SR-7/12 and the 7450 ESS-7/12 is -40 to -72 VDC. The alarm is raised if the system detects that the voltage of the power supply has dropped to -42.5 VDC.

Reduced power can cause intermittent errors and could also cause permanent damage to components.

Re-insert the power supply or raise the input voltage to above -42.5 VDC.

7-2017-1

tmnxEqSyncIfTimingHoldover

Generated when the synchronous equipment timing subsystem transitions into a holdover state.

Any node-timed ports will have very slow frequency drift limited by the central clock oscillator stability. The oscillator meets the holdover requirements of a Stratum 3 and G.813 Option 1 clock.

Address issues with the central clock input references.

7-2019-1

tmnxEqSyncIfTimingRef1Alarm

with attribute tmnxSyncIfTimingNotifyAlarm == 'los(1)'

Generated when an alarm condition on the first timing reference is detected. The type of alarm (los, oof, and so on) is indicated in the details of the log event or alarm, and is also available in the tmnxSyncIfTimingNotifyAlarm attribute included in the SNMP notification. The SNMP notification will have the same indices as those of the tmnxCpmCardTable.

Timing reference 1 cannot be used as a source of timing into the central clock.

Address issues with the signal associated with timing reference 1.

7-2019-2

tmnxEqSyncIfTimingRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oof(2)'

The same cause as 7-2019-1

The same effect as 7-2019-1

Address issues with the signal associated with timing reference 1.

7-2019-3

tmnxEqSyncIfTimingRef1Alarm with attribute tmnxSyncIfTimingNotifyAlarm == 'oopir(3)'

The same cause as 7-2019-1

The same effect as 7-2019-1

Address issues with the signal associated with timing reference 1.

7-2021-x

same as 7-2019-x but for ref2

The same cause as 7-2019-x but for the second timing reference

The same as 7-2019-x but for the second timing reference

The same as 7-2019-x but for the second timing reference

7-2030-x

same as 7-2019-x but for the BITS input

The same cause as 7-2019-x but for the BITS timing reference

The same as 7-2019-x but for the BITS timing reference

The same as 7-2019-x but for the BITS timing reference

7-2033-1

tmnxChassisUpgradeInProgress

The tmnxChassisUp gradeInProgress notification is generated only after a CPM switchover occurs and the new active CPM is running new software, while the IOMs or XCMs are still running old software. This is the start of the upgrade process. The tmnxChassisUpgradeInProgress notification will continue to be generated every 30 minutes while at least one IOM or XCM is still running older software.

A software mismatch between the CPM and IOM or XCM is generally fine for a short duration (during an upgrade) but may not allow for correct long term operation.

Complete the upgrade of all IOMs or XCMs.

7-2073-x

same as 7-2019-x but for the BITS2 input

The same as 7-2019-x but for the BITS 2 timing reference

The same as 7-2019-x but for the BITS 2 timing reference

The same as 7-2019-x but for the BITS 2 timing reference

7-2092-1

tmnxEqPowerCapacityExceeded

Generated when a device needs power to boot, but there is not enough power capacity to support the device.

A non-powered device will not boot until the power capacity is increased to support the device.

Add a new power supply to the system, or change the faulty power supply with a working one.

7-2094-1

tmnxEqPowerLostCapacity

Generated when a power supply fails or is removed which puts the system in an overloaded situation.

Devices are powered off in order of lowest power priority until the available power capacity can support the powered devices.

Add a new power supply to the system, or change the faulty power supply with a working one.

7-2096-1

tmnxEqPowerOverloadState

Generated when the overloaded power capacity can not support the power requirements and there are no further devices that can be powered off.

The system runs a risk of experiencing brownouts while the available power capacity does not meet the required power consumption.

Add power capacity or manually shutdown devices until the power capacity meets the power needs.

7-2104-1

tmnxEqLowSwitchFabricCap

The tmnxEqLowSwitchFabricCap alarm is generated when the total switch fabric capacity becomes less than the IOM capacity due to link failures. At least one of the taps on the IOM is below 100% capacity.

There is diminished switch fabric capacity to forward service-impacting information.

If the system does not self-recover, the IOM must be rebooted.

7-2134-1

tmnxSyncIfTimBITS2048khzUnsup

The tmnxSyncIfTimBITS2048khzUnsup notification is generated when the value of tSyncIfTimingAdmBITSIfType is set to 'g703-2048khz (5)' and the CPM does not meet the specifications for the 2048kHz BITS output signal under G.703.

The BITS input will not be used as the Sync reference and the 2048kHz BITS output signal generated by the CPM is squelched.

Replace the CPM with one that is capable of generating the 2048kHz BITS output signal, or set tSyncIfTimingAdmBITSIfType to a value other than 'g703-2048khz (5)'.

7-2136-1

tmnxEqMgmtEthRedStandbyRaise

The tmnxEqMgmtEthRedStandbyRaise notification is generated when the active CPM's management Ethernet port goes operationally down and the standby CPM's management Ethernet port is operationally up and now serving as the system's management Ethernet port.

The management Ethernet port is no longer redundant. The node can be managed via the standby CPM's management Ethernet port only.

Bring the active CPM's management Ethernet port operationally up.

7-2138-1

tmnxEqPhysChassPowerSupOvrTmp

Generated when the temperature sensor reading on a power supply module is greater than its configured threshold.

This could be causing intermittent errors and could also cause permanent damage to components.

Remove or power down the affected power supply module or improve the cooling to the node. More powerful fan trays may also be required. The power supply itself may be faulty so replacement may be necessary.

7-2140-1

tmnxEqPhysChassPowerSupAcFail

Generated when an AC failure is detected on a power supply.

Reduced power can cause intermittent errors and could also cause permanent damage to components.

First try re-inserting the power supply. If unsuccessful, replace the power supply.

7-2142-1

tmnxEqPhysChassPowerSupDcFail

Generated when an DC failure is detected on a power supply.

Reduced power can cause intermittent errors and could also cause permanent damage to components.

First try re-inserting the power supply. If unsuccessful, then replace the power supply.

7-2144-1

tmnxEqPhysChassPowerSupInFail

Generated when an input failure is detected on a power supply.

Reduced power can cause intermittent errors and could also cause permanent damage to components.

First try re-inserting the power supply. If that does not work, then replace the power supply.

7-2146-1

tmnxEqPhysChassPowerSupOutFail,

Generated when an output failure is detected on a power supply.

Reduced power can cause intermittent errors and could also cause permanent damage to components.

First try re-inserting the power supply. If that does not work, then replace the power supply.

7-2148-1

tmnxEqPhysChassisFanFailure

Generated when one of the fans in a fan tray has failed.

This could cause the temperature to rise and result in intermittent errors and potentially permanent damage to components.

Replace the fan tray immediately, improve the cooling to the node, or reduce the heat being generated in the node by removing cards or powering down the node.

7-2153-1

tmnxCpmMemSizeMismatch

A tmnxCpmMemSizeMismatch notification is generated when the RAM memory size of the standby CPM (that is, tmnxChassisNotifyCpmCardSlotNum) is different from the active CPM (that is, tmnxChassisNotifyHwIndex).

There is an increased risk of the memory overflow on the standby CPM during the CPM switchover.

Use CPMs with the same memory size.

7-2156-1

tmnxPhysChassPwrSupWrgFanDir

The tmnxPhysChassPwrSupWrgFanDirClr notification is generated when the airflow direction of the power supply's fan is corrected.

The fan is cooling the power supply in the proper direction.

No recovery required.

7-2157-1

tmnxPhysChassPwrSupPemACRect

The tmnxPhysChassPwrSupPemACRect notification is generated if any one of the AC rectifiers for a given power supply is in a failed state or is missing.

There is an increased risk of the power supply failing, causing insufficient power to the system.

Bring the AC rectifiers back online.

7-2159-1

tmnxPhysChassPwrSupInputFeed

The tmnxPhysChassPwrSupInputFeed notification is generated if any one of the input feeds for a given power supply is not supplying power.

There is an increased risk of system power brown-outs or black-outs.

Restore all of the input feeds that are not supplying power.

7-2161-1

tmnxEqBpEpromFail

The tmnxEqBpEpromFail alarm is generated when the active CPM is no longer able to access any of backplane EPROMs due to a hardware defect.

The active CPM is at risk of failing to initialize after node reboot due to not being able to access the BP EPROM to read the chassis type.

The system does not self-recover and Nokia Support has to be contacted for further instructions.

7-2163-1

tmnxEqBpEpromWarning

The tmnxEqBpEpromWarning alarm is generated when the active CPM is no longer to access one backplane EPROM due to a hardware defect but a redundant EPROM is present and accessible.

There is no effect on system operation.

No recovery action required.

7-2165-1

tmnxPhysChassisPCMInputFeed

The tmnxPhysChassisPCMInputFeed notification is generated if any one of the input feeds for a given PCM has gone offline.

There is an increased risk of system power brown-outs or black-outs.

Restore all of the input feeds that are not supplying power.

7-2190-1

tmnxPhysChassisPMOutFail

The tmnxPhysChassisPMOutFail notification is generated when an output failure occurs on the power module.

The power module is no longer operational.

Insert a new power module.

7-2192-1

tmnxPhysChassisPMInputFeed

The tmnxPhysChassisPMInputFeed notification is generated if any one of the input feeds for a given power module is not supplying power.

There is an increased risk of system power brownouts or blackouts.

Restore all of the input feeds that are not supplying power.

7-2194-1

tmnxPhysChassisFilterDoorOpen

The tmnxPhysChassisFilterDoorOpen notification is generated when the filter door is either open or not present.

Power shelf protection may be compromised.

If the filter door is not installed, install it. Close the filter door.

7-2196-1

tmnxPhysChassisPMOverTemp

The tmnxPhysChassisPMOverTemp notification is generated when a power module's temperature surpasses the temperature threshold.

The power module is no longer operational.

Check input feed and/or insert a new power module.

7-2203-x

same as 7-2019-x but for SyncE

The same cause as 7-2019-x but for SyncE

same as 7-2019-x but for SyncE

same as 7-2019-x but for SyncE

7-2205-x

same as 7-2019-x but for E2

The same cause as 7-2019-x but for E2

same as 7-2019-x but for E2

same as 7-2019-x but for E2

7-4001-1

tmnxInterChassisCommsDown

The tmnxInterChassis CommsDown alarm is generated when the active CPM cannot reach the far-end chassis.

The resources on the far-end chassis are not available. This event for the far-end chassis means that the CPM, SFM, and XCM cards in the far-end chassis will reboot and remain operationally down until communications are re-established.

Ensure that all CPM interconnect ports in the system are properly cabled together with working cables.

7-4003-1

tmnxCpmIcPortDown

The tmnxCpmIcPort Down alarm is generated when the CPM interconnect port is not operational. The reason may be a cable connected incorrectly, a disconnected cable, a faulty cable, or a misbehaving CPM interconnect port or card.

At least one of the control plane paths used for inter-chassis CPM communication is not operational. Other paths may be available.

A manual verification and testing of each CPM interconnect port is required to ensure fully functional operation. Physical replacement of cabling may be required.

7-4006-1

tmnxCpmIcPortSFFRemoved

The tmnxCpmIcPortSFFRemoved notification is generated when the SFF (eg. QSFP) is removed from the CPM interconnect port. Removing an SFF causes both this trap, and also a tmnxCpmIcPortDown event.

Removing the SFF will cause the CPM interconnect port to go down. This port will no longer be able to be used as part of the control plane between chassis but other paths may be available.

Insert a working SFF into the port.

7-4007-1

tmnxCpmNoLocalIcPort

The tmnxCpmNoLo calIcPort alarm is generated when the CPM cannot reach the other chassis using its local CPM interconnect ports.

Another control communications path may still be available between the CPM and the other chassis via the mate CPM in the same chassis. If that alternative path is not available then complete disruption of control communications to the other chassis will occur and the tmnxInterChassisCommsDown alarm is raised.

A tmnxCpmNoLocalIcPort alarm on the active CPM indicates that a further failure of the local CPM interconnect ports on the standby CPM will cause complete disruption of control communications to the other chassis and the tmnxInterChassisCommsDown alarm is raised.

A tmnxCpmNoLocalIcPort alarm on the standby CPM indicates that a CPM switchover may cause temporary disruption of control communications to the other chassis while the rebooting CPM comes back into service.

Ensure that all CPM interconnect ports in the system are properly cabled together with working cables.

7-4017-1

tmnxSfmIcPortDown

The tmnxSfmIcPortDown alarm is generated when the SFM interconnect port is not operational. The reason may be a cable connected incorrectly, a disconnected cable, a faulty cable, or a misbehaving SFM interconnect port or SFM card.

This port can no longer be used as part of the user plane fabric between chassis. Other fabric paths may be available resulting in no loss of capacity.

A manual verification and testing of each SFM interconnect port is required to ensure fully functional operation. Physical replacement of cabling may be required.

7-6002-1

tmnxPowerShelfCommsDown

The tmnxPowerShelfCommsDown is generated when there is a loss of communications with the power shelf controller.

If there is a power failure, it will not be detected since the power modules cannot be polled. The system will continue to report the state of the power modules as they were when last seen.

Correct the power shelf controller communications problem.

7-6005-1

tmnxPowerShelfOutputStatusDown

The tmnxPowerShelfOutputStatusSwitch is generated when the physical output switch on the power shelf is set to Standby.

The power output from the identified power shelf is switched off and does not supply power to the system.

Set output switch to On to restore power output.

The linkDown Facility Alarm is supported for the objects listed in Table 51 (note that all objects may not be supported on all platforms):

Table 51:  linkDown Facility Alarm Support  

Object

Supported

Ethernet Ports

Yes

Sonet Section, Line and Path (POS)

Yes

TDM Ports (E1, T1, DS3) including CES MDAs

Yes

TDM Channels (DS3 channel configured in an STM-1 port)

Yes

ATM Ports

Yes

Ethernet LAGs

No

APS groups

No

Bundles (MLPPP, IMA, and so on)

No

ATM channels, Ethernet VLANs, Frame Relay DLCIs

No