Alarm Identity
|
Initial Perceived Severity
|
abort-error |
major |
Description
|
Recommended Action
|
An error happened while aborting or reverting a transaction. Device's
configuration is likely to be inconsistent with the NCS CDB.
|
Inspect the configuration difference with compare-config,
resolve conflicts with sync-from or sync-to if any.
|
Alarm message(s)
|
|
Clear condition(s)
|
If NCS achieves sync with the device, or receives a transaction
id for a netconf session towards the device, the alarm is cleared.
|
|
Alarm Identity
|
alarm-type |
Description
|
Base identity for alarm types. A unique identification of the
fault, not including the managed object. Alarm types are used
to identify if alarms indicate the same problem or not, for
lookup into external alarm documentation, etc. Different
managed object types and instances can share alarm types. If
the same managed object reports the same alarm type, it is to
be considered to be the same alarm. The alarm type is a
simplification of the different X.733 and 3GPP alarm IRP alarm
correlation mechanisms and it allows for hierarchical
extensions.
A 'specific-problem' can be used in addition to the alarm type
in order to have different alarm types based on information not
known at design-time, such as values in textual SNMP
Notification varbinds.
|
|
Alarm Identity
|
Initial Perceived Severity
|
bad-user-input |
critical |
Description
|
Recommended Action
|
Invalid input from user. NCS cannot recognize parameters needed to
connect to device.
|
Verify that the user supplied input are correct.
|
Alarm message(s)
|
|
Clear condition(s)
|
This alarm is not cleared. |
|
Alarm Identity
|
certificate-expiration |
Description
|
Recommended Action
|
The certificate is nearing its expiry or has already expired.
The severity depends on the time left to expiry, it ranges from
warning to critical.
|
Replace certificate.
|
Alarm message(s)
|
|
Clear condition(s)
|
This alarm is cleared when the certificate is no longer loaded. |
|
Alarm Identity
|
Initial Perceived Severity
|
cluster-subscriber-failure |
critical |
Description
|
Recommended Action
|
Failure to establish a notification subscription towards
a remote node.
|
Verify IP connectivity between cluster nodes.
|
Alarm message(s)
|
-
Failed to establish netconf notification
subscription to node ~s, stream ~s
-
Commit queue items with remote nodes will not receive required
event notifications.
|
Clear condition(s)
|
This alarm is cleared if NCS succeeds to establish a
subscription towards the remote node, or when the subscription
is explicitly stopped.
|
|
Alarm Identity
|
Initial Perceived Severity
|
commit-through-queue-blocked |
warning |
Description
|
A commit was queued behind a queue item waiting to be able to
connect to one of its devices. This is potentially dangerous
since one unreachable device can potentially fill up the commit
queue indefinitely.
|
Alarm message(s)
|
|
Clear condition(s)
|
An alarm raised due to a transient error will be cleared
when NCS is able to reconnect to the device.
|
|
Alarm Identity
|
Initial Perceived Severity
|
commit-through-queue-failed |
critical |
Description
|
Recommended Action
|
A queued commit failed.
|
Resolve with rollback if possible.
|
Alarm message(s)
|
-
Failed to authenticate towards device {device}: {reason}
-
Device {dev} is locked
-
{Reason}
-
Device {dev} is southbound locked
-
Commit queue item {CqId} rollback invoked
-
Commit queue item {CqId} has failed: Operation failed because:
inconsistent database
-
Remote commit queue item ~p cannot be unlocked:
cluster node not configured correctly
|
Clear condition(s)
|
This alarm is not cleared. |
|
Alarm Identity
|
Initial Perceived Severity
|
commit-through-queue-failed-transiently |
critical |
Description
|
Recommended Action
|
A queued commit failed as it exhausted its retry attempts
on transient errors.
|
Resolve with rollback if possible.
|
Alarm message(s)
|
-
Failed to connect to device {dev}: {reason}
-
Connection to {dev} timed out
-
Failed to authenticate towards device {device}: {reason}
-
The configuration database is locked for device {dev}: {reason}
-
the configuration database is locked by session {id} {identification}
-
the configuration database is locked by session {id} {identification}
-
{Dev}: Device is locked in a {Op} operation by session {session-id}
-
resource denied
-
Commit queue item {CqId} rollback invoked
-
Commit queue item {CqId} has failed: Operation failed because:
inconsistent database
-
Remote commit queue item ~p cannot be unlocked:
cluster node not configured correctly
|
Clear condition(s)
|
This alarm is not cleared. |
|
Alarm Identity
|
Initial Perceived Severity
|
commit-through-queue-rollback-failed |
critical |
Description
|
Recommended Action
|
Rollback of a commit-queue item failed.
|
Investigate the status of the device and resolve the
situation by issuing the appropriate action, i.e., service
redeploy or a sync operation.
|
Alarm message(s)
|
|
Clear condition(s)
|
This alarm is not cleared. |
|
Alarm Identity
|
Initial Perceived Severity
|
configuration-error |
critical |
Description
|
Recommended Action
|
Invalid configuration of NCS managed device, NCS cannot recognize
parameters needed to connect to device.
|
Verify that the configuration parameters defined in
tailf-ncs-devices.yang submodule are consistent for this device.
|
Alarm message(s)
|
-
Failed to resolve IP address for {dev}
-
the configuration database is locked by session {id} {identification}
-
{Reason}
-
Resource {resource} doesn't exist
|
Clear condition(s)
|
The alarm is cleared when NCS reads the configuration
parameters for the device, and is raised again if the
parameters are invalid.
|
|
Alarm Identity
|
Initial Perceived Severity
|
connection-failure |
major |
Description
|
Recommended Action
|
NCS failed to connect to a managed device before the timeout expired.
|
Verify address, port, authentication, check that the device is up
and running. If the error occurs intermittently, increase
connect-timeout.
|
Alarm message(s)
|
|
Clear condition(s)
|
If NCS successfully reconnects to the device, the alarm is cleared. |
|
Alarm Identity
|
Initial Perceived Severity
|
final-commit-error |
critical |
Description
|
Recommended Action
|
A managed device validated a configuration change, but failed to
commit. When this happens, NCS and the device are out of sync.
|
Reconcile by comparing and sync-from or sync-to.
|
Alarm message(s)
|
-
The connection to {dev} was closed
-
External error in the NED implementation for device {dev}: {reason}
-
Internal error in the NED NCS framework affecting device {dev}: {reason}
|
Clear condition(s)
|
If NCS achieves sync with a device, the alarm is cleared. |
|
Alarm Identity
|
ha-alarm |
Description
|
Base type for all alarms related to high availablity.
This is never reported, sub-identities for the specific
high availability alarms are used in the alarms.
|
|
Alarm Identity
|
ha-node-down-alarm |
Description
|
Base type for all alarms related to nodes going down in
high availablity. This is never reported, sub-identities
for the specific node down alarms are used in the alarms.
|
|
Alarm Identity
|
Initial Perceived Severity
|
ha-primary-down |
critical |
Description
|
Recommended Action
|
The node lost the connection to the primary node.
|
Make sure the HA cluster is operational, investigate why
the primary went down and bring it up again.
|
Alarm message(s)
|
-
Lost connection to primary due to: Primary closed connection
-
Lost connection to primary due to: Tick timeout
-
Lost connection to primary due to: code {Code}
|
Clear condition(s)
|
This alarm is never automatically cleared and has to be cleared
manually when the HA cluster has been restored.
|
|
Alarm Identity
|
Initial Perceived Severity
|
ha-secondary-down |
critical |
Description
|
Recommended Action
|
The node lost the connection to a secondary node.
|
Investigate why the secondary node went down, fix the
connectivity issue and reconnect the secondary to the
HA cluster.
|
Alarm message(s)
|
|
Clear condition(s)
|
This alarm is cleared when the secondary node is reconnected
to the HA cluster.
|
|
Alarm Identity
|
Initial Perceived Severity
|
missing-transaction-id |
warning |
Description
|
Recommended Action
|
A device announced in its NETCONF hello message that
it supports the transaction-id as defined in
http://tail-f.com/yang/netconf-monitoring. However when
NCS tries to read the transaction-id no data is returned.
The NCS check-sync feature will not work. This is usually
a case of misconfigured NACM rules on the managed device.
|
Verify NACM rules on the concerned device.
|
Alarm message(s)
|
|
Clear condition(s)
|
If NCS successfully reads a transaction id for which
it had previously failed to do so, the alarm is cleared.
|
|
Alarm Identity
|
ncs-cluster-alarm |
Description
|
Base type for all alarms related to cluster.
This is never reported, sub-identities for the specific
cluster alarms are used in the alarms.
|
|
Alarm Identity
|
ncs-dev-manager-alarm |
Description
|
Base type for all alarms related to the device manager
This is never reported, sub-identities for the specific
device alarms are used in the alarms.
|
|
Alarm Identity
|
ncs-package-alarm |
Description
|
Base type for all alarms related to packages.
This is never reported, sub-identities for the specific
package alarms are used in the alarms.
|
|
Alarm Identity
|
ncs-service-manager-alarm |
Description
|
Base type for all alarms related to the service manager
This is never reported, sub-identities for the specific
service alarms are used in the alarms.
|
|
Alarm Identity
|
ncs-snmp-notification-receiver-alarm |
Description
|
Base type for SNMP notification receiver Alarms. This is never
reported, sub-identities for specific SNMP notification receiver
alarms are used in the alarms.
|
|
Alarm Identity
|
Initial Perceived Severity
|
ned-live-tree-connection-failure |
major |
Description
|
Recommended Action
|
NCS failed to connect to a managed device using one of the optional
live-status-protocol NEDs.
|
Verify the configuration of the optional NEDs.
If the error occurs intermittently, increase connect-timeout.
|
Alarm message(s)
|
|
Clear condition(s)
|
If NCS successfully reconnects to the managed device,
the alarm is cleared.
|
|
Alarm Identity
|
Initial Perceived Severity
|
out-of-sync |
major |
Description
|
Recommended Action
|
A managed device is out of sync with NCS. Usually it means that the
device has been configured out of band from NCS point of view.
|
Inspect the difference with compare-config, reconcile by
invoking sync-from or sync-to.
|
Alarm message(s)
|
|
Clear condition(s)
|
If NCS achieves sync with a device, the alarm is cleared. |
|
Alarm Identity
|
Initial Perceived Severity
|
package-load-failure |
critical |
Description
|
Recommended Action
|
NCS failed to load a package.
|
Check the package for the reason.
|
Alarm message(s)
|
|
Clear condition(s)
|
If NCS successfully loads a package for which an alarm
was previously raised, it will be cleared.
|
|
Alarm Identity
|
Initial Perceived Severity
|
package-operation-failure |
critical |
Description
|
Recommended Action
|
A package has some problem with its operation.
|
Check the package for the reason.
|
Clear condition(s)
|
This alarm is not cleared. |
|
Alarm Identity
|
Initial Perceived Severity
|
receiver-configuration-error |
major |
Description
|
Recommended Action
|
The snmp-notification-receiver could not setup its configuration,
either at startup or when reconfigured. SNMP notifications will now
be missed.
|
Check the error-message and change the configuration.
|
Alarm message(s)
|
|
Clear condition(s)
|
This alarm will be cleared when the NCS is configured
to successfully receive SNMP notifications
|
|
Alarm Identity
|
Initial Perceived Severity
|
revision-error |
major |
Description
|
Recommended Action
|
A managed device arrived with a known module, but too new revision.
|
Upgrade the Device NED using the new YANG revision in order
to use the new features in the device.
|
Alarm message(s)
|
|
Clear condition(s)
|
If all device yang modules are supported by NCS,
the alarm is cleared.
|
|
Alarm Identity
|
Initial Perceived Severity
|
service-activation-failure |
critical |
Description
|
Recommended Action
|
A service failed during re-deploy.
|
Corrective action and another re-deploy is needed.
|
Alarm message(s)
|
|
Clear condition(s)
|
If the service is successfully redeployed, the alarm is cleared. |
|
Alarm Identity
|
time-violation-alarm |
Description
|
Base type for all alarms related to time violations.
This is never reported, sub-identities for the specific
time violation alarms are used in the alarms.
|
|
Alarm Identity
|
Initial Perceived Severity
|
transaction-lock-time-violation |
warning |
Description
|
Recommended Action
|
The transaction lock time exceeded its threshold and might be stuck
in the critical section. This threshold is configured in
/ncs-config/transaction-lock-time-violation-alarm/timeout.
|
Investigate if the transaction is stuck and possibly
interrupt it by closing the user session which it is
attached to.
|
Alarm message(s)
|
|
Clear condition(s)
|
This alarm is cleared when the transaction has finished. |
|