Network Working GroupInternet Engineering Task Force (IETF) S. VallinInternet-DraftRequest for Comments: 8632 Stefan Vallin ABIntended status:Category: Standards Track M. BjorklundExpires: October 13, 2019ISSN: 2070-1721 CiscoApril 11,August 2019 A YANG Data Model for AlarmModule draft-ietf-ccamp-alarm-module-09Management Abstract This document defines a YANG module for alarm management. It includes functions foralarm listalarm-list management, alarmshelvingshelving, and notifications to inform management systems. There are also operations to manage the operator state of an alarm and administrative alarm procedures. The module carefully maps to relevant alarm standards. Status of This Memo ThisInternet-Draftissubmitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documentsan Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF).Note that other groups may also distribute working documents as Internet-Drafts. The listIt represents the consensus ofcurrent Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents validthe IETF community. It has received public review and has been approved fora maximumpublication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 ofsix monthsRFC 7841. Information about the current status of this document, any errata, and how to provide feedback on it may beupdated, replaced, or obsoleted by other documentsobtained atany time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on October 13, 2019.https://www.rfc-editor.org/info/rfc8632. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology and Notation . . . . . . . . . . . . . . . . 3 2. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. AlarmModuleData Model Concepts . . . . . . . . . . . . . . . . . .. .5 3.1. Alarm Definition . . . . . . . . . . . . . . . . . . . . 5 3.2. Alarm Type . . . . . . . . . . . . . . . . . . . . . . . 6 3.3. Identifying the Alarming Resource . . . . . . . . . . . . 8 3.4. Identifying Alarm Instances . . . . . . . . . . . . . . . 9 3.5. Alarm Lifecycle . . . . . . . . . . . . . . . . . . . . . 9 3.5.1. Resource Alarm Lifecycle . . . . . . . . . . . . . . 10 3.5.2. Operator Alarm Lifecycle . . . . . . . . . . . . . .1011 3.5.3. Administrative Alarm Lifecycle . . . . . . . . . . . 11 3.6. Root Cause, ImpactedResourcesResources, and Related Alarms . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.7. Alarm Shelving . . . . . . . . . . . . . . . . . . . . . 13 3.8. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 13 4. Alarm Data Model . . . . . . . . . . . . . . . . . . . . . . 13 4.1. Alarm Control . . . . . . . . . . . . . . . . . . . . . . 15 4.1.1. Alarm Shelving . . . . . . . . . . . . . . . . . . . 15 4.2. Alarm Inventory . . . . . . . . . . . . . . . . . . . . .1516 4.3. Alarm Summary . . . . . . . . . . . . . . . . . . . . . . 16 4.4. The Alarm List . . . . . . . . . . . . . . . . . . . . . 17 4.5. TheShelved AlarmsShelved-Alarm List . . . . . . . . . . . . . . . . . 19 4.6. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 19 4.7. Operations . . . . . . . . . . . . . . . . . . . . . . . 20 4.8. Notifications . . . . . . . . . . . . . . . . . . . . . . 20 5. Relationship to the ietf-hardware YANGmoduleModule . . . . . . . . 20 6. Alarm YANG Module . . . . . . . . . . . . . . . . . . . . . . 21 7.X.733 Extensions . . . . . . . . . . . . . . . . . . . . . . 53 8.The X.733 Mapping Module . . . . . . . . . . . . . . . . . . 539.8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6510.9. Security Considerations . . . . . . . . . . . . . . . . . . . 6511. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 66 12.10. References . . . . . . . . . . . . . . . . . . . . . . . . . 6712.1.10.1. Normative References . . . . . . . . . . . . . . . . . . 6712.2.10.2. Informative References . . . . . . . . . . . . . . . . . 68 Appendix A.Vendor-specificVendor-Specific Alarm Types Example . . . . . . . . 69 Appendix B. Alarm Inventory Example . . . . . . . . . . . . . . 70 Appendix C. Alarm List Example . . . . . . . . . . . . . . . . . 71 Appendix D. Alarm Shelving Example . . . . . . . . . . . . . . . 72 Appendix E. X.733 Mapping Example . . . . . . . . . . . . . . . 73 Appendix F. Relationship toother alarm standardsOther Alarm Standards . . . . . . . 74 F.1.Alarm definition . .Definition of "Alarm" . . . . . . . . . . . . . . . . . . 74 F.2. DatamodelModel . . . . . . . . . . . . . . . . . . . . . . . 76 F.2.1. X.733 . . . . . . . . . . . . . . . . . . . . . . . . 76 F.2.2.RFC 3877, theThe Alarm MIB.(RFC 3877) . . . . . . . . . . . . . . 76 F.2.3. 3GPP Alarm IRP . . . . . . . . . . . . . . . . . . . 77 F.2.4. G.7710 . . . . . . . . . . . . . . . . . . . . . . . 77 Appendix G.Alarm UsabilityAlarm-Usability Requirements . . . . . . . . . . . .7778 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 81 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . .8182 1. Introduction This document defines a YANG[RFC7950]module [RFC7950] for alarm management. The purpose is to define a standardized alarm interface for network devices that can be easily integrated into management applications. The model is also applicable as a northbound alarm interface in the management applications. Alarm monitoring is a fundamental part of monitoring the network. Raw alarms from devices do not always tell the status of the network services or necessarily point to the root cause. However, being able to feed alarms to thealarm managementalarm-management application in a standardized format is a starting point for performinghigher levelhigher-level network assurance tasks. The design of the module is based on experience from using and implementing available alarm standards from ITU [X.733], 3GPP[ALARMIRP][ALARMIRP], and ANSI [ISA182]. 1.1. Terminology and Notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. The following terms are defined in [RFC7950]: o action o client o data tree o server The following terms are used within this document:oAlarm (the general concept): An alarm signifies an undesirable state in a resource that requires corrective action.oFault: A fault is the underlying cause of an undesired behavior. There is no trivial one-to-one mapping between faults and alarms. One fault may result in several alarms in case the system lacks root-cause and correlation capabilities. An alarm might not have an underlying fault as a cause. For example, imagine a bad Mean Opinion Score (MOS) alarm from a Voice over IP (VOIP) probe and the cause being non-optimal QoS configuration.oAlarm Type: An alarm type identifies a possible unique alarm state for a resource. Alarm types are names to identify the state like "link-alarm", "jitter-violation", and "high-disk-utilization".oResource: A fine-grained identification of the alarming resource, forexample:example, aninterface,interface and a process.oAlarm Instance: The alarm state for a specific resource and alarmtype. For exampletype, for example, ("GigabitEthernet0/15",link-alarm)."link-alarm"). An entry in the alarm list.oClearedalarm:Alarm: A cleared alarm is an alarm where the system considers the undesired state to be cleared. Operatorscan notcannot clearalarms,alarms; clearance is managed by the system. For example, alinkUp"linkUp" notification can be considered a clear condition for alinkDown"linkDown" state.oClosedalarm:Alarm: Operators can close alarms irrespective of the alarm being cleared or not. A closed alarm indicates that the alarm does not needattention,attention because eithersincethe corrective action has been taken orthatit can be ignored for other reasons.oAlarm Inventory: A list of all possible alarm types on a system.oAlarm Shelving: Blocking alarms according to specific criteria.oCorrective Action: An action taken by an operator or automation routine in order to minimize the impact of the alarm or resolve the root cause.oManagement System: Thealarm managementalarm-management application that consumes the alarms, i.e., acts as a client.oSystem: The system that implements this YANGalarmmodule, i.e., acts as a server. This corresponds to a network device or a management application that provides a northbound alarm interface. Tree diagrams used in this document follow the notation defined in [RFC8340]. 2. Objectives The objectives for the design of theAlarm Modulealarm data model are: oSimpleUsers find it simple to use. If a system supports this module, it shall bestraight-forwardstraightforward to integratethisit into aYANG basedYANG-based alarm manager. oView alarmsAlarms are viewed as states on resources and not as discrete notifications. oTo provide aA precise definition of "alarm" is provided in order to exclude general events that should not be forwarded as alarm notifications. oTo provide precisePrecise identification of alarm types and alarminstances.instances is provided. o A management system should be able to pull all available alarm types from a system, i.e., read the alarm inventory from a system. This makes it possible to prepare alarm operators with corresponding alarm instructions. oAddress alarm usability requirements,Alarm-usability requirements are addressed; see Appendix G. While IETF and telecom standards have addressed alarms mostly from a protocol perspective, the process industry has published several relevant standards addressing requirements for a useful alarm interface;[EEMUA],see [EEMUA] and [ISA182]. Thisalarm moduledocument defines usability requirements as well as a YANG data model. o Mapping to [X.733], which is a requirement for some alarmsystems.systems, is achievable. Still, keep some of the X.733 concepts out of the core model in order to make the model small and easy to understand. 3. AlarmModuleData Model Concepts This section defines the fundamental concepts behind the data model. This section is rooted in the works of Vallin et. al [ALARMSEM]. 3.1. Alarm Definition An alarm signifies an undesirable state in a resource that requires corrective action. There are two main things to remember from this definition: 1.the definitionIt focuses on leaving out events and logging information in general. Alarms should only be used for undesired states that require action. 2.the definitionIt also focuses on alarms as a state on a resource, not the notifications that report the state changes. See Appendix F for information on how this definition relates to other alarm standards. 3.2. Alarm Type This document defines an alarm type with analarm typealarm-type id and analarm typealarm-type qualifier. Thealarm typealarm-type id is modeled as a YANG identity. With YANG identities, new alarm types can be defined in a distributed fashion. YANG identities are hierarchical, which means that a hierarchy of alarm types can be defined. Standards and vendors should define their ownalarm typealarm-type identities based on this definition. The use of YANG identities means that all possible alarms are identified at design time. This explicit declaration of alarm types makes it easier to allow for alarm qualification reviews and preparation of alarm actions and documentation. There are occasions where the alarm types are not known at design time. An example is a system with digital inputs that allows users to connect detectors, such as smoke detectors, to the inputs. In thiscasecase, it is a configuration action that saysthatcertain connectors are firealarmsalarms, for example. In order to allow for dynamic addition of alarmtypestypes, the alarmmodule allows fordata model permits further qualification of the identity-based alarm type using a string. A potential drawback of this is that there is a significant risk that alarm operators will receive alarm types as a surprise. They do not know how to resolve the problem since a defined alarm procedure does not necessarily exist. To avoid thisriskrisk, the system MUST publish all possible alarm types in the alarminventory,inventory; see Section 4.2. A vendor or standards organization can define their ownalarm typealarm-type hierarchy. The example below shows a hierarchy based on X.733 event types: import ietf-alarms { prefix al; } identity vendor-alarms { base al:alarm-type; } identity communications-alarm { base vendor-alarms; } identity link-alarm { base communications-alarm; } Alarm types can be abstract. An abstract alarm type is used as a base for defining hierarchical alarm types. Concrete alarm types are used for alarm states and appear in the alarm inventory. There are two kinds of concrete alarm types: 1. The last subordinate identity in the "alarm-type-id" hierarchy is concrete, forexample:example, "alarm-identity.environmental- alarm.smoke". In thisexampleexample, "alarm-identity" and "environmental-alarm" are abstract YANG identities, whereas "smoke" is a concrete YANG identity. 2. The YANG identity hierarchy isabstractabstract, and the concrete alarm type is defined by the dynamicalarm qualifieralarm-qualifier string, forexample:example, "alarm-identity.environmental-alarm.external-detector" with alarm-type-qualifier "smoke". For example: // Alternative 1: concrete alarm type identity import ietf-alarms { prefix al; } identity environmental-alarm { base al:alarm-type; description "Abstract alarm type"; } identity smoke { base environmental-alarm; description "Concrete alarm type"; } // Alternative 2: concrete alarm type qualifier import ietf-alarms { prefix al; } identity environmental-alarm { base al:alarm-type; description "Abstract alarm type"; } identity external-detector { base environmental-alarm; description "Abstract alarmtype,type; arun-timeruntime configuration procedure sets the type of alarm detected. This will be reported in the alarm-type-qualifier."; } A server SHOULD strive to minimize the number of dynamically defined alarm types. 3.3. Identifying the Alarming Resource It is of vital importance to be able to refer to the alarming resource. This reference must be as fine-grained as possible. If the alarming resource exists in the datatree thentree, aninstance- identifierinstance-identifier MUST be used with the full path to the object. When the module is used in acontroller/orchestrator/managercontroller/orchestrator/manager, the original device resource identification can be modified to include the device in the path. The details depend on how devices areidentified,identified and are out of scope for this specification. Example: The original device alarm might identify the resource as "/dev:interfaces/dev:interface[dev:name='FastEthernet1/0']". The resource identification in the manager could look something like: "/mgr:devices/mgr:device[mgr:name='xyz123']/dev:interfaces/ dev:interface[dev:name='FastEthernet1/0']" This module also allows for alternate naming of the alarming resource if it is not available in the data tree. 3.4. Identifying Alarm Instances A primary goal ofthisthe alarmmoduledata model is to remove any ambiguity in how alarm notifications are mapped to an update of an alarm instance. The X.733 [X.733] and 3GPP [ALARMIRP] documents were not clear on this point. ThisYANGalarmmoduledata model states that the tuple (resource,alarm typealarm-type identifier,alarm typeand alarm-type qualifier) corresponds to a single alarm instance. This means that alarm notifications for the same resource and same alarm type are matched to update the same alarm instance. These three leafs are therefore used as the key in the alarm list: list alarm { key "resource alarm-type-id alarm-type-qualifier"; ... } 3.5. Alarm Lifecycle The alarm model clearly separates the resource alarm lifecycle from the operator and administrative lifecycles of an alarm. o resource alarm lifecycle: the alarm instrumentation that controls alarm raise, clearance, and severity changes. o operator alarm lifecycle: operators acting upon alarms with actions likeacknowledgmentacknowledging and closing. Closing an alarm implies that the operator considers the corrective action performed. Operators can alsoshelfshelve (block/filter) alarms in order to avoid nuisance alarms. o administrative alarm lifecycle: purging (deleting) unwanted alarms and compressing the alarmstatus changestatus-change list. This module exposes operations to manage the administrative lifecycle. The server may also perform these operations based on other policies, but how that is done is out of scope for this document. A server SHOULD describe how long it retains cleared/closedalarms:alarms until they are manually purged or if it has an automatic removal policy. How this is done is outside the scope of this document. 3.5.1. Resource Alarm Lifecycle From a resource perspective, an alarmcancan, forexampleexample, have the following lifecycle: raise, change severity, change severity, clear, being raised again, etc. All of these status changes can have different alarm texts generated by the instrumentation. Two important things to note: 1. Alarms are not deleted when they are cleared. Deleting alarms is an administrative process. Thealarm"ietf-alarms" YANG module defines an action "purge-alarms" that deletes alarms. 2. Alarms are not cleared byoperators,operators; only the underlying instrumentation can clear an alarm. Operators can close alarms. The YANG tree representation below illustrates theresource orientedresource-oriented lifecycle: +--ro alarm* [resource alarm-type-id alarm-type-qualifier] ... +--ro is-cleared boolean +--ro last-raised yang:date-and-time +--ro last-changed yang:date-and-time +--ro perceived-severity severity +--ro alarm-text alarm-text +--ro status-change* [time] {alarm-history}? +--ro time yang:date-and-time +--ro perceived-severity severity-with-clear +--ro alarm-text alarm-text For every status change from the resourceperspectiveperspective, a row is added to the "status-change" list, if the server implements the feature "alarm-history". The feature "alarm-history" is optional to implement, since keeping the alarm history may have an impact on the server's memory resources. The last status values are also represented as leafs for the alarm. Note well that the alarm severity does not include"cleared","cleared"; alarm clearance is a boolean flag.AnTherefore, an alarm canthereforelook like this: (("GigabitEthernet0/25",link- alarm,""),"link-alarm",""), false, 2018-04-08T08:20:10.00Z, 2018-04-08T08:20:10.00Z, major, "Interface GigabitEthernet0/25down")down"). 3.5.2. Operator Alarm Lifecycle Operators can act upon alarms using the set-operator-state action: +--ro alarm* [resource alarm-type-id alarm-type-qualifier] ... +--ro operator-state-change* [time] {operator-actions}? | +--ro time yang:date-and-time | +--ro operator string | +--ro state operator-state | +--ro text? string +---x set-operator-state {operator-actions}? +---w input +---w state writable-operator-state +---w text? string The operator state for an alarm canbe:be "none", "ack", "shelved", and "closed". Alarm deletion (using the action "purge-alarms") can use this state as a criterion. For example, a closed alarm is an alarm where the operator has performed any required corrective actions. Closed alarms are good candidates for being purged. 3.5.3. Administrative Alarm Lifecycle Deleting alarms from the alarm list is considered an administrative action. This is supported by the "purge-alarms" action. The "purge- alarms" action takes a filter as input. The filter selects alarms based on the operator and resource alarm lifecycle such as "all closed cleared alarms older than a time specification". The server may also perform these operations based on other policies, but how that is done is out of scope for this document. Purged alarms are removed from the alarm list. Notewell,well that if the alarm resource state changes after a purge, the alarm will reappear in the alarm list. Alarms can be compressed. Compressing an alarm deletes all entries in the alarm's "status-change" list except for the last status change. A client can perform this using the "compress-alarms" action. The server may also perform these operations based on other policies, but how that is done is out of scope for this document. 3.6. Root Cause, ImpactedResourcesResources, and Related Alarms The alarmmoduledata model does not mandate any requirements for the system to support alarm correlation or root-cause and service-impact analysis. However, if such features are supported, this section describes how the results of such analysis are represented in the data model. These parts of the model are optional. The module supports three scenarios:Root causeRoot-cause analysis: An alarm can indicate candidateroot causeroot-cause resources, forexample:example, a database issue alarm referring to afull discfull-disk partition.Service impactService-impact analysis: An alarm can refer to potential impacted resources, forexample:example, an interface alarm referring to impacted networkservicesservices. Alarm correlation: Dependencies betweenalarms,alarms; several alarms can be grouped as relating to each other, forexampleexample, a streaming media alarm relating to ahigh jitterhigh-jitter alarm. Different systems have varying degrees of alarm correlation and analysis capabilities, and the intent of the alarmmoduledata model is to enable any capability, including none. The general principle of this alarmmoduledata model is to limit the amount of alarms. In manycasescases, several resources are affected for a given underlying problem. A full disk will of course impact databases and applications as well. The recommendation is to have a single alarm for the underlying problem and list the affected resources in thealarm,alarm rather than having separate alarms for each resource. The alarm has one leaf-list to identify a possible"impacted-resources""impacted- resource" and a leaf-list to identify a possible"root-cause-resources"."root-cause- resource". Theseservesserve as hints only. It is up to the client application to use this information to present the overall status. Using thedisk fulldisk-full example, a good alarm would be to use thehardhard- disk partition as the alarming resource and add the database and applications into the"impacted-resources""impacted-resource" leaf-list. A system should always strive to identify the resource that can be acted upon as the "resource" leaf. The "impacted-resource" leaf-list shall be used to identify anyside-effectsside effects of the alarm. The impacted resourcescan notcannot be acted upon to fix the problem. The disk full example above illustrates the principle; youcan notcannot fix the underlying issue by database operations. However, you need to pay attention to the database to perform any operations thatlimitslimit the impact of the problem. On someoccasionsoccasions, the system might not be capable of detecting the root cause, the resource that can be acted upon. The instrumentation in this case only monitors theside-effectside effect and raises an alarm to indicate a situation requiring attention. The instrumentation still might identify possible candidates for the root-cause resource. In thiscasecase, the "root-cause-resource" leaf-list can be used to indicate the candidate root-cause resources. An example of this kind of alarm might be an active test tool that detectsan SLAa Service Level Agreement (SLA) violation on a VPN connection and identifies the devices along the chain as candidate root causes. The alarmmoduledata model also supports a way to associate different alarmstowith each otherwithusing the "related-alarm" list. This list enables the server to inform the client that certain alarms are related to other alarms. Note well that this module does not prescribe any dependencies or preference between the above alarm correlation mechanisms. Different systems have differentcapabilitiescapabilities, and the above described mechanisms are available to support the instrumentation features. 3.7. Alarm Shelving Alarm shelving is an important function in order foralarm managementalarm-management applications and operators to stop superfluous alarms. A shelved alarm implies that any alarms fulfilling these criteria are ignored (blocked/filtered). Shelved alarms appear in a dedicatedshelvedshelved- alarmlist in order tolist; thus, they can be filtered outfrom the alarm list in order forso that the main alarm listtoonlycontaincontains entries of interest. Shelved alarms do not generatenotificationsnotifications, but theshelved alarmshelved-alarm list is updated with anyalarm statealarm-state changes. Alarm shelving is optional to implement, since matching alarms against shelf criteria may have an impact on the server's processing resources. 3.8. Alarm Profiles Alarm profiles are used to configure further information to an alarm type. This module supports configuring severity levels overriding thesystem defaultsystem-default levels. This corresponds to the Alarm Severity AssignmentProfile, ASAP,Profile (ASAP) functionality in M.3100 [M.3100] and M.3160 [M.3160]. Other standard or enterprise modules can augment this list with furtheralarm typealarm-type information. 4. Alarm Data Model The fundamental parts of the data model are the "alarm-list" with associated notifications and the "alarm-inventory" list of all possible alarm types. These MUST be implemented by a system. The rest of the data modelareis made conditional withthethese YANGfeaturesfeatures: "operator-actions", "alarm-shelving", "alarm-history", "alarm- summary", "alarm-profile", and "severity-assignment". The data model has the following overall structure: +--rw control | +--rw max-alarm-status-changes? union | +--rw notify-status-changes? enumeration | +--rw notify-severity-level? severity | +--rw alarm-shelving {alarm-shelving}? | ... +--ro alarm-inventory | +--ro alarm-type* [alarm-type-id alarm-type-qualifier] | ... +--ro summary {alarm-summary}? | +--ro alarm-summary* [severity] | | ... | +--ro shelves-active? empty {alarm-shelving}? +--ro alarm-list | +--ro number-of-alarms? yang:gauge32 | +--ro last-changed? yang:date-and-time | +--ro alarm* [resource alarm-type-id alarm-type-qualifier] | | ... | +---x purge-alarms | | ... | +---x compress-alarms {alarm-history}? | ... +--ro shelved-alarms {alarm-shelving}? | +--ro number-of-shelved-alarms? yang:gauge32 | +--ro shelved-alarms-last-changed? yang:date-and-time | +--ro shelved-alarm* | | [resource alarm-type-id alarm-type-qualifier] | | ... | +---x purge-shelved-alarms | | ... | +---x compress-shelved-alarms {alarm-history}? | ... +--rw alarm-profile* [alarm-type-id alarm-type-qualifier-match resource] {alarm-profile}? +--rw alarm-type-id alarm-type-id +--rw alarm-type-qualifier-match string +--rw resource resource-match +--rw description string +--rw alarm-severity-assignment-profile {severity-assignment}? ... 4.1. Alarm Control The "/alarms/control/notify-status-changes" leaf controlsifwhether notifications are sent for all state changes, only raise and clear, or only notifications more severe than a configured level. Thisfeaturefeature, in combination with alarmshelvingshelving, corresponds to the ITU Alarm Report Controlfunctionality,functionality; see Appendix F.2.4. Every alarm has a list of status changes. The length of this list is controlled by "/alarms/control/max-alarm-status-changes". When the list is full and a new entry created, the oldest entry is removed. 4.1.1. Alarm Shelving The shelving control tree is shown below: +--rw control +--rw alarm-shelving {alarm-shelving}? +--rw shelf* [name] +--rw name string +--rw resource* resource-match +--rw alarm-type* | [alarm-type-id alarm-type-qualifier-match] | +--rw alarm-type-id alarm-type-id | +--rw alarm-type-qualifier-match string +--rw description? string Shelved alarms are shown in a dedicatedshelved alarmshelved-alarm list. Matching alarms MUST appear in the/alarms/shelved-alarms/shelved-alarm"/alarms/shelved-alarms/shelved-alarm" list, and non-matching/alarmsalarms MUST appear in the/alarms/alarm-list/alarm"/alarms/alarm-list/ alarm" list. The server does not send any notifications for shelved alarms. Shelving and unshelving can only be performed by editing the shelf configuration. It cannot be performed on individual alarms. The server will add an operator state indicating that the alarm was shelved/unshelved. Aleaf (/alarms/summary/shelves-active)leaf, "/alarms/summary/shelves-active", in the alarm summary indicates if there are shelved alarms. A system can selecttonot to support the shelving feature. 4.2. Alarm Inventory The alarm inventory represents all possible alarm types that may occur in the system. A management system may use this to build alarm procedures. The alarm inventory is relevant forseveralthe following reasons: The system might not implement all defined alarm type identities, and some alarm identities are abstract. The system has configured dynamic alarm types using the alarm qualifier. The inventory makes it possible for the management system to discover these. Note that the mechanism whereby dynamic alarm types are added using thealarm typealarm-type qualifier MUST populate this list. The optional leaf-list "resource" in the alarm inventory enables the system to publish for which resources a given alarm type may appear. A server MUST implement the alarm inventory in order to enable controlled alarm procedures in the client. A server implementer may want to document the alarm inventory foroff-lineoffline processing by clients. The file format defined in[I-D.ietf-netmod-yang-instance-file-format][YANG-INSTANCE] can be used for this purpose. The alarm inventory tree is shown below: +--ro alarm-inventory +--ro alarm-type* [alarm-type-id alarm-type-qualifier] +--ro alarm-type-id alarm-type-id +--ro alarm-type-qualifier alarm-type-qualifier +--ro resource* resource-match +--ro will-clear boolean +--roseverity-levels*severity-level* severity +--ro description string 4.3. Alarm Summary The alarm summary list summarizes alarms perseverity;severity: how many cleared, cleared and closed, and closed. It also gives an indication if there are shelved alarms. The alarm summary tree is shown below: +--ro summary {alarm-summary}? +--ro alarm-summary* [severity] | +--ro severity severity | +--ro total? yang:gauge32 | +--ro not-cleared? yang:gauge32 | +--ro cleared? yang:gauge32 | +--ro cleared-not-closed? yang:gauge32 | | {operator-actions}? | +--ro cleared-closed? yang:gauge32 | | {operator-actions}? | +--ro not-cleared-closed? yang:gauge32 | | {operator-actions}? | +--ro not-cleared-not-closed? yang:gauge32 | {operator-actions}? +--ro shelves-active? empty {alarm-shelving}? 4.4. The Alarm List The alarmlist (/alarms/alarm-list)list, "/alarms/alarm-list", is a function from the tuple (resource, alarm type,alarm typealarm-type qualifier) to the current composite alarm state. The composite state includes states for the resource alarm lifecycle such as severity, clearanceflagflag, and operator states such asacknowledgment.acknowledged. This means that for a given resource and alarmtypetype, the alarm list shows the current states of the alarm such as acknowledged andcleared status.cleared. +--ro alarm-list +--ro number-of-alarms? yang:gauge32 +--ro last-changed? yang:date-and-time +--ro alarm* [resource alarm-type-id alarm-type-qualifier] | +--ro resource resource | +--ro alarm-type-id alarm-type-id | +--ro alarm-type-qualifier alarm-type-qualifier | +--ro alt-resource* resource | +--ro related-alarm* | | [resource alarm-type-id alarm-type-qualifier] | | {alarm-correlation}? | | +--ro resource | | | -> /alarms/alarm-list/alarm/resource | | +--ro alarm-type-id leafref | | +--ro alarm-type-qualifier leafref | +--ro impacted-resource* resource | | {service-impact-analysis}? | +--ro root-cause-resource* resource | | {root-cause-analysis}? | +--ro time-created yang:date-and-time | +--ro is-cleared boolean | +--ro last-raised yang:date-and-time | +--ro last-changed yang:date-and-time | +--ro perceived-severity severity | +--ro alarm-text alarm-text | +--ro status-change* [time] {alarm-history}? | | +--ro time yang:date-and-time | | +--ro perceived-severity severity-with-clear | | +--ro alarm-text alarm-text | +--ro operator-state-change* [time] {operator-actions}? | | +--ro time yang:date-and-time | | +--ro operator string | | +--ro state operator-state | | +--ro text? string | +---x set-operator-state {operator-actions}? | | +---w input | | +---w state writable-operator-state | | +---w text? string | +---n operator-action {operator-actions}? | +-- time yang:date-and-time | +-- operator string | +-- state operator-state | +-- text? string +---x purge-alarms | +---w input | | +---w alarm-clearance-status enumeration | | +---w older-than! | | | +---w (age-spec)? | | | +--:(seconds) | | | | +---w seconds? uint16 | | | +--:(minutes) | | | | +---w minutes? uint16 | | | +--:(hours) | | | | +---w hours? uint16 | | | +--:(days) | | | | +---w days? uint16 | | | +--:(weeks) | | | +---w weeks? uint16 | | +---w severity! | | | +---w (sev-spec)? | | | +--:(below) | | | | +---w below? severity | | | +--:(is) | | | | +---w is? severity | | | +--:(above) | | | +---w above? severity | | +---w operator-state-filter! {operator-actions}? | | +---w state? operator-state | | +---w user? string | +--ro output | +--ro purged-alarms? uint32 +---x compress-alarms {alarm-history}? +---w input | +---w resource? resource-match | +---w alarm-type-id? | | -> /alarms/alarm-list/alarm/alarm-type-id | +---w alarm-type-qualifier? leafref +--ro output +--ro compressed-alarms? uint32 Every alarm has three importantstates,states: the resource clearance state "is-cleared", the severity"perceived-severity""perceived-severity", and the operator state available in theoperator stateoperator-state change list. In order to see the alarmhistoryhistory, the resource state changes are available in the "status-change"listlist, and the operator history is available in the "operator-state-change" list. 4.5. TheShelved AlarmsShelved-Alarm List Theshelved alarmshelved-alarm list has the same structure as the alarm list above. It shows all the alarms thatmatchesmatch the shelving criteria(/alarms/control/alarm-shelving)."/alarms/control/alarm-shelving". 4.6. Alarm Profiles Alarmprofiles (/alarms/alarm-profile)profiles, "/alarms/alarm-profile", is a list of configurable alarm types. The list supports configurable alarm severity levels in the container "alarm-severity-assignment-profile". If an alarm matches the configured alarmtypetype, it MUST use the configured severity level(s) instead of the system default. This configuration MUST also be represented in the alarm inventory. +--rw alarm-profile* [alarm-type-id alarm-type-qualifier-match resource] {alarm-profile}? +--rw alarm-type-id alarm-type-id +--rw alarm-type-qualifier-match string +--rw resource resource-match +--rw description string +--rw alarm-severity-assignment-profile {severity-assignment}? +--rwseverity-levels*severity-level* severity 4.7. Operations The alarmmoduledata model supports the following actions to manage the alarms:/alarms/alarm-list/purge-alarms:"/alarms/alarm-list/purge-alarms": Delete alarms from the"alarm-list""alarm- list" according to specific criteria, forexampleexample, all cleared alarms older than a specific date./alarms/alarm-list/compress-alarms:"/alarms/alarm-list/compress-alarms": Compress the "status-change" list for the alarms./alarms/alarm-list/alarm/set-operator-state:"/alarms/alarm-list/alarm/set-operator-state": Change the operator state for an alarm. For example, an alarm can be acknowledged by setting the operator state to "ack"./alarms/shelved-alarm-list/purge-shelved-alarms:"/alarms/shelved-alarm-list/purge-shelved-alarms": Delete alarms from the "shelved-alarm-list" according to specific criteria, forexampleexample, all alarms older than a specific date./alarms/shelved-alarm-list/compress-shelved-alarms:"/alarms/shelved-alarm-list/compress-shelved-alarms": Compress the "status-change" list for the alarms. 4.8. Notifications The alarmmoduledata model supports a general notification to reportalarmalarm- state changes. It carries all relevant parameters for thealarmalarm- management application. There is also a notification to report that an operator changed the operator state on an alarm, likeacknowledge.acknowledged. If the alarm inventory is changed, forexampleexample, a new card type is inserted, a notification will tell the management application that new alarm types are available. 5. Relationship to the ietf-hardware YANGmoduleModule RFC 8348 [RFC8348] defines the "ietf-hardware" YANG data model for the management of hardware. The "alarm-state" in RFC 8348 is a summary of the alarm severity levels that may be active on the specific hardware component. It does not say anything about how alarms are reported, and it doesn't provide any details of the alarms. The mapping between the alarm YANG datamodelmodel, prefix "al", and the "alarm-state" in RFC83488348, prefix "hw", is as follows:resource:"al:resource": Corresponds to an entry in the list"/hardware/component/" is-cleared:"/hw:hardware/hw:component/". "al:is-cleared": No bit set in"/hardware/component/state/alarm-state" perceived-severity:"/hw:hardware/hw:component/hw:state/ hw:alarm-state". "al:perceived-severity": Corresponding bit set in"/hardware/component/state/alarm-state". operator-state-change/state:"/hw:hardware/hw:component/hw:state/hw:alarm-state". "al:operator-state-change/al:state": If the alarm is acknowledged by the operator, the bit"under-repair""hw:under-repair" is set in"/hardware/component/state/ alarm-state"."/hw:hardware/hw:component/hw:state/hw:alarm-state". 6. Alarm YANG Module This YANG module references [RFC6991] and [XSD-TYPES]. <CODE BEGINS> file"ietf-alarms@2019-04-10.yang""ietf-alarms@2019-07-16.yang" module ietf-alarms { yang-version 1.1; namespace "urn:ietf:params:xml:ns:yang:ietf-alarms"; prefix al; import ietf-yang-types { prefix yang; reference "RFC 6991: Common YANG Data Types."; } organization "IETF CCAMP Working Group"; contact "WG Web:<http://tools.ietf.org/wg/ccamp><https://trac.ietf.org/trac/ccamp> WG List: <mailto:ccamp@ietf.org> Editor: Stefan Vallin <mailto:stefan@wallan.se> Editor: Martin Bjorklund <mailto:mbj@tail-f.com>";// RFC Ed.: replace XXXX with actual RFC number and // remove this note.description "This module defines an interface for managing alarms. Main inputs to the module design are the 3GPP AlarmIRP,Integration Reference Point (IRP), ITU-TX.733X.733, and ANSI/ISA-18.2 alarm standards. Main features of this module include: * Alarm list: A list of all alarms. Cleared alarms stay in the list until explicitly purged. * Operator actions on alarms: Acknowledging and closing alarms. * Administrative actions on alarms: Purging alarms from the list according to specific criteria. * Alarm inventory: A management application can read all alarm types implemented by the system. * Alarm shelving: Shelving (blocking) alarms according to specific criteria. * Alarm profiles: A management system can attach further information to alarm types, forexampleexample, overridingsystem defaultsystem-default severity levels. This module uses a stateful view on alarms. An alarm is a state for a specific resource (note that an alarm is not a notification). An alarm type is a possible alarm state for a resource. For example, the tuple: ('link-alarm', 'GigabitEthernet0/25') is an alarm of type 'link-alarm' on the resource 'GigabitEthernet0/25'. Alarm types are identified using YANG identities and an optional string-based qualifier. The string-based qualifier allows for dynamic extension of the statically defined alarm types. Alarm types identify a possible alarm state and not the individual notifications. For example, the traditional 'link-down' and 'link-up' notifications are two notifications referring to the same alarm type 'link-alarm'. With thisdesigndesign, there is no ambiguity about how alarm and alarm clear correlation should beperformed:performed; notifications that report the same resource and alarm type are considered updates of the same alarm, e.g., clearing an active alarm or changing the severity of an alarm. The instrumentation can update'severity'the severity and'alarm-text'alarm text on an existing alarm. The above alarm example can therefore looklike:like the following: (('link-alarm', 'GigabitEthernet0/25'), warning, 'interface down while interface admin state is up') There is a clear separation between updates on the alarm from the underlying resource, like clear, and updates from anoperatoroperator, likeacknowledgeacknowledging or closing an alarm: (('link-alarm', 'GigabitEthernet0/25'), warning, 'interface down while interface admin state is up', cleared, closed) Administrative actions like removing closed alarms older than a given time is supported. ThisalarmYANG module does not define how the underlying instrumentation detects and clears the specific alarms. That belongs to theSDOStandards Development Organization (SDO) or enterprise that owns that specific technology. The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 'MAY', and 'OPTIONAL' in this document are to be interpreted as described in BCP 14 (RFC 2119) (RFC 8174) when, and only when, they appear in all capitals, as shown here. Copyright (c) 2019 IETF Trust and the persons identified as authors of the code. All rights reserved. Redistribution and use in source and binary forms, with or without modification, is permitted pursuant to, and subject to the license terms contained in, the Simplified BSD License set forth in Section 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info). This version of this YANG module is part of RFCXXXX (https://tools.ietf.org/html/rfcXXXX);8632; see the RFC itself for full legal notices.";// RFC Ed.: update the date below with the date of RFC publication // and remove this note.revision2019-04-102019-07-16 { description "Initial revision."; reference "RFCXXXX:8632: A YANG Data Model for AlarmModule";Management"; } /* * Features */ feature operator-actions { description "This feature indicates that the system supports operator states on alarms."; } feature alarm-shelving { description "This feature indicates that the system supports shelving (blocking) alarms. Alarm shelving may have an impact on server processing resources in order to match alarms against shelf criteria."; } feature alarm-history { description "This feature indicates that the server maintains a history of state changes for each alarm. For example, if an alarm toggles between cleared and active 10 times, these state changes are present in a separate list in the alarm. Keeping the alarm history may have an impact on server memory resources."; } feature alarm-summary { description "This feature indicates that the server summarizes the number of alarms per severity and operator state."; } feature alarm-profile { description "The systemsupportsenables clients to configure further information to each alarm type."; } feature severity-assignment { description "The system supports configurable alarm severity levels."; reference"M.3160/M.3100 Alarm Severity Assignment Profile, ASAP";"ITU-T Recommendation M.3100: Generic network information model ITU-T Recommendation M.3160: Generic, protocol-neutral management information model"; } feature root-cause-analysis { description "The system supports identifying candidate root-cause resources for an alarm, forexampleexample, adiscdisk partition root cause for a logger failure alarm."; } feature service-impact-analysis { description "The system supports identifyingcandidate impactedcandidate-impacted resources for analarm. Foralarm, for example, an interface state change resulting in a linkalarmalarm, which can refer to a link as being impacted."; } feature alarm-correlation { description "The system supports correlating/grouping alarms that belong together."; } /* * Identities */ identity alarm-type-id { description "Base identity for alarm types. A unique identification of the alarm, not including the resource. Different resources can share alarm types. If the resource reports the same alarm type, it isto beconsidered to be the same alarm. The alarm type is a simplification of the different X.733 and 3GPPalarmAlarm IRPalarmcorrelationmechanismsmechanisms, and it allows for hierarchical extensions. A string-based qualifier can be used in addition to the identity in order to have different alarm types based on information not known atdesign-time,design time, such as values in textual SNMP Notificationvar-binds.varbinds. Standards and vendors can define sub-identities to clearly identify specific alarm types. This identity is abstract and MUST NOT be used for alarms."; } /* * Common types */ typedef resource { type union { type instance-identifier { require-instance false; } type yang:object-identifier; type string; type yang:uuid; } description "This is an identification of the alarming resource, such as an interface. It should be as fine-grained as possiblebothto both guide the operator andtoguarantee uniqueness of the alarms. If the alarming resource is modeled in YANG, this type will be an instance-identifier. If the resource is an SNMP object, the type will be anobject-identifier.'object-identifier'. If the resource is anything else, forexampleexample, a distinguished name or aCIMCommon Information Model (CIM) path, this type will be a string. If the alarming object is identified by aUUIDUniversally Unique Identifier (UUID), use the uuid type. Be cautious when using this type, since a UUID is hard to use for an operator. If the server supports several models, the precedence should be in the order as given in the union definition."; } typedef resource-match { type union { type yang:xpath1.0; type yang:object-identifier; type string; } description "This type is used to match resources of type 'resource'. Since the type 'resource' is a union of different types, the 'resource-match' type is also a union of corresponding types. If the type is given as an XPath 1.0 expression, a resource of type 'instance-identifier' matches if the instance is part of the node set that is the result of evaluating the XPath 1.0 expression. For example, the XPath 1.0 expression: /ietf-interfaces:interfaces/ietf-interfaces:interface [ietf-interfaces:type='ianaift:ethernetCsmacd'] would match the resource instance-identifier: /if:interfaces/if:interface[if:name='eth1'], assuming that the interface 'eth1' is of type 'ianaift:ethernetCsmacd'. If the type is given as an object identifier, a resource of type 'object-identifier' matches if the match object identifier is a prefix of the resource's object identifier. For example, the value: 1.3.6.1.2.1.2.2 would match the resource object identifier: 1.3.6.1.2.1.2.2.1.1.5 If the type is given as an UUID or a string, it is interpreted as an XML Schema regular expression, which matches a resource of type 'yang:uuid' or 'string' if the given regular expression matches the resource string. If the type is given as an XPathexpressionexpression, it is evaluated in the following XPath context: o The set of namespace declarations is the set of prefix and namespace pairs for all YANG modules implemented by the server, where the prefix is the YANG module name and the namespace is as defined by the 'namespace' statement in the YANG module. If a leaf of this type is encoded in XML, all namespace declarations in scope on the leaf element are added to the set of namespace declarations. If a prefix found in the XML is already present in the set of namespace declarations, the namespace in the XML is used. o The set of variable bindings is empty. o The function library is the core functionlibrarylibrary, and the functions are defined in Section 10 of RFC 7950. o The context node is the root node in the data tree."; reference "XML Schema Part 2: Datatypes Second Edition, World Wide Web Consortium Recommendation REC-xmlschema-2-20041028"; } typedef alarm-text { type string; description "The string used to inform operators about the alarm. This MUST contain enough information for an operator to be able to understand the problem and how to resolve it. If this string contains structure, this format should be clearly documented for programs to be able to parse that information."; } typedef severity { type enumeration { enum indeterminate { value 2; description "Indicates that the severity level could not be determined. This level SHOULD be avoided."; } enum warning { value 3; description "The 'warning' severity level indicates the detection of a potential or impendingservice affectingservice-affecting fault, before any significant effects have been felt. Action should be taken to further diagnose (if necessary) and correct the problem in order to prevent it from becoming a more seriousservice affectingservice-affecting fault."; } enum minor { value 4; description "The 'minor' severity level indicates the existence of anon-service affectingnon-service-affecting fault condition and that corrective action should be taken in order to prevent a more serious (for example,service affecting)service-affecting) fault. Such a severity can be reported, for example, when the detected alarm condition is not currently degrading the capacity of the resource."; } enum major { value 5; description "The 'major' severity level indicates that aserviceservice- affecting condition has developed and an urgent corrective action is required. Such a severity can be reported, for example, when there is a severe degradation in the capability of the resource and its full capability must be restored."; } enum critical { value 6; description "The 'critical' severity level indicates that aserviceservice- affecting condition has occurred and an immediate corrective action is required. Such a severity can be reported, for example, when a resource becomes totally out of service and its capability must be restored."; } } description "The severity level of the alarm. Note well that the value 'clear' is not included.IfWhether or not an alarm is clearedor notis a separate boolean flag."; reference"ITU"ITU-T Recommendation X.733: Information Technology - Open Systems Interconnection - System Management: Alarm Reporting Function"; } typedef severity-with-clear { type union { type enumeration { enum cleared { value 1; description "The alarm is cleared by the instrumentation."; } } type severity; } description "The severity level of the alarm including clear. This is used only in notifications reporting state changes for an alarm."; } typedef writable-operator-state { type enumeration { enum none { value 1; description "The alarm is not being taken care of."; } enum ack { value 2; description "The alarm is being taken care of. Corrective action not takenyet,yet or has failed"; } enum closed { value 3; description "Corrective action taken successfully."; } } description "Operator states on an alarm. The 'closed' state indicates that an operator considers the alarm being resolved. This is separate from the alarm's 'is-cleared' leaf."; } typedef operator-state { type union { type writable-operator-state; type enumeration { enum shelved { value 4; description "The alarm is shelved. Alarms in /alarms/shelved-alarms/ MUST be assigned this operator state by the server as the last entry in theoperator-state-change'operator-state-change' list. The text for that entry SHOULD include the shelf name."; } enum un-shelved { value 5; description "The alarm is moved back to 'alarm-list' from a shelf. Alarms that are moved from /alarms/shelved-alarms/ to /alarms/alarm-list MUST be assigned this state by the server as the last entry in the 'operator-state-change' list. The text for that entry SHOULD include the shelf name."; } } } description "Operator states on an alarm. The 'closed' state indicates that an operator considers the alarm being resolved. This is separate from the alarm's 'is-cleared' leaf."; } /* Alarm type */ typedef alarm-type-id { type identityref { base alarm-type-id; } description "Identifies an alarm type. The description of the alarm type id MUST indicateifwhether or not the alarm type isabstract or not.abstract. An abstract alarm type is used as a base for other alarm type ids and will not be used as a value for an alarm or be present in the alarm inventory."; } typedef alarm-type-qualifier { type string; description "If an alarm typecan notcannot be fully specified at design time byalarm-type-id,'alarm-type-id', this string qualifier is used in addition to fully define a unique alarm type. The definition of alarm qualifiers is consideredbeingto be part of the instrumentation and is out of scope for this module. An empty string is used when this is part of a key."; } /* * Groupings */ grouping common-alarm-parameters { description "Common parameters for an alarm. This grouping is used both in the alarm list and in the notification representing analarm statealarm-state change."; leaf resource { type resource; mandatory true; description "The alarming resource. See also 'alt-resource'. This could be, forexample beexample, a reference to the alarming interface"; } leaf alarm-type-id { type alarm-type-id; mandatory true; description "This leaf and the leaf 'alarm-type-qualifier' togetherprovidesprovide a unique identification of the alarm type."; } leaf alarm-type-qualifier { type alarm-type-qualifier; description "This leaf is used when the 'alarm-type-id' leaf cannot uniquely identify the alarm type. Normally, this is not the case, and this leaf is the empty string."; } leaf-list alt-resource { type resource; description "Used if the alarming resource is available over other interfaces. This field can contain SNMP OIDs, CIMpathspaths, or 3GPPDistinguished namesdistinguished names, for example."; } list related-alarm { if-feature "alarm-correlation"; key "resource alarm-type-id alarm-type-qualifier"; description "References to related alarms. Note that the related alarm might have been purged from the alarm list."; leaf resource { type leafref { path "/alarms/alarm-list/alarm/resource"; require-instance false; } description "The alarming resource for the related alarm."; } leaf alarm-type-id { type leafref { path "/alarms/alarm-list/alarm" + "[resource=current()/../resource]" + "/alarm-type-id"; require-instance false; } description "The alarm type identifier for the related alarm."; } leaf alarm-type-qualifier { type leafref { path "/alarms/alarm-list/alarm" + "[resource=current()/../resource]" + "[alarm-type-id=current()/../alarm-type-id]" + "/alarm-type-qualifier"; require-instance false; } description "The alarm qualifier for the related alarm."; } } leaf-list impacted-resource { if-feature "service-impact-analysis"; type resource; description "Resources that might be affected by this alarm. If the system creates an alarm on a resource and also has a mapping to other resources that might be impacted, these resources can be listed in this leaf-list. In thiswayway, the system can create one alarm instead of several. For example, if an interface has an alarm, the 'impacted-resource' can reference the aggregated port channels."; } leaf-list root-cause-resource { if-feature "root-cause-analysis"; type resource; description "Resources that are candidates for causing the alarm. If the system has a mechanism to understand the candidate root causes of an alarm, this leaf-list can be used to list theroot causeroot-cause candidate resources. In thiswayway, the system can create one alarm instead of several. An example might be a logging system (alarm resource) thatfails,fails; the alarm can reference thefile-systemfile system in the 'root-cause-resource' leaf-list. Note that the intended use is not to also send ananalarm with theroot-cause-resource'root-cause-resource' as an alarming resource. Theroot-cause-resource leaf list'root-cause-resource' leaf-list is a hint and should not also generate an alarm for the same problem."; } } grouping alarm-state-change-parameters { description "Parameters for analarm statealarm-state change. This grouping is used both in the alarm list's status-change list and in the notification representing analarm statealarm-state change."; leaf time { type yang:date-and-time; mandatory true; description "The time the status of the alarm changed. The value represents the time the realalarm statealarm-state change appeared in the resource and not when it was added to the alarm list. The /alarm-list/alarm/last-changed MUST be set to the same value."; } leaf perceived-severity { type severity-with-clear; mandatory true; description "The severity of the alarm as defined by X.733. Note that this may not be the original severity since the alarm may have changed severity."; reference"ITU"ITU-T Recommendation X.733: Information Technology - Open Systems Interconnection - System Management: Alarm Reporting Function"; } leaf alarm-text { type alarm-text; mandatory true; description "Auser friendlyuser-friendly text describing thealarm statealarm-state change."; reference"ITU"ITU-T Recommendation X.733: Information Technology - Open Systems Interconnection - System Management: Alarm Reporting Function"; } } grouping operator-parameters { description "This grouping defines parameters that can be changed by an operator."; leaf time { type yang:date-and-time; mandatory true; description "Timestamp for operator action on the alarm."; } leaf operator { type string; mandatory true; description "The name of the operator that has acted on this alarm."; } leaf state { type operator-state; mandatory true; description "The operator's view of the alarm state."; } leaf text { type string; description "Additional optional textual information provided by the operator."; } } grouping resource-alarm-parameters { description "Alarm parameters thatoriginatesoriginate from the resource view."; leaf is-cleared { type boolean; mandatory true; description "Indicates the current clearance state of the alarm. An alarm might toggle from active alarm to cleared alarm and back to active again."; } leaf last-raised { type yang:date-and-time; mandatory true; description "An alarm may change severity level and toggle between active and cleared during itslife-time.lifetime. This leaf indicates the last time it waslastraised(is-cleared('is-cleared' =false).";'false')."; } leaf last-changed { type yang:date-and-time; mandatory true; description "A timestamp when the 'status-change' or 'operator-state-change' list was last changed."; } leaf perceived-severity { type severity; mandatory true; description "The last severity of the alarm. If an alarm was raised with severity'warning','warning' but later changed to 'major', this leaf will show 'major'."; } leaf alarm-text { type alarm-text; mandatory true; description "The last reported alarm text. This text should contain information for an operator to be able to understand the problem and how to resolve it."; } list status-change { if-feature "alarm-history"; key "time"; min-elements 1; description "A list ofstatus changestatus-change events for this alarm. The entry with latesttime-stamptimestamp in this list MUST correspond to the leafs 'is-cleared','perceived-severity''perceived-severity', and 'alarm-text' for the alarm. This list is ordered according to the timestamps of alarm state changes. The first item corresponds to the latest state change. The following state changescreatescreate an entry in this list: - changed severity (warning, minor, major, critical) - clearancestatus,status; this also updates the 'is-cleared' leaf -alarm textalarm-text update"; uses alarm-state-change-parameters; } } grouping filter-input { description "Grouping to specify a filter construct on alarm information."; leaf alarm-clearance-status { type enumeration { enum any { description "Ignorealarm clearancealarm-clearance status."; } enum cleared { description "Filter cleared alarms."; } enum not-cleared { description "Filternot clearednot-cleared alarms."; } } mandatory true; description "The clearance status of the alarm."; } container older-than { presence "Age specification"; description "Matches the 'last-status-change' leaf in the alarm."; choice age-spec { description "Filter using date and time age."; case seconds { leaf seconds { type uint16; description "Age expressed in seconds."; } } case minutes { leaf minutes { type uint16; description "Age expressed in minutes."; } } case hours { leaf hours { type uint16; description "Age expressed in hours."; } } case days { leaf days { type uint16; description "Age expressed in days."; } } case weeks { leaf weeks { type uint16; description "Age expressed in weeks."; } } } } container severity { presence "Severity filter"; choice sev-spec { description "Filter based on severity level."; leaf below { type severity; description "Severity less than this leaf."; } leaf is { type severity; description "Severity level equal to this leaf."; } leaf above { type severity; description "Severity level higher than this leaf."; } } description "Filter based on severity."; } container operator-state-filter { if-feature "operator-actions"; presence "Operator state filter"; leaf state { type operator-state; description "Filter on operator state."; } leaf user { type string; description "Filter based on which operator."; } description "Filter based on operator state."; } } /* * The /alarms data tree */ container alarms { description "The top container for this module."; container control { description "Configuration to control the alarm behavior."; leaf max-alarm-status-changes { type union { type uint16; type enumeration { enum infinite { description "Thestatus changestatus-change entries are accumulated infinitely."; } } } default "32"; description "Thestatus-change'status-change' entries are kept in a circular list per alarm. When this number is exceeded, the oldest status change entry is automatically removed. If the value is 'infinite', thestatus changestatus-change entries are accumulated infinitely."; } leaf notify-status-changes { type enumeration { enum all-state-changes { description "Send notifications for all status changes."; } enum raise-and-clear { description "Send notifications only for raise, clear, and re-raise. Notifications forseverity levelseverity-level changes oralarm textalarm-text changes are not sent."; } enum severity-level { description "Only send notifications foralarm statealarm-state changes crossing the level specified in 'notify-severity-level'. Always send clear notifications."; } } must '. != "severity-level" or ../notify-severity-level' { description "When notify-status-changes is 'severity-level', a value must be given fornotify-severity-level.";'notify-severity-level'."; } default "all-state-changes"; description "This leaf controls the notifications sent for alarm status updates. There are three options: 1. Notifications are sent for all updates,severity level changes and alarm text changesseverity-level changes, and alarm-text changes. 2. Notifications are only sent for alarm raise andclearclear. 3. Notifications are sent for status changes equal to or above the specified severity level. Clear notifications shall always be sent. Notifications shall also be sent for state changes thatmakesmake an alarm less severe than the specified level. For example, in option 3,assumingassume that the severity level is set to major and that the alarm has the following state changes: [(Time, severity, clear)]: [(T1, major, -), (T2, minor, -), (T3, warning, -), (T4, minor, -), (T5, major, -), (T6, critical, -), (T7, major. -), (T8, major, clear)] In that case, notifications will be sent at times T1, T2, T5, T6,T7T7, and T8."; } leaf notify-severity-level { when '../notify-status-changes = "severity-level"'; type severity; description "Only send notifications foralarm statealarm-state changes crossing the specified level. Always send clear notifications."; } container alarm-shelving { if-feature "alarm-shelving"; description "Thealarm-shelving/shelf'alarm-shelving/shelf' list is used to shelve (block/filter) alarms. The conditions in the shelf criteria are logically ANDed. The first matching shelf is used, and an alarm is shelved only for this first match. Matching alarms MUST appear in the /alarms/shelved-alarms/shelved-alarm list, and non-matching /alarms MUST appear in the /alarms/alarm-list/alarm list. The server does not send any notifications for shelved alarms. The server MUST maintain states (e.g., severity changes) for the shelved alarms. Alarms that match the criteria shall have an operator state 'shelved'. When the shelf configuration removes an alarm from theshelfshelf, the server shall addanthe operator state 'un-shelved'."; list shelf { key "name"; ordered-by user; leaf name { type string; description "An arbitrary name for the alarm shelf."; } description "Each entry defines the criteria for shelving alarms. Criteria are ANDed. If no criteria are specified, all alarms will be shelved."; leaf-list resource { type resource-match; description "Shelve alarms for matching resources."; } list alarm-type { key "alarm-type-id alarm-type-qualifier-match"; description "Any alarm matching the combined criteria ofalarm-type-id'alarm-type-id' andalarm-type-qualifier-match'alarm-type-qualifier-match' MUST be matched."; leaf alarm-type-id { type alarm-type-id; description "Shelve all alarms that have analarm-type-id'alarm-type-id' that is equal to or derived from the givenalarm-type-id.";'alarm-type-id'."; } leaf alarm-type-qualifier-match { type string; description "An XML Schema regular expression that is used to match an alarm type qualifier. Shelve all alarms thatmatchesmatch this regular expression for the alarm type qualifier."; reference "XML Schema Part 2: Datatypes Second Edition, World Wide Web Consortium Recommendation REC-xmlschema-2-20041028"; } } leaf description { type string; description "An optional textual description of the shelf. This description should include the reason for shelving these alarms."; } } } } container alarm-inventory { config false; description"This alarm-inventory/alarm-type"The 'alarm-inventory/alarm-type' list contains all possible alarm types for the system. If the system knows for which resources a specific alarm type can appear,thisit is also identified in the inventory. The list also tells if each alarm type has a corresponding clear state. The inventory shall only contain concrete alarm types. The alarm inventory MUST be updated by the system when new alarms can appear. This can be the case when installing new software modules or inserting new card types. A notification 'alarm-inventory-changed' is sent when the inventory is changed."; list alarm-type { key "alarm-type-id alarm-type-qualifier"; description "An entry in this list defines a possible alarm."; leaf alarm-type-id { type alarm-type-id; description "The statically defined alarm type identifier for this possible alarm."; } leaf alarm-type-qualifier { type alarm-type-qualifier; description "The optionally dynamically defined alarm type identifier for this possible alarm."; } leaf-list resource { type resource-match; description "Optionally, specifies for which resources the alarm type is valid."; } leaf will-clear { type boolean; mandatory true; description "This leaf tells the operator if the alarm will be cleared when the correct corrective action has been taken. Implementations SHOULD strive for detecting the cleared state for all alarm types. If this leaf is 'true', the operator can monitor the alarm until it becomes cleared after the corrective action has been taken. If this leaf is 'false', the operator needs to validate that the alarm is no longer active using other mechanisms. Alarms can lack a corresponding clear due to missing instrumentation orthat there isno logical corresponding clear state."; } leaf-listseverity-levelsseverity-level { type severity; description "This leaf-list indicates the possible severity levels of this alarm type. Note well that 'clear' is not part of the severity type. In general, the severity level should be defined by the instrumentation based on the dynamicstate and notstate, rather than being defined statically by the alarmtypetype, in order to provide a relevant severity level based on dynamic state and context.HoweverHowever, most alarm types have a defined set of possible severitylevelslevels, and this should be provided here."; } leaf description { type string; mandatory true; description "A description of the possible alarm. It SHOULD include information on possible underlying root causes and corrective actions."; } } } container summary { if-feature "alarm-summary"; config false; description "This container gives a summary of the number of alarms."; list alarm-summary { key "severity"; description "A global summary of all alarms in the system. The summary does not include shelved alarms."; leaf severity { type severity; description "Alarm summary for this severity level."; } leaf total { type yang:gauge32; description "Total number of alarms of this severity level."; } leaf not-cleared { type yang:gauge32; description "Total number of alarms of this severity level that are not cleared."; } leaf cleared { type yang:gauge32; description "For this severity level, the number of alarms that are cleared."; } leaf cleared-not-closed { if-feature "operator-actions"; type yang:gauge32; description "For this severity level, the number of alarms that are cleared but not closed."; } leaf cleared-closed { if-feature "operator-actions"; type yang:gauge32; description "For this severity level, the number of alarms that are cleared and closed."; } leaf not-cleared-closed { if-feature "operator-actions"; type yang:gauge32; description "For this severity level, the number of alarms that are not cleared but closed."; } leaf not-cleared-not-closed { if-feature "operator-actions"; type yang:gauge32; description "For this severity level, the number of alarms that are not cleared and not closed."; } } leaf shelves-active { if-feature "alarm-shelving"; type empty; description "This is a hint to the operator that there are active alarm shelves. This leaf MUST exist if the /alarms/shelved-alarms/number-of-shelved-alarms is > 0."; } } container alarm-list { config false; description "The alarms in the system."; leaf number-of-alarms { type yang:gauge32; description "This object shows the total number of alarms in the system, i.e., the total number of entries in the alarm list."; } leaf last-changed { type yang:date-and-time; description "A timestamp when the alarm list was last changed. The value can be used by a manager to initiate an alarm resynchronization procedure."; } list alarm { key "resource alarm-type-id alarm-type-qualifier"; description "The list of alarms. Each entry in the list holds one alarm for a given alarm type and resource. An alarm can be updated from the underlying resource or by the user. The following leafs are maintained by the resource:is-cleared, last-change, perceived-severity,'is-cleared', 'last-change', 'perceived-severity', andalarm-text.'alarm-text'. An operator canchange: operator-statechange 'operator-state' andoperator-text.'operator-text'. Entries appear in the alarm list the first time an alarm becomes active for a givenalarm-typealarm type and resource. Entries do not get deleted when the alarm is cleared. Clear status is represented as a boolean flag. Alarm entries are removed, i.e., purged, from the list by an explicit purge action. For example, purge all alarms that are cleared and in closedoperator-stateoperator state that are older than 24 hours. Purged alarms are removed from the alarm list. If the alarm resource state changes after a purge, the alarm will reappear in the alarm list. Systems may also remove alarms based on locally configuredpolicies whichpolicies; this is out of scope for this module."; uses common-alarm-parameters; leaf time-created { type yang:date-and-time; mandatory true; description "Thetime-stamptimestamp when this alarm entry was created. This represents the first time the alarmappeared,appeared; it can also represent that the alarmre-appearedreappeared after a purge. Furtherstate-changesstate changes of the same alarmdoesdo not change thisleaf,leaf; these changes will update the 'last-changed' leaf."; } uses resource-alarm-parameters; list operator-state-change { if-feature "operator-actions"; key "time"; description "This list is used by operators to indicate the state of human intervention on an alarm. For example, if an operator has seen an alarm, the operator can add a new item to this list indicating that the alarm is acknowledged."; uses operator-parameters; } action set-operator-state { if-feature "operator-actions"; description "This is a means for the operator to indicate the level of human intervention on an alarm."; input { leaf state { type writable-operator-state; mandatory true; description "Set this operator state."; } leaf text { type string; description "Additional optional textual information."; } } } notification operator-action { if-feature "operator-actions"; description "This notification is used to report that an operator acted upon an alarm."; uses operator-parameters; } } action purge-alarms { description "This operation requests that the servertodelete entries from the alarm list according to the supplied criteria.TypicallyTypically, this operation is used to delete alarms that are in closed operator state and older than a specified time. The number of purged alarms is returned as an output parameter."; input { uses filter-input; } output { leaf purged-alarms { type uint32; description "Number of purged alarms."; } } } action compress-alarms { if-feature "alarm-history"; description "This operation requests that the servertocompress entries in the alarm list by removing all but the latest 'status-change' entry for all matching alarms. Conditions in the input are logically ANDed. If no input condition is given, all alarms are compressed."; input { leaf resource { type resource-match; description "Compress the alarms matching this resource."; } leaf alarm-type-id { type leafref { path "/alarms/alarm-list/alarm/alarm-type-id"; require-instance false; } description "Compress alarms with thisalarm-type-id.";'alarm-type-id'."; } leaf alarm-type-qualifier { type leafref { path "/alarms/alarm-list/alarm/alarm-type-qualifier"; require-instance false; } description "Compress the alarms with thisalarm-type-qualifier.";'alarm-type-qualifier'."; } } output { leaf compressed-alarms { type uint32; description "Number of compressed alarm entries."; } } } } container shelved-alarms { if-feature "alarm-shelving"; config false; description "The shelved alarms. Alarms appear here if they match the criteria in /alarms/control/alarm-shelving. This list does not generate any notifications. The list represents alarms that are considered not relevant by the operator. Alarms in this list have anoperator-state'operator-state' of 'shelved'. Thiscan notcannot be changed."; leaf number-of-shelved-alarms { type yang:gauge32; description "This object shows the total number ofcurrentlycurrent alarms, i.e., the total number of entries in the alarm list."; } leaf shelved-alarms-last-changed { type yang:date-and-time; description "A timestamp when theshelved alarmshelved-alarm list was last changed. The value can be used by a manager to initiate an alarm resynchronization procedure."; } list shelved-alarm { key "resource alarm-type-id alarm-type-qualifier"; description "The list of shelved alarms. Shelved alarms can only be updated from the underlyingresource,resource; no operator actions are supported."; uses common-alarm-parameters; leaf shelf-name { type leafref { path "/alarms/control/alarm-shelving/shelf/name"; require-instance false; } description "The name of the shelf."; } uses resource-alarm-parameters; list operator-state-change { if-feature "operator-actions"; key "time"; description "This list is used by operators to indicate the state of human intervention on an alarm. For shelved alarms, the system has set the list item in the list to 'shelved'."; uses operator-parameters; } } action purge-shelved-alarms { description "This operation requests that the servertodelete entries from theshelved alarmsshelved-alarm list according to the supplied criteria. In theshelved alarm listshelved-alarm list, it makes sense to delete alarms that are not relevant anymore. The number of purged alarms is returned as an output parameter."; input { uses filter-input; } output { leaf purged-alarms { type uint32; description "Number of purged alarms."; } } } action compress-shelved-alarms { if-feature "alarm-history"; description "This operation requests that the servertocompress entries in theshelved alarmshelved-alarm list by removing all but the latest 'status-change' entry for all matching shelved alarms. Conditions in the input are logically ANDed. If no input condition is given, all alarms are compressed."; input { leaf resource { type leafref { path "/alarms/shelved-alarms/shelved-alarm/resource"; require-instance false; } description "Compress the alarms with this resource."; } leaf alarm-type-id { type leafref { path "/alarms/shelved-alarms/shelved-alarm" + "/alarm-type-id"; require-instance false; } description "Compress alarms with thisalarm-type-id.";'alarm-type-id'."; } leaf alarm-type-qualifier { type leafref { path "/alarms/shelved-alarms/shelved-alarm" + "/alarm-type-qualifier"; require-instance false; } description "Compress the alarms with thisalarm-type-qualifier.";'alarm-type-qualifier'."; } } output { leaf compressed-alarms { type uint32; description "Number of compressed alarm entries."; } } } } list alarm-profile { if-feature "alarm-profile"; key "alarm-type-id alarm-type-qualifier-match resource"; ordered-by user; description "This list is used to assign further information or configuration for each alarm type. This module supports a mechanism where the client can override thesystem defaultsystem-default alarm severity levels. Thealarm-profile'alarm-profile' is also a useful augmentation point for specific additions to alarm types."; leaf alarm-type-id { type alarm-type-id; description "The alarm type identifier to match."; } leaf alarm-type-qualifier-match { type string; description "An XML Schema regular expression that is used to match the alarm type qualifier."; reference "XML Schema Part 2: Datatypes Second Edition, World Wide Web Consortium Recommendation REC-xmlschema-2-20041028"; } leaf resource { type resource-match; description "Specifies which resources to match."; } leaf description { type string; mandatory true; description "A description of the alarm profile."; } container alarm-severity-assignment-profile { if-feature "severity-assignment"; description "The client can override thesystem defaultsystem-default severity level."; reference"ITU M.3100, ITU M.3160 -"ITU-T Recommendation M.3100: GenericNetwork Information Model, Alarm Severity Assignment Profile";network information model ITU-T Recommendation M.3160: Generic, protocol-neutral management information model"; leaf-listseverity-levelsseverity-level { type severity; ordered-by user; description "Specifies the configured severity level(s) for the matching alarm. If the alarm has several severitylevelslevels, the leaf-list shall be given in rising severity order. The original M3100/M3160 ASAP function only allows for a one-to-one mapping between alarm type andseverityseverity, but sincethe IETF alarmYANG module supports statefulalarmsalarms, the mapping must allow for several severity levels. Assume a high-utilization alarm type with two thresholds with thesystem defaultsystem-default severity levels of threshold1 = warning and threshold2 = minor. Setting this leaf-list to (minor, major) will assign the severity levels as threshold1 = minor and threshold2 = major"; } } } } /* * Notifications */ notification alarm-notification { description "This notification is used to report a state change for an alarm. The same notification is used for reporting a newly raised alarm, a clearedalarmalarm, or changing the text and/or severity of an existing alarm."; uses common-alarm-parameters; uses alarm-state-change-parameters; } notification alarm-inventory-changed { description "This notification is used to report that the list of possible alarms has changed. This can happenwhenwhen, forexample ifexample, a new software module isinstalled,installed or a new physical card is inserted."; } } <CODE ENDS> 7. The X.733ExtensionsMapping Module Many alarm systems are based on theX.733, [X.733],X.733 [X.733] and X.736 [X.736] alarm standards. This module "ietf-alarms-x733" augments the alarm inventory, the alarmlistslists, and the alarm notification with X.733 and X.736 parameters. The module also supports a feature whereby the alarm manager can configure the mapping from alarm types to X.733 "event-type" and "probable-cause" parameters. This might be needed when the default mapping provided by the system is in conflict with other management systems or not considered correct. Note that theIETF Alarm Moduleterm "resource" in this document is synonymous to the ITU term "managed object".8. The X.733 Mapping ModuleThis YANG module references [RFC6991], [X.721],[X.733][X.733], and [X.736]. <CODE BEGINS> file"ietf-alarms-x733@2019-03-21.yang""ietf-alarms-x733@2019-07-16.yang" module ietf-alarms-x733 { yang-version 1.1; namespace "urn:ietf:params:xml:ns:yang:ietf-alarms-x733"; prefix x733; import ietf-alarms { prefix al; } import ietf-yang-types { prefix yang; reference "RFC 6991: Common YANG Data Types"; } organization "IETF CCAMP Working Group"; contact "WG Web:<http://tools.ietf.org/wg/ccamp><https://trac.ietf.org/trac/ccamp> WG List: <mailto:ccamp@ietf.org> Editor: Stefan Vallin <mailto:stefan@wallan.se> Editor: Martin Bjorklund <mailto:mbj@tail-f.com>"; description "This module augments the ietf-alarms module with X.733 alarm parameters. The following structures are augmented with the X.733 event type and probable cause: 1) alarms/alarm-inventory: all possible alarm types 2) alarms/alarm-list: every alarm in the system 3) alarm-notification: notifications indicatingalarm statealarm-state changes 4) alarms/shelved-alarms The module also optionally allows thealarm managementalarm-management system to configure the mapping from theIETF Alarm moduleietf-alarms' alarm keys to the ITU tuple (event-type, probable-cause). The mapping does not include a correspondingX.733 specificproblemvalue.value specific to X.733. The recommendation is to use the 'alarm-type-qualifier'leafleaf, which serves the same purpose. The module uses an integer and a corresponding string for probable cause instead of a globally defined enumeration, in order to be able to manage conflicting enumeration definitions. A single globally defined enumeration is challenging to maintain. The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 'MAY', and 'OPTIONAL' in this document are to be interpreted as described in BCP 14 (RFC 2119) (RFC 8174) when, and only when, they appear in all capitals, as shown here. Copyright (c) 2019 IETF Trust and the persons identified as authors of the code. All rights reserved. Redistribution and use in source and binary forms, with or without modification, is permitted pursuant to, and subject to the license terms contained in, the Simplified BSD License set forth in Section 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info). This version of this YANG module is part of RFCXXXX (https://tools.ietf.org/html/rfcXXXX);8632; see the RFC itself for full legal notices."; reference"ITU"ITU-T Recommendation X.733: Information Technology - Open Systems Interconnection - System Management: Alarm Reporting Function"; revision2019-03-212019-07-16 { description "Initial revision."; reference "RFCXXXX:8632: A YANG Data Model for AlarmModule";Management"; } /* * Features */ feature configure-x733-mapping { description "The system supports configurable X733 mapping from theIETF alarm moduleietf-alarms' alarm-type to X733 event-type and probable-cause."; } /* * Typedefs */ typedef event-type { type enumeration { enum other { value 1; description "None of the below."; } enum communications-alarm { value 2; description "An alarm of this type is principally associated with the procedures and/or processes required to convey information from one point to another."; } enum quality-of-service-alarm { value 3; description "An alarm of this type is principally associated with a degradation in the quality of a service."; } enum processing-error-alarm { value 4; description "An alarm of this type is principally associated with a software or processing fault."; } enum equipment-alarm { value 5; description "An alarm of this type is principally associated with an equipment fault."; } enum environmental-alarm { value 6; description "An alarm of this type is principally associated with a condition relating to an enclosure in which the equipment resides."; } enum integrity-violation { value 7; description "An indication that information may have been illegally modified,insertedinserted, or deleted."; } enum operational-violation { value 8; description "An indication that the provision of the requested service was not possible due to the unavailability,malfunctionmalfunction, or incorrect invocation of the service."; } enum physical-violation { value 9; description "An indication that a physical resource has been violated in a way that suggests a security attack."; } enum security-service-or-mechanism-violation { value 10; description "An indication that a security attack has been detected by a security service or mechanism."; } enum time-domain-violation { value 11; description "An indication that an event has occurred at an unexpected or prohibited time."; } } description "The event types as defined by X.733 and X.736."; reference"ITU"ITU-T Recommendation X.733: Information Technology - Open Systems Interconnection - System Management: Alarm Reporting FunctionITUITU-T Recommendation X.736: Information Technology - Open Systems Interconnection - System Management: Security Alarm Reporting Function"; } typedef trend { type enumeration { enum less-severe { description "There is at least one outstanding alarm of a severity higher (more severe) than that in the current alarm."; } enum no-change { description "The Perceived severity reported in the current alarm is the same as the highest (most severe) of any of the outstanding alarms"; } enum more-severe { description "The Perceived severity in the current alarm is higher (more severe) than that reported in any of the outstanding alarms."; } } description "This type is used to describe the severity trend of the alarmingresource";resource."; reference"ITU"ITU-T Recommendation X.721: Information Technology - Open Systems Interconnection - Structure of management information: Definition of management information Module Attribute-ASN1Module"; } typedef value-type { type union { type int64; type uint64; type decimal64 { fraction-digits 2; } } description "A generic union type to match the ITU choice of integer and real."; } /* * Groupings */ grouping x733-alarm-parameters { description "Common X.733 parameters for alarms."; leaf event-type { type event-type; description "The X.733/X.736 event type for this alarm."; } leaf probable-cause { type uint32; description "The X.733 probable cause for this alarm."; } leaf probable-cause-string { type string; description "Theuser friendlyuser-friendly string matching the probable cause integer value. The string SHOULD match the X.733 enumeration. For example, value 27 is 'localNodeTransmissionError'."; } container threshold-information { description "This parameter shall be present when the alarm is a result of crossing a threshold. "; leaf triggered-threshold { type string; description "The identifier of the threshold attribute that caused the notification."; } leaf observed-value { type value-type; description "The value of the gauge or counterwhichthat crossed the threshold. This may be different from the threshold value if, for example, the gauge may only take on discrete values."; } choice threshold-level { description "In the case of agaugegauge, the threshold level specifies a pair of thresholdvalues,values: the firstbeingis the value of the crossedthresholdthreshold, and thesecond,second is its corresponding hysteresis; in the case of acountercounter, the threshold level specifies only the threshold value."; case up { leaf up-high { type value-type; description "Thegoing upgoing-up threshold forrisingraising the alarm."; } leaf up-low { type value-type; description "The going-down thresholdlevelfor clearing the alarm. This is used for hysteresis functions for gauges."; } } case down { leaf down-low { type value-type; description "Thegoing downgoing-down threshold forrisingraising the alarm."; } leaf down-high { type value-type; description "The going-up thresholdlevelfor clearing the alarm. This is used for hysteresis functions for gauges."; } } } leaf arm-time { type yang:date-and-time; description "For a gauge threshold, it's the time at which the threshold was lastre-armed, namelyre-armed; namely, it's the time after the previous threshold crossing at which the hysteresis value of the threshold wasexceededexceeded, thus again permitting the generation of notifications when the threshold is crossed. For a counter threshold, it's the later of the time at which the threshold offset was lastapplied,applied or thetime at which thecounter was last initialized (for resettable counters)."; } } list monitored-attributes { uses attribute; key "id"; description "The Monitored attributes parameter, when present, defines one or more attributes of the resource and their corresponding values at the time of the alarm."; } leaf-list proposed-repair-actions { type string; description "This parameter, when present, is used if the cause is known and the system being managed can suggest one or more solutions (such as switch in standby equipment, retry, and replace media)."; } leaf trend-indication { type trend; description "This parameter specifies the current severity trend of the resource. Ifpresentpresent, it indicates that there are one or more alarms ('outstanding alarms')whichthat have not beencleared,cleared and that pertain to the same resource asthat to whichthis alarm ('current alarm')pertains.does. The possible values are: more-severe: The Perceived severity in the current alarm is higher (more severe) than that reported in any of the outstanding alarms. no-change: The Perceived severity reported in the current alarm is the same as the highest (most severe) of any of the outstanding alarms. less-severe: There is at least one outstanding alarm of a severity higher (more severe) than that in the current alarm."; } leaf backedup-status { type boolean; description "This parameter, when present, specifies whether or not the object emitting the alarm has beenbacked-up, andbacked up; therefore, it is possible to know whether or not services provided to the userhave, therefore, nothave beendisrupted.disrupted when this parameter is included. The use of this field in conjunction with theseverity'perceived-severity' field provides information in an independent form to qualify the seriousness of the alarm and the ability of the system as a whole to continue to provide services. If the value of this parameter is true, it indicates that the object emitting the alarm has beenbacked-up;backed up; if false, the object has not beenbacked-up.";backed up."; } leaf backup-object { type al:resource; description "This parametershallSHALL be present when theBacked-up status'backedup-status' parameter is present and has the valuetrue.'true'. This parameter specifies the managed object instance that is providing back-up services for the managed objectaboutto which the notification pertains. This parameter is useful, for example, when the back-up object is from a pool ofobjectsobjects, any of which may be dynamically allocated to replace a faulty object."; } list additional-information { key "identifier"; description "This parameter allows the inclusion ofa set ofan additional information set in the alarm. It is a series of datastructuresstructures, each of which contains three items of information: an identifier, a significance indicator, and the problem information."; leaf identifier { type string; description "Identifies thedata-typedata type of the information parameter."; } leaf significant { type boolean; description "Set totrue'true' if the receiving system must be able to parse the contents of the information subparameter for the event report to be fully understood."; } leaf information { type string; description "Additional information about the alarm."; } } leaf security-alarm-detector { type al:resource; description "This parameter identifies the detector of the security alarm."; } leaf service-user { type al:resource; description "This parameter identifies the service-user whose request for service led to the generation of the security alarm."; } leaf service-provider { type al:resource; description "This parameter identifies the intended service-provider of the service that led to the generation of the security alarm."; } reference"ITU"ITU-T Recommendation X.733: Information Technology - Open Systems Interconnection - System Management: Alarm Reporting FunctionITUITU-T Recommendation X.736: Information Technology - Open Systems Interconnection - System Management: Security Alarm Reporting Function"; } grouping x733-alarm-definition-parameters { description "Common X.733 parameters for alarm definitions. This grouping is used to define those alarm attributes that can be mapped from the alarm-type mechanism in theietf-alarmietf-alarms module."; leaf event-type { type event-type; description "The alarm type has this X.733/X.736 event type."; } leaf probable-cause { type uint32; description "The alarm type has this X.733 probable cause value. This module defines probable cause as an integer and not as an enumeration. The reason being that the primary use of probable cause is in the management application if it is based on the X.733 standard. However, most management applications have their own defined enum definitions and merging enums from different systems might create conflicts. By using a configurableuint32uint32, the system can be configured to match the enum values in the management application."; } leaf probable-cause-string { type string; description "This string can be used to give auser friendlyuser-friendly string to the probable cause value."; } } grouping attribute { description "A grouping to match the ITU generic reference to an attribute."; leaf id { type al:resource; description "The resource representing the attribute."; } leaf value { type string; description "The value represented as a string since it could be of any type."; } reference"ITU"ITU-T Recommendation X.721: Information Technology - Open Systems Interconnection - Structure of management information: Definition of management information Module Attribute-ASN1Module"; } /* * Add X.733 parameters to the alarm definitions, alarms, * and notification. */ augment "/al:alarms/al:alarm-inventory/al:alarm-type" { description "Augment X.733 mapping information to the alarm inventory."; uses x733-alarm-definition-parameters; } /* * Add X.733 configurable mapping. */ augment "/al:alarms/al:control" { description "Add X.733 mapping capabilities. "; list x733-mapping { if-feature "configure-x733-mapping"; key "alarm-type-id alarm-type-qualifier-match"; description "This list allows a management application to control the X.733 mapping for all alarm types in the system. Any entry in this list will allow the alarm manager toover-rideoverride the default X.733 mapping in thesystemsystem, and the final mapping will be shown in the alarm inventory."; leaf alarm-type-id { type al:alarm-type-id; description "Map the alarm type with this alarm type identifier."; } leaf alarm-type-qualifier-match { type string; description "A W3C regular expression that is used when mapping an alarm type and alarm-type-qualifier to X.733 parameters."; } uses x733-alarm-definition-parameters; } } augment "/al:alarms/al:alarm-list/al:alarm" { description "Augment X.733 information to the alarm."; uses x733-alarm-parameters; } augment "/al:alarms/al:shelved-alarms/al:shelved-alarm" { description "Augment X.733 information to the alarm."; uses x733-alarm-parameters; } augment "/al:alarm-notification" { description "Augment X.733 information to the alarm notification."; uses x733-alarm-parameters; } } <CODE ENDS>9.8. IANA Considerations This document registers two URIs in theIETF"IETF XMLregistryRegistry" [RFC3688]. Following the format in RFC 3688, the following registrationsare requested to behave been made. URI: urn:ietf:params:xml:ns:yang:ietf-alarms Registrant Contact: The IESG. XML:N/A,N/A; the requested URI is an XML namespace. URI: urn:ietf:params:xml:ns:yang:ietf-alarms-x733 Registrant Contact: The IESG. XML:N/A,N/A; the requested URI is an XML namespace. This document registers two YANG modules in theYANG"YANG ModuleNamesNames" registry [RFC6020]. name: ietf-alarms namespace: urn:ietf:params:xml:ns:yang:ietf-alarms prefix: al reference: RFCXXXX8632 name: ietf-alarms-x733 namespace: urn:ietf:params:xml:ns:yang:ietf-alarms-x733 prefix: x733 reference: RFCXXXX 10.8632 9. Security Considerations The YANGmodulemodules specified in this documentdefinesdefine a schema for data that is designed to be accessed via network management protocols such as NETCONF [RFC6241] or RESTCONF [RFC8040]. The lowest NETCONF layer is the secure transport layer, and the mandatory-to-implement secure transport is Secure Shell (SSH) [RFC6242]. The lowest RESTCONF layer is HTTPS, and the mandatory-to-implement secure transport is TLS [RFC8446]. The Network Configuration Access ControlmodelModel (NACM) [RFC8341] provides the means to restrict access for particular NETCONF or RESTCONF users to a preconfigured subset of all available NETCONF or RESTCONF protocol operations and content. The list of alarms itself may be potentially sensitive from a security perspective, in that it potentially gives an attacker an authoritative picture of the (broken) state of the network. There are a number of data nodes defined inthisthe YANGmodulemodules that are writable/creatable/deletable (i.e., config true, which is the default). These data nodes may be considered sensitive or vulnerable in some network environments. Write operations (e.g., edit-config) to these data nodes without proper protection can have a negative effect on network operations. These are the subtrees and data nodes in the "ietf-alarms" module and their sensitivity/vulnerability:/alarms/control/notify-status-changes:"/alarms/control/notify-status-changes": This leaf controls whether an alarm should notify based on various state changes. Unauthorized access to this leaf could have a negative impact on operational procedures relying on fine-grainedalarm statealarm-state changereporting /alarms/control/alarm-shelving/shelf:reporting. "/alarms/control/alarm-shelving/shelf": This list controls the shelving (blocking) of alarms. Unauthorized access to this list could jeopardize thealarm management proceduresalarm-management procedures, since these alarms will not be notifiedand notor be part of the alarm list./alarms/control/alarm-profile/alarm-severity-assignment-profile:"/alarms/control/alarm-profile/alarm-severity-assignment-profile": This list controls the severity levels of an alarm. Unauthorized access to thiscouldcould, forexampleexample, downgrade the severity of an alarm and thereby have a negative impact on thealarm monitoringalarm-monitoring process. Some of the RPC operations in this YANG module may be considered sensitive or vulnerable in some network environments. It is thus important to control access to these operations. These are the operations and their sensitivity/vulnerability:/alarms/alarm-list/purge-alarms:"/alarms/alarm-list/purge-alarms": This action deletes alarms from the alarm list. Unauthorized use of this action could jeopardize thealarm managementalarm-management procedures since the deleted alarms may be vital for thealarm managementalarm-management application./alarms/alarm-list/alarm/set-operator-state:"/alarms/alarm-list/alarm/set-operator-state": This action can be used by the operator to indicate the level of human intervention on an alarm. Unauthorized use of this action could result in alarms being ignored by operators.11. Acknowledgements The authors wish to thank Viktor Leijon and Johan Nordlander for their valuable input on forming the alarm model. The authors also wish to thank Nick Hancock, Joey Boyd, Tom Petch and Balazs Lengyel for their extensive reviews and contributions to this document. 12.10. References12.1.10.1. Normative References [M.3100] InternationalTelecommunicationsTelecommunication Union, "GenericNetwork Information Model",network information model", ITU-T Recommendation M.3100,2005.April 2005, <https://www.itu.int/rec/T-REC-M.3100-200504-I/en>. [M.3160] InternationalTelecommunicationsTelecommunication Union, "Generic,protocol-neutralprotocol- neutral management information model", ITU-T Recommendation M.3100,2008.November 2008, <https://www.itu.int/rec/T-REC-M.3160-200811-I>. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, DOI 10.17487/RFC3688, January 2004, <https://www.rfc-editor.org/info/rfc3688>. [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for the Network Configuration Protocol (NETCONF)", RFC 6020, DOI 10.17487/RFC6020, October 2010, <https://www.rfc-editor.org/info/rfc6020>. [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., and A. Bierman, Ed., "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, <https://www.rfc-editor.org/info/rfc6241>. [RFC6242] Wasserman, M., "Using the NETCONF Protocol over Secure Shell (SSH)", RFC 6242, DOI 10.17487/RFC6242, June 2011, <https://www.rfc-editor.org/info/rfc6242>. [RFC6991] Schoenwaelder, J., Ed., "Common YANG Data Types", RFC 6991, DOI 10.17487/RFC6991, July 2013, <https://www.rfc-editor.org/info/rfc6991>. [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", RFC 7950, DOI 10.17487/RFC7950, August 2016, <https://www.rfc-editor.org/info/rfc7950>. [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, <https://www.rfc-editor.org/info/rfc8040>. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. [RFC8341] Bierman, A. and M. Bjorklund, "Network Configuration Access Control Model", STD 91, RFC 8341, DOI 10.17487/RFC8341, March 2018, <https://www.rfc-editor.org/info/rfc8341>. [RFC8348] Bierman, A., Bjorklund, M., Dong, J., and D. Romascanu, "A YANG Data Model for Hardware Management", RFC 8348, DOI 10.17487/RFC8348, March 2018, <https://www.rfc-editor.org/info/rfc8348>. [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, <https://www.rfc-editor.org/info/rfc8446>. [X.721] InternationalTelecommunicationsTelecommunication Union, "InformationTechnologytechnology - Open Systems Interconnection - Structure of management information: Definition of management information", ITU-T Recommendation X.721,1992.February 1992, <https://www.itu.int/rec/T-REC-X.721-199202-I/en>. [X.733] InternationalTelecommunicationsTelecommunication Union, "InformationTechnologytechnology - Open Systems Interconnection - Systems Management: AlarmReporting Function",reporting function", ITU-T Recommendation X.733,1992.February 1992, <https://www.itu.int/rec/T-REC-X.733-199202-I/en>. [XSD-TYPES] Malhotra, A. and P. Biron, "XML Schema Part 2: Datatypes Second Edition", World Wide Web Consortium Recommendation REC-xmlschema-2-20041028, October 2004, <http://www.w3.org/TR/2004/REC-xmlschema-2-20041028>.12.2.10.2. Informative References [ALARMIRP] 3GPP, "Telecommunication management; Fault Management; Part 2: Alarm Integration Reference Point (IRP): Information Service (IS)", 3GPP TS32.111-2 3.4.0,32.111-2, March2005.2005, <http://www.3gpp.org/ftp/Specs/html-info/32111-2.htm>. [ALARMSEM] Wallin, S., Leijon, V., Nordlander, J., and N. Bystedt, "The semantics of alarm definitions: enabling systematic reasoning aboutalarms.alarms", International Journal of Network Management, Volume 22, Issue 3,John Wiley and Sons, Ltd, http://dx.doi.org/10.1002/nem.800", March 2012.May 2012, <http://dx.doi.org/10.1002/nem.800>. [EEMUA] "Alarm systems: a guide to design, management and procurement", EEMUA Publication No.191191, Engineering Equipment and Materials Users Association,London, 2 edition., "Alarm Systems: A Guide to Design, Management and Procurement.",Second Edition, 2007. [G.7710]ITU-T,International Telecommunication Union, "SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL SYSTEMS AND NETWORKS - Data over Transport - Generic aspects - Transport network controlaspects.aspects; Common equipment management function requirements", ITU-T Recommendation G.7710/Y.1701, Amendment 1, November 2012.[I-D.ietf-netmod-yang-instance-file-format] Lengyel, B. and B. Claise, "YANG Instance Data File Format", draft-ietf-netmod-yang-instance-file-format-02 (work in progress), February 2019.[ISA182] International Society ofAutomation,ISA, "ANSI/ISA- 18.2-2009 ManagementAutomation, "Management of Alarm Systems for the Process Industries",2009.ANSI/ISA - 18.2-2016, March 2016. [RFC3877] Chisholm, S. and D. Romascanu, "Alarm Management Information Base (MIB)", RFC 3877, DOI 10.17487/RFC3877, September 2004, <https://www.rfc-editor.org/info/rfc3877>. [RFC8340] Bjorklund, M. and L. Berger, Ed., "YANG Tree Diagrams", BCP 215, RFC 8340, DOI 10.17487/RFC8340, March 2018, <https://www.rfc-editor.org/info/rfc8340>. [X.736] InternationalTelecommunicationsTelecommunication Union, "InformationTechnologytechnology - Open Systems Interconnection - Systems Management: Security alarm reporting function", ITU-T Recommendation X.736,1992.January 1992, <https://www.itu.int/rec/T-REC-X.736-199201-I/en>. [YANG-INSTANCE] Lengyel, B. and B. Claise, "YANG Instance Data File Format", Work in Progress, draft-ietf-netmod-yang- instance-file-format-02, February 2019. Appendix A.Vendor-specificVendor-Specific Alarm Types Example This example shows how to define alarm types in a vendor-specific module. In thiscasecase, the vendor "xyz" has chosen to definetoptop- level identities according to X.733 event types. module example-xyz-alarms { namespace "urn:example:xyz-alarms"; prefix xyz-al; import ietf-alarms { prefix al; } identity xyz-alarms { base al:alarm-type-id; } identity communications-alarm { base xyz-alarms; } identity quality-of-service-alarm { base xyz-alarms; } identity processing-error-alarm { base xyz-alarms; } identity equipment-alarm { base xyz-alarms; } identity environmental-alarm { base xyz-alarms; } // communications alarms identity link-alarm { base communications-alarm; } // QoS alarms identity high-jitter-alarm { base quality-of-service-alarm; } } Appendix B. Alarm Inventory Example This shows an alarminventory, it showsinventory: one alarm type is defined only with theidentifier,identifier and another is dynamically configured. In the lattercasecase, a digital input has been connected to asmoke-detector, thereforesmoke detector; therefore, the "alarm-type-qualifier" is set to "smoke-detector" and the "alarm-type-id" to "environmental-alarm". <alarms xmlns="urn:ietf:params:xml:ns:yang:ietf-alarms" xmlns:xyz-al="urn:example:xyz-alarms" xmlns:dev="urn:example:device"> <alarm-inventory> <alarm-type> <alarm-type-id>xyz-al:link-alarm</alarm-type-id> <alarm-type-qualifier/> <resource> /dev:interfaces/dev:interface </resource> <will-clear>true</will-clear> <description> Linkfailure,failure; operational state down but admin state up </description> </alarm-type> <alarm-type> <alarm-type-id>xyz-al:environmental-alarm</alarm-type-id> <alarm-type-qualifier>smoke-alarm</alarm-type-qualifier> <will-clear>true</will-clear> <description> Connected smoke detector to digital input </description> </alarm-type> </alarm-inventory> </alarms> Appendix C. Alarm List Example In thisexampleexample, we show an alarm that has toggled [major, clear, major]. An operator has acknowledged the alarm. <alarms xmlns="urn:ietf:params:xml:ns:yang:ietf-alarms" xmlns:xyz-al="urn:example:xyz-alarms" xmlns:dev="urn:example:device"> <alarm-list> <number-of-alarms>1</number-of-alarms> <last-changed>2018-04-08T08:39:50.00Z</last-changed> <alarm> <resource> /dev:interfaces/dev:interface[name='FastEthernet1/0'] </resource> <alarm-type-id>xyz-al:link-alarm</alarm-type-id> <alarm-type-qualifier></alarm-type-qualifier> <time-created>2018-04-08T08:20:10.00Z</time-created> <is-cleared>false</is-cleared> <alt-resource>1.3.6.1.2.1.2.2.1.1.17</alt-resource> <last-raised>2018-04-08T08:39:40.00Z</last-raised> <last-changed>2018-04-08T08:39:50.00Z</last-changed> <perceived-severity>major</perceived-severity> <alarm-text> Link operationally down but administratively up </alarm-text> <status-change> <time>2018-04-08T08:39:40.00Z</time> <perceived-severity>major</perceived-severity> <alarm-text> Link operationally down but administratively up </alarm-text> </status-change> <status-change> <time>2018-04-08T08:30:00.00Z</time> <perceived-severity>cleared</perceived-severity> <alarm-text> Link operationally up and administratively up </alarm-text> </status-change> <status-change> <time>2018-04-08T08:20:10.00Z</time> <perceived-severity>major</perceived-severity> <alarm-text> Link operationally down but administratively up </alarm-text> </status-change> <operator-state-change> <time>2018-04-08T08:39:50.00Z</time> <state>ack</state> <operator>joe</operator> <text>Will investigate, ticket TR764999</text> </operator-state-change> </alarm> </alarm-list> </alarms> Appendix D. Alarm Shelving Example This example shows how toshelfshelve alarms. Weshelfshelve alarms related to thesmoke-detectorssmoke detectors, since they are being installed and tested. We alsoshelfshelve all alarms from FastEthernet1/0. <alarms xmlns="urn:ietf:params:xml:ns:yang:ietf-alarms" xmlns:xyz-al="urn:example:xyz-alarms" xmlns:dev="urn:example:device"> <control> <alarm-shelving> <shelf> <name>FE10</name> <resource> /dev:interfaces/dev:interface[name='FastEthernet1/0'] </resource> </shelf> <shelf> <name>detectortest</name> <alarm-type> <alarm-type-id> xyz-al:environmental-alarm </alarm-type-id> <alarm-type-qualifier-match> smoke-alarm </alarm-type-qualifier-match> </alarm-type> </shelf> </alarm-shelving> </control> </alarms> Appendix E. X.733 Mapping Example This example shows how to map a dynamic alarm type (alarm-type- id=environmental-alarm, alarm-type-qualifier=smoke-alarm) to the corresponding X.733 "event-type" and "probable-cause" parameters. <alarms xmlns="urn:ietf:params:xml:ns:yang:ietf-alarms" xmlns:xyz-al="urn:example:xyz-alarms"> <control> <x733-mapping xmlns="urn:ietf:params:xml:ns:yang:ietf-alarms-x733"> <alarm-type-id>xyz-al:environmental-alarm</alarm-type-id> <alarm-type-qualifier-match> smoke-alarm </alarm-type-qualifier-match> <event-type>quality-of-service-alarm</event-type> <probable-cause>777</probable-cause> </x733-mapping> </control> </alarms> Appendix F. Relationship toother alarm standardsOther Alarm Standards This section briefly describes how this alarmmoduledata model relates to other relevant standards. F.1.Alarm definitionDefinition of "Alarm" The table below summarizes relevant definitions of the term "alarm" in other alarm standards. +------------+---------------------------+--------------------------+ | Standard | Definition | Comment | +------------+---------------------------+--------------------------+ | X.733 | error: A deviation of a | The X.733 alarm | | [X.733] | system from normal | definition is focused on | | | operation. fault: The | the notification as such | | | physical or algorithmic | and not the state.X.733| | | cause of a malfunction. | X.733 defines an alarmas a| | | Faults manifest | as a deviation fromnormala | | | themselves as errors. |condition,normal condition butwithout| | | alarm: A notification, of | without the requirementthat it| | | the form defined by this | that it needs corrective | | | function, of a specific | actions. | | | event. An alarm may or | | | | may not represent an | | | | error. | | | | | | | G.7710 | Alarms are indications | The G.7710 definition is | | [G.7710] | that are automatically | close to the original | | | generated by a device as | X.733 definition. | | | a result of the | | | | declaration of a failure. | | | | | | | Alarm MIB | Alarm: Persistent | RFC 3877 defines the | | [RFC3877] | indication of a fault. | term alarm as referring | | | Fault: Lasting error or | back to "a deviation | | | warning condition. | from normal operation". | | | Error: A deviation of a | The Alarm YANGmodeldata | | | system from normal | model adds therequirement| | | operation. | requirement that it | | | | should require a | | | | corrective action and | | | | should be undesired, not | | | | only a deviation from | | | | normal. The alarm MIB | | | | is state oriented in the | | | | same way as the Alarm | | | |YANG,YANG module; it focuseson the| | | | on the "lastingcondition",| | | | condition", not the | | | | individual | | | | notifications. | | | | | | ISA | Alarm: An audible and/or | The ISA standard adds an | | [ISA182] | visible means of | important requirement to | | | indicating to the | the "deviation from | | | operator an equipment | normal condition state": | | | malfunction, process | requiring a response. | | |deviationdeviation, or abnormal | | | | condition requiring a | | | | response. | | | | | | | EEMUA | An alarm is an event to | This is the foundation | | [EEMUA] | which an operator must | for the definition of | | | knowinglyreact,respond,react, respond, | alarm in this document. | | | and acknowledge--- not | It focuses on the core | | | simply acknowledge and |criteriacriterion that an action | | | ignore. | is really needed. | | | | | | 3GPP Alarm | 3GPP v15: An alarm | The latest 3GPP Alarm | | IRP | signifies an undesired | IRP version uses | | [ALARMIRP] | condition of a resource | literally the same alarm | | |(e.g.(e.g., device, link) for | definition as this alarm | | | which an operator action |module.data model. It is worth | | | is required. It | noting that earlier | | | emphasizes a key | versions used a | | | requirement that | definition not requiring | | | operators [...] should | an operator action and | | | not be informed about an | themore broadmore-broad | | | undesired condition | definition of deviation | | | unless it requires | from normal condition. | | | operator action.3GPP| The earlier version also | | | 3GPP v12: alarm: abnormal | defined an alarm as a | | | network entity condition, | special case of "event". | | | which categorizes an | | | | event as a fault.fault:| | | | fault: a deviation of asystem| | | | system from normaloperation,| | | | operation, which may | | | | result in the loss of | | | |loss ofoperational capabilities | | | |capabilities[...] | | +------------+---------------------------+--------------------------+ Table 1: Definition ofalarmthe Term "Alarm" instandardsStandards The evolution of the definition of alarm moves from focused on events reporting a deviation from normal operation towards a definition to a undesired *state*whichthat *requires an operator action*. F.2. DatamodelModel This section describes how this YANG alarmmoduledata model relates to other standard data models. Note well that we cover other data models for alarminterfaces. Notinterfaces but not other standards such asSDOSDO- specificalarms for example.alarms. F.2.1. X.733 X.733 has acted as a base for several alarm data models over theyear.years. The YANG alarmmoduledata model differs in the following ways: X.733 models the alarm list as a list of notifications. The YANG alarmmoduledata model defines the alarm list as the current alarm states for the resources, which is generated from the state change reporting notifications. InX.733X.733, an alarm can have the severity levelclear."clear". In the YANG alarmmoduledata model, "clear" is not a severitylevel,level; it is a separate state of the alarm. An alarm can have the followingstatesstates, forexampleexample, (major,cleared),cleared) and (minor, notcleared)cleared). X.733 uses aflatflat, globally defined enumerated "probable-cause" to identify alarm types. This alarmmoduledata model uses a hierarchical YANGidentity,identity: "alarm-type". This enables delegation of alarm types within organizations. It alsoletsenables management to reason about abstract alarm types corresponding to baseidentities,identities; see Section 3.2. The YANG alarmmoduledata model has not included the majority of the X.733 alarm attributes.RatherRather, these are defined in an augmenting module [X.733] if "strict" X.733 compliance is needed. F.2.2.RFC 3877, theThe Alarm MIB (RFC 3877) The MIB in RFC 3877 takes a differentapproach,approach; rather than defining a concrete data model for alarms, it defines a model to map existingSNMP managedSNMP-managed objects and notifications into alarm states and alarm notifications. This was necessary since MIBs were already defined with both managed objects and notifications indicating alarms, forexample linkUpexample, "linkUp" andlinkDown"linkDown" notifications in combination withifAdminState"ifAdminState" andifOperState. So"ifOperState". So, RFC 3877can notcannot really be compared to the alarm YANG module in that sense. The Alarm MIB maps existing MIB definitions into alarms,alarmModelTable.such as "alarmModelTable". The upside of that is thataan SNMP Managercancan, atruntimeruntime, read the possible alarm types. This corresponds to thealarmInventory"alarmInventory" in the alarm YANG module. F.2.3. 3GPP Alarm IRP The 3GPP Alarm IRP is an evolution of X.733. Main differences between the alarm YANG module and 3GPPare:are as follows: 3GPP keeps the majority of the X.733 attributes, but the alarm YANG module does not. 3GPP introduced overlapping and possibly conflicting keys for alarms,alarmIdalarmId, and (managed object, event type, probable cause, specific problem). (See Example 3 in Annex Cin [X.733] Example 3).of [ALARMIRP]). In the YANG alarmmoduledata model, the key for identifying an alarm instance is clearly defined by ("resource", "alarm-type-id","alarm-type- qualifier")."alarm-type-qualifier"). See also Section 3.4 for more information. The alarm YANG module clearly separates the resource/ instrumentationlife cyclelifecycle from the operatorlife cycle.lifecycle. 3GPP allows operators to set the alarm severity toclear,clear; this is not allowed by thismodule, rathermodule. Rather, an operator closes analarmalarm, which does not affect the severity. F.2.4. G.7710 G.7710 is different than thepreviouspreviously referenced alarm standards. It does not define a data model for alarm reporting. It defines common equipment management function requirements including alarm instrumentation. The scope is transport networks. The requirements in G.7710correspondscorrespond to features in the alarm YANG module in the following way: Alarm Severity Assignment Profile (ASAP): the alarm profile "/alarms/alarm-profile/". Alarm Reporting Control (ARC): alarm shelving "/alarms/control/ alarm-shelving/" and the ability to control alarm notifications "/alarms/control/notify-status-changes". Alarm shelving corresponds to the use case of turning off alarm reporting for a specific resource, which is the NALM (No ALarM) state in M.3100. Appendix G.Alarm UsabilityAlarm-Usability Requirements This section defines usability requirements for alarms. Alarm usability is important for an alarm interface. A data model will help in defining theformatformat, but if the actual alarms are of lowvaluevalue, we have not gained the goal of alarm management. Common alarm problems andthe cause of the problemstheir causes are summarized in Table 2. This summary is adopted to networking based on the ISA [ISA182] andEEMUAEngineering Equipment Materials Users Association (EEMUA) [EEMUA] standards.+------------------+--------------------------------+---------------++-----------------+--------------------------------+----------------+ | Problem | Cause | How this | | | | module | | | |addressaddresses the | | | | cause |+------------------+--------------------------------+---------------++-----------------+--------------------------------+----------------+ | Alarms are | "Nuisance" alarms (chattering | Strict | |generatedgenerated, but | alarms and fleeting alarms), | definition of | | they areignored| faulty hardware, redundant | alarms | | ignored by theoperator.| alarms, cascading alarms, | requiring | | operator. | incorrect alarm settings, and | corrective | | | alarms that have not been | response. See | | |rationalized,rationalized; the alarms |Alarmalarm | | | represent log information | requirements | | | rather than true alarms. | in Table 3. | | | | | | When alarms | Insufficientalarm responsealarm-response | The alarm | | occur,operators| procedures and notwellwell- | inventory | | operators donot know how| defined alarm types. | lists all | | not know how torespond.| | alarm types | | respond. | | and| | | |corrective | | | | actions. See | | | |Alarmalarm | | | | requirements | | | | in Table 3. | | | | | | The alarm | Nuisance alarms, stale alarms, | The alarm | | display is full | and alarms from equipment notin| definition and | | of alarms, even | in service. |andalarm | | when there is | | shelving. | | nothing wrong. | | | | | | | | During a | Incorrect prioritization of | State-based | | failure, | alarms. Not using advanced | alarmmodel,model | | operators are | alarm techniques(e.g.(e.g., state- |alarm rateand alarm-rate | | flooded with so | based alarming). |requirementsrequirements; | | many alarmsthat| |in Tablesee Tables 4 | | that they donot know| | andTable 55, | | not know whichones are| | respectively. | | ones are themost| | | | most important. | | |+------------------+--------------------------------+---------------++-----------------+--------------------------------+----------------+ Table 2: Alarm Problems and Causes Based upon the aboveproblemsproblems, EEMUA gives the following definition of a good alarm: +----------------+--------------------------------------------------+ | Characteristic | Explanation | +----------------+--------------------------------------------------+ | Relevant | Not spurious or of low operational value. | | | | | Unique | Not duplicating another alarm. | | | | | Timely | Not long before any response is needed or too | | | late to do anything. | | | | | Prioritized | Indicating the importance that the operator | | | deals with the problem. | | | | | Understandable | Having a messagewhichthat is clear and easy to | | | understand. | | | | | Diagnostic | Identifying the problem that has occurred. | | | | | Advisory | Indicative of the action to be taken. | | | | | Focusing | Drawing attention to the most important issues. | +----------------+--------------------------------------------------+ Table 3: Definition of a Good Alarm Vendors SHOULD rationalize all alarms according to the table above. Another crucial requirement is acceptable alarm notification rates. Vendors SHOULD make sure that they do not exceed the recommendations from EEMUA below: +-----------------------------------+-------------------------------+ |Long TermLong-Term Alarm Rate in Steady | Acceptability | | Operation | | +-----------------------------------+-------------------------------+ | More than one per minute | Very likely to be | | | unacceptable. | | | | | One per 2 minutes | Likely to beover-demanding.overdemanding. | | | | | One per 5 minutes | Manageable. | | | | | Less than one per 10 minutes | Very likely to be acceptable. | +-----------------------------------+-------------------------------+ Table 4: Acceptable AlarmRates,Rates -- Steady State +----------------------------+--------------------------------------+ | Number of alarms displayed | Acceptability | | in 10 minutes following a | | | major network problem | | +----------------------------+--------------------------------------+ | More than 100 | Definitely excessive and very likely | | | to lead to the operatorto abandonabandoning | | | the use of the alarm system. | | | | | 20-100 | Hard to cope with. | | | | | Under 10 | Should bemanageable -manageable, but it may be | | | difficult if several of the alarms | | | require a complex operator response. | +----------------------------+--------------------------------------+ Table 5: Acceptable AlarmRates,Rates -- Burst The numbers inTableTables 4 andTable5 are the sum of all alarms for a network being managed from one alarm console. So every individual system orNMSNetwork Management System (NMS) contributes to these numbers. Vendors SHOULD make sure that the following rules are used in designing the alarm interface: 1. Rationalize the alarms in the system to ensure that every alarm is necessary, has a purpose, and follows the cardinal rule-that it requires an operator response. Adheres to the rules of Table33. 2. Audit the quality of the alarms. Talk with the operators about how well the alarm informationsupportsupports them. Do they know what to do in the event of an alarm? Are they able to quickly diagnose the problem and determine the corrective action? Does the alarm text adhere to the requirements in Table 3? 3. Analyze and benchmark the performance of the system and compare it to the recommended metrics inTableTables 4 andTable5. Start by identifying nuisance alarms, as well as standing alarms at normal state and startup. Acknowledgements The authors wish to thank Viktor Leijon and Johan Nordlander for their valuable input on forming the alarm model. The authors also wish to thank Nick Hancock, Joey Boyd, Tom Petch, and Balazs Lengyel for their extensive reviews and contributions to this document. Authors' Addresses Stefan Vallin Stefan Vallin AB Email: stefan@wallan.se Martin Bjorklund Cisco Email: mbj@tail-f.com