Network Working GroupInternet Engineering Task Force (IETF) E. OsborneInternet-DraftRequest for Comments: 7324 July 2014 Updates: 6378(if approved) May 29, 2014 Intended status:Category: Standards TrackExpires: November 30, 2014ISSN: 2070-1721 Updates to MPLS Transport Profile Linear Protectiondraft-ietf-mpls-psc-updates-06Abstract This document contains a number of updates to the Protection State Coordination (PSC) logic defined inRFC6378,RFC 6378, "MPLS Transport Profile (MPLS-TP) Linear Protection". These updates provide some rules and recommendations around the use of TLVs in PSC, address some issues raised in an ITU-T liaison statement, and clarify PSC's behavior in a case not well explained inRFC6378. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described inRFC2119 [RFC2119].6378. Status of This Memo ThisInternet-Draftissubmitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documentsan Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF).Note that other groups may also distribute working documents as Internet-Drafts. The listIt represents the consensus ofcurrent Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents validthe IETF community. It has received public review and has been approved fora maximumpublication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 ofsix monthsRFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may beupdated, replaced, or obsoleted by other documentsobtained atany time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 30, 2014.http://www.rfc-editor.org/info/rfc7324. Copyright Notice Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 2. Message Formatting and Error Handling . . . . . . . . . . . . 3 2.1. PSC TLV Format . . . . . . . . . . . . . . . . . . . . . 3 2.2. ErrorhandlingHandling . . . . . . . . . . . . . . . . . . . . . 4 2.2.1. MalformedmessagesMessages . . . . . . . . . . . . . . . . . 4 2.2.2.Well-formedWell-Formed butunknownUnknown orunexpectedUnexpected TLV . . . . . . 4 3. Incorrectlocal statusLocal Status afterfailureFailure . . . . . . . . . . . . 5 4. Handling acapabilities mismatchCapabilities Mismatch . . . . . . . . . . . . . . 5 4.1. Protection TypemismatchMismatch . . . . . . . . . . . . . . . . 5 4.2. RmismatchMismatch . . . . . . . . . . . . . . . . . . . . . . . 6 4.3. UnsupportedmodesModes . . . . . . . . . . . . . . . . . . . . 6 5. Reversiondeadlock dueDeadlock Due to arace conditionRace Condition . . . . . . . . .67 6. Clarifying PSC'sbehaviorBehavior in thefaceFace ofmultiple inputsMultiple Inputs . .78 7. Security Considerations . . . . . . . . . . . . . . . . . . .910 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 10.1. Normative References . . . . . . . . . . . . . . . . . . 10 10.2. Informative References . . . . . . . . . . . . . . . . . 10Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 111. Introduction This document contains a number of updates to PSC [RFC6378]. One provides some rules and recommendations around the use of TLVs in PSC. Three ofthemthe updates address issues #2,#7#7, and #8 as identified in the ITU's liaison statement "Recommendation ITU-T G.8131/Y.1382 revision - Linear protection switching for MPLS-TP networks" [LIAISON]. Another clears up a behaviorwhichthat was not well explained inRFC6378.RFC 6378. These updates are not changes to the protocol's packet format or to PSC'sdesign, butdesign; they are corrections and clarifications to specific aspects of the protocol's procedures. This document does not introduce backward compatibility issues with implementations of RFC 6378. It should be noted that[I-D.ietf-mpls-tp-psc-itu][RFC7271] contains protocol mechanisms for an alternate mode of operating MPLS-TP PSC. Those modes are built on the message structures and procedures of[RFC6378][RFC6378], and so, while this document does not update[I-D.ietf-mpls-tp-psc-itu],[RFC7271], it has an impact on that work through its update to [RFC6378]. This document assumes familiarity withRFC6378RFC 6378 and its terms,conventionsconventions, and acronyms. Any term used in this document but not defined herein can be found inRFC6378.RFC 6378. In particular, this document shares the acronyms defined inRFC6378 section 2.1.Section 2.1 of RFC 6378. 1.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Message Formatting and Error Handling This section covers messageformatting,formatting as well as some recommended error checking. 2.1. PSC TLV Format [RFC6378] provides the capability to carry TLVs in the PSC messages. All fields are encoded in network byte order. Each TLV contains three fields, as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Type field (T): Atwo octettwo-octet field that encodes a type value. The type values are recorded in the IANA registry "MPLS PSC TLV Registry". Length field(L) :(L): Atwo octettwo-octet field that encodes the length in octets of the Value field. The value of this field MUST be a multiple of 4. Value field(V) :(V): The payload of the TLV. The length of this field (which is the value of the Length field) MUST be a multiple of 4 octets, and so this field may contain explicit padding. The length of each single TLV is the sum of the lengths of its three fields: the length of the value field + 4. The overall TLV Length field in the PSC message contains the total length of all TLVs in octets. 2.2. ErrorhandlingHandling It is recommended to implement error and bounds checking to ensure that received messages, if improperly formatted, are handled in such a way to minimize the impact of this formatting on the behavior of the network and its devices. This section covers two such areas--- malformed messages and well-formed but unexpected TLVs.Neither of these sectionsThis text is not intended to limit the error or bounds checking a device performs. The recommendations herein should be taken as a starting point. 2.2.1. Malformedmessages AMessages An implementation SHOULD: o Ensure any fields prior to TLV Length are consistent with RFC 6378, particularly Section4.2.4.2 of that document. o Ensure the overall length of the message matches the value in the TLV Length + 12. o Check that the sum of the lengths of all TLVs matches the value in the TLV Length. If an implementation receives a messagewhichthat fails any malformed message checks, it MUST drop the message and SHOULD alert the operator to the malformed message. The method(s) used to alert the operator are outside the scope of thisdocument,document but may include things like syslog or console messages. 2.2.2.Well-formedWell-Formed butunknownUnknown orunexpectedUnexpected TLV If a message is deemed to be properly formed, an implementation SHOULD check all TLVs to ensure that it knows what to do with them. A well-formed but unknown or unexpected TLV value MUST be ignored, and the rest of the message processed as if the ignored TLV did not exist. An implementation detecting a malformed TLV SHOULD alert the operator as described in Section 2.2.1. 3. Incorrectlocal statusLocal Status afterfailureFailure Issue #2 in the liaison statement identifies a case where a strict reading ofRFC6378RFC 6378 leaves a node reporting an inaccurate status: A node can end up sending incorrect status--- NR(0,1)--- despite the failure of the protection LSP (P-LSP). This is clearly not correct, as a node should not be sending NR if it has a local failure. To address this issue, the fourth bullet insectionSection 4.3.3.3 ofRFC6378RFC 6378 is replaced with the following three bullets: o If the current state is due to a local or remote Manual Switch, a local Signal Fail indication on the protection path SHALL cause the LER to enter local Unavailable state and begin transmission of an SF(0,0) message. o If the LER is in local Protecting Administrative state due to a local Forced Switch, a local Signal Fail indication on the protection path SHALL be ignored. o If the LER is in remote Protecting Administrative state due to a remote Forced Switch, a local Signal Fail indication on the protection path SHALL cause the LER to remain in remote Protecting administrative state and transmit an SF(0,1) message. 4. Handling acapabilities mismatchCapabilities Mismatch PSC has no explicit facility to negotiate any properties of the protection domain. It does, however, have the ability to signal two properties of that domain, via the Protection Type (PT) and Revertive (R) bits.RFC6378RFC 6378 specifies that if these bits do not match an operator "SHALL [be notified]" (PT,sectionSection 4.2.3) or "SHOULD be notified" (R,sectionSection 4.2.4). However, there is no textwhichthat specifies the behavior of the end nodes of a protection domain in case of a mismatch. This section provides that text, as requested by issue #7 in theliaison.liaison statement. 4.1. Protection TypemismatchMismatch The behavior of the protection domain depends on the exact Protection Type (PT) mismatch. Section 4.2.3 ofRFC6378RFC 6378 specifies three protection types--- bidirectional switching using a permanent bridge, bidirectional switching using a selector bridge, and unidirectional switching using a permanent bridge. They are abbreviated here as BP,BSBS, and UP. There are three possible mismatches: {BP, UP}, {BP, BS}, and {UP, BS}. The priority is: UP > BS > BP In other words: o If the PT mismatch is {BP, UP}, the node transmitting BP MUST switch to UP mode if it is supported. o If the PT mismatch is {BP, BS}, the node transmitting BP MUST switch to BS mode if it is supported. o If the PT mismatch is {UP, BS}, the node transmitting BS MUST switch to UP mode if it is supported. If a node does not support a mode to which it is required toswitchswitch, then that node MUST behave as in Section 4.3. 4.2. RmismatchMismatch The R bit indicates whether the protection domain is inRevertiverevertive orNon-Revertivenon-revertive behavior. If the R bits do not match, the node indicatingNon-Revertivenon-revertive MUST switch to Revertive if it is supported. If it is notsupportedsupported, a node must behave as in Section4.34.3. 4.3. UnsupportedmodesModes An implementation may not support all three PT modes and/or both R modes, and thus a pair of nodes may be unable to converge on a common mode. This creates a permanent mismatch, resolvable only by operator intervention. An implementation SHOULD alert the operator to an irreconcilable mismatch. It is desirable to allow the protection domain to function in a non- failure mode even if there is a mismatch, as the mismatches of PT or R have to do with how nodes recover from a failure. An implementation SHOULD allow traffic to be sent on the Working LSP as long as there is no failure(e.g.(e.g., NR state) regardless of any PT or R mismatch. If there is a triggerwhichthat would cause the protection LSP to be used, such as SF or MS, a node MUST NOT use the protection LSP to carry traffic. 5. Reversiondeadlock dueDeadlock Due to arace conditionRace Condition Issue #8 in the liaison statement identifies a deadlock case where each node can end up sending NR(0,1) when it should instead be in the process of recovering from the failure(i.e.(i.e., entering into WTR or DNR, as appropriate for the protection domain). The root of the issue is that a pair of nodes can simultaneously enter WTR state, receive anout of dateout-of-date SF-Windication andindication, transition into a remotely triggered WTR, and remain in remotely triggered WTR waiting for the other end to trigger a change in status. In the case identified in issue #8, each node can end up sending NR(0,1), which is an indication that the transmitting node has no local failure, but is instead reacting to the remote SF-W. If a nodewhichthat receives NR(0,1) is in fact not indicating a local error, the correct behavior for the receiving node is to take the received NR(0,1) as an indication that there is no error in the protection domain, and recovery procedures (WTR or DNR) should begin. This is addressed by adding the following text as the penultimate bullet insectionSection 4.3.3.4 ofRFC6378:RFC 6378: o If a node is in Protecting Failure state due to a remote SF-W and receives NR(0,1), this SHALL cause the node to begin recovery procedures. If the LER is configured for revertive behavior, it enters into Wait-to-Restore state, starts the WTR timer, and begins transmitting WTR(0,1). If the LER is configured for non- revertive behavior, it enters into Do-Not-Revert state and begins transmitting a DNR(0,1) message. Additionally, thefinalpenultimate bullet insectionSection 4.3.3.3 is changed from o A remote NR(0,0) message SHALL be ignored if in local Protecting administrative state. to o A remote No Request message SHALL be ignored if in local Protecting administrative state. This indicates that a remote NR triggers the same behavior regardless of the value of FPath and Path. This change does not directly address issue #8, but it fixes a similar issue--- if a node receives NR while in Remote administrative state, the value of FPath and Path have no bearing on the node's reaction to this NR. 6. Clarifying PSC'sbehaviorBehavior in thefaceFace ofmultiple inputs RFC6378Multiple Inputs RFC 6378 describes the PSC state machine. Figure 1 insectionSection 3 of RFC 6378 shows two inputs into the PSC Control logic--- Local Request logic and Remote PSC Request. When there is only one input into the PSC Control logic--- a local request or a remote request but not both--- the PSC Control logic decides what that input signifies and then takes one or more actions, as necessary. This is what the PSC State Machine insectionSection 4.3 of RFC 6378 describes.RFC6378RFC 6378 does not sufficiently describe the behavior in the face of multiple inputs into the PSC Control Logic (one Local Request and one Remote Request). This section clarifies the expected behavior. There are two cases to think about when considering dual inputs into the PSC Control logic. The first is when the same request is presented from both local and remote sources. One example of this case is a Forced Switch (FS) configured on both ends of an LSP. This will result in the PSC Control logic receiving both a local FS and remove FS. For convenience, this scenario is written as [L(FS), R(FS)]--- that is, Local(Forced Switch) and Remote(Forced Switch). The second case, which is handled in exactly the same way as the first, is when the two inputs into the PSC Control logic describe different events. There are a number of variations on this case. One example is when there is a Lockout of Protection from the Local request logic and a Signal Fail on the Working path from the Remote PSC Request. This is shortened to [L(LO), R(SF-W)]. In bothcasescases, the question is not how the PSC Control logic decides which of these is the one it acts upon. Section 4.3.2 ofRFC6378RFC 6378 lists the priorityorder,order and prioritizes the local input over the remote input in case both inputs are of the same priority.SoSo, in the first example it is the local SF that drives the PSC Control logic, and in the second example it is the local Lockoutwhichthat drives the PSC Control logic. The point that this section clears up is around what happens when thehighest priorityhighest-priority input goes away. Consider the first case. Initially, the PSC Control logic has [L(FS),R(FS)]R(FS)], and L(FS) is driving PSC's behavior. When L(FS) isremovedremoved, but R(FS) remains, what does PSC do? A strict reading of theFSMFinite State Machine (FSM) would suggest that PSC transition from PA:F:L into N, and at some future time (perhaps after the remote requestrefreshes)refreshes), PSC would transition from N to PA:F:R. This is an unreasonable behavior, as there is no sensible justification for a node behaving as if things were normal (i.e., N state) when it is clear that they are not. The second case is similar. If a node starts with [L(LO), R(SF-W)] and the local lockout is removed, a strict reading of the state machine would suggest that the node transition from UA:LO:L to N, and then at some future time presumably notice the R(SF-W) and transition from N to PF:W:R. As with the first case, this is clearly not a useful behavior. In bothcasescases, the request that was driving PSC's behavior was removed. What should happen is that the PSC Control logic should, upon removal of an input, immediately reevaluate all other inputs to decide on the next course of action. This requires an implementation to store the most recent local and remote inputs regardless of their eventual use as triggers for the PSC Control Logic. There is also a third case. Consider a node with [L(FS), R(LO)]. At some point intimetime, the remote node replaces its Lockout request with a Signal Fail on Working, so that the inputs into the PSC Control logic on the receiving node go to [L(FS), R(SF-W)]. Similar to the first two cases, the node should immediately reevaluate both its local and remote inputs to determine the highest priority amongthem,them and act on that input accordingly. That is in fact what happens, as defined in Section4.3.3: "When4.3.3 of RFC 6378: When a LER is in a remote state, i.e., state transition in reaction to a PSC message received from the far-end LER, and receives a new PSC message from the far-end LER that indicates a contradictory state, e.g., in remote Unavailable state receiving a remote FS(1,1) message, then the PSC Control logic SHALL reevaluate all inputs (both the local input and the remote message) as if the LER is in the Normalstate."state. This section extends that paragraph to handle the first two cases. The essence of the quoted paragraph is that when faced with multiple inputs, PSC must reevaluate any changes as if itwaswere in Normal state.SoSo, the quoted paragraph is replaced with the following text:"TheThe PSC Control logic may simultaneously have Local and Remote requests, and the highest priority of these requests ultimately drives the behavior of the PSC Control logic. When thishighesthighest- priority request is removed or is replaced with another input, then the PSC Control logic SHALL immediately reevaluate all inputs (both the local input and the remote message), transitioning into a new state only upon reevaluation of allinputs".inputs. 7. Security Considerations These changes and clarifications raise no new security concerns. RFC 6941 [RFC6941] provides the baseline security discussion for MPLS-TP, and PSC(both(as described in both RFC 6378 and this document)fallfalls under that umbrella. Additionally, Section 2.2 clarifies how to react to malformed or unexpected messages. 8. IANA Considerations IANAis requested to markhas marked the value 0 in the "MPLS PSC TLV Registry" as "Reserved, not to be allocated" andto updateupdated the references to show [RFC6378] and[RFC-ietf-mpls-psc-updates-04].this document (RFC 7324). Note that thisactiondocument provides documentation of an action already taken by IANA but not recorded in RFC 6378. 9. Acknowledgements The author of this document thanks Taesik Cheung, Alessandro D'Alessandro, Annamaria Fulignoli, Sagar Soni, GeorgeSwallowSwallow, and Yaacov Weingarten for their contributions and review, and Adrian Farrel for the text of Section 2. 10. References 10.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC6378] Weingarten, Y., Bryant, S., Osborne, E., Sprecher, N., and A. Fulignoli, "MPLS Transport Profile (MPLS-TP) Linear Protection", RFC 6378, October 2011. 10.2. Informative References[I-D.ietf-mpls-tp-psc-itu] Ryoo, J., Gray, E., Helvoort, H., D'Alessandro, A., Cheung, T., and E. Osborne, "MPLS Transport Profile (MPLS- TP) Linear Protection to Match the Operational Expectations of SDH, OTN and Ethernet Transport Network Operators", draft-ietf-mpls-tp-psc-itu-04 (work in progress), March 2014.[LIAISON] ITU-T SG15, "Liaison Statement: Recommendation ITU-TG .8131/Y.1382G.8131/Y.1382 revision - Linear protection switching for MPLS-TP networks", <https://datatracker.ietf.org/ liaison/1205/>. [RFC6941] Fang, L., Niven-Jenkins, B., Mansfield, S., and R. Graveman, "MPLS Transport Profile (MPLS-TP) Security Framework", RFC 6941, April 2013. [RFC7271] Ryoo, J., Gray, E., van Helvoort, H., D'Alessandro, A., Cheung, T., and E. Osborne, "MPLS Transport Profile (MPLS- TP) Linear Protection to Match the Operational Expectations of Synchronous Digital Hierarchy, Optical Transport Network, and Ethernet Transport Network Operators", RFC 7271, June 2014. Author's Address Eric OsborneEmail:EMail: eric.osborne@notcom.com