TRILL Working Group Tissa SenevirathneInternetDraft CISCO Intended status:Engineering Task Force (IETF) T. Senevirathne Request for Comments: 6905 Cisco Category: InformationalDavidD. Bond ISSN: 2070-1721 IBMSamS. AldrinYizhouY. Li HuaweiRohitR. WatveCISCO January 26, 2013 Expires: JulyCisco March 2013 Requirements for Operations,AdministrationAdministration, and Maintenance (OAM) inTRILL (TransparentTransparent Interconnection of Lots ofLinks) draft-ietf-trill-oam-req-05Links (TRILL) AbstractOAM (Operations, AdministrationOperations, Administration, andMaintenance)Maintenance (OAM) is a general term used to identify functions and toolsets to troubleshoot and monitor networks. This document presents OAMRequirementsrequirements applicable toTRILL.the Transparent Interconnection of Lots of Links (TRILL). Status ofthisThis Memo ThisInternet-Draftdocument issubmitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documentsnot an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force(IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum(IETF). It represents the consensus ofsix monthsthe IETF community. It has received public review andmay be updated, replaced, or obsoletedhas been approved for publication byotherthe Internet Engineering Steering Group (IESG). Not all documentsatapproved by the IESG are a candidate for anytime. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The listlevel of Internet Standard; see Section 2 of RFC 5741. Information about the currentInternet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The liststatus ofInternet-Draft Shadow Directories canthis document, any errata, and how to provide feedback on it may beaccessedobtained athttp://www.ietf.org/shadow.html This Internet-Draft will expire on July 26,2013.http://www.rfc-editor.org/info/rfc6905. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.Abstract OAM (Operations, Administration and Maintenance) is a general term used to identify functions and toolsets to troubleshoot and monitor networks. This document presents OAM Requirements applicable to TRILL.Table of Contents 1.Introduction...................................................3Introduction ....................................................3 1.1.Scope.....................................................3Scope ......................................................3 2. ConventionsusedUsed inthis document..............................3This Document ...............................3 3.Terminology....................................................3Terminology .....................................................3 4. OAMRequirements...............................................5Requirements ................................................4 4.1. DataPlane................................................5Plane .................................................4 4.2. ConnectivityVerification.................................5Verification ..................................5 4.2.1.Unicast..............................................5Unicast .............................................5 4.2.2. DistributionTrees...................................5Trees ..................................5 4.3. ContinuityCheck..........................................6Check ...........................................5 4.4. PathTracing..............................................6Tracing ...............................................6 4.5. GeneralRequirements......................................6Requirements .......................................6 4.6. PerformanceMonitoring....................................7Monitoring .....................................7 4.6.1. PacketLoss..........................................7Loss .........................................7 4.6.2. PacketDelay.........................................8Delay ........................................7 4.7. ECMPUtilization..........................................8Utilization ...........................................8 4.8. Security and Operationalconsiderations...................8Considerations ....................8 4.9. FaultIndications.........................................9Indications ..........................................8 4.10. DefectIndications.......................................9Indications ........................................9 4.11. Live Trafficmonitoring..................................9Monitoring ...................................9 5. SecurityConsiderations.......................................10Considerations .........................................9 6.IANA Considerations...........................................10 7. References....................................................10 7.1.References ......................................................9 6.1. NormativeReferences.....................................10 7.2.References .......................................9 6.2. InformativeReferences...................................10References ....................................10 7. Acknowledgments ................................................11 8.Acknowledgments...............................................11 9. Authors.......................................................11 10. Contributors.................................................13Contributors ...................................................11 1. IntroductionOAM (Operations, AdministrationThe Operations, Administration, andMaintenance)Maintenance (OAM) generally covers various production aspects of a network. In thisdocumentdocument, we use the term OAM as defined in [RFC6291].SuccessThe success of network operations depends on the ability to proactively monitor it for faults, performance,etc.etc., as well as the ability to efficiently and quickly troubleshoot defects and failures. Awell- definedwell-defined OAM toolset is a vital requirement for wider adoption ofTRILL (TransparentTransparent Interconnection of Lots ofLinks)Links (TRILL) as the next generationdata forwardingdata-forwarding technology in larger networks such as data centers. In thisdocumentdocument, we define the requirements for TRILL OAM. It is assumed that the readers are familiar with the OAM concepts and terminologies defined in other OAM standards such as [8021ag] and [RFC5860]. This document does not attempt to redefine the terms and concepts specified elsewhere. 1.1. Scope The scope of this document is OAM betweenRBridgesRouting Bridges (RBridges) of a TRILL campus over links selected by TRILL routing. 2. ConventionsusedUsed inthis documentThis Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described inRFC-2119RFC 2119 [RFC2119]. Although this document is not a protocol specification, the use of this language clarifies the instructions to protocol designers producing solutions that satisfy the requirements set out in this document. 3. Terminology Section:TheThis termSectionrefers to a segment of a path between any two given RBridges. As an example, consider the case where RB1 is connected to RBx viaRB2,RB3RB2, RB3, and RB4. The segment between RB2 to RB4 is referred to as aSectionsection of the path RB1 to RBx. More details of"section"this definition can be found in[RFC5960][RFC5960]. Flow:TheThis termFlowindicates a set of packets that share the same path and per-hop behavior (such as priority). A flow is typically identified by a portion of the inner payload that affects thehop-byhop-by- hop forwarding decisions. This may contain Layer 2 through Layer 4 information. All SelectableLeast CostLeast-Cost Paths:TheThis term"all selectable least cost paths"refers to a subset of all potentially availableleast costleast-cost paths to a specified destination RBridge that are available (and usable) for forwarding of frames. It is important tonote,note that in practice, due to limitations in implementations, not all availableleast costleast-cost paths may be selectable for forwarding. Connectivity:TheThis termconnectivityindicates reachability between an arbitrary RBridge RB1 and any other RBridge RB2. The specific path can be either explicit(i.e.(i.e., associated with a specific flow) or unspecified. Unspecified means that messages used for connectivity verification take whatever path the RBs happen to select. Please refer to [OAMOVER] for details. Continuity Verification:Continuity VerificationThis term refers to proactive verification of liveliness between two RBridges at periodic intervals and the generation of explicit notification whenConnectivityconnectivity failures occur. Please refer to [OAMOVER] for details. Fault:TheThis termFaultrefers to an inability to perform a required action, e.g., an unsuccessful attempt to deliver a packet. Please refer to [TERMTP] for definition. Defect:TheThis termDefectrefers to an interruption in the normal operation, such that over a period of time no packets are delivered successfully. Please refer to [TERMTP] for definition. Failure:TheThis termFailurerefers to the termination of the required function over a longer period of time. Persistence of a defect for a period of time is interpreted as a failure. Please refer to [TERMTP] for definition. Simulated Flow:TheThis termsimulated flowrefers to a sequence ofOAM generatedOAM-generated packets designed to follow a specific path. The fields of the packets in the simulated flow may or may not be identical to the fields of data packets of an actual flow being simulated. However, the purpose of the simulated flow is to have OAM packets of the simulated flow follow a specific path. 4. OAM Requirements 4.1. Data Plane OAM frames, utilized for connectivity verification, continuity checks, performance measurements, etc., will by default take whatever path TRILL chooses based on the current topology andper- hop equalper-hop equal- cost path choices. In some cases, it may be required that the OAM frames utilize specific paths. Thus, it MUST be possible to arrange that OAM frames follow the path taken by a specific flow. RBridges MUST have the ability to identify frameswhichthat require OAMprocessing..processing. TRILL OAM frames MUST remain within a TRILL campus and MUST NOT be egressed from a TRILL network as native frames. OAM MUST have the ability to include all Ethernet traffic types carried by TRILL. 4.2. Connectivity Verification 4.2.1. Unicast From an arbitrary RBridge RB1, OAM MUST have the ability to verify connectivity to any other RBridge RB2. From an arbitrary RBridge RB1, OAM MUST have the ability to verify connectivity to any other RBridge RB2 for a specific flow via the path associated with the specified flow. 4.2.2. Distribution Trees OAM MUST have the ability to verifyconnectivity,connectivity from an arbitrary RBridgeRB1,RB1 to either a specific set of RBridges or all member RBridges, for a specified distribution tree. This functionality is referred to as verification of theun-prunedunpruned distribution tree. OAM MUST have the ability to verifyconnectivity,connectivity from an arbitrary RBridgeRB1,RB1 to either a specific set of RBridges or all member RBridges, for a specified distribution tree and for a specified flow. This functionality is referred to as verification of the pruned tree. 4.3. Continuity Check OAM MUST provide functions that allow any arbitrary RBridge RB1 to perform a Continuity Check to any other RBridge. OAM MUST provide functions that allow any arbitrary RBridge RB1 to perform a Continuity Check to any other RBridge using a path associated with a specified flow. OAM SHOULD provide functions that allow any arbitrary RBridge to perform a Continuity Check to any other RBridge over any section of any selectableleast costleast-cost path. OAM SHOULD provide the ability to perform a Continuity Check on sections of any selectable path within the network. OAM SHOULD provide the ability to perform a multicast Continuity Check for specified distributiontree(s)tree(s), as well as specified combinations of distributiontreetrees andflow combinations.flows. The former is referred to as anun-prunedunpruned multi-destination tree Continuity Check and the latter is referred to as a pruned tree Continuity Check. 4.4. Path Tracing OAM MUST provide the ability to trace a path between any two RBridges per specified unicast flow. OAM SHOULD provide the ability to trace all selectableleast costleast-cost paths between any two RBridges. OAM SHOULD provide functionality to trace all branches of a specified distribution tree(un-pruned(unpruned tree). OAM SHOULD provide functionality to trace all branches of a specified distribution tree for a specified flow (pruned tree). 4.5. General Requirements OAM MUST provide the ability to initiate and maintain multiple concurrent sessions for multiple OAM functions between any arbitrary RBridge RB1 to any other RBridge. In general, multiple OAM operations will run concurrently. For example, proactive continuity checks may take place between RB1 and RB2 at the same time that an operator decides to test connectivity between the same two RBs. Multiple OAM functions and instances of those functions MUST be able to run concurrently without interfering with each other. OAM MUST provide a single OAM framework for all TRILL OAM functions within the scope of this document. OAM, as practical and as possible, SHOULD reuse functional,operationaloperational, and semantic elements of existing OAM standards. OAM MUST maintain related error and operational counters. Such counters MUST be accessible via network management applications(e.g.(e.g., SNMP). OAM functions related to continuity and connectivity checks MUST be able to be invoked either proactively oron-demand.on demand. OAM MAY be required to provide the ability to specify a desired response mode for a specific OAM message. The desired response mode can beeitherin-band,out-of bandout-of-band, or none. The OAM Framework MUST be extensible to include new functionality. For example, the solution needs to include aVersionversion number to differentiate older and newer implementations and TLV structures for flexibility to include new information elements. OAM MAY provide methods to verifycontrol planecontrol-plane andforwarding planeforwarding-plane alignments. OAM SHOULD leverage existing OAM technologies, where practical. 4.6. Performance Monitoring 4.6.1. Packet Loss In this document, the termloss of a packet"packet loss" is used as defined in[RFC2680] (seeSection 2.4 ofRFC2680).[RFC2680]. OAM SHOULD provide the ability to measure packet loss statistics for a flow from any arbitrary RBridge RB1 to any other RBridge. OAM SHOULD provide the ability to measure packet loss statistics over asection,section for a flow between any arbitrary RBridge RB1 to any other RBridge. OAM SHOULD provide the ability to measure packet loss statistics between any two RBridges over allleast costleast-cost paths. An RBridge SHOULD be able to perform the above packet loss measurement functions either proactively oron-demand.on demand. 4.6.2. Packet Delay There are two types of packet delays -- one-way delay and two-way delay(Round Trip(Round-Trip Delay). One-way delay is defined in [RFC2679] as the time elapsed from the start of transmission of the first bit of a packet by an RBridge until the reception of the last bit of the packet by the destination RBridge. Two-way delay is also referred to asRound TripRound-Trip Delay and is defined similar to [RFC2681];i.e.i.e., the time elapsed from the start of transmission of the first bit of a packet from RB1, receipt of the packet at RB2, RB2 sending a response packet back toRB1RB1, and RB1 receiving the last bit of that response packet. OAM SHOULD provide functions to measure two-way delay between two RBridges. OAM MAY provide functions to measure one-way delay between two RBridges for a specified flow. OAM MAY provide functions to measure one-way delay between two RBridges for a specified flow over a specific section. 4.7. ECMP Utilization OAM MAY provide functionality to monitor the effectiveness ofper- hop ECMPper-hop Equal-Cost Multipath (ECMP) hashing. For example, individual RBridges could maintain counters that show how packets are being distributed acrossequal costequal-cost next hops for a specified destination RBridge or RBridges as a result of ECMP hashing. 4.8. Security and OperationalconsiderationsConsiderations Methods MUST be provided to protect against exploitation of OAM framework for security anddenial of servicedenial-of-service attacks. Methods MUST be provided to prevent OAM messages from causing congestion in the networks. Periodically generated messages with high frequencies may lead to congestion, hence methods such as shaping or rate limiting SHOULD be utilized. Certain OAM functions may be utilized to gather operational information such as topology of the network. Methods MUST be provided to prevent unauthorized users accessing OAM functions to gather critical and sensitive information of the network. OAM packets MUST be limited to within the TRILLcampuscampus, and the implementation MUST provide methods to prevent leaking of OAM packets out of the TRILL campus.AdditionallyAdditionally, methods MUST be provided to prevent accepting OAM packets from outside the TRILL campus. 4.9. Fault Indications OAM MUST provide a Fault Indication framework to notifyfaults tothe packet's ingress RBRidgeof the packetor other interested parties (such as syslogservers).servers) about faults. OAM MUST provide functions to selectively enable or disable different types of Fault Indications. 4.10. Defect Indications OAM SHOULD provide a framework for Defect Detection and Indication. OAM Defect Detection and Indication Framework SHOULD provide methods to selectively enable or disable Defect Detection per defect type. OAM Defect Detection and Indication Framework SHOULD provide methods to configure Defect Detection thresholds per different types of defects. OAM Defect Detection and Indication Framework SHOULD provide methods to log defect indications to a locally defined archive (such as log buffer) orSNMPSimple Network Management Protocol (SNMP) traps. OAM Defect Detection and Indication Framework SHOULD provide a Remote Defect Indication framework that facilitates notifying the originator/owner of the flow experiencing the defect, which is the ingress RBridge. Remote Defect Indication MAY be either in-band or out-of-band. 4.11. Live TrafficmonitoringMonitoring OAM implementations MAY provide methods to utilize live traffic for troubleshooting and performance monitoring. 5. Security Considerations SecurityRequirementsrequirements are specified insectionSection 4.8. For general TRILL securityconsiderationsconsiderations, please refer to[RFC6325][RFC6325]. 6. References 6.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC6291]Anderson,Andersson, L.,et.al.van Helvoort, H., Bonica, R., Romascanu, D., and S. Mansfield, "Guidelines for the Use of the "OAM" Acronym in the IETF", BCP 161, RFC 6291, June 2011. 6.2. Informative References [RFC6325] Perlman, R.,et.al.,Eastlake 3rd, D., Dutt, D., Gai, S., and A. Ghanwani, "Routing Bridges (RBridges): Base Protocol Specification", RFC 6325, July 2011. [RFC5101] Claise, B., Ed., "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information",RFC5101,RFC 5101, January 2008. [RFC2680] Almes, G.,et.al.Kalidindi, S., and M. Zekauskas, "A One-way Packet Loss Metric for IPPM", RFC 2680, September 1999. [RFC2679] Almes, G.,et.al.Kalidindi, S., and M. Zekauskas, "A One-way Delay Metric for IPPM", RFC 2679, September 1999. [RFC2681] Almes, G.,et.al.Kalidindi, S., and M. Zekauskas, "A Round-trip Delay Metric for IPPM", RFC 2681, September 1999. [8021ag] IEEE, "Virtual Bridged Local Area Networks Amendment 5: Connectivity Fault Management",802.1ag,IEEE Std 802.1ag-2007, 2007. [8021Q] IEEE, "Media Access Control (MAC) Bridges and Virtual Bridged Local Area Networks", IEEE Std 802.1Q-2011,August,August 2011. [RFC4377] Nadeau, T.,et.al.Morrow, M., Swallow, G., Allan, D., and S. Matsushima, "Operations and Management (OAM) Requirements forMulti-protocolMulti-Protocol Label Switched(MPLS)Networks",(MPLS) Networks", RFC 4377, February 2006. [OAMOVER] Mizrahi,T, et.al.,T., Sprecher, N., Bellagamba, E., Y. Weingarten, "An Overview of Operations, Administration, and Maintenance (OAM) Mechanisms",draft- ietf-opsawg-oam-overview,Work in Progress,March 2012.January 2013. [RFC5860] Vigoureux, M.,et.al.,Ed., Ward, D., Ed., and M. Betts, Ed., "Requirements for Operations,AdministrationAdministration, and Maintenance (OAM) in MPLS Transport Networks",RFC5860,RFC 5860, May 2010. [TERMTP] van Helvoort, H.,et.al.,Ed., Andersson, L., Ed., and N. Sprecher, Ed., "A Thesaurus for the Terminology used in Multiprotocol Label Switching Transport Profile(MPLS- TP)(MPLS-TP) drafts/RFCs and ITU-T' Transport Network Recommendations",draft-ietf-mpls-tp-rosetta-stone,Work in Progress,July 2012.February 2013. [RFC5960] Frost, D.,et.al.,Ed., Bryant, S., Ed., and M. Bocci, Ed., "MPLS Transport Profile Data PlaneArchitecture"Architecture", RFC 5960, August 2010. 7. Acknowledgments Special acknowledgments to IEEE 802.1 chair, Tony Jeffree, for allowing us to solicit comments from IEEE 802.1 group. Also recognized are the comments received from the IEEE group, IESG, Stewart Bryant, Ralph Droms, Adrian Farrel, Benoit Claise, Ayal Lior, and others. 8. Contributors Thomas Narten IBM Corporation 3039 Cornwallis Avenue, PO Box 12195 Research Triangle Park, NC 27709 USAEmail:narten@us.ibm.comEMail:narten@us.ibm.com Donald Eastlake Huawei Technologies 155 Beaver Street, Milford,MACMA 01757USA. Email:USA EMail: d3e3e3@gmail.com Anoop GhanwaniDELLDell 350 Holger Way San Jose, CA 95134USA.USA Phone: +1-408-571-3500Email:EMail: Anoop@alumni.duke.edu Jon Hudson Brocade 120 Holger Way San Jose, CA 95134USA. Email:USA EMail: jon.hudson@gmail.com Naveen Nimmu Broadcom 9th Floor, Building no 9, Raheja Mind space Hi-Tec City, Madhapur, Hyderabad - 500081, INDIA081 India Phone: +1-408-218-8893Email:EMail: naveen@broadcom.com Radia Perlman Intel Labs 2700 156th Ave NE, Suite 300, Bellevue, WA 98007USA.USA Phone: +1-425-881-4824Email:EMail: radia.perlman@intel.com Tal Mizrahi Marvell 6 Hamada St. Yokneam, 20692 IsraelEmail:EMail: talmi@marvell.com8. Acknowledgments Special acknowledgments to IEEE 802.1 chair, Tony Jeffree for allowing us to solicit comments from IEEE 802.1 group. Also recognized are the comments received from IEEE group, IESG, Stewart Bryant, Ralph Droms, Adrian Farrel, Benoit Claise, Ayal Lior and others. This document was prepared using 2-Word-v2.0.template.dot. 9. AuthorsAuthors' Addresses Tissa SenevirathneCISCOCisco Systems 375 East Tasman Drive San Jose, CA 95134USA.USA Phone: +1-408-853-2291Email:EMail: tsenevir@cisco.com David Bond IBM 4400 North1 st1st Street San Jose, CA 95134 USA Phone: +1-603-339-7575Email:EMail: mokon@mokon.net Sam Aldrin Huawei Technologies 2330 Central Express Way Santa Clara, CA 95951 USAEmail:EMail: aldrin.ietf@gmail.com Yizhou Li Huawei Technologies 101 Software Avenue, Nanjing 210012 China Phone: +86-25-56625375Email:EMail: liyizhou@huawei.com Rohit WatveCISCOCisco Systems 375 East Tasman Drive San Jose, CA 95134USA.USA Phone: +1-408-424-2091Email:EMail: rwatve@cisco.com