Network Working GroupInternet Engineering Task Force (IETF) A. KaranInternet-DraftRequest for Comments: 7431 C. FilsfilsIntended status:Category: Informational IJ. Wijnands, Ed.Expires: November 19, 2015ISSN: 2070-1721 Cisco Systems, Inc. B. Decraene OrangeMay 18,August 2015Multicast onlyMulticast-Only FastRe-Route draft-ietf-rtgwg-mofrr-08Reroute Abstract As IPTV deployments grow in number and size, service providers are looking for solutions that minimize the service disruption due to faults in the IP network carrying the packets for these services. This document describes a mechanism for minimizing packet loss in a network when node or link failures occur.Multicast onlyMulticast-only FastRe- RouteReroute (MoFRR) works by making simple enhancements to multicast routing protocols such asPIMProtocol Independent Multicast (PIM) andmLDP.Multipoint LDP (mLDP). Status of This Memo ThisInternet-Draftdocument issubmitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documentsnot an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF).Note that other groups may also distribute working documents as Internet-Drafts. The listIt represents the consensus ofcurrent Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents validthe IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are amaximumcandidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status ofsix monthsthis document, any errata, and how to provide feedback on it may beupdated, replaced, or obsoleted by other documentsobtained atany time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 19, 2015.http://www.rfc-editor.org/info/rfc7431. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . 2....................................................3 1.1. ConventionsusedUsed inthis document . . . . . . . . . . . . 3This Document ..........................3 1.2. Terminology. . . . . . . . . . . . . . . . . . . . . . . 3................................................3 2. Basic Overview. . . . . . . . . . . . . . . . . . . . . . . 4..................................................4 3. Determination of thesecondarySecondary UMH. . . . . . . . . . . . . 4..............................5 3.1.ECMP-modeECMP-Mode MoFRR. . . . . . . . . . . . . . . . . . . . . 4............................................5 3.2.Non-ECMP-modeNon-ECMP-Mode MoFRR. . . . . . . . . . . . . . . . . . . 5........................................5 4. Upstream Multicast Hop Selection. . . . . . . . . . . . . . 5................................6 4.1. PIM. . . . . . . . . . . . . . . . . . . . . . . . . . . 5........................................................6 4.2. mLDP. . . . . . . . . . . . . . . . . . . . . . . . . . 6.......................................................6 5. Detecting Failures. . . . . . . . . . . . . . . . . . . . . 6..............................................6 6. MoFRRapplicabilityApplicability to Dual-Plane Topology. . . . . . . . . 7......................7 7. Other Topologies. . . . . . . . . . . . . . . . . . . . . . 10...............................................10 8. Capacity Planning for MoFRR. . . . . . . . . . . . . . . . . 11....................................11 9. PEnodes . . . . . . . . . . . . . . . . . . . . . . . . . . 11Nodes .......................................................11 10. Other Applications. . . . . . . . . . . . . . . . . . . . . 11............................................11 11.IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 12.Security Considerations. . . . . . . . . . . . . . . . . . . 12 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12 14. Contributor Addresses . . . . . . . . . . . . . . . . . . . . 12 15........................................12 12. References. . . . . . . . . . . . . . . . . . . . . . . . . 13 15.1.....................................................12 12.1. Normative References. . . . . . . . . . . . . . . . . . 13 15.2......................................12 12.2. Informative References. . . . . . . . . . . . . . . . . 13...................................12 Acknowledgments ...................................................13 Contributors ......................................................13 Authors' Addresses. . . . . . . . . . . . . . . . . . . . . . . 14................................................14 1. Introduction Different solutions have been developed and deployed to improve service guarantees, both for multicast video traffic and Video on Demand traffic. Most of these solutions are geared towards finding an alternate path around one or more failed network elements (link, node, or path failures). This document describes a mechanism for minimizing packet loss in a network when node or link failures occur.Multicast onlyMulticast-only FastRe- RouteReroute (MoFRR) works by making simple changes to the way selected routers use multicast protocols such as PIM and mLDP. No changes to the protocols themselves are required. With MoFRR, in many cases, multicast routing protocols don't necessarily have to depend on or have to wait on unicast routing protocols to detect networkfailures,failures; see Section 5. On a MergePointPoint, MoFRR logic determines a primary Upstream Multicast Hop (UMH) and a secondary UMH and joins the tree via both simultaneously. Data packets are received over the primary and secondary paths. Only the packets from the primary UMH are accepted and forwarded down thetree,tree; the packets from the secondary UMH are discarded. The UMH determination is different for PIM and mLDP and explained in Section 4. When a failure is detected on the path to the primary UMH, the repair occurs by changing the secondary UMH into the primary and the primary into the secondary. Since the repair is local, it is fast--- greatly improving convergence times in the event of node or link failures on the path to the primary UMH. 1.1. ConventionsusedUsed inthis documentThis Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 1.2. Terminology MoFRR:Multicast onlyMulticast-only FastRe-Route.Reroute. ECMP:Equal Cost Multi-Path.Equal-Cost Multipath. mLDP:Multi-pointMultipoint Label Distribution Protocol. PIM: Protocol Independent Multicast. UMH: Upstream MulticastHop, aHop. A candidate next-hop that can be used to reach the root of the tree. tree: Either a PIM (S,G)/(*,G) tree oraan mLDPP2MPPoint-to-Multipoint (P2MP) orMP2MPMultipoint-to-Multipoint (MP2MP) LSP. OIF: OutgoingInterFace, aninterface. An interface used to forward multicast packets down the tree towards the receivers. Either a PIM (S,G)/(*,G) tree oraan mLDP P2MP or MP2MP LSP. LFA:Loop FreeLoop-Free Alternate as defined in [RFC5286]. In unicast FastReRoute,Reroute, this is an alternate next-hopwhichthat can be used to reach a unicast destination without using the protected link or node. Merge Point: A router that joins a multicast stream via two divergent upstream paths. RPF: Reverse Path Forwarding. RP: Rendezvous Point.LSR:LSP: Label Switched Path. LSR: Label Switching Router. BFD: Bidirectional Forwarding Detection. IGP: Interior Gateway Protocol. MVPN: Multicast Virtual PrivateNetworks.Network. POP: Point Of Presence, an access point into the network. 2. Basic Overview The basic idea of MoFRR is for a Merge Point router to join a multicast tree via two divergent upstream paths in order to get maximum redundancy. The determination of this alternate upstream is defined in Section 3. In order to maximize robustness against any failure, the two paths should be as diverse as possible. Ideally, they should not merge upstream. Sometimes the topology guarantees maximalredundancy,redundancy; other times additional configuration or techniques are needed to enforce it. See Section 6 for more discussion on the applicability of MoFRR depending on the network topology. A Merge Point router should only accept and forward on one of the upstream paths at a time in order to avoid duplicate packet forwarding. The selection of the primary and secondary UMH is done by the MoFRR logic and normally based on unicast routing to findloop freeloop-free candidates. This is described in Section 4. Note, the impact of an additional amount of data on the network is mitigated when tree membership is densely populated. When a part of the network has redundant data flowing, join latency for new joining members is reduced becauseitsit's likely a tree Merge Point is not far away. 3. Determination of thesecondarySecondary UMH The secondary UMH is aLoop FreeLoop-Free Alternate (LFA) as per [RFC5286]. 3.1.ECMP-modeECMP-Mode MoFRR If the IGP installs two ECMP paths to the source, then as per [RFC5286] the LFA is a primaryNext-hop.next-hop. If theMulticastmulticast tree is enabled forECMP-ModeECMP-mode MoFRR, the router installsthemthe paths as primary and secondaryUMH.UMHs. Before the failure, only packets received from the primary UMH path areprocessedprocessed, while packets received from the secondary UMH are dropped. The selected primary UMH SHOULD be the same as if the MoFRR extensionwaswere not enabled. If more than two ECMP paths exist, one is selected as primary and another as secondary UMH. The selection of the primary and secondary is a local decision. Information from the IGP link-state topology could be leveraged to optimize this selection such that the primary and secondarypathpaths are maximal divergent and don't lead to the same upstream node. Note that MoFRR does not restrict the number of UMH paths that are joined. Implementations may use as many paths as are configured. 3.2.Non-ECMP-modeNon-ECMP-Mode MoFRR A router X configured for non-ECMP-mode MoFRR for aMulticastmulticast tree joins a primary path to its primary UMH and a secondary path to its LFA UMH. In order to prevent control-planeloopsloops, a router MUST stop joining the secondary UMH if this UMH is the only member in the OIF list. To illustrate the reason for this rule, let's consider the example inFIG3.Figure 3. If two Provider Edge routers, PE1 andPE2PE2, have received an IGMP request for aMulticastmulticast tree, they will both join the primary path on their plane and a secondary path to the neighbor PE. If their receiverswouldleave at the same time,it could beit's possible for theMulticastmulticast tree on PE1 and PE2 to never getdeleteddeleted, aseach PEthe PEs refresh each other via the secondary path joins (remember that a secondary path join is not distinguishable from a primary join). 4. Upstream Multicast Hop Selection An Upstream Multicast Hop (UMH) is a candidate next-hop that can be used to reach the root of the tree. This is normally based on unicast routing to findloop freeloop-free candidate(s). With MoFRRproceduresprocedures, we select a primary and a backup UMH. The procedures for determining the UMH are different for PIM and mLDP. 4.1. PIM The UMH selection in PIM is also known as the Reverse Path Forwarding (RPF) procedure. Based on a unicast route lookup on either theSourcesource address or Rendezvous Point (RP) [RFC4601], an upstream interface is selected for sending the PIM Joins/Prunes AND accepting the multicast packets. The interface the packets are received on is used to pass or fail the RPF check. If packets are received on an interface that was not selected as the primary by the RPF procedure,or not the primary,the packets are discarded. 4.2. mLDP The UMH selection in mLDP also depends on unicast routing, but the differencewithfrom PIM is that the acceptance of multicast packets is based on MPLS labels and is independent of the interface on which the packet isreceived on.received. Using the procedures as defined in[RFC6388][RFC6388], an upstream LabelSwitchedSwitching Router (LSR) is elected. The upstream LSR that was elected for a Label Switched Path (LSP) gets a unique local MPLSLabellabel allocated. Multicast packets are only forwarded if the MPLS label matches the MPLS label that was allocated for thatLSPsLSP's (primary) upstream LSR. 5. Detecting Failures Once the two paths are established, the next step is detecting a failure on the primary path to know when to switch to the backup path. This is a localissueissue, but this section explores some possibilities. The first (and simplest) option is to detect the failure of the local interface asitit's done for unicast FastReRoute.Reroute. Detection can be performed using the loss of signal or the loss of probing packets(e.g.(e.g., BFD). This option can be used in combination with the other options as documented below. Just like for unicast fast reroute,50msec switch-over50 msec switchover is possible. A second option consists of comparing the packets received on the primary and secondary streams but only forwarding one of them -- the first one received, no matter which interface it is received on. Zero packet loss is possible for RTP-based streams. A third option assumes a minimum known packet rate for a given data stream. If a packet is not received on the primary RPF within this time frame, the router assumes primary path failure and switches to the secondary RPF interface.50msec switch-over50 msec switchover may be possible forhigh rate stream (e.g. IP TVhigh-rate streams (e.g., IPTV where SD video has a continuous inter- packet gap of~ 3msec)about 3 msec), but in general the delay isdependantdependent on the rate of the multicast stream. A fourth option leverages the significant improvements of the IGP convergence speed. When the primary path to the source is withdrawn by the IGP, the MoFRR-enabled router switches over to the backup path, and the UMH is changed to the secondary UMH. Since the secondary path is already in place, and assuming it is disjoint from the primary path, convergence times would not include the time required to build a new tree and hence are smaller. Sub-second tosub-200msec switch-oversub-200 msec switchover should be possible. 6. MoFRRapplicabilityApplicability to Dual-Plane Topology MoFRR applicability is topology dependent. The applicability is the same as LFAFRRFRR, which is discussed in [RFC6571]. The following section will discuss MoFRR applicability to dual-plane network topologies. MoFRR works best in dual-planes topologies as illustrated in the figures below. MoFRR may be enabled on any router in the network. In the figures below, MoFRR is shown enabled on the Provider Edge (PE) routers to illustrate one way in which the technology may be deployed. S P / \ P / \ ^ G1 R1 ^ P / \ P / \ G2----------R2 ^ | \ | \ P ^ | \ | \ P | G3----------R3 | | | | | | | | ^ G4---|------R4 | P ^ \ | \ | P \ | \ | G5----------R5 ^ | | ^ P | | P | | Gi Ri \ \__ ^ /| \ \ S1/ | ^ ^ \ ^\ / |P2 P1 \ S2\_/__ | \ / \| PE1 PE2 P = Primary path S = Secondary pathFIG1.Figure 1: Two-Plane Network Design The topology has two planes, a primary plane and a secondary plane that are fully disjoint from each other all the way into the POPs. Thistwo planetwo-plane design is common in service provider networks as it eliminates single point of failures in their core network. The links marked P indicate the normal(Primary)(primary) path of how the PIMjoinsJoins flow from the POPs towards the source of the network. Multicast streams, especially for the densely watched channels, typically flow along both the planes in the network anyway. The only change MoFRR adds to this is on the links marked S where the PE routers join a secondary path to their secondary ECMP UMH. As a result of this, each PE router receives two copies of the same stream, one from the primary plane and the other from the secondary plane. As a result of normal UMH behavior, the multicast stream received over the primary path is accepted and forwarded to the downstream receivers. The copy of the stream received from the secondaryUNHUMH is discarded. When a router detects a routing failure on the path to its primary UMH, it will switch to the secondary UMH and accept packets for that stream. If the failure isrepairedrepaired, the router may switch back. The primary and secondary UMHs have only local context and not end-to-end context. As one can see, MoFRR achieves the faster convergence by pre-building the secondary multicast tree and receiving the traffic on that secondary path. The example discussed above is a simple case where there are two ECMP paths from each PE device towards the source, one along the primary plane and one along the secondary. In cases where the topology is asymmetric or is a ring, this ECMP nature does not hold, and additional rules have to be taken into account to choose when and where to join the secondary path. MoFRR is appealing in such topologies for the following reasons: 1. Ease of deployment and simplicity: the functionality is only required on the PEdevicesdevices, although it may be configured on all routers in the topology. Furthermore, each PE device can be enabledseparately,separately; there is no need fora network widenetwork-wide coordination in order to deploy MoFRR.Inter-operabilityInteroperability testing is not required as there are no PIM or mLDP protocolchange.changes. 2. End-to-end failure detection and recovery: any failure along the path from the source to the PE can be detected and repaired with the secondary disjointstream.(see Section 5stream. (See the second, third, and fourth options2, 3, 4)in Section 5.) 3. CapacityEfficiency:efficiency: as illustrated in the previous example, theMulticastmulticast trees corresponding to IPTV channels cover the backbone and distribution topology in a very dense manner. As a consequence, the secondary pathgraft intografts onto the normalMulticastmulticast trees(ie.(i.e., trees signaled by PIM or mLDP without the MoFRR extension) at the aggregation level and hencedodoes not demand any extra capacity either on the distribution links or in the backbone.TheyThe secondary path simplyuseuses the capacity that is normally used, without any duplication. This is different from conventional FRR mechanismswhichthat often duplicate the capacity requirements when the backup path crosses links/nodeswhichthat already carry the primary/normaltreetree, andhencethus twice as much capacity is required. 4.Loop free:Loop-free: the secondary path join is sent on an ECMP disjoint path. By definition, the neighbor receiving this request is closer to the source and hence will not cause a loop. The topology we just analyzed is very frequent and can bemodelledmodeled as perFIG2.Figure 2. The PE has two ECMP disjoint paths to the source. Each ECMP path uses a disjoint plane of the network. Source / \ Plane1 Plane2 | | A1 A2 \ / PEFIG2.Figure 2: PE isdual-homedDual-Homed to Dual-Plane Backbone Another frequent topology is described inFIG3.Figure 3. PEs are grouped by pairs. In each pair, each PE is connected to a different plane. Each PE has one single shortest-path to a source (via its connected plane). There is no ECMP like inFIG2.Figure 2. However, there is clearly a way to provide MoFRR benefits as each PE can offer a disjoint secondary path to the PE in the other planePE(via the disjoint path). The MoFRR secondary neighbor selection process needs to be extended in this case as one cannot simply rely on using an ECMP path as secondary neighbor. This extension is referred to asnon-ecmp extensionnon-ECMP-mode MoFRR and is described in Section 3.2. Source / \ Plane1 Plane2 | | A1 A2 | | PE1----PE2FIG3.Figure 3: PEsare connectedAre Connected inpairsPairs to Dual-Plane Backbone 7. Other Topologies As mentioned insectionSection 6, MoFRR works best in dual-plane topologies. If MoFRR is applied tonone dual-planenon-dual-plane networks,itsit's possible that the secondary path iseffectedaffected by the same failure thateffectedaffected the primary path. In that case, there is noguarenteeguarantee that the backup path will provide anun-interupteduninterrupted traffic flow of packets without loss or duplication. 8. Capacity Planning for MoFRR The previous section has described two very frequent designs(FIG2(Figures 2 andFIG3)3) which provide maximum MoFRR benefits. Designers with topologies different thanFIG2Figures 2 andFIG33 can still benefit fromMoFRRMoFRR, thanks to the use of capacity planning tools. Such tools are able to simulate the ability of each PE to build two disjoint branches of the same tree. This simulation could be for hundreds of PEs and hundreds of sources. This allowsto assessan assessment of the MoFRR protection coverage of a given network, for a set of sources. If the protection coverage is deemed insufficient, the designer can use such a tool to optimize the topology (add links, change IGP metrics). 9. PEnodesNodes Many Service Providers devise their topology such that PEs have disjoint paths to the multicast sources. MoFRR leverages the existence of these disjoint paths without any PIM or mLDP protocol modification. Interoperability testing is thus not required. In such topologies, MoFRR only needs to be deployed on the PE devices. Each PE device can be enabled one by one. 10. Other Applications While all the examples in this document show the MoFRR applicability on PE devices, it is clear that MoFRR could be enabled on aggregation or core routers. MoFRR can be popular inData Centerdata center network configurations. With the advent oflower cost ethernetlower-cost Ethernet and increasing port density in routers, there is more meshed connectivity than ever before. When using a3-levelthree-level access, distribution, and core layers in aData Center,data center, there is a lot of inexpensive bandwidth connecting the layers. This will lend itself to more opportunities for ECMP paths at multiple layers. This allows for multiple layers of redundancy protecting link and node failure at each layer with minimal redundancy cost. Redundancy costs are reduced because only one packet is forwarded at every link along the primary and secondary data paths so there is no duplication of data on any link thereby providing make-before-break protection at a very small cost. A MoFRR router only accepts packets from the primary path and discards packets from the secondary path. For that reason, management applications (like ping and mtrace) will not work when verifying the secondary path. The MoFRR principle may be applied to MVPNs. 11.IANA Considerations This document makes no request of IANA. 12.Security Considerations There are no security considerations for this design other than what is already in the main PIM specification [RFC4601] and mLDP specification [RFC6388].13. Acknowledgments Thanks to Dave Oran and Alvaro Retana for their review and comments on this document. The authors would like to especially acknowledge the contribution from Dino Farinacci, John Zwiebel and Greg Shepherd for the genesis of the MoFRR concept. 14. Contributor Addresses Below is a list of other contributing authors in alphabetical order: Dino Farinacci Email: farinacci@gmail.com Wim Henderickx Alcatel-Lucent Copernicuslaan 50 Antwerp 2018 Belgium Email: wim.henderickx@alcatel-lucent.com Uwe Joorde Deutsche Telekom Dahlweg 100 D-48153 Muenster Germany Email: Uwe.Joorde@telekom.de Nicolai Leymann Deutsche Telekom Winterfeldtstrasse 21 Berlin 10781 DE Email: N.Leymann@telekom.de Jeff Tantsura Ericsson 300 Holger Way San Jose CA 95134 USA Email: jeff.tantsura@ericsson.com 15.12. References15.1.12.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March1997.1997, <http://www.rfc-editor.org/info/rfc2119>. [RFC5286] Atlas,A.A., Ed., and A. Zinin, Ed., "Basic Specification for IP Fast Reroute: Loop-Free Alternates", RFC 5286, DOI 10.17487/RFC5286, September2008. 15.2.2008, <http://www.rfc-editor.org/info/rfc5286>. 12.2. Informative References [RFC4601] Fenner, B., Handley, M., Holbrook, H., and I. Kouvelas, "Protocol Independent Multicast - Sparse Mode (PIM-SM): Protocol Specification (Revised)", RFC 4601, DOI 10.17487/RFC4601, August2006.2006, <http://www.rfc-editor.org/info/rfc4601>. [RFC6388] Wijnands, IJ., Ed., Minei, I., Ed., Kompella, K., and B. Thomas, "Label Distribution Protocol Extensions forPoint-to- MultipointPoint- to-Multipoint and Multipoint-to-Multipoint Label Switched Paths", RFC 6388, DOI 10.17487/RFC6388, November2011.2011, <http://www.rfc-editor.org/info/rfc6388>. [RFC6571] Filsfils, C., Ed., Francois, P., Ed., Shand, M., Decraene, B., Uttaro, J., Leymann, N., and M. Horneffer, "Loop-Free Alternate (LFA) Applicability in Service Provider (SP) Networks", RFC 6571, DOI 10.17487/RFC6571, June2012.2012, <http://www.rfc-editor.org/info/rfc6571>. Acknowledgments Thanks to Dave Oran and Alvaro Retana for their review and comments on this document. The authors would like to especially acknowledge Dino Farinacci, John Zwiebel, and Greg Shepherd for the genesis of the MoFRR concept. Contributors Below is a list of the contributors in alphabetical order: Dino Farinacci Email: farinacci@gmail.com Wim Henderickx Alcatel-Lucent Copernicuslaan 50 Antwerp 2018 Belgium Email: wim.henderickx@alcatel-lucent.com Uwe Joorde Deutsche Telekom Dahlweg 100 D-48153 Muenster Germany Email: Uwe.Joorde@telekom.de Nicolai Leymann Deutsche Telekom Winterfeldtstrasse 21 Berlin 10781 Germany Email: N.Leymann@telekom.de Jeff Tantsura Ericsson 300 Holger Way San Jose, CA 95134 United States Email: jeff.tantsura@ericsson.com Authors' Addresses Apoorva Karan Cisco Systems, Inc. 3750 Cisco Way SanJose CA,Jose, CA 95134USAUnited States Email: apoorva@cisco.com Clarence Filsfils Cisco Systems, Inc. De kleetlaan 6a Diegem BRABANT 1831 Belgium Email: cfilsfil@cisco.com IJsbrand Wijnands (editor) Cisco Systems, Inc. De Kleetlaan 6a Diegem 1831BEBelgium Email: ice@cisco.com Bruno Decraene Orange 38-40 rue du General Leclerc Issy Moulineaux Cedex 9, 92794FRFrance Email: bruno.decraene@orange.com