Internet Engineering Task Force (IETF) S. Litkowski, Ed. Request for Comments: 7916 B. Decraene Category: Standards Track Orange ISSN: 2070-1721 C. Filsfils K. Raza Cisco Systems M. Horneffer Deutsche Telekom P. Sarkar Individual Contributor June 2016 Operational Management of Loop-Free Alternates Abstract Loop-Free Alternates (LFAs), as defined in RFC 5286, constitute an IP Fast Reroute (IP FRR) mechanism enabling traffic protection for IP traffic (and, by extension, MPLS LDP traffic). Following early deployment experiences, this document provides operational feedback on LFAs, highlights some limitations, and proposes a set of refinements to address those limitations. It also proposes required management specifications. This proposal is also applicable to remote-LFA solutions. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7916. Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Operational Issues with Default LFA Tiebreakers . . . . . . . 4 3.1. Case 1: PE Router Protecting against Failures within Core Network . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Case 2: PE Router Chosen to Protect against Core Failures while P Router LFA Exists . . . . . . . . . . . . . . . . 5 3.3. Case 3: Suboptimal P Router Alternate Choice . . . . . . 6 3.4. Case 4: No-Transit LFA Computing Node . . . . . . . . . . 7 4. Need for Coverage Monitoring . . . . . . . . . . . . . . . . 8 5. Need for LFA Activation Granularity . . . . . . . . . . . . . 9 6. Configuration Requirements . . . . . . . . . . . . . . . . . 9 6.1. LFA Enabling/Disabling Scope . . . . . . . . . . . . . . 10 6.2. Policy-Based LFA Selection . . . . . . . . . . . . . . . 10 6.2.1. Connected versus Remote Alternates . . . . . . . . . 11 6.2.2. Mandatory Criteria . . . . . . . . . . . . . . . . . 12 6.2.3. Additional Criteria . . . . . . . . . . . . . . . . . 12 6.2.4. Evaluation of Criteria . . . . . . . . . . . . . . . 12 6.2.5. Retrieving Alternate Path Attributes . . . . . . . . 16 6.2.6. ECMP LFAs . . . . . . . . . . . . . . . . . . . . . . 21 7. Operational Aspects . . . . . . . . . . . . . . . . . . . . . 22 7.1. No-Transit Condition on LFA Computing Node . . . . . . . 22 7.2. Manual Triggering of FRR . . . . . . . . . . . . . . . . 23 7.3. Required Local Information . . . . . . . . . . . . . . . 24 7.4. Coverage Monitoring . . . . . . . . . . . . . . . . . . . 24 7.5. LFAs and Network Planning . . . . . . . . . . . . . . . . 25 8. Security Considerations . . . . . . . . . . . . . . . . . . . 25 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 9.1. Normative References . . . . . . . . . . . . . . . . . . 26 9.2. Informative References . . . . . . . . . . . . . . . . . 27 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 1. Introduction Following the first deployments of Loop-Free Alternates (LFAs), this document provides feedback to the community about the management of LFAs. o Section 3 provides real use cases illustrating some limitations and suboptimal behavior. o Section 4 provides requirements for LFA simulations. o Section 5 proposes requirements for activation granularity and policy-based selection of the alternate. o Section 6 expresses requirements for the operational management of LFAs and, in particular, a policy framework to manage alternates. o Section 7 details some operational considerations of LFAs, such as IS-IS overload bit management and troubleshooting information. 1.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Definitions o Per-prefix LFA computation: Evaluation for the best alternate is done for each destination prefix, as opposed to the "per-next-hop" simplification technique proposed in Section 3.8 of [RFC5286]. o PE router: Provider Edge router. These routers connect customers to each other. o P router: Provider router. These routers are core routers without customer connections. They provide transit between PE routers, and they form the core network. o Core network: subset of the network composed of P routers and links between them. o Core link: network link part of the core network, i.e., a link between P routers. o Link-protecting LFA: alternate providing protection against link failure. o Node-protecting LFA: alternate providing protection against node failure. o Connected alternate: alternate adjacent (at the IGP level) to the Point of Local Repair (PLR) (i.e., an IGP neighbor). o Remote alternate: alternate that does not share an IGP adjacency with the PLR. 3. Operational Issues with Default LFA Tiebreakers [RFC5286] introduces the notion of tiebreakers when selecting the LFA among multiple candidate alternate next hops. When multiple LFAs exist, [RFC5286] has favored the selection of the LFA that provides the best coverage against the failure cases. While this is indeed a goal, it is one among multiple goals, and in some deployments this leads to the selection of a suboptimal LFA. The following sections detail real use cases related to such limitations. Note that the use case for LFA computation per destination (per-prefix LFA) is assumed throughout this analysis. We also assume in the network figures that all IP prefixes are advertised with zero cost. 3.1. Case 1: PE Router Protecting against Failures within Core Network P1 --------- P2 ---------- P3 --------- P4 | 1 100 1 | | | | 100 | 100 | | | 1 100 1 | 1 5k P5 --------- P6 ---------- P7 --------- P8 --- P9 -- PE1 | | | | | | 5k| |5k 5k| |5k | 5k | 5k | | | | | | | +-- PE4 --+ | +---- PE2 ----+ | | | +---- PE5 ----+ | 5k | PE3 Px routers are P routers using n * 10 Gbps links. PEs are connected using links with lower bandwidth. Figure 1 In Figure 1, let us consider the traffic flowing from PE1 to PE4. The nominal path is P9-P8-P7-P6-PE4. Let us now consider the failure of link P7-P8. As the P4 primary path to PE4 is P8-P7-P6-PE4, P4 is not an LFA for P8 (because P4 will loop traffic back to P8), and the only available LFA is PE2. When the core link P8-P7 fails, P8 switches all traffic destined to PE4/PE5 towards the node PE2. Hence, a PE node and PE links are used to protect against the failure of a core link. Typically, PE links have less capacity than core links, and congestion may occur on PE2 links. Note that although PE2 is not directly affected by the failure, its links become congested, and its traffic will suffer from the congestion. In summary, in the case of P8-P7 link failure, the impact on customer traffic is: o From PE2's point of view: * without LFA: no impact. * with LFA: traffic is partially dropped (but possibly prioritized by a QoS mechanism). It must be highlighted that in such a situation, traffic not affected by the failure may be affected by the congestion. o From P8's point of view: * without LFA: traffic is totally dropped until convergence occurs. * with LFA: traffic is partially dropped (but possibly prioritized by a QoS mechanism). Besides the congestion aspects of using a PE router as an alternate to protect against a core failure, a service provider may consider this to be a bad routing design and would want to prevent it. 3.2. Case 2: PE Router Chosen to Protect against Core Failures while P Router LFA Exists P1 --------- P2 ------------ P3 ------- P4 | 1 100 | 1 | | | | | 100 | 30 | 30 | | | | 1 50 50 | 10 | 1 5k P5 --------- P6 --- P10 ---- P7 ------- P8 --- P9 -- PE1 | | | | \ | 5k| |5k 5k| |5k \ 5k | 5k | | | | \ | | +-- PE4 --+ | +---- PE2 ----+ | | | +---- PE5 ----+ | 5k | PE3 Px routers are P routers meshed with n * 10 Gbps links. PEs are meshed using links with lower bandwidth. Figure 2 In Figure 2, let us consider the traffic coming from PE1 to PE4. The nominal path is P9-P8-P7-P10-P6-PE4. Let us now consider the failure of the link P7-P8. For P8, P4 is a link-protecting LFA and PE2 is a node-protecting LFA. PE2 is chosen as the best LFA, due to the better type of protection that it provides. Just as in case 1, this may lead to congestion on PE2 links upon LFA activation. 3.3. Case 3: Suboptimal P Router Alternate Choice +--- PE3 ---+ / \ 1000 / \ 1000 / \ +----- P1 ---------------- P2 ----+ | | 500 | | | 10 | | | 10 | | | | R5 | 10 | 10 R7 | | | | | 10 | | | 10 | | 500 | | +---- P3 ----------------- P4 ----+ \ / 1000 \ / 1000 \ / +--- PE1 ---+ Px routers are P routers. P1-P2 and P3-P4 links are 1 Gbps links. All other inter-Px links are 10 Gbps links. Figure 3 In Figure 3, let us consider the failure of link P1-P3. For destination PE3, P3 has two possible alternates: o P4, which is node-protecting o R5, which is link-protecting P4 is chosen as the best LFA, due to the better type of protection that it provides. However, for bandwidth capacity reasons, it may not be desirable to use P4. A service provider may prefer to use high-bandwidth links as the preferred LFA. In this example, preferring the shortest path over the type of protection may achieve the expected behavior, but in cases where metrics do not reflect the bandwidth, this technique would not work and some other criteria would need to be involved when selecting the best LFA. 3.4. Case 4: No-Transit LFA Computing Node P1 P2 | \ / | 50 | 50 \/ 50 | 50 | /\ | PE1-+ +-- PE2 \ / 45 \ / 45 -PE3- (No-transit condition set) Figure 4 The IS-IS and OSPF protocols define some way to prevent a router from being used for transit. The IS-IS overload bit is defined in [ISO10589], and the OSPF R-bit is defined in [RFC5340]. Also, the OSPF stub router is defined in [RFC6987] as a method to prevent transit on a node by advertising MaxLinkMetric on all non-stub links. In Figure 4, PE3 has its no-transit condition set (permanently, for design reasons) and wants to protect traffic using an LFA for destination PE2. On PE3, the loop-free condition is not satisfied: 100 !< 45 + 45. PE1 is thus not considered as an LFA. However, thanks to the no-transit condition on PE3, we know that PE1 will not loop the traffic back to PE3. So, PE1 is an LFA to reach PE2. In the case of a no-transit condition set on a node, LFA behavior must be clarified. 4. Need for Coverage Monitoring As per [RFC6571], LFA coverage depends strongly on the network topology that is in use. Even if the remote-LFA mechanism [RFC7490] significantly extends the coverage of the basic LFA specification, there are still some cases where protection would not be available. As network topologies are constantly evolving (network extension, additional capacity, latency optimization, etc.), the protection coverage may change. Fast Reroute (FRR) functionality may be critical for some services supported by the network; a service provider must always know what type of protection coverage is currently available on the network. Moreover, predicting protection coverage in the event of network topology changes is mandatory. Today, network simulation tools associated with "what if" scenarios are often used by service providers for the overall network design (capacity, path optimization, etc.). Sections 7.3, 7.4, and 7.5 of this document propose the addition of LFA information into such tools and within routers, so that a service provider may be able to: o evaluate protection coverage after a topology change. o adjust the topology change to cover the primary need (e.g., latency optimization, bandwidth increase) as well as LFA protection. o constantly monitor the LFA coverage in the live network and receive alerts. Documentation of LFA selection algorithms by implementers (default and tuning options) is important in order to make it possible for third-party modules to model these policy-based LFA selection algorithms. 5. Need for LFA Activation Granularity As in all FRR mechanisms, an LFA installs backup paths in the Forwarding Information Base (FIB). Depending on the hardware used by a service provider, FIB resources may be critical. Activating LFAs by default on all available components (IGP topologies, interfaces, address families, etc.) may lead to a waste of FIB resources, as generally only a few destinations in a network should be protected (e.g., loopback addresses supporting MPLS services) compared to the number of destinations in the Routing Information Base (RIB). Moreover, a service provider may implement multiple different FRR mechanisms in its networks for different applications (e.g., Maximally Redundant Trees (MRTs), TE FRR). In this scenario, an implementation MAY allow the computation of alternates for a specific destination even if the destination is already protected by another mechanism. This will provide redundancy and permit the operator to select the best option for FRR, using a policy language. Section 6 provides some implementation guidelines. 6. Configuration Requirements Controlling the selection of the best alternate and the granularity of LFA activation is a requirement for service providers. This section defines configuration requirements for LFAs. 6.1. LFA Enabling/Disabling Scope The granularity of LFA activation SHOULD be controlled (as alternate next hops consume memory in the forwarding plane). An implementation of an LFA SHOULD allow its activation, with the following granularities: o Per routing context: Virtual Routing and Forwarding (VRF), virtual/logical router, global routing table, etc. o Per interface. o Per protocol instance, topology, area. o Per prefix: Prefix protection SHOULD have a higher priority compared to interface protection. This means that if a specific prefix must be protected due to a configuration request, an LFA MUST be computed and installed for that prefix even if the primary outgoing interface is not configured for protection. An implementation of an LFA MAY allow its activation, with the following criteria: o Per address family: IPv4 unicast, IPv6 unicast. o Per MPLS control plane: For MPLS control planes that inherit routing decisions from the IGP routing protocol, the MPLS data plane may be protected by an LFA. The implementation may allow an operator to control this inheritance of protection from the IP prefix to the MPLS label bound to this prefix. The inheritance of protection will concern IP-to-MPLS, MPLS-to-MPLS, and MPLS-to-IP entries. As an example, LDP and Segment Routing extensions [SEG-RTG-ARCH] for IS-IS and OSPF are control-plane eligible for this inheritance of protection. 6.2. Policy-Based LFA Selection When multiple alternates exist, the LFA selection algorithm is based on tiebreakers. Current tiebreakers do not provide sufficient control regarding how the best alternate is chosen. This document proposes an enhanced tiebreaker allowing service providers to manage all specific cases: 1. An LFA implementation SHOULD support policy-based decisions for determining the best LFA. 2. Policy-based decisions SHOULD be based on multiple criteria, with each criterion having a level of preference. 3. If the defined policy does not allow the determination of a unique best LFA, an implementation SHOULD pick only one based on its own decision. For load-balancing purposes, an implementation SHOULD also support the election of multiple LFAs. 4. The policy SHOULD be applicable to a protected interface or a specific set of destinations. In the case of applicability to the protected interface, all destinations primarily routed on that interface SHOULD use the policy for that interface. 5. The choice of whether or not to dynamically re-evaluate policy (in the event of a policy change) is left to the implementation. If a dynamic approach is chosen, the implementation SHOULD recompute the best LFAs and reinstall them in the FIB without service disruption. If a non-dynamic approach is chosen, the policy would be taken into account upon the next IGP event. In this case, the implementation SHOULD support a command to manually force the recomputation/reinstallation of LFAs. 6.2.1. Connected versus Remote Alternates In addition to connected LFAs, tunnels (e.g., IP, LDP, RSVP-TE, Segment Routing) to distant routers may be used to complement LFA coverage (tunnel tail used as virtual neighbor). When a router has multiple alternate candidates for a specific destination, it may have connected alternates and remote alternates (reachable via a tunnel). Connected alternates may not always provide an optimal routing path, and it may be preferable to select a remote alternate over a connected alternate. Some uses of tunnels to extend LFA [RFC5286] coverage are described in [RFC7490] and [TI-LFA]. [RFC7490] and [TI-LFA] present some use cases for LDP tunnels and Segment Routing tunnels, respectively. This document considers any type of tunneling techniques to reach remote alternates (IP, Generic Routing Encapsulation (GRE), LDP, RSVP-TE, the Layer 2 Tunneling Protocol (L2TP), Segment Routing, etc.) and does not restrict the remote alternates to the uses presented in these other documents. In Figure 1, there is no P router alternate for P8 to reach PE4 or PE5, so P8 is using PE2 as an alternate; this may generate congestion when FRR is activated. Instead, we could have a remote alternate for P8 to protect traffic to PE4 and PE5. For example, a tunnel from P8 to P3 (following the shortest path) can be set up, and P8 would be able to use P3 as a remote alternate to protect traffic to PE4 and PE5. In this scenario, traffic will not use a PE link during FRR activation. When selecting the best alternate, the selection algorithm MUST consider all available alternates (connected or tunnel). For example, with remote LFAs, computation of PQ sets [RFC7490] SHOULD be performed before the selection of the best alternate. 6.2.2. Mandatory Criteria An LFA implementation MUST support the following criteria: o Non-candidate link: A link marked as "non-candidate" will never be used as an LFA. o A primary next hop being protected by another primary next hop of the same prefix (ECMP case). o Type of protection provided by the alternate: link protection or node protection. In the case of preference for node protection, an implementation SHOULD support fallback to link protection if node protection is not available. o Shortest path: lowest IGP metric used to reach the destination. o Shared Risk Link Groups (SRLGs) (as defined in Section 3 of [RFC5286]; see also Section 6.2.4.1 for more details). 6.2.3. Additional Criteria An LFA implementation SHOULD support the following criteria: o A downstream alternate: Preference for a downstream path over a non-downstream path SHOULD be configurable. o Link coloring with "include", "exclude", and preference-based systems (see Section 6.2.4.2). o Link bandwidth (see Section 6.2.4.3). o Alternate preference / node coloring (see Section 6.2.4.4). 6.2.4. Evaluation of Criteria 6.2.4.1. SRLGs Section 3 of [RFC5286] proposes the reuse of GMPLS IGP extensions to encode SRLGs [RFC5307] [RFC4203]. Section 3 of [RFC5286] also describes the algorithm to compute SRLG protection. When SRLG protection is computed, an implementation SHOULD allow the following: o Exclusion of alternates in violation of SRLGs. o Maintenance of a preference system between alternates based on SRLG violations. How the preference system is implemented is out of scope for this document, but here are two examples: * Preference based on the number of violations. In this case, more violations = less preferred. * Preference based on violation cost. In this case, each SRLG violation has an associated cost. The lower violation costs are preferred. When applying SRLG criteria, the SRLG violation check SHOULD be performed on sources to alternates as well as alternates to destination paths, based on the SRLG set of the primary path. In the case of remote LFAs, PQ-to-destination path attributes would be retrieved from the Shortest Path Tree (SPT) rooted at the PQ. 6.2.4.2. Link Coloring Link coloring is a powerful system to control the choice of alternates. Link colors are markers that will allow the encoding of properties of a particular link. Protecting interfaces are tagged with colors. Protected interfaces are configured to include some colors with a preference level and exclude others. Link color information SHOULD be signaled in the IGP, and administrative-group IGP extensions [RFC5305] [RFC3630] that are already standardized, implemented, and widely used SHOULD be used for encoding and signaling link colors. PE2 | +---- P4 | / PE1 ---- P1 --------- P2 | 10 Gbps 1 Gbps | | P3 Figure 5 In the example in Figure 5, the P1 router is connected to three P routers and two PEs. P1 is configured to protect the P1-P4 link. We assume that, given the topology, all neighbors are candidate LFAs. We would like to enforce a policy in the network where only a core router may protect against the failure of a core link and where high-capacity links are preferred. In this example, we can use the proposed link coloring by: o Marking the PE links with the color RED. o Marking the 10 Gbps core link with the color BLUE. o Marking the 1 Gbps core link with the color YELLOW. o Configuring the protected interface P1->P4 as follows: * Include BLUE, preference 200. * Include YELLOW, preference 100. * Exclude RED. Using this, PE links will never be used to protect against P1-P4 link failure, and the 10 Gbps link will be preferred. The main advantage of this solution is that it can easily be duplicated on other interfaces and other nodes without change. A service provider has only to define the color system (associate a color with a level of significance), as it is done already for TE affinities or BGP communities. An implementation of link coloring: o SHOULD support multiple "include" and "exclude" colors on a single protected interface. o SHOULD provide a level of preference between included colors. o SHOULD support the configuration of multiple colors on a single protecting interface. 6.2.4.3. Bandwidth As mentioned in previous sections, not taking into account the bandwidth of an alternate could lead to congestion during FRR activation. We propose that the bandwidth criteria be based on the link speed information, for the following reasons: o If a router S has a set of X destinations primarily forwarded to N, using per-prefix LFAs may lead to having a subset of X protected by a neighbor N1, another subset by N2, another subset by Nx, etc. o S is not aware of traffic flows to each destination, so in the case of FRR activation, S is not able to evaluate how much traffic will be sent to N1, N2, Nx, etc. Based on this, it is not useful to gather available bandwidth on alternate paths, as the router does not know how much bandwidth it requires for protection. The proposed link speed approach provides a good approximation at low cost, as information is easily available. The bandwidth criteria of the policy framework SHOULD work in at least the following two ways: o Prune: Exclude an LFA if the link speed to reach it is lower than the link speed of the primary next-hop interface. o Prefer: Prefer an LFA based on its bandwidth to reach it compared to the link speed of the primary next-hop interface. 6.2.4.4. Alternate Preference / Node Coloring Rather than tagging interfaces on each node (using link colors) to identify the types of alternate nodes (as an example), it would be helpful if routers could be identified in the IGP. This would allow grouped processing on multiple nodes. As an implementation needs to exclude some specific alternates (see Section 6.2.3), an implementation SHOULD be able to: o give preference to a specific alternate. o give preference to a group of alternates. o exclude a specific alternate. o exclude a group of alternates. A specific alternate may be identified by its interface, IP address, or router ID, and a group of alternates may be identified by a marker (tag) advertised in IGP. The IGP encoding and signaling for marking groups of alternates SHOULD be done according to [RFC7917] and [RFC7777]. Using a tag/marker is referred to as "node coloring", as compared to the link coloring option presented in Section 6.2.4.2. Consider the following network: PE3 | | PE2 | +---- P4 | / PE1 ---- P1 -------- P2 | 10 Gbps 1 Gbps | | P3 Figure 6 In the example above, each node is configured with a specific tag flooded through the IGP. o PE1,PE3: 200 (non-candidate). o PE2: 100 (edge/core). o P1,P2,P3: 50 (core). A simple policy could be configured on P1 to choose the best alternate for P1->P4 based on the function or role of the router, as follows: o criterion 1 -> alternate preference: exclude tags 100 and 200. o criterion 2 -> bandwidth. 6.2.5. Retrieving Alternate Path Attributes 6.2.5.1. Alternate Path The alternate path is composed of two distinct parts: PLR to alternate and alternate to destination. N1 -- R1 ---- R2 /50 \ \ / R3 --- R4 / \ S -------- E ------- D \\ // \\ // N2 ---- PQ ---- R5 Figure 7 In Figure 7, we consider a primary path from S to D, with S using E as the primary next hop. All metrics are 1, except that {S,N1} = 50. Two alternate paths are available: o {S,N1,R1,R2|R3,R4,D}, where N1 is a connected alternate. This consists of two sub-paths: * {S,N1}: path from the PLR to the alternate. * {N1,R1,R2|R3,R4,D}: path from the alternate to the destination. o {S,N2,PQ,R5,D}, where the PQ is a remote alternate. Again, the path consists of two sub-paths: * {S,N2,PQ}: path from the PLR to the alternate. * {PQ,R5,D}: path from the alternate to the destination. As displayed in Figure 7, some parts of the alternate path may fan out to multiple paths due to ECMP. 6.2.5.2. Alternate Path Attributes Some criteria listed in the previous sections require the retrieval of some characteristics of the alternate path (SRLG, bandwidth, color, tag, etc.). We call these characteristics "path attributes". A path attribute can record a list of node properties (e.g., node tag) or link properties (e.g., link color). This document defines two types of path attributes: o Cumulative attribute: When a path attribute is cumulative, the implementation SHOULD record the value of the attribute on each element (link and node) along the alternate path. SRLG, link color, and node color are cumulative attributes. o Unitary attribute: When a path attribute is unitary, the implementation SHOULD record the value of the attribute only on the first element along the alternate path (first node, or first link). Bandwidth is a unitary attribute. N1 -- R1 ---- R2 / \ / 50 R4 / \ S -------- E ------- D Figure 8 In Figure 8, N1 is a connected alternate to reach D from S. We consider that all links have a RED color except {R1,R2}, which is BLUE. We consider all links to be 10 Gbps except {N1,R1}, which is 2.5 Gbps. The bandwidth attribute collected for the alternate path will be 10 Gbps. As the attribute is unitary, only the link speed of the first link {S,N1} is recorded. The link color attribute collected for the alternate path will be {RED,RED,BLUE,RED,RED}. As the attribute is cumulative, the value of the attribute on each link along the path is recorded. 6.2.5.3. Connected Alternate For an alternate path using a connected alternate: o Attributes from the PLR to the alternate are retrieved from the interface connected to the alternate. If the alternate is connected through multiple interfaces, the evaluation of attributes SHOULD be done once per interface (each interface is considered as a separate alternate) and once per ECMP group of interfaces (Layer 3 bundle). o Path attributes from the alternate to the destination are retrieved from the SPT rooted at the alternate. As the alternate is a connected alternate, the SPT has already been computed to find the alternate, so there is no need for additional computation. N1 -- R1 ---- R2 50//50 \ // \ i1//i2 \ S -------- E -------- D Figure 9 In Figure 9, we consider a primary path from S to D, with S using E as the primary next hop. All metrics are considered as 1 except {S,N1} links, which are using a metric of 50. We consider the following SRLGs on links: o {S,N1} using i1: SRLG1,SRLG10. o {S,N1} using i2: SRLG2,SRLG20. o {N1,R1}: SRLG3. o {R1,R2}: SRLG4. o {R2,D}: SRLG5. o {S,E}: SRLG10. o {E,D}: SRLG6. S is connected to the alternate using two interfaces: i1 and i2. If i1 and i2 are not part of an ECMP group, the evaluation of attributes is done once per interface, and each interface is considered as a separate alternate path. Two alternate paths will be available with the associated SRLG attributes: o Alternate path #1: {S,N1 using if1,R1,R2,D}: SRLG1,SRLG10,SRLG3,SRLG4,SRLG5. o Alternate path #2: {S,N1 using if2,R1,R2,D}: SRLG2,SRLG20,SRLG3,SRLG4,SRLG5. Alternate path #1 is sharing risks with the primary path and may be pruned, or its preference may be revoked, per user-defined policy. If i1 and i2 are part of an ECMP group, the evaluation of attributes is done once per ECMP group, and the implementation considers a single alternate path {S,N1 using if1|if2,R1,R2,D} with the following SRLG attributes: SRLG1,SRLG10,SRLG2,SRLG20,SRLG3,SRLG4,SRLG5. The alternate path is sharing risks with the primary path and may be pruned, or its preference may be revoked, per user-defined policy. 6.2.5.4. Remote Alternate For alternate path using a remote alternate (tunnel): o Attributes on the path from the PLR to the alternate are retrieved using the PLR's primary SPT (when using a PQ node from the P-space) or the immediate neighbor's SPT (when using a PQ from the extended P-space). These are then combined with the attributes of the link(s) to reach the immediate neighbor. In both cases, no additional SPT is required. o Attributes from the remote alternate to the destination path may be retrieved from the SPT rooted at the remote alternate. An additional forward SPT is required for each remote alternate (PQ node), as indicated in Section 2.3.2 of [REMOTE-LFA-NODE]. In some remote-alternate scenarios, like [TI-LFA], alternate-to- destination path attributes may be obtained using a different technique. The number of remote alternates may be very high. In the case of remote LFAs, simulations of real-world network topologies have shown that as many as hundreds of PQs are possible. The computational overhead of collecting all path attributes of all such PQs to destination paths could grow beyond reasonable levels. To handle this situation, implementations need to limit the number of remote alternates to be evaluated to a finite number before collecting alternate path attributes and running the policy evaluation. Section 2.3.3 of [REMOTE-LFA-NODE] provides a way to reduce the number of PQs to be evaluated. Some other remote alternate techniques using static or dynamic tunnels may not require this pruning. Link Remote Remote alternate alternate alternate ------------- ------------------ ------------- Alternates | LFA | | rLFA (PQs) | | Static/ | | | | | | Dynamic | sources | | | | | tunnels | ------------- ------------------ ------------- | | | | | | | -------------------------- | | | Prune some alternates | | | | (sorting strategy) | | | -------------------------- | | | | | | | ------------------------------------------------ | Collect alternate attributes | ------------------------------------------------ | | ------------------------- | Evaluate policy | ------------------------- | | Best alternates Figure 10 6.2.5.5. Collecting Attributes in the Case of Multiple Paths As described in Section 6.2.5, there may be some situations where an alternate path or part of an alternate path fans out to multiple paths (e.g., ECMP). When collecting path attributes in such a case, an implementation SHOULD consider the union of attributes of each sub-path. In Figure 7 (in Section 6.2.5.1), S has two alternate paths to reach D. Each alternate path fans out to multiple paths due to ECMP. Consider the following link color attributes: all links are RED except {R1,R3}, which is BLUE. The user wants to use an alternate path with only RED links. The first alternate path {S,N1,R1,R2|R3,R4,D} does not fit the constraint, as {R1,R3} is BLUE. The second alternate path {S,N2,PQ,R5,D} fits the constraint and will be preferred, as it uses only RED links. 6.2.6. ECMP LFAs 10 PE2 - PE3 | | 50 | 5 | 50 P1----P2 \\ // 50 \\ // 50 PE1 Links between P1 and PE1 are L1 and L2. Links between P2 and PE1 are L3 and L4. Figure 11 In Figure 11, the primary path from PE1 to PE2 is through P1, using ECMP on two parallel links -- L1 and L2. In the case of standard ECMP behavior, if L1 is failing, the post-convergence next hop would become L2 and ECMP would no longer be in use. If an LFA is activated, as stated in Section 3.4 of [RFC5286], "alternate next-hops may themselves also be primary next-hops, but need not be" and "alternate next-hops should maximize the coverage of the failure cases." In this scenario, there is no alternate providing node protection, so PE1 will prefer L2 as the alternate to protect L1; this makes sense compared to post-convergence behavior. Consider a different scenario, again referring to Figure 11, where L1 and L2 are configured as a Layer 3 bundle using a local feature and L3/L4 comprise a second Layer 3 bundle. Layer 3 bundles are configured as if a link in the bundle is failing; the traffic must be rerouted out of the bundle. Layer 3 bundles are generally introduced to increase bandwidth between nodes. In a nominal situation, ECMP is still available from PE1 to PE2, but if L1 is failing, the post-convergence next hop would become the ECMP on L3 and L4. In this case, LFA behavior SHOULD be adapted in order to reflect the bandwidth requirement. We would expect the following FIB entry on PE1: On PE1: PE2 +--> ECMP -> L1 | | | +----> L2 | +--> LFA (ECMP) -> L3 | +----------> L4 Figure 12 If L1 or L2 is failing, traffic must be switched on the LFA ECMP bundle rather than using the other primary next hop. As mentioned in Section 3.4 of [RFC5286], protecting a link within an ECMP by another primary next hop is not a MUST. Moreover, as already discussed in this document, maximizing coverage against the failure cases may not be the right approach, and a policy-based choice of an alternate may be preferred. An implementation SHOULD allow setting a preference to protect a primary next hop with another primary next hop. An implementation SHOULD also allow setting a preference to protect a primary next hop with a NON-primary next hop. An implementation SHOULD allow the use of an ECMP bundle as an LFA. 7. Operational Aspects 7.1. No-Transit Condition on LFA Computing Node In Section 3.5 of [RFC5286], the setting of the no-transit condition (through the IS-IS overload bit or the OSPF R-bit) in an LFA computation is only taken into account for the case where a neighbor has the no-transit condition set. In addition to Inequality 1 (Loop-Free Criterion) (Distance_opt(N, D) < Distance_opt(N, S) + Distance_opt(S, D)) [RFC5286], the IS-IS overload bit or the OSPF R-bit of the LFA calculating neighbor (S) SHOULD be taken into account. Indeed, if it has the IS-IS overload bit set or the OSPF R-bit clear, no neighbor will loop traffic back to itself. An OSPF router acting as a stub router [RFC6987] SHOULD behave as if the R-bit was clear regarding the LFA computation. 7.2. Manual Triggering of FRR Service providers often perform manual link shutdown (using a router's command-line interface (CLI)) to perform network changes/tests. A manual link shutdown may be done at multiple levels: physical interface, logical interface, IGP interface, Bidirectional Forwarding Detection (BFD) session, etc. In particular, testing or troubleshooting FRR requires that manual shutdown be performed on the remote end of the link, as a local shutdown would not generally trigger FRR. To permit such a situation, an implementation SHOULD support triggering/activating LFA FRR for a given link when a manual shutdown is done on a component that currently supports FRR activation. An implementation MAY also support FRR activation for a specific interface or a specific prefix on a primary next-hop interface and revert without any action on any running component of the node (links or protocols). In this use case, the FRR activation time needs to be controlled by a timer in case the operator forgot to revert the traffic to the primary path. When the timer expires, the traffic is automatically reverted to the primary path. This will simplify the testing of the FRR path; traffic can then be reverted back to the primary path without causing a global network convergence. For example: o If an implementation supports FRR activation upon a BFD session-down event, that implementation SHOULD support FRR activation when a manual shutdown is done on the BFD session. But if an implementation does not support FRR activation upon a BFD session-down event, there is no need for that implementation to support FRR activation upon manual shutdown of a BFD session. o If an implementation supports FRR activation upon a physical link-down event (e.g., Rx laser "off" detection, error threshold raised), that implementation SHOULD support FRR activation when a manual shutdown of a physical interface is done. But if an implementation does not support FRR activation upon a physical link-down event, there is no need for that implementation to support FRR activation upon manual shutdown of a physical link. o A CLI command may allow switching from the primary path to the FRR path to test the FRR path for a specific interface or prefix. There is no impact on the control plane; only the data plane of the local node may be changed. A similar command may allow switching traffic back from the FRR path to the primary path. 7.3. Required Local Information The introduction of LFAs in a network requires some enhancements to standard routing information provided by implementations. Moreover, due to "non-100%" coverage, coverage information is also required. Hence, an implementation: o MUST be able to display, for every prefix, the primary next hop as well as the alternate next-hop information. o MUST provide coverage information per LFA activation domain (area, level, topology, instance, virtual router, address family, etc.). o MUST provide the number of protected prefixes as well as non-protected prefixes globally. o SHOULD provide the number of protected prefixes as well as non-protected prefixes per link. o MAY provide the number of protected prefixes as well as non-protected prefixes per priority if the implementation supports prefix-priority insertion in the RIB/FIB. o SHOULD provide a reason for choosing an alternate (policy and criteria) and for excluding an alternate. o SHOULD provide the list of non-protected prefixes and the reason why they are not protected (e.g., no protection required, no alternate available). 7.4. Coverage Monitoring It is pretty easy to evaluate the coverage of a network in a nominal situation, but topology changes may change the level of coverage. In some situations, the network may no longer be able to provide the required level of protection. Hence, it becomes very important for service providers to receive alerts regarding changes in coverage. An implementation SHOULD: o provide an alert system if total coverage (for a node) is below a defined threshold or when coverage returns to normal. o provide an alert system if coverage for a specific link is below a defined threshold or when coverage returns to normal. An implementation MAY: o trigger an alert if a specific destination is not protected anymore or when protection comes back up for this destination. Although the procedures for providing alerts are beyond the scope of this document, we recommend that implementations consider standard and well-used mechanisms like syslog or SNMP traps. 7.5. LFAs and Network Planning The operator may choose to run simulations in order to ensure a certain type of full coverage for the whole network or a given subset of the network. This is particularly likely if he operates the network in the sense of the third backbone profile described in Section 4 of [RFC6571]; that is, he seeks to design and engineer the network topology in such a way that a certain level of coverage is always achieved. Obviously, a complete and exact simulation of the IP FRR coverage can only be achieved if the behavior is deterministic and the algorithm used is available to the simulation tool. Thus, an implementation SHOULD: o Behave deterministically in its LFA selection process. That is, in the same topology and with the same policy configuration, the implementation MUST always choose the same alternate for a given prefix. o Document its behavior. The implementation SHOULD provide enough documentation regarding its behavior to allow an implementer of a simulation tool to foresee the exact choice of the LFA implementation for every prefix in a given topology. This SHOULD take into account all possible policy configuration options. One possible way to document this behavior is to disclose the algorithm used to choose alternates. 8. Security Considerations The policy mechanism introduced in this document allows the tuning of the selection of the alternate. This is not seen as a security threat, because: o all candidates are already eligible as per [RFC5286] and considered usable. o the policy is based on information from the router's own configuration and from the IGP, both of which are considered trusted. Hence, this document does not introduce any new security considerations as compared to [RFC5286]. As noted above, the policy mechanism introduced in this document allows the tuning of the selection of the best alternate but does not change the list of alternates that are eligible. As described in Section 7 of [RFC5286], this best alternate "can be used anyway when a different topological change occurs, and hence this can't be viewed as a new security threat." 9. References 9.1. Normative References [ISO10589] International Organization for Standardization, "Intermediate System to Intermediate System intra-domain routeing information exchange protocol for use in conjunction with the protocol for providing the connectionless-mode network service (ISO 8473)", ISO Standard 10589, 2002. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <http://www.rfc-editor.org/info/rfc2119>. [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering (TE) Extensions to OSPF Version 2", RFC 3630, DOI 10.17487/RFC3630, September 2003, <http://www.rfc-editor.org/info/rfc3630>. [RFC4203] Kompella, K., Ed. and Y. Rekhter, Ed., "OSPF Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4203, DOI 10.17487/RFC4203, October 2005, <http://www.rfc-editor.org/info/rfc4203>. [RFC5286] Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for IP Fast Reroute: Loop-Free Alternates", RFC 5286, DOI 10.17487/RFC5286, September 2008, <http://www.rfc-editor.org/info/rfc5286>. [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic Engineering", RFC 5305, DOI 10.17487/RFC5305, October 2008, <http://www.rfc-editor.org/info/rfc5305>. [RFC5307] Kompella, K., Ed. and Y. Rekhter, Ed., "IS-IS Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 5307, DOI 10.17487/RFC5307, October 2008, <http://www.rfc-editor.org/info/rfc5307>. [RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008, <http://www.rfc-editor.org/info/rfc5340>. [RFC6571] Filsfils, C., Ed., Francois, P., Ed., Shand, M., Decraene, B., Uttaro, J., Leymann, N., and M. Horneffer, "Loop-Free Alternate (LFA) Applicability in Service Provider (SP) Networks", RFC 6571, DOI 10.17487/RFC6571, June 2012, <http://www.rfc-editor.org/info/rfc6571>. [RFC6987] Retana, A., Nguyen, L., Zinin, A., White, R., and D. McPherson, "OSPF Stub Router Advertisement", RFC 6987, DOI 10.17487/RFC6987, September 2013, <http://www.rfc-editor.org/info/rfc6987>. [RFC7490] Bryant, S., Filsfils, C., Previdi, S., Shand, M., and N. So, "Remote Loop-Free Alternate (LFA) Fast Reroute (FRR)", RFC 7490, DOI 10.17487/RFC7490, April 2015, <http://www.rfc-editor.org/info/rfc7490>. [RFC7777] Hegde, S., Shakir, R., Smirnov, A., Li, Z., and B. Decraene, "Advertising Node Administrative Tags in OSPF", RFC 7777, DOI 10.17487/RFC7777, March 2016, <http://www.rfc-editor.org/info/rfc7777>. [RFC7917] Sarkar, P., Ed., Gredler, H., Hegde, S., Litkowski, S., and B. Decraene, "Advertising Node Administrative Tags in IS-IS", RFC 7917, DOI 10.17487/RFC7917, June 2016, <http://www.rfc-editor.org/info/rfc7917>. 9.2. Informative References [REMOTE-LFA-NODE] Sarkar, P., Ed., Hegde, S., Bowers, C., Gredler, H., and S. Litkowski, "Remote-LFA Node Protection and Manageability", Work in Progress, draft-ietf-rtgwg-rlfa- node-protection-05, December 2015. [SEG-RTG-ARCH] Filsfils, C., Ed., Previdi, S., Ed., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", Work in Progress, draft-ietf-spring- segment-routing-08, May 2016. [TI-LFA] Francois, P., Filsfils, C., Bashandy, A., Decraene, B., and S. Litkowski, "Topology Independent Fast Reroute using Segment Routing", Work in Progress, draft-francois- segment-routing-ti-lfa-00, November 2013. Contributors Significant contributions were made by Pierre Francois, Hannes Gredler, Chris Bowers, Jeff Tantsura, Uma Chunduri, Acee Lindem, and Mustapha Aissaoui, whom the authors would like to acknowledge. Authors' Addresses Stephane Litkowski (editor) Orange Email: stephane.litkowski@orange.com Bruno Decraene Orange Email: bruno.decraene@orange.com Clarence Filsfils Cisco Systems Email: cfilsfil@cisco.com Kamran Raza Cisco Systems Email: skraza@cisco.com Martin Horneffer Deutsche Telekom Email: Martin.Horneffer@telekom.de Pushpasis Sarkar Individual Contributor Email: pushpasis.ietf@gmail.com