<?xml version="1.0"encoding="US-ASCII"?>encoding="UTF-8"?> <!DOCTYPE rfc SYSTEM"rfc2629.dtd"> <?rfc toc="yes"?> <?rfc tocompact="yes"?> <?rfc tocdepth="3"?> <?rfc tocindent="yes"?> <?rfc symrefs="yes"?> <?rfc sortrefs="yes"?> <?rfc comments="yes"?> <?rfc inline="yes"?> <?rfc compact="yes"?> <?rfc subcompact="no"?>"rfc2629-xhtml.ent"> <rfccategory="info"xmlns:xi="http://www.w3.org/2001/XInclude" docName="draft-ietf-spring-segment-routing-central-epe-10"ipr="trust200902">number="9087" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" category="info" consensus="true" xml:lang="en" tocInclude="true" tocDepth="3" symRefs="true" sortRefs="true" version="3"> <front> <title abbrev="Segment Routing Centralized EPE">Segment Routing Centralized BGP Egress Peer Engineering</title> <seriesInfo name="RFC" value="9087"/> <author fullname="Clarence Filsfils" initials="C." role="editor" surname="Filsfils"> <organization>Cisco Systems, Inc.</organization> <address> <postal> <street/> <city>Brussels</city> <region/> <code/><country>BE</country><country>Belgium</country> </postal> <email>cfilsfil@cisco.com</email> </address> </author> <author fullname="Stefano Previdi" initials="S." surname="Previdi"> <organization>Cisco Systems, Inc.</organization> <address> <postal> <street/> <city/> <code/> <country>Italy</country> </postal> <email>stefano@previdi.net</email> </address> </author> <author fullname="Gaurav Dawra" initials="G." role="editor" surname="Dawra"> <organization>Cisco Systems, Inc.</organization> <address> <postal> <street/> <city/> <code/><country>USA</country><country>United States of America</country> </postal> <email>gdawra.ietf@gmail.com</email> </address> </author> <author fullname="Ebben Aries" initials="E." surname="Aries"> <organization>Juniper Networks</organization> <address> <postal> <street>1133 Innovation Way</street> <city>Sunnyvale</city><code>CA 94089</code> <country>US</country><region>CA</region> <code>94089</code> <country>United States of America</country> </postal> <email>exa@juniper.net</email> </address> </author> <author fullname="Dmitry Afanasiev" initials="D." surname="Afanasiev"> <organization>Yandex</organization> <address> <postal> <street/> <city/> <code/><country>RU</country><country>Russian Federation</country> </postal> <email>fl0w@yandex-team.ru</email> </address> </author> <dateyear="2017"/> <workgroup>Network Working Group</workgroup>year="2021" month="August" /> <area>RTG</area> <workgroup>SPRING</workgroup> <abstract> <t>Segment Routing (SR) leverages source routing. A node steers a packet through a controlled set of instructions, called segments, by prepending the packet with an SR header. A segment can represent anyinstructioninstruction, topological orservice-based.service based. SR allowsto enforcefor the enforcement of a flow through any topological path while maintaining per-flow state only at the ingress node of the SR domain.</t> <t>The Segment Routing architecture can be directly applied to the MPLSdataplanedata plane with no change on the forwarding plane. It requires a minor extension to the existing link-state routing protocols.</t> <t>This document illustrates the application of Segment Routing to solve the BGP Egress Peer Engineering (BGP-EPE) requirement. The SR-based BGP-EPE solution allows a centralized(Software Defined Network,(Software-Defined Networking, or SDN) controller to program any egress peer policy at ingress border routers or at hosts within the domain.</t> </abstract><note title="Requirements Language"> <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in <xref target="RFC2119">RFC 2119</xref>.</t> </note></front> <middle> <section anchor="INTRO"title="Introduction">numbered="true" toc="default"> <name>Introduction</name> <t>The document is structured as follows:<list style="symbols"> <t><xref target="INTRO"/></t> <ul> <li><xref target="INTRO" format="default"/> states the BGP-EPE problem statement and provides the keyreferences.</t> <t><xref target="BGPSEGMENTS"/>references. </li> <li><xref target="BGPSEGMENTS" format="default"/> defines the different BGP Peering Segments and the semantic associated tothem.</t> <t><xref target="TOPOBGPLS"/>them. </li> <li><xref target="TOPOBGPLS" format="default"/> describes the automated allocation of BGP Peering Segment-IDs (SIDs) by theBGP-EPE enabledBGP-EPE-enabled egress border router and the automated signaling of the external peering topology and the related BGP PeeringSID’sSIDs to the collector <xreftarget="I-D.ietf-idr-bgpls-segment-routing-epe"/>.</t> <t><xref target="BGPPECTRL"/>target="RFC9086" format="default"/>. </li> <li><xref target="BGPPECTRL" format="default"/> overviews the components of a centralized BGP-EPE controller. The definition of the BGP-EPE controller is outside the scope of thisdocument.</t> <t><xref target="PROGRINPUTPOL"/>document. </li> <li><xref target="PROGRINPUTPOL" format="default"/> overviews the methods that could be used by the centralized BGP-EPE controller to implement a BGP-EPE policy at an ingress border router or at a source host within the domain. The exhaustive definition of all the means to programana BGP-EPE input policy is outside the scope of thisdocument.</t> </list></t>document. </li> </ul> <t>For editorial reasons, the solution is described with IPv6 addresses and MPLS SIDs. This solution is equally applicable to IPv4 with MPLS SIDs and also to IPv6 with native IPv6 SIDs.</t> <section anchor="PROBSTATE"title="Problem Statement">numbered="true" toc="default"> <name>Problem Statement</name> <t>The BGP-EPE problem statement is defined in <xreftarget="RFC7855"/>.</t>target="RFC7855" format="default"/>.</t> <t>A centralized controller should be able to instruct an ingress Provider Edgerouter(PE) router or a content source within the domain to use a specific egress PE and a specific external interface/neighbor to reach a particular destination.</t> <t>Let's call this solution "BGP-EPE" for "BGP Egress Peer Engineering". The centralized controller is called the“BGP-EPE Controller”."BGP-EPE controller". The egress border router where the BGP-EPE traffic steering functionality is implemented is called aBGP-EPE enabledBGP-EPE-enabled border router. The input policy programmed at an ingress border router or at a source host is called a BGP-EPE policy.</t> <t>The requirements that have motivated the solution described in this document are listed herebelow:<list style="symbols"> <t>Thebelow:</t> <ul spacing="normal"> <li>The solutionMUST<bcp14>MUST</bcp14> apply to the Internetuse-caseuse case where the Internet routes are assumed to use IPv4 unlabeled or IPv6 unlabeled. It is not required to place the Internet routes in aVRFVPN Routing and Forwarding (VRF) instance and allocate labels on aper route,per-route oron aper-pathbasis.</t> <t>Thebasis. </li> <li>The solutionMUST<bcp14>MUST</bcp14> support any deployediBGPInternal BGP (iBGP) schemes(RRs, confederations(Route Reflectors (RRs), confederations, or iBGP fullmeshes).</t> <t>Themeshes).</li> <li>The solutionMUST<bcp14>MUST</bcp14> be applicable to both routers with external and internalpeers.</t> <t>Thepeers.</li> <li>The solution should minimize the need for new BGP capabilities at the ingressPEs.</t> <t>ThePEs.</li> <li>The solutionMUST<bcp14>MUST</bcp14> accommodate an ingress BGP-EPE policy at an ingress PE or directly at a source within thedomain.</t> <t>Thedomain.</li> <li>The solutionMAY<bcp14>MAY</bcp14> support automated Fast Reroute (FRR) and fast convergencemechanisms.</t> </list></t>mechanisms.</li> </ul> <t>The following reference diagram is used throughout this document.</t> <figureanchor="REFDIAGRAMFIG" title="Reference Diagram"> <artwork>+---------+anchor="REFDIAGRAMFIG"> <name>Reference Diagram</name> <artwork name="" type="" align="left" alt=""><![CDATA[+---------+ +------+ | | | | | H B------D G | | +---/| AS 2 |\ +------+ | |/ +------+ \ | |---L/8 A AS1 C---+ \| | | |\\ \ +------+ /| AS 4 |---M/8 | | \\ +-E |/ +------+ | X | \\ | K | | +===F AS 3 | +---------+ +------+</artwork>]]></artwork> </figure> <t>IPaddressing:<list style="symbols"> <t>C’saddressing:</t> <ul spacing="normal"> <li>C's interface to D: 2001:db8:cd::c/64,D’sD's interface:2001:db8:cd::d/64</t> <t>C’s2001:db8:cd::d/64</li> <li>C's interface to E: 2001:db8:ce::c/64,E’sE's interface:2001:db8:ce::e/64</t> <t>C’s2001:db8:ce::e/64</li> <li>C's upper interface to F: 2001:db8:cf1::c/64,F’sF's interface:2001:db8:cf1::f/64</t> <t>C’s2001:db8:cf1::f/64</li> <li>C's lower interface to F: 2001:db8:cf2::c/64,F’sF's interface:2001:db8:cf2::f/64</t> <t>BGP2001:db8:cf2::f/64</li> <li>BGP router-ID of C:192.0.2.3</t> <t>BGP192.0.2.3</li> <li>BGP router-ID of D:192.0.2.4</t> <t>BGP192.0.2.4</li> <li>BGP router-ID of E:192.0.2.5</t> <t>BGP192.0.2.5</li> <li>BGP router-ID of F:192.0.2.6</t> <t>Loopback192.0.2.6</li> <li>Loopback of F used foreBGPExternal BGP (eBGP) multi-hop peering to C:2001:db8:f::f/128</t> <t>C’s2001:db8:f::f/128</li> <li>C's loopback is 2001:db8:c::c/128 with SID64</t> </list></t> <t>C’s BGP peering:<list style="symbols"> <t>Single-hop64</li> </ul> <t>C's BGP peering:</t> <ul spacing="normal"> <li>Single-hop eBGP peering with neighbor 2001:db8:cd::d(D)</t> <t>Single-hop(D)</li> <li>Single-hop eBGP peering with neighbor 2001:db8:ce::e(E)</t> <t>Multi-hop(E)</li> <li>Multi-hop eBGP peering with F on IP address 2001:db8:f::f(F)</t> </list></t> <t>C’s(F)</li> </ul> <t>C's resolution of the multi-hop eBGP session toF:<list style="symbols"> <t>StaticF:</t> <ul spacing="normal"> <li>Static route to 2001:db8:f::f/128 via2001:db8:cf1::f</t> <t>Static2001:db8:cf1::f</li> <li>Static route to 2001:db8:f::f/128 via2001:db8:cf2::f</t> </list></t>2001:db8:cf2::f</li> </ul> <t>C is configured with a local policy that defines a BGP PeerSet as the set of peers (2001:db8:ce::e for E and 2001:db8:f::f forF)</t>F).</t> <t>X is the BGP-EPE controller within the AS1 domain.</t> <t>H is a content source within the AS1 domain.</t> </section> <section> <name>Requirements Language</name> <t> The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shown here. </t> </section> </section> <section anchor="BGPSEGMENTS"title="BGPnumbered="true" toc="default"> <name>BGP PeeringSegments">Segments</name> <t>As defined in <xreftarget="I-D.ietf-spring-segment-routing"/>,target="RFC8402" format="default"/>, certain segments are defined by aBGP-EPE capableBGP-EPE-capable node andcorrespondingcorrespond toitstheir attached peers. These segments are called BGPpeering segmentsPeering Segments or BGP Peering SIDs. They enable the expression of source-routed inter-domain paths.</t> <t>An ingress border router of an AS may compose a list of segments to steer a flow along a selected path within the AS, towards a selected egress border router C of the AS and through a specific peer. At minimum, a BGP EgressPeeringPeer Engineering policy applied at an ingress EPE involves two segments: the Node SID of the chosen egress EPE and then the BGP Peering Segment for the chosen egress EPE peer or peering interface.</t> <t><xreftarget="I-D.ietf-spring-segment-routing"/>target="RFC8402" format="default"/> defines three types of BGPpeering segments/SIDs:Peering Segments/SIDs: PeerNode SID, PeerAdjSIDSID, and PeerSet SID.</t><t>A Peer<ul empty="true"> <li> <dl newline="false"> <dt>Peer NodeSegment is aSegment: </dt> <dd>A segment describing a peer, including the SID (PeerNode SID) allocated toit.</t> <t>A Peerit </dd> <dt>Peer AdjacencySegment is aSegment: </dt> <dd>A segment describing a link, including the SID (PeerAdj SID) allocated toit.</t> <t>A Peerit </dd> <dt>Peer SetSegment is aSegment: </dt> <dd>A segment describing a link or a node that is part of the set, including the SID (PeerSet SID) allocated to theset.</t>set </dd> </dl> </li> </ul> </section> <section anchor="TOPOBGPLS"title="Distributionnumbered="true" toc="default"> <name>Distribution of Topology and TE Informationusing BGP-LS">Using BGP-LS</name> <t>In ships-in-the-night mode with respect to the pre-existing iBGP design, aBGP-LSBorder Gateway Protocol - Link State (BGP-LS) <xreftarget="RFC7752"/>target="RFC7752" format="default"/> session is established between theBGP-EPE enabledBGP-EPE-enabled border router and the BGP-EPE controller.</t> <t>As a result of its local configuration and according to the behavior described in <xreftarget="I-D.ietf-idr-bgpls-segment-routing-epe"/>, nodetarget="RFC9086" format="default"/>, Node C allocates the following BGP Peering Segments(<xref target="I-D.ietf-spring-segment-routing"/>):<list style="symbols"> <t>A<xref target="RFC8402" format="default"/>:</t> <ul spacing="normal"> <li>A PeerNode segment for each of its definedpeerpeers (D: 1012, E: 1022 and F:1052).</t> <t>A1052).</li> <li>A PeerAdj segment for each recursing interface to a multi-hop peer(e.g.:(e.g., the upper and lower interfaces from C to F infigure 1).</t> <t>A<xref target="REFDIAGRAMFIG"/>).</li> <li>A PeerSet segment to the set of peers (E and F). In thiscasecase, the PeerSet represents a set of peers (E, F) belonging to the same AS (AS3).</t> </list></t>3).</li> </ul> <t>C programs its forwarding tableaccordingly:<figure suppress-title="true"> <artwork>Incoming Outgoing Label Operation Interface ------------------------------------ 1012 POP link to D 1022 POPaccordingly:</t> <table anchor="c-table"> <thead> <tr> <th>Incoming Label</th> <th>Operation</th> <th>Outgoing Interface</th> </tr> </thead> <tbody> <tr> <td>1012</td> <td>POP</td> <td>link to D</td> </tr> <tr> <td>1022</td> <td>POP</td> <td>link to E</td> </tr> <tr> <td>1032</td> <td>POP</td> <td>upper link toE 1032 POP upper link to F 1042 POP lowerF</td> </tr> <tr> <td>1042</td> <td>POP</td> <td>lower link toF 1052 POP loadF</td> </tr> <tr> <td>1052</td> <td>POP</td> <td>load balance on any link toF 1060 POP loadF</td> </tr> <tr> <td>1060</td> <td>POP</td> <td>load balance on any link to E or toF </artwork> </figure></t>F</td> </tr> </tbody> </table> <t>C signalstheeach related BGP-LSNLRI’sinstance of Network Layer Reachability Information (NLRI) to the BGP-EPE controller. Each such BGP-LS route is described in the following subsections according to the encoding details defined in <xreftarget="I-D.ietf-idr-bgpls-segment-routing-epe"/>.</t>target="RFC9086" format="default"/>.</t> <section anchor="PEERNODED"title="PeerNodenumbered="true" toc="default"> <name>PeerNode SID toD">D</name> <t>Descriptors:<list style="symbols"> <t>Local</t> <ul spacing="normal"> <li>Local Node Descriptors (BGP router-ID, ASN, BGP-LS Identifier): 192.0.2.3, AS1,1000</t> <t>Remote1000</li> <li>Remote Node Descriptors (BGP router-ID, ASN): 192.0.2.4,AS2</t> <t>LinkAS2</li> <li>Link Descriptors (IPv6 Interface Address, IPv6 Neighbor Address): 2001:db8:cd::c,2001:db8:cd::d</t> </list></t>2001:db8:cd::d</li> </ul> <t>Attributes:<list style="symbols"> <t>PeerNode</t> <ul spacing="normal"> <li>PeerNode SID:1012</t> </list></t>1012</li> </ul> </section> <section anchor="PEERNODEE"title="PeerNodenumbered="true" toc="default"> <name>PeerNode SID toE">E</name> <t>Descriptors:<list style="symbols"> <t>Local</t> <ul spacing="normal"> <li>Local Node Descriptors (BGP router-ID, ASN, BGP-LSIdentifier)):Identifier): 192.0.2.3, AS1,1000</t> <t>Remote1000</li> <li>Remote Node Descriptors (BGP router-ID, ASN): 192.0.2.5,AS3</t> <t>LinkAS3</li> <li>Link Descriptors (IPv6 Interface Address, IPv6 Neighbor Address): 2001:db8:ce::c,2001:db8:ce::e</t> </list></t>2001:db8:ce::e</li> </ul> <t>Attributes:<list style="symbols"> <t>PeerNode</t> <ul spacing="normal"> <li>PeerNode SID:1022</t> <t>PeerSetSID: 1060</t> <t>Link1022</li> <li>PeerSetSID: 1060</li> <li>Link Attributes: seesection 3.3.2 of<xreftarget="RFC7752"/></t> </list></t>target="RFC7752" sectionFormat="of" section="3.3.2"/></li> </ul> </section> <section anchor="PEERNODEF"title="PeerNodenumbered="true" toc="default"> <name>PeerNode SID toF">F</name> <t>Descriptors:<list style="symbols"> <t>Local</t> <ul spacing="normal"> <li>Local Node Descriptors (BGP router-ID, ASN, BGP-LSIdentifier)):Identifier): 192.0.2.3, AS1,1000</t> <t>Remote1000</li> <li>Remote Node Descriptors (BGP router-ID, ASN): 192.0.2.6,AS3</t> <t>LinkAS3</li> <li>Link Descriptors (IPv6 Interface Address, IPv6 Neighbor Address): 2001:db8:c::c,2001:db8:f::f</t> </list></t>2001:db8:f::f</li> </ul> <t>Attributes:<list style="symbols"> <t>PeerNode</t> <ul spacing="normal"> <li>PeerNode SID:1052</t> <t>PeerSetSID: 1060</t> </list></t>1052</li> <li>PeerSetSID: 1060</li> </ul> </section> <section anchor="PEERNODEFLINK1"title="Firstnumbered="true" toc="default"> <name>First PeerAdj toF">F</name> <t>Descriptors:<list style="symbols"> <t>Local</t> <ul spacing="normal"> <li>Local Node Descriptors (BGP router-ID, ASN, BGP-LSIdentifier)):Identifier): 192.0.2.3, AS1,1000</t> <t>Remote1000</li> <li>Remote Node Descriptors (BGP router-ID, ASN): 192.0.2.6,AS3</t> <t>LinkAS3</li> <li>Link Descriptors (IPv6 Interface Address, IPv6 Neighbor Address): 2001:db8:cf1::c,2001:db8:cf1::f</t> </list></t>2001:db8:cf1::f</li> </ul> <t>Attributes:<list style="symbols"> <t>PeerAdj-SID: 1032</t> <t>LinkAttributes:</t> <ul spacing="normal"> <li>PeerAdj-SID: 1032</li> <li>Link Attributes: seesection 3.3.2 of<xreftarget="RFC7752"/></t> </list></t>target="RFC7752" sectionFormat="of" section="3.3.2"/></li> </ul> </section> <section anchor="PEERNODEFLINK2"title="Secondnumbered="true" toc="default"> <name>Second PeerAdj toF">F</name> <t>Descriptors:<list style="symbols"> <t>Local</t> <ul spacing="normal"> <li>Local Node Descriptors (BGP router-ID, ASN, BGP-LSIdentifier)):Identifier): 192.0.2.3 ,AS1</t> <t>RemoteAS1, 1000</li> <li>Remote Node Descriptors (peer router-ID, peer ASN): 192.0.2.6,AS3</t> <t>LinkAS3</li> <li>Link Descriptors (IPv6 Interface Address, IPv6 Neighbor Address): 2001:db8:cf2::c,2001:db8:cf2::f</t> </list></t>2001:db8:cf2::f</li> </ul> <t>Attributes:<list style="symbols"> <t>PeerAdj-SID: 1042</t> <t>LinkAttributes:</t> <ul spacing="normal"> <li>PeerAdj-SID: 1042</li> <li>Link Attributes: seesection 3.3.2 of<xreftarget="RFC7752"/></t> </list></t>target="RFC7752" sectionFormat="of" section="3.3.2"/></li> </ul> </section> <section anchor="FRR"title="Fastnumbered="true" toc="default"> <name>Fast Reroute(FRR)">(FRR)</name> <t>ABGP-EPE enabledBGP-EPE-enabled border routerMAY<bcp14>MAY</bcp14> allocateaan FRR backup entry on aper BGP Peering SIDper-BGP-Peering-SID basis. One example is asfollows:<list style="symbols">follows:</t> <ul spacing="normal"> <li> <t>PeerNodeSID<list style="numbers"> <t>IfSID</t> <ol spacing="normal" type="1"> <li>If multi-hop,backupback up via the remaining PeerADJ SIDs (if available) to the samepeer.</t> <t>Else backuppeer.</li> <li>Else, back up via another PeerNode SID to the sameAS.</t> <t>ElseAS.</li> <li>Else, pop the PeerNode SID and perform an IPlookup.</t> </list></t>lookup.</li> </ol> </li> <li> <t>PeerAdjSID<list style="numbers"> <t>IfSID</t> <ol spacing="normal" type="1"> <li>If to a multi-hop peer,backupback up via the remaining PeerADJ SIDs (if available) to the samepeer.</t> <t>Else backuppeer.</li> <li>Else, back up via a PeerNode SID to the sameAS.</t> <t>ElseAS.</li> <li>Else, pop the PeerNode SID and perform an IPlookup.</t> </list></t>lookup.</li> </ol> </li> <li> <t>PeerSetSID<list style="numbers"> <t>BackupSID</t> <ol spacing="normal" type="1"> <li>Back up via remaining PeerNode SIDs in the samePeerSet.</t> <t>ElsePeerSet.</li> <li>Else, pop the PeerNode SID and IPlookup.</t> </list></t> </list></t>lookup.</li> </ol> </li> </ul> <t>Let's illustrate different types of possible backups using the reference diagram and considering the Peering SIDs allocated by C.</t> <t>PeerNode SID 1052, allocated by C for peerF:<list style="symbols"> <t>UponF:</t> <ul spacing="normal"> <li>Upon the failure of the upper connected link CF, C can reroute all the traffic onto the lower CF link to the same peer(F).</t> </list></t>(F).</li> </ul> <t>PeerNode SID 1022, allocated by C for peerE:<list style="symbols"> <t>UponE:</t> <ul spacing="normal"> <li>Upon the failure of the connected link CE, C can reroute all the traffic onto the link to PeerNode SID 1052(F).</t> </list></t>(F).</li> </ul> <t>PeerNode SID 1012, allocated by C for peerD:<list style="symbols"> <t>UponD:</t> <ul spacing="normal"> <li>Upon the failure of the connected link CD, C can pop the PeerNode SID andlookuplook up the IP destination address in its FIB and routeaccordingly.</t> </list></t>accordingly.</li> </ul> <t>PeerSet SID 1060, allocated by C for the set of peers E andF:<list style="symbols"> <t>UponF:</t> <ul spacing="normal"> <li>Upon the failure of a connected link in the group, the traffic to PeerSet SID 1060 is rerouted on any other member of thegroup.</t> </list></t>group.</li> </ul> <t>For specific business reasons, the operator might not want the default FRR behavior applied to a PeerNode SID or any of its dependent PeerADJSID.</t>SIDs.</t> <t>The operator should be able to associate a specific backup PeerNode SID for a PeerNodeSID:SID; e.g., 1022 (E) must be backed up by 1012(D)(D), which overrules the default behaviorwhichthat would have preferred F as a backup for E.</t> </section> </section> <section anchor="BGPPECTRL"title="BGP-EPE Controller">numbered="true" toc="default"> <name>BGP-EPE Controller</name> <t>In this section, Let's provide a non-exhaustive set of inputs that a BGP-EPE controller would likely collect such as to perform the BGP-EPE policy decision.</t> <t>The exhaustive definition is outside the scope of this document.</t> <section anchor="PATHSFROMPEERS"title="Validnumbered="true" toc="default"> <name>Valid PathsFrom Peers">from Peers</name> <t>The BGP-EPE controller should collect all the BGP paths(i.e.:(i.e., IP destination prefixes) advertised by all theBGP-EPE enabledBGP-EPE-enabled borderrouter.</t>routers.</t> <t>This could be realized by setting an iBGP session with theBGP-EPE enabledBGP-EPE-enabled border router, with the router configured to advertise all paths using BGPadd-pathADD-PATH <xreftarget="RFC7911"/>target="RFC7911" format="default"/> and the originalnext-hopnext hop preserved.</t> <t>In this case, C would advertise the following Internet routes to the BGP-EPEcontroller:<list style="symbols">controller:</t> <ul spacing="normal"> <li> <t>NLRI <2001:db8:abcd::/48>,next-hopnext hop 2001:db8:cd::d, AS Path {AS 2,4}<list> <t>X (i.e.:4}</t> <ul spacing="normal"> <li>X (i.e., the BGP-EPE controller) knows that C receives a path to 2001:db8:abcd::/48 via neighbor 2001:db8:cd::d ofAS2.</t> </list></t>AS2.</li> </ul> </li> <li> <t>NLRI <2001:db8:abcd::/48>,next-hopnext hop 2001:db8:ce::e, AS Path {AS 3,4}<list> <t>X4}</t> <ul spacing="normal"> <li>X knows that C receives a path to 2001:db8:abcd::/48 via neighbor 2001:db8:ce::e ofAS2.</t> </list></t>AS2.</li> </ul> </li> <li> <t>NLRI <2001:db8:abcd::/48>,next-hopnext hop 2001:db8:f::f, AS Path {AS 3, 4}<list> <t>X</t> <ul spacing="normal"> <li>X knows that C has an eBGP path to 2001:db8:abcd::/48 via AS3 via neighbor2001:db8:f::f</t> </list></t> </list></t>2001:db8:f::f.</li> </ul> </li> </ul> <t>An alternative option would be for a BGP-EPE collector to use the BGP Monitoring Protocol (BMP) <xreftarget="RFC7854"/>target="RFC7854" format="default"/> to track the Adj-RIB-In ofBGP-EPE enabledBGP-EPE-enabled border routers.</t> </section> <section anchor="INTRATOPO"title="Intra-Domain Topology">numbered="true" toc="default"> <name>Intra-Domain Topology</name> <t>The BGP-EPE controller should collect the internal topology and the related IGP SIDs.</t> <t>This could be realized by collecting the IGPLSDBLink-State Database (LSDB) of each area or running a BGP-LS session with a node in each IGP area.</t> </section> <section anchor="EXTRATOPO"title="External Topology">numbered="true" toc="default"> <name>External Topology</name> <t>Thanks to the collected BGP-LS routes described insection 2,<xref target="TOPOBGPLS"/>, the BGP-EPE controller is able to maintain an accurate description of the egress topology ofnodeNode C. Furthermore, the BGP-EPE controller is able to associate BGP Peering SIDs to the various components of the external topology.</t> </section> <section anchor="SLA"title="SLA characteristicsnumbered="true" toc="default"> <name>SLA Characteristics ofeach peer">Each Peer</name> <t>The BGP-EPE controller might collectSLAService Level Agreement (SLA) characteristics across peers. This requiresana BGP-EPEsolutionsolution, as the SLA probes need to be steered via non-best-path peers.</t> <t>Unidirectional SLA monitoring of the desired path is likely required. This might be possible when the application is controlled at the source and the receiver side. Unidirectional monitoring dissociates the SLA characteristic of the return path (which cannot usually be controlled) from the forward path (the one of interest for pushing content from a source to a consumer and the onewhichthat can be controlled).</t> <t>Alternatively,Extended Metrics,Metric Extensions, as defined in <xreftarget="RFC7810"/>target="RFC8570" format="default"/>, could also be advertised using BGP-LS(<xref target="I-D.ietf-idr-te-pm-bgp"/>).</t><xref target="RFC8571" format="default"/>.</t> </section> <section anchor="MATRIX"title="Traffic Matrix">numbered="true" toc="default"> <name>Traffic Matrix</name> <t>The BGP-EPE controller might collect the traffic matrix to its peers or the final destinations.IPFIXIP Flow Information Export (IPFIX) <xreftarget="RFC7011"/>target="RFC7011" format="default"/> is a likely option.</t> <t>An alternative option consistsinof collecting the link utilization statistics of each of the internal and external links, also available in the current definitionofin <xreftarget="RFC7752"/>.</t>target="RFC7752" format="default"/>.</t> </section> <section anchor="BUSINESS"title="Business Policies">numbered="true" toc="default"> <name>Business Policies</name> <t>The BGP-EPE controller should be configured or collect business policies through any desired mechanisms. These mechanisms by which these policies are configured or collected are outside the scope of this document.</t> </section> <section anchor="BGPPOLICY"title="BGP-EPE Policy">numbered="true" toc="default"> <name>BGP-EPE Policy</name> <t>On the basis of all these inputs (and likely others), the BGP-EPEControllercontroller decides to steer some demands away from their best BGP path.</t> <t>The BGP-EPE policy is likely expressed as a two-entry segment list where the first element is the IGPprefix SIDPrefix-SID of the selected egress border router and the second element is a BGP Peering SID at the selected egress border router.</t> <t>A few examples are providedhereafter:<list style="symbols"> <t>Preferhereafter:</t> <ul spacing="normal"> <li>Prefer egress PE C and peer AS AS2: {64, 1012}. "64" being the SID of PE C as defined in <xreftarget="PROBSTATE"/>.</t> <t>Prefertarget="PROBSTATE" format="default"/>.</li> <li>Prefer egress PE C and peer AS AS3 via eBGP peer 2001:db8:ce::e, {64,1022}.</t> <t>Prefer1022}.</li> <li>Prefer egress PE C and peer AS AS3 via eBGP peer 2001:db8:f::f, {64,1052}.</t> <t>Prefer1052}.</li> <li>Prefer egress PE C and peer AS AS3 via interface 2001:db8:cf2::f of multi-hop eBGP peer 2001:db8:f::f, {64,1042}.</t> <t>Prefer1042}.</li> <li>Prefer egress PE C and any interface to any peer in the group 1060: {64,1060}.</t> </list></t>1060}.</li> </ul> <t>Note that the first SID could be replaced by a list of segments. This is useful when an explicit path within the domain is required fortraffic engineeringtraffic-engineering purposes. For example, if thePrefix SIDPrefix-SID ofnodeNode B is 60 and the BGP-EPE controller would like to steer the traffic from A to C via B then through the external link to peerDD, then the segment list would be {60, 64, 1012}.</t> </section> </section> <section anchor="PROGRINPUTPOL"title="Programmingnumbered="true" toc="default"> <name>Programming aninput policy">Input Policy</name> <t>The detailed/exhaustive description of all the means to implementana BGP-EPE policy are outside the scope of this document. A few examples are provided in this section.</t> <section anchor="ATHOST"title="Atnumbered="true" toc="default"> <name>At aHost">Host</name> <t>A static IP/MPLS route can be programmed at the host H. The static route would define a destination prefix, anext-hopnext hop, and a label stack to push. Assuminga global SRGB,the same Segment Routing Global Block (SRGB), at least on all access routers connecting the hosts, the same policy can be programmed across all hosts, which is convenient.</t> </section> <section anchor="ATROUTER"title="Atnumbered="true" toc="default"> <name>At arouter –Router - SRTraffic Engineering tunnel">Traffic-Engineering Tunnel</name> <t>The BGP-EPE controller can configure the ingress border router with an SRtraffic engineeringtraffic-engineering tunnel T1 and asteering-policy S1steering policy S1, which causes a certain class of traffic to be mapped on the tunnel T1.</t> <t>The tunnel T1 would be configured to push the required segment list.</t> <t>The tunnel and the steering policy could be configured via multiple means. A few examples are givenbelow:<list style="symbols"> <t>PCEPbelow:</t> <ul spacing="normal"> <li>The Path Computation Element Communication Protocol (PCEP) according to <xreftarget="I-D.ietf-pce-segment-routing"/>target="RFC8664" format="default"/> and <xreftarget="I-D.ietf-pce-pce-initiated-lsp"/>.</t> <t>Netconf (<xref target="RFC6241"/>).</t> <t>Othertarget="RFC8281" format="default"/></li> <li>NETCONF <xref target="RFC6241" format="default"/></li> <li>Other static or ephemeralAPIs</t> </list></t>APIs</li> </ul> <t>Example: at router A (<xreftarget="REFDIAGRAMFIG"/>).<figure align="left" suppress-title="true"> <artwork align="center">target="REFDIAGRAMFIG" format="default"/>).</t> <sourcecode> Tunnel T1: push {64, 1042} IP route L/8 set next-hop T1</artwork> </figure></t></sourcecode> </section> <section anchor="ATROUTER8277"title="Atnumbered="true" toc="default"> <name>At a Router– BGP Labeled- Unicastroute (RFC8277)">Route Labeled Using BGP (RFC 8277)</name> <t>The BGP-EPEControllercontroller could build aBGP Labeled Unicastunicast route labeled using BGP <xreftarget="RFC8277"/>) routetarget="RFC8277"/> (from scratch) and send it to the ingressrouter:<list style="symbols"> <t>NLRI:router.</t> <t>Such a route would require the following:</t> <dl newline="true"> <dt>NLRI </dt> <dd>the destination prefix toengineer: e.g., L/8.</t> <t>Next-Hop: theengineer (e.g., L/8) </dd> <dt>Next Hop </dt> <dd>the selected egress border router:C.</t> <t>Label: theC </dd> <dt>Label </dt> <dd>the selected egress peer:1042.</t> <t>AS path: reflecting the1042 </dd> <dt>Autonomous System (AS) path </dt> <dd>the selected valid ASpath.</t> <t>Somepath </dd> </dl> <t> Some BGP policy to ensure it will be selected as best by the ingress router. Note that as discussed inRFC 8277 section 5,<xref target="RFC8277" sectionFormat="of" section="5"/>, the comparison of a labeled and unlabeled unicast BGP route is implementation dependent and hence may require animplementation specificimplementation-specific policy on each ingressrouter.</t> </list></t>router. </t> <t>ThisBGP Labeledunicast route(RFC8277) “overwrites”labeled using BGP <xref target="RFC8277"/> "overwrites" an equivalent or less-specific“best path”."best path". As thebest-pathbest path is changed, this BGP-EPE input policy option may influence the path propagated to the upstream peer/customers. Indeed, implementations treating the SAFI-1 and SAFI-4 routes for a given prefix as comparable would trigger a BGP WITHDRAW of the SAFI-1 route to their BGP upstream peers.</t> </section> <section anchor="ATROUTERVPN"title="Atnumbered="true" toc="default"> <name>At a Router–- VPNpolicy route">Policy Route</name> <t>The BGP-EPEControllercontroller could build a VPNv4 route (from scratch) and send it to the ingressrouter:<list style="symbols"> <t>NLRI:router.</t> <t>Such a route would require the following:</t> <dl newline="true"> <dt>NLRI </dt> <dd>the destination prefix to engineer: e.g.,L/8.</t> <t>Next-Hop: theL/8 </dd> <dt>Next Hop </dt> <dd>the selected egress border router:C.</t> <t>Label: theC </dd> <dt>Label </dt> <dd>the selected egress peer:1042.</t> <t>Route-Target: selecting the1042 </dd> <dt>Route-Target </dt> <dd>the selected appropriate VRF instance at the ingressrouter.</t> <t>AS path: reflecting therouter </dd> <dt>AS path </dt> <dd>the selected valid ASpath.</t> <t>Somepath </dd> </dl> <t> Some BGP policy to ensure it will be selected as best by the ingress router in the relatedVRF.</t> </list></t>VRF instance. </t> <t>The related VRF instance must be preconfigured. A VRF fallback to the main FIB might be beneficial to avoid replicating all the "normal" Internet paths in eachVRF.</t>VRF instance.</t> </section> </section> <section anchor="IPv6"title="IPv6 Dataplane">numbered="true" toc="default"> <name>IPv6 Data Plane</name> <t>The described solution is applicable to IPv6, either with MPLS-based orIPv6-NativeIPv6-native segments. In both cases, the same three steps of the solution areapplicable:<list style="symbols"> <t>BGP-LS-basedapplicable:</t> <ul spacing="normal"> <li>BGP-LS-based signaling of the external topology and BGP Peering Segments to the BGP-EPEcontroller.</t> <t>Collection of various inputscontroller.</li> <li>Collecting, by the BGP-EPEcontrollercontroller, various inputs to come up with a policydecision.</t> <t>Programmingdecision.</li> <li>Programming at an ingress router or source host of the desired BGP-EPEpolicypolicy, which consistsinof a list of segments to push on a defined trafficclass.</t> </list></t>class.</li> </ul> </section> <section anchor="BENEFITS"title="Benefits">numbered="true" toc="default"> <name>Benefits</name> <t>The BGP-EPE solutions described in this document have the followingbenefits:<list style="symbols"> <t>Nobenefits:</t> <ul spacing="normal"> <li>No assumption on the iBGP design withinAS1.</t> <t>Next-Hop-SelfAS1.</li> <li>Next-hop-self on the Internet routes propagated to the ingress border routers is possible. This is a common design rule to minimize the number of IGP routes and to avoid importing external churn into the internal routingdomain.</t> <t>Consistentdomain.</li> <li>Consistent support for traffic engineering within the domain and at the external edge of thedomain.</t> <t>Supportdomain.</li> <li>Support for both host and ingress border router BGP-EPE policyprogramming.</t> <t>BGP-EPEprogramming.</li> <li>BGP-EPE functionality is only required on theBGP-EPE enabledBGP-EPE-enabled egress border router and the BGP-EPEcontroller:controller; an ingress policy can be programmed at the ingress border router without any newfunctionality.</t> <t>Abilityfunctionality.</li> <li>Ability to deploy the same input policy across hosts connected to different routers (assuming the global property of IGPprefix SIDs).</t> </list></t>Prefix-SIDs).</li> </ul> </section> <section anchor="IANA"title="IANA Considerations">numbered="true" toc="default"> <name>IANA Considerations</name> <t>This documentdoes not request anyhas no IANAallocations.</t>actions.</t> </section> <section anchor="Manageability"title="Manageability Considerations"> <t>Thenumbered="true" toc="default"> <name>Manageability Considerations</name> <t> The BGP-EPEuse-caseuse case described in this document requires BGP-LS(<xref target="RFC7752"/>)<xref target="RFC7752" format="default"/> extensions that are described in <xreftarget="I-D.ietf-idr-bgpls-segment-routing-epe"/>. The required extensionstarget="RFC9086" format="default"/> and that consists of additional BGP-LS descriptors andTLVs that will follow the same.TLVs. Manageability functions of BGP-LS, described in <xreftarget="RFC7752"/>target="RFC7752" format="default"/>, also apply to the extensions required by the EPEuse-case.</t>use case. </t> <t>AdditionalManageabilitymanageability considerations are described in <xreftarget="I-D.ietf-idr-bgpls-segment-routing-epe"/>.</t>target="RFC9086" format="default"/>.</t> </section> <section anchor="Security"title="Security Considerations">numbered="true" toc="default"> <name>Security Considerations</name> <t><xreftarget="RFC7752"/>target="RFC7752" format="default"/> defines BGP-LSNLRIsNLRI instances and their associated security aspects.</t> <t><xreftarget="I-D.ietf-idr-bgpls-segment-routing-epe"/>target="RFC9086" format="default"/> defines the BGP-LS extensions required by the BGP-EPE mechanisms described in this document. BGP-EPE BGP-LS extensions also include the related security.</t> </section><section anchor="Contributors" title="Contributors"> <t>Daniel Ginsburg substantially contributed to the content of this document.</t> </section></middle> <back> <references> <name>References</name> <references> <name>Normative References</name> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7752.xml"/> <reference anchor="RFC9086" target="https://www.rfc-editor.org/info/rfc9086"> <front> <title>Border Gateway Protocol - Link State (BGP-LS) Extensions for Segment Routing BGP Egress Peer Engineering</title> <author initials="S." surname="Previdi" fullname="Stefano Previdi"> <organization>Individual</organization> </author> <author initials="K." surname="Talaulikar" fullname="Ketan Talaulikar" role="editor"> <organization>Cisco Systems</organization> </author> <author initials="C." surname="Filsfils" fullname="Clarence Filsfils"> <organization>Cisco Systems</organization> </author> <author initials="K." surname="Patel" fullname="Keyur Patel"> <organization>Arrcus, Inc.</organization> </author> <author initials="S." surname="Ray" fullname="Saikat Ray"> <organization>Individual Contributor</organization> </author> <author initials="J." surname="Dong" fullname="Jie Dong"> <organization>Huawei Technologies</organization> </author> <date month="August" year="2021" /> </front> <seriesInfo name="RFC" value="9086"/> <seriesInfo name="DOI" value="10.17487/RFC9086"/> </reference> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8402.xml"/> </references> <references> <name>Informative References</name> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7855.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7911.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8570.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7854.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7011.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6241.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8277.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8664.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8281.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8571.xml"/> </references> </references> <section anchor="Acknowledgements"title="Acknowledgements">numbered="false" toc="default"> <name>Acknowledgements</name> <t>The authors would like to thankAcee Lindem<contact fullname="Acee Lindem"/> for his comments and contribution.</t> </section></middle> <back> <references title="Normative References"> <?rfc include="reference.RFC.2119.xml"?> <?rfc include="reference.RFC.7752.xml"?> <?rfc include="reference.I-D.ietf-idr-bgpls-segment-routing-epe.xml"?> <?rfc include="reference.I-D.ietf-spring-segment-routing.xml"?> </references> <references title="Informative References"> <?rfc include="reference.RFC.7855.xml"?> <?rfc include="reference.RFC.7911.xml"?> <?rfc include="reference.RFC.7810.xml"?> <?rfc include="reference.RFC.7854.xml"?> <?rfc include="reference.RFC.7011.xml"?> <?rfc include="reference.RFC.6241.xml"?> <?rfc include="reference.RFC.8277.xml"?> <?rfc include="reference.I-D.ietf-pce-segment-routing.xml"?> <?rfc include="reference.I-D.ietf-pce-pce-initiated-lsp.xml"?> <?rfc include="reference.I-D.ietf-idr-te-pm-bgp.xml"?> </references><section anchor="Contributors" numbered="false" toc="default"> <name>Contributors</name> <t><contact fullname="Daniel Ginsburg"/> substantially contributed to the content of this document.</t> </section> </back> </rfc>