BESS WorkgroupInternet Engineering Task Force (IETF) J. Rabadan, Ed.Internet DraftRequest for Comments: 8388 S. Palislamovic Category: Informational W. HenderickxIntended status: InformationalISSN: 2070-1721 Nokia A. Sajassi Cisco J. Uttaro AT&TExpires: August 28, 2018 February 24,April 2018 Usage andapplicabilityApplicability of BGPMPLS basedMPLS-Based Ethernet VPNdraft-ietf-bess-evpn-usage-09Abstract This document discusses the usage and applicability of BGPMPLS basedMPLS-based Ethernet VPN (EVPN) in a simple and fairly common deployment scenario. The different EVPN procedures are explainedonin the examplescenario, analyzingscenario along with the benefits and trade-offs of each option. This document is intended to provide a simplified guide for the deployment of EVPN networks. Status ofthisThis Memo ThisInternet-Draftdocument issubmitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documentsnot an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force(IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum(IETF). It represents the consensus ofsix monthsthe IETF community. It has received public review andmay be updated, replaced, or obsoletedhas been approved for publication byotherthe Internet Engineering Steering Group (IESG). Not all documentsatapproved by the IESG are candidates for anytime. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The listlevel of Internet Standard; see Section 2 of RFC 7841. Information about the currentInternet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The liststatus ofInternet-Draft Shadow Directories canthis document, any errata, and how to provide feedback on it may beaccessedobtained athttp://www.ietf.org/shadow.html This Internet-Draft will expire on August 28, 2018.https://www.rfc-editor.org/info/rfc8388. Copyright Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents(http://trustee.ietf.org/license-info)(https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . ..3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . ..3 3.Use-case scenario descriptionUse Case Scenario Description andrequirements .Requirements . . . . . . . 4 3.1. Service Requirements . . . . . . . . . . . . . . . . . ..5 3.2. Why EVPNis chosenIs Chosen toaddress this use-case .Address This Use Case . . . . . . . 6 4. Provisioning Model . . . . . . . . . . . . . . . . . . . . ..7 4.1. Commonprovisioning tasks .Provisioning Tasks . . . . . . . . . . . . . . . . 7 4.1.1.Non-service specific parameters .Non-Service-Specific Parameters . . . . . . . . . . . 7 4.1.2.Service specific parameters .Service-Specific Parameters . . . . . . . . . . . . . 8 4.2.Service interface dependent provisioning tasks .Service-Interface-Dependent Provisioning Tasks . . . . . 9 4.2.1.VLAN-based service interfaceVLAN-Based Service Interface EVI . . . . . . . . . ..9 4.2.2.VLAN-bundle service interfaceVLAN Bundle Service Interface EVI . . . . . . . . . ..10 4.2.3.VLAN-aware bundling service interfaceVLAN-Aware Bundling Service Interface EVI . . . . . ..10 5. BGP EVPN NLRIusage .Usage . . . . . . . . . . . . . . . . . . . . . 10 6.MAC-based forwarding model use-case .MAC-Based Forwarding Model Use Case . . . . . . . . . . . . . 11 6.1. EVPN Network Startupprocedures .Procedures . . . . . . . . . . . . . 11 6.2.VLAN-based service procedures .VLAN-Based Service Procedures . . . . . . . . . . . . . . 12 6.2.1. Servicestartup procedures .Startup Procedures . . . . . . . . . . . . . 12 6.2.2. Packetwalkthrough .Walk-Through . . . . . . . . . . . . . . . . . 13 6.3.VLAN-bundle service procedures .VLAN Bundle Service Procedures . . . . . . . . . . . . . 16 6.3.1. Servicestartup procedures .Startup Procedures . . . . . . . . . . . . . 16 6.3.2. PacketWalkthrough .Walk-Through . . . . . . . . . . . . . . . . . 17 6.4.VLAN-aware bundling service procedures .VLAN-Aware Bundling Service Procedures . . . . . . . . . 17 6.4.1. Servicestartup procedures .Startup Procedures . . . . . . . . . . . . . 18 6.4.2. PacketWalkthrough .Walk-Through . . . . . . . . . . . . . . . . . 18 7.MPLS-based forwarding model use-case .MPLS-Based Forwarding Model Use Case . . . . . . . . . . . . 19 7.1. Impact ofMPLS-based forwardingMPLS-Based Forwarding on the EVPNnetwork startup .Network Startup . . . . . . . . . . . . . . . . . . . . . . . . . 20 7.2. Impact ofMPLS-based forwardingMPLS-Based Forwarding on theVLAN-based service procedures .VLAN-Based Service Procedures . . . . . . . . . . . . . . . . . . . . . . . 20 7.3. Impact ofMPLS-based forwardingMPLS-Based Forwarding on theVLAN-bundle service procedures .VLAN Bundle Service Procedures . . . . . . . . . . . . . . . . . . . 21 7.4. Impact ofMPLS-based forwardingMPLS-Based Forwarding on theVLAN-aware service procedures .VLAN-Aware Service Procedures . . . . . . . . . . . . . . . . . . . . . . . 21 8. Comparison betweenMAC-basedMAC-Based andMPLS-basedMPLS-Based Egress Forwarding Models . . . . . . . . . . . . . . . . . . . . . . . . . . ..22 9. Trafficflow optimization .Flow Optimization . . . . . . . . . . . . . . . . . . 23 9.1.Control PlaneControl-Plane Procedures . . . . . . . . . . . . . . . ..23 9.1.1. MAClearning options .Learning Options . . . . . . . . . . . . . . . . 23 9.1.2.Proxy-ARP/ND .Proxy ARP/ND . . . . . . . . . . . . . . . . . . . . 24 9.1.3. Unknown Unicastflooding suppression .Flooding Suppression . . . . . . . .2524 9.1.4. Optimization ofInter-subnet forwarding .Inter-Subnet Forwarding . . . . . . . 25 9.2. PacketWalkthroughWalk-Through Examples . . . . . . . . . . . . . .. .26 9.2.1.Proxy-ARP exampleProxy ARP Example for CE2 to CE3traffic .Traffic . . . . . . 26 9.2.2. Floodsuppression exampleSuppression Example forCE1 to CE3 traffic .CE1-to-CE3 Traffic . . 26 9.2.3. Optimization ofinter-subnet forwarding exampleInter-subnet Forwarding Example for CE3 to CE2traffic .Traffic . . . . . . . . . . . . . . . . . 27 10. Security Considerations . . . . . . . . . . . . . . . . . . . 28 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 12. References . . . . . . . . . . . . . . . . . . . . . . . . ..29 12.1. Normative References . . . . . . . . . . . . . . . . . ..29 12.2. Informative References . . . . . . . . . . . . . . . . ..2913.Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 2914.Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 3015.Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 1. Introduction This document complements [RFC7432] by discussing the applicability of the technology in a simple and fairly common deployment scenario, which is described insectionSection 3. After describing the topology and requirements of theuse-caseuse case scenario,sectionSection 4 will describe the provisioning model. Once the provisioning model is analyzed,sectionsSections 5,66, and 7 will describe thecontrol planecontrol-plane anddata planedata-plane procedures in the examplescenario,scenario for the two potential disposition/forwarding models:MAC-basedMAC- based and MPLS-based models. While both models can interoperate in the same network, each one has different trade-offs that are analyzed insectionSection 8. Finally, EVPN provides some potential traffic flow optimization tools that are also described insection 9,Section 9 in the context of the example scenario. 2. Terminology The following terminology is used:oVID: VLANIdentifier. oIdentifier CE: Customer Edgedevice. o(device) EVI: EVPNInstance. oInstance MAC-VRF: A Virtual Routing and Forwarding (VRF) table for Media Access Control (MAC) addresses on aPE. oProvider Edge (PE) router. ES: An Ethernet Segment(ES):is a set of links through which acustomer site (CE)CE is connected to one or more PEs. Each ES is identified by an Ethernet Segment Identifier (ESI) in the control plane.o CE-VIDs refer to theCE-VIDs: The VLANtag identifiersIdentifier tags being used at CE1,CE2CE2, and CE3 to tag customer traffic sent to theService Provider E- VPN network oservice provider EVPN network. CE1-MAC,CE2-MACCE2-MAC, andCE3-MAC refer toCE3-MAC: The source MAC addresses "behind" eachCECE, respectively.ThoseThese MAC addresses can belong to the CEs themselves or to devices connected to the CEs.oCE1-IP,CE2-IPCE2-IP, andCE3-IP refer toCE3-IP: The IP addresses associatedtowith the above MACaddresses. oaddresses LACP: Link Aggregation ControlProtocol. oProtocol RD: RouteDistinguisher. oDistinguisher RT: RouteTarget. oTarget PE: Provider Edgerouter. o(router) AS: AutonomousSystem. oSystem PE-IP:it refers to theThe IP address of a givenPE.PE 3.Use-case scenario descriptionUse Case Scenario Description andrequirementsRequirements Figure 1 depicts the scenario that will be referenced throughout the rest of the document. +--------------+ | | +----+ +----+ | | +----+ +----+ | CE1|-----| | | | | |---| CE3| +----+ /| PE1| | IP/MPLS | | PE3| +----+ / +----+ | Network | +----+ / | | / +----+ | | +----+/ | | | | | CE2|-----| PE2| | | +----+ +----+ | | +--------------+ Figure11: EVPNuse-case scenarioUse Case Scenario There are three PEs and three CEs considered in this example: PE1, PE2, and PE3, as well as CE1,CE2CE2, and CE3. BroadcastDomainsdomains must be extended among the three CEs. 3.1. Service Requirements The following service requirements are assumed in this scenario: o Redundancy requirements: - CE2 requiresmulti-homingmultihoming connectivity to PE1 and PE2, not only for redundancypurposes,purposes but also for adding moreupstream/downstreamupstream/ downstream connectivity bandwidth to/from the network. - Fast convergence. Forexample:example, if the link between CE2 and PE1 goes down, a fast convergence mechanism must be supported so that PE3 can immediately send the traffic to PE2, irrespective of the number of affected services and MAC addresses. o Service interface requirements: - The service definition must be flexible in terms of CE-VID-to- broadcast-domain assignment in the core. - The following three EVI services are required in this example: EVI100- Ituses VLAN-based service interfaces in the three CEs with a 1:1 VLAN-to-EVI mapping. The CE-VIDs at the three CEs can be thesame, for example:same (for example, VID100,100) or different at eachCE, for instance:CE (for instance, VID 101 in CE1, VID 102 inCE2CE2, and VID 103 inCE3.CE3). A single broadcast domain needs to be created for EVI100 in any case;thereforetherefore, CE-VIDs will require translation at the egress PEs if they are not consistent across the three CEs. The case when the same CE-VID is used across the three CEs for EVI100 is referred to in [RFC7432] as the "Unique VLAN" EVPN case. This term will be used throughout this document too. EVI200- ItusesVLAN-bundleVLAN bundle service interfaces in CE1,CE2CE2, andCE3,CE3 based on an N:1 VLAN-to-EVI mapping. The operator needs topre-configurepreconfigure a range of CE-VIDs and its mapping to the EVI, and this mapping should be consistent in all the PEs (no translation is supported). A single broadcast domain is created for the customer. The customer is responsibleoffor keeping the separation between users in different CE-VIDs. EVI300- Ituses VLAN-aware bundling service interfaces in CE1,CE2CE2, and CE3. As in the EVI200 case, an N:1 VLAN-to-EVI mapping is created at the ingressPEs, howeverPEs; however, in this case, a separate broadcast domain is required per CE-VID. The CE-VIDs can be different(hence(hence, CE-VID translation is required).NOTE:Note that insectionSection 4.2.1, only EVI100 is used as an example of VLAN-based service provisioning. InsectionsSections 6.2 and 7.2, 4kVLAN-basedVLAN- based EVIs (EVI1 to EVI4k) are used so that the impact of MACvs.versus MPLS disposition models in the control plane can be evaluated. In the same way, EVI200 and EVI300 will be described with a 4k:1 mapping (CE-VIDs-to-EVI mapping) insectionsSections 6.3, 6.4,7.37.3, and 7.4. oBUM (Broadcast,Broadcast, Unknownunicast, Multicast)Unicast, Multicast (BUM) optimization requirements: - The solution must support ingress replication or P2MP MPLS LSPs on a per EVI service.-For example, we can use ingress replication for EVI100 and EVI200, assuming those EVIs will not carry much BUM traffic. On the contrary, if EVI300 is presumably carrying a significant amount of multicast traffic, P2MP MPLS LSPs can be used for this service. - The benefit of ingress replication compared to P2MP LSPs is that the core routers will not need to maintain any multicast states. 3.2. Why EVPNis chosenIs Chosen toaddress this use-case VPLSAddress This Use Case Virtual Private LAN Service (VPLS) solutions based on [RFC4761],[RFC4762][RFC4762], and [RFC6074] cannot meet the requirements insectionSection 3, whereas EVPN can. For example: o If CE2 has a single CE-VID (or a fewCE-VIDs)CE-VIDs), the current VPLSmulti-homingmultihoming solutions (based on load-balancing per CE-VID or service) do not provide the optimized link utilization required in this example. EVPN provides theflow-based load-balancing multi-homingflow-based, load-balancing, multihoming solution required in this scenario to optimize the upstream/downstream link utilization between CE2 and PE1-PE2. oAlso,EVPN provides a fast convergence solution that is independent of the CE-VIDs in themulti-homedmultihomed PEs. Upon failure on the link between CE2 and PE1, PE3 can immediately send the traffic toPE2,PE2 based on a single notification message being sent by PE1. This is not possible with VPLS solutions. o With regard to service interfaces and mapping to broadcast domains, while VPLS might meet the requirements for EVI100 and EVI200, the VLAN-aware bundling service interfaces required by EVI300 are not supported by the current VPLS tools. The rest of the document will describe how EVPN can be used to meet the service requirements described insection 3,Section 3 and even optimize the network further by: oProvidingproviding the user with an option to reduce (and even suppress)ARP-flooding.ARP (Address Resolution Protocol) flooding; and oSupportingsupporting ARP termination andinter-subnet-forwarding.inter-subnet forwarding. 4. Provisioning Model One of the requirements stated in [RFC7209] is the ease of provisioning. BGP parameters and service context parameters should be auto-provisioned so that the addition of a new MAC-VRF to the EVI requires a minimum number of single-sided provisioning touches.HoweverHowever, this is possible only in a limited number of cases. This section describes the provisioning tasks required for the services described insectionSection 3,i.e.i.e., EVI100 (VLAN-based service interfaces), EVI200(VLAN-bundle(VLAN bundle serviceinterfaces)interfaces), and EVI300 (VLAN-aware bundling service interfaces). 4.1. Commonprovisioning tasksProvisioning Tasks Regardless of the service interface type (VLAN-based,VLAN-bundleVLAN bundle, or VLAN-aware), the followingsub-sectionssubsections describe the parameters to be provisioned in the three PEs. 4.1.1.Non-service specific parametersNon-Service-Specific Parameters Themulti-homingmultihoming function in EVPN requires the provisioning of certain parameters that are notservice-specificservice specific and that are shared by all the MAC-VRFs in the node using themulti-homingmultihoming capabilities. In ouruse-case,use case, these parameters are only provisioned orauto- derivedauto-derived in PE1 andPE2,PE2 and are listed below: o Ethernet Segment Identifier (ESI):onlyOnly the ESI associatedtowith CE2 needs to be considered in our example. Single-homed CEs such as CE1 and CE3 do not require the provisioning of an ESI (the ESI will be coded as zero in the BGPNLRIs).Network Layer Reachability Information (NLRI)). In our example, aLAGLink Aggregation Group (LAG) is used between CE2 and PE1-PE2 (since all-activemulti-homingmultihoming is arequirement) thereforerequirement); therefore, the ESI can beauto-derivedauto- derived from the LACP information as described in [RFC7432]. Note that the ESI must be unique across all the PEs in thenetwork, thereforenetwork; therefore, the auto-provisioning of the ESI is recommended only in case the CEs are managed by theOperator. Otherwiseoperator. Otherwise, the ESI should be manually provisioned(type 0(Type 0, as in [RFC7432]) in order to avoid potential conflicts. o ES-Import Route Target (ES-Import RT):thisThis is the RT that will be sent by PE1 and PE2, along with the ES route. Regardless of how the ESI is provisioned in PE1 and PE2, the ES-Import RT must always be auto-derived from the 6-byte MAC address portion of the ESI value. o Ethernet Segment Route Distinguisher (ES RD):thisThis is the RD to be encoded in the ESrouteroute, and it is the Ethernet Auto-Discovery (A-D) route to be sent by PE1 and PE2 for the CE2 ESI. This RD should always be auto-derived from thePE IPPE-IP address, as described in [RFC7432]. oMulti-homingMultihoming type:theThe user must be able to provision themulti-homingmultihoming type to be used in the network. In ouruse-case,use case, themulti-homingmultihoming type will be set to all-active for the CE2 ESI. This piece of information is encoded in the ESI Label extended community flags and is sent by PE1 and PE2 along with the Ethernet A-D route for the CE2 ESI. In addition, the same LACP parameters will be configured in PE1 and PE2 for the ES so that CE2 can send frames to PE1 and PE2 as though they were forming a single system. 4.1.2.Service specific parametersService-Specific Parameters The following parameters must be provisioned in PE1,PE2PE2, and PE3 per EVI service: o EVIidentifier:Identifier: The global identifier per EVI that is shared by all the PEs that are part of the EVI,i.e.i.e., PE1,PE2PE2, and PE3 will be provisioned with EVI100,200200, and 300. The EVI identifier can be associatedtowith (or be the same value as) the EVI default Ethernet Tag (4-byte default broadcast domain identifier for the EVI). The Ethernet Tag is different from zero in the EVPN BGP routes only if the service interface type (of the source PE) is a VLAN-awareBundle.bundle. o EVI Route Distinguisher (EVI RD): This RD is a unique value across all the MAC-VRFs in a PE. Auto-derivation of this RD might be possible depending on the service interface type being used in the EVI.NextThe next section discusses the specifics of each service interface type. o EVI Route Target(s) (EVI RT):oneOne or more RTs can be provisioned per MAC-VRF. The RT(s) imported and exported can be equal or different, just as the RT(s) in IP-VPNs. Auto-derivation of this RT(s) might be possible depending on the service interface type being used in the EVI.NextThe next section discusses the specifics of each service interface type. o CE-VID and port/LAG binding to EVI identifier or Ethernet Tag: For more information, please seesectionSection 4.2. 4.2.Service interface dependent provisioning tasksService-Interface-Dependent Provisioning Tasks Depending on the service interface type being used in the EVI, aspecificgiven CE-VID bindingprovisioningprovision must be specified. 4.2.1.VLAN-based service interfaceVLAN-Based Service Interface EVI In ouruse-case,use case, EVI100 is a VLAN-based service interface EVI. EVI100 can be a "unique-VLAN" service if the CE-VID being used for this service in CE1,CE2CE2, and CE3 isidentical, for exampleidentical (for example, VID100.100). In that case, the VID 100 binding must be provisioned in PE1,PE2PE2, and PE3 for EVI100 and the associated port or LAG. The MAC-VRF RD and RT can be auto-derived from the CE-VID: o The auto-derived MAC-VRF RD will be a Type 1 RD, as recommended in [RFC7432], and it will be comprised of [PE-IP]:[zero-padded-VID]; where [PE-IP] is the IP address of the PE (a loopback address) and [zero-padded-VID] is a 2-byte value where thelow orderlow-order 12 bits are the VID (VID 100 in our example) and thehigh orderhigh-order 4 bits are zero. o The auto-derived MAC-VRF RT will be composed of [AS]:[zero-padded- VID]; where [AS] is the Autonomous System that the PE belongs to and [zero-padded-VID] is a22- or 4-byte value where thelow orderlow-order 12 bits are the VID (VID 100 in our example) and thehigh orderhigh-order bits are zero. Note that auto-deriving the RT implies supporting a basic any-to-any topology in the EVI and using the same import and export RT in the EVI. If EVI100 is not a "unique-VLAN" instance, each individual CE-VID must be configured in each PE, and MAC-VRF RDs and RTs cannot beauto-derived, henceauto-derived; hence, they must be provisioned by the user. 4.2.2.VLAN-bundle service interfaceVLAN Bundle Service Interface EVI Assuming EVI200 is aVLAN-bundleVLAN bundle service interface EVI, and VIDs 200-250 are assigned to EVI200, the CE-VID bundle 200-250 must be provisioned on PE1,PE2PE2, and PE3. Note that this model does not allow CE-VID translation and the CEs must use the same CE-VIDs for EVI200. No auto-derived EVI RDs or EVI RTs are possible. 4.2.3.VLAN-aware bundling service interfaceVLAN-Aware Bundling Service Interface EVI If EVI300 is a VLAN-aware bundling service interface EVI, CE-VID binding to EVI300 does not have to match on the three PEs (only on PE1 and PE2, since they are part of the same ES). Forexample:example, PE1 and PE2 CE-VID binding to EVI300 can be set to the range 300-310 and PE3 to 321-330. Note that each individual CE-VID will be assigned to a different broadcast domain, which will be represented by an Ethernet Tag in the control plane. Therefore, besides the CE-VID bundle range bound to EVI300 in each PE, associations between each individual CE-VID and the corresponding EVPN Ethernet Tag must be provisioned by the user. No auto-derived EVI RDs/RTs are possible. 5. BGP EVPN NLRIusageUsage [RFC7432] defines four different route types and four different extended communities. However, not all the PEs in an EVPN network must generate and process all the different routes and extended communities. Table 1 shows the routes that must be exported and imported in theuse-caseuse case described in this document. "Export", in this context, means that the PE must be capable of generating and exporting a given route, assuming there are no BGP policies to prevent it. In the same way, "Import" means the PE must be capable of importing and processing a given route, assuming the right RTs and policies. "N/A" means neither import nor export actions are required.+-------------------+---------------+---------------++-----------------+---------------+---------------+ | BGP EVPNroutesRoutes | PE1-PE2 | PE3 |+-------------------+---------------+---------------++-----------------+---------------+---------------+ | ES |Export/importExport/Import | N/A | | A-D per ESI |Export/importExport/Import | Import | | A-D per EVI |Export/importExport/Import | Import | | MAC |Export/importExport/Import |Export/importExport/Import | | InclusivemcastMcast |Export/importExport/Import |Export/importExport/Import |+-------------------+---------------+---------------++-----------------+---------------+---------------+ Table1 -1: Base EVPN Routes and Export/Import Actions PE3 is required to export only MAC and InclusivemulticastMulticast (Mcast) routes and be able to import and process A-Droutes,routes as well as MAC and InclusivemulticastMulticast routes. If PE3 did not support importing and processing A-D routes per ESI and per EVI, fast convergence and aliasing functions (respectively) would not be possible in thisuse-case.use case. 6.MAC-based forwarding model use-caseMAC-Based Forwarding Model Use Case This section describes how the BGP EVPN routes are exported and imported by the PEs in ouruse-case,use case as well as how traffic is forwarded assuming that PE1,PE2PE2, and PE3 support a MAC-based forwarding model. In order to compare thecontrolcontrol- anddata planedata-plane impact in the two forwarding models (MAC-based and MPLS-based) and different service types, we will assume that CE1,CE2CE2, and CE3 need to exchange traffic for up to 4k CE-VIDs. 6.1. EVPN Network StartupproceduresProcedures Before any EVI is provisioned in the network, the following procedures are required: o Infrastructure setup:theThe proper MPLS infrastructure must besetupset up among PE1,PE2PE2, and PE3 so that the EVPN services can make use ofP2PPoint-to-Point (P2P) and P2MP LSPs. In addition to the MPLS transport, PE1 and PE2 must be properly configured with the same LACP configuration to CE2. Details are provided in [RFC7432]. Once the LAG is properlysetup,set up, the ESI for the CE2 EthernetSegment, for example ESI12,Segment (for example, ESI12) can be auto-generated by PE1 and PE2 from the LACP information exchanged with CE2 (ESItypeType 1), as discussed insectionSection 4.1. Alternatively, the ESI can also be manually provisioned on PE1 and PE2 (ESItypeType 0). PE1 and PE2 will auto-configure a BGP policy that will import any ES route matching the auto-derivedES-importES-Import RT for ESI12. o Ethernet Segment route exchange andDFDesignated Forwarder (DF) election: PE1 and PE2 will advertise a BGP Ethernet Segment route for ESI12, where the ESI RD and ES-Import RT will beauto-generatedauto- generated as discussed insectionSection 4.1.1. PE1 and PE2 will import the ES routes of each other and will run the DF election algorithm for any existing EVI (if any, at this point). PE3 will simply discard the route. Note that the DF election algorithm can support servicecarving,carving so that the downstream BUM traffic from the network to CE2 can be load-balanced across PE1 and PE2 on a per-service basis. At the end of this process, the network infrastructure is ready to start deploying EVPN services. PE1 and PE2 are aware of the existence of a shared Ethernet Segment,i.e.i.e., ESI12. 6.2.VLAN-based service proceduresVLAN-Based Service Procedures Assuming that the EVPN network must carry traffic among CE1,CE2CE2, and CE3 for up to 4k CE-VIDs, theService Providerservice provider can decide to implement VLAN-based service interface EVIs to accomplish it. In this case, each CE-VID will be individually mapped to a different EVI. While this means a total number of 4k MAC-VRFsisare required per PE, the advantages of this approach are the auto-provisioning of most of the service parameters if no VLAN translation is needed (seesectionSection 4.2.1) and great control over each individual customer broadcast domain. We assume in this section that the range of EVIs from 1 to 4k is provisioned in the network. 6.2.1. Servicestartup proceduresStartup Procedures As soon as the EVIs are created in PE1,PE2PE2, and PE3, the followingcontrol planecontrol-plane actions are carried out: o Flooding tree setup per EVI (4k routes): Each PE will send one Inclusive Multicast Ethernet Tag route per EVI (up to 4k routes per PE) so that the flooding tree per EVI can besetup.set up. Note that ingress replication or P2MP LSPs canoptionallybe optionally signaled in thePMSIProvider Multicast Service Interface (PMSI) Tunnel attribute and the corresponding tree can be created. o Ethernet A-D routes per ESI (a set of routes for ESI12): A set of A-D routes with a total list of 4k RTs (one per EVI) for ESI12 will be issued from PE1 and PE2 (it has to be a set of routes so that the total number of RTs can be conveyed). As per [RFC7432], each Ethernet A-D route per ESI is differentiated from the other routes in the set by a different Route Distinguisher (ES RD). This set will also include ESI Label extended communities with theactive- standbyactive-standby flag set to zero (all-activemulti-homingmultihoming type) and an ESI Label different from zero (used for split-horizon functions). These routes will be imported by the three PEs, since the RTs match theEVI RTslocallyconfigured.configured EVI RTs. The A-D routes per ESI will be used for fast convergence and split-horizon functions, as discussed in [RFC7432]. o Ethernet A-D routes per EVI (4k routes): An A-D route per EVI will be sent by PE1 and PE2 for ESI12. Each individual route includes the corresponding EVI RT and an MPLSlabelLabel to be used by PE3 for the aliasing function. These routes will be imported by the three PEs. 6.2.2. PacketwalkthroughWalk-Through Once the services aresetup,set up, the traffic can start flowing. Assuming there are no MAC addresses learned yet and that MAC learning at the access is performed in the data plane in ouruse-case,use case, this is the process followed upon receiving frames from each CE(example for(for example, EVI1).(1)BUM frame example from CE1:a)a. AnARP-requestARP request with CE-VID=1 is issued from source MAC CE1-MAC (MAC address coming from CE1 or from a device connected to CE1) to find the MAC address of CE3-IP.b)b. Based on the CE-VID, the frame is identified to be forwarded in the MAC-VRF-1 (EVI1) context. A source MAC lookup is done in the MACFIBFIB, and the sender's CE1-IP is looked up in theproxy-ARPproxy ARP table within the MAC-VRF-1 (EVI1) context. If CE1-MAC/CE1-IP are unknown in both tables, three actions are carried out (assuming the source MAC is accepted by PE1):(1) Forwarding1. the forwarding state is added for the CE1-MAC associatedtowith the corresponding port andCE-VID, (2)CE-VID; 2. theARP-requestARP request is snooped and the tuple CE1-MAC/CE1-IP is added to theproxy-ARP tableproxy ARP table; and(3)3. a BGP MACadvertisementAdvertisement route is triggered from PE1 containing the EVI1 RD and RT, ESI=0,Ethernet-Tag=0Ethernet-Tag=0, andCE1-MAC/CE1-IPCE1-MAC/CE1-IP, along with an MPLSlabelLabel assigned toMAC-VRF-1MAC- VRF-1 from the PE1labelLabel space. Note that depending on the implementation, the MAC FIB andproxy-ARPproxy ARP learning processes can independently send two BGP MAC advertisements instead of one (one containing only the CE1-MAC and another one containing CE1-MAC/CE1-IP). Since we assume a MAC forwarding model, a label per MAC-VRF is normally allocated and signaled by the three PEs for MACadvertisementAdvertisement routes. Based on the RT, the route is imported by PE2 andPE3PE3, and the forwarding state plus the ARP entry are added to their MAC-VRF-1 context. From this moment on, any ARP request from CE2 or CE3 destined toCE1-IP,CE1-IP can be directly replied to by PE1,PE2PE2, orPE3PE3, and ARP flooding for CE1-IP is not needed in the core.c)c. Since the ARP frame is a broadcast frame, it is forwarded by PE1 using the Inclusivemulticast treeMulticast Tree for EVI1 (CE-VID=1 tag should be kept if translation is required). Depending on the type of tree, the label stack may vary. Forexampleexample, assuming ingress replication, the packet is replicated to PE2 and PE3 with the downstream allocated labels and the P2P LSP transport labels. No other labels are added to the stack.d)d. Assuming PE1 is the DF for EVI1 on ESI12, the frame is locally replicated to CE2.e)e. The MPLS-encapsulated frame gets to PE2 and PE3. Since PE2 isnon- DFnon-DF for EVI1 on ESI12, and there is no other CE connected to PE2, the frame is discarded. At PE3, the frame isde-encapsulated, CE- VID translatedde- encapsulated and the CE-VID is translated, ifneededneeded, and forwarded to CE3. Any other type of BUM frame from CE1 would follow the same procedures. BUM frames from CE3 would follow the same procedures too.(2)BUM frame example from CE2:a)a. AnARP-requestARP request with CE-VID=1 is issued from source MAC CE2-MAC to find the MAC address of CE3-IP.b)b. CE2 will hash the frame and will forward ittoto, forexampleexample, PE2. Based on the CE-VID, the frame is identified to be forwarded in the EVI1 context. A source MAC lookup is done in the MAC FIB and the sender's CE2-IP is looked up in theproxy-ARPproxy ARP table within the MAC-VRF-1 context. If both are unknown, three actions are carried out (assuming the source MAC is accepted by PE2):(1) Forwarding1. the forwarding state is added for the CE2-MAC associatedtowith the corresponding LAG/ESI andCE-VID, (2)CE-VID; 2. theARP-requestARP request is snooped and the tuple CE2-MAC/CE2-IP is added to theproxy-ARP tableproxy ARP table; and(3)3. a BGP MACadvertisementAdvertisement route is triggered from PE2 containing the EVI1 RD and RT, ESI=12,Ethernet-Tag=0Ethernet-Tag=0, andCE2-MAC/CE2-IPCE2-MAC/CE2-IP, along with an MPLSlabelLabel assigned from the PE2labelLabel space (one label per MAC-VRF). Again, depending on the implementation, the MAC FIB andproxy-ARPproxy ARP learning processes can independently send two BGP MAC advertisements instead of one. Notethat,that since PE3 is not part of ESI12, it will install the forwarding state for CE2-MAC as long as the A-D routes for ESI12 are also active on PE3. On the contrary, PE1 is part of ESI12, therefore PE1 will not modify the forwarding state for CE2-MAC if it has previouslylearntlearned CE2-MAC locally attached to ESI12.OtherwiseOtherwise, it will add the forwarding state for CE2-MAC associatedtowith the local ESI12 port.c)c. Assuming PE2 does not have the ARP information for CE3-IP yet, and since the ARP is a broadcast frame and PE2 is the non-DF for EVI1 on ESI12, the frame is forwarded by PE2 in the Inclusivemulticast treeMulticast Tree for EVI1, thus adding the ESIlabelLabel for ESI12 at the bottom of the stack. The ESIlabelLabel has been previously allocated and signaled by the A-D routes for ESI12. Note that, as per [RFC7432], if the result of the CE2 hashing is different and the frame is sent to PE1, PE1 should add the ESIlabelLabel too (PE1 is the DF for EVI1 on ESI12).d)d. The MPLS-encapsulated frame gets to PE1 and PE3. PE1de-encapsulatesde- encapsulates the Inclusivemulticast tree label(s) andMulticast Tree Label(s) and, based on the ESIlabelLabel at the bottom of the stack, it decides to not forward the frame to the ESI12. It will pop the ESIlabelLabel and will replicate it toCE1 though,CE1, since CE1 is not part of the ESI identified by the ESIlabel.Label. At PE3, the Inclusivemulticast tree labelMulticast Tree Label is popped and the frame forwarded to CE3. If a P2MP LSP is used as the Inclusivemulticast treeMulticast Tree for EVI1, PE3 will find an ESIlabelLabel after popping the P2MP LSPlabel.Label. The ESIlabelLabel will simply be popped, since CE3 is not part of ESI12.(3)Unicast frame example from CE3 to CE1:a)a. A unicast frame with CE-VID=1 is issued from source MAC CE3-MAC and destination MAC CE1-MAC (we assume PE3 has previously resolved an ARP request from CE3 to find the MAC ofCE1-IP,CE1-IP and has added CE3-MAC/CE3-IP to itsproxy-ARPproxy ARP table).b)b. Based on the CE-VID, the frame is identified to be forwarded in the EVI1 context. A source MAC lookup is done in the MAC FIB within the MAC-VRF-1 context and this time, since we assumeCE3- MACCE3-MAC is known, no further actions are carried out as a result of the source lookup. A destination MAC lookup is performed next and the label stack associatedtowith the MAC CE1-MAC is found (including the label associatedtowith MAC-VRF-1 in PE1 and the P2P LSPlabelLabel to get to PE1). The unicast frame is then encapsulated and forwarded to PE1.c)c. At PE1, the packet is identified to be part of EVI1 and a destination MAC lookup is performed in the MAC-VRF-1 context. The labels are popped and the frame is forwarded to CE1 withCE-VID=1.CE- VID=1. Unicast frames from CE1 to CE3 or from CE2 to CE3 follow the same procedures described above.(4)Unicast frame example from CE3 to CE2:a)a. A unicast frame with CE-VID=1 is issued from source MAC CE3-MAC and destination MAC CE2-MAC (we assume PE3 has previously resolved an ARP request from CE3 to find the MAC of CE2-IP).b)b. Based on the CE-VID, the frame is identified to be forwarded in the MAC-VRF-1 context. We assume CE3-MAC is known. A destination MAC lookup is performed next and PE3 finds CE2-MAC associatedtowith PE2 on ESI12, an Ethernet Segment for which PE3 has two active A-D routes per ESI (from PE1 and PE2) and two active A-D routes for EVI1 (from PE1 and PE2). Based on a hashing function for the frame, PE3 may decide to forward the frame using the label stack associatedtowith PE2 (label received from the MACadvertisementAdvertisement route) or the label stack associatedtowith PE1 (label received from the A-D route per EVI for EVI1). Either way, the frame is encapsulated and sent to the remote PE.c)c. At PE2 (or PE1), the packet is identified to be part of EVI1 based on the bottom label, and a destination MAC lookup is performed. At either PE (PE2 or PE1), the FIB lookup yields a local ESI12 port to which the frame is sent. Unicast frames from CE1 to CE2 follow the same procedures. 6.3.VLAN-bundle service proceduresVLAN Bundle Service Procedures Instead of using VLAN-based interfaces, theOperatoroperator can choose to implementVLAN-bundleVLAN bundle interfaces to carry the traffic for the 4k CE- VIDs among CE1,CE2CE2, and CE3. If that is the case, the 4k CE-VIDs can be mapped to the sameEVI, for example EVI200,EVI (for example, EVI200) at each PE. The main advantage of this approach is the lowcontrol planecontrol-plane overhead (reduced number of routes and labels) and easiness ofprovisioning,provisioning at the expense of no control over the customer broadcast domains,i.e.i.e., a singleinclusive multicast treeInclusive Multicast Tree for all the CE-VIDs and no CE-VID translation in theProviderprovider network. 6.3.1. Servicestartup proceduresStartup Procedures As soon as the EVI200 is created in PE1,PE2PE2, and PE3, the followingcontrol planecontrol-plane actions are carried out: o Flooding tree setup per EVI (one route): Each PE will send one Inclusive Multicast Ethernet Tag route per EVI(hence(hence, only one route per PE) so that the flooding tree per EVI can besetup.set up. Note that ingress replication or P2MP LSPs can optionally be signaled in the PMSI Tunnel attribute and the corresponding tree can be created. o Ethernet A-D routes per ESI (one route for ESI12): A single A-D route for ESI12 will be issued from PE1 and PE2. This route will include a single RT (RT for EVI200), an ESI Label extended community with the active-standby flag set to zero (all-activemulti-homing type)multihoming type), and an ESI Label different from zero (used by the non-DF for split-horizon functions). This route will be imported by the three PEs, since the RT matches theEVI200 RTlocallyconfigured.configured EVI200 RT. The A-D routes per ESI will be used for fast convergence and split-horizon functions, as described in [RFC7432]. o Ethernet A-D routes per EVI (one route): An A-D route (EVI200) will be sent by PE1 and PE2 for ESI12. This route includes the EVI200 RT and an MPLSlabelLabel to be used by PE3 for the aliasing function. This route will be imported by the three PEs. 6.3.2. PacketWalkthroughWalk-Through The packetwalkthroughwalk-through for theVLAN-bundleVLAN bundle case is similar to the one described for EVI1 in the VLAN-based case except for the way the CE-VID is handled by the ingress PE and the egress PE: o No VLAN translation is allowed and the CE-VIDs are kept untouched from CE to CE,i.e.i.e., the ingress CE-VID must be kept at the imposition PE and at the disposition PE. o The frame is identified to be forwarded in the MAC-VRF-200 context as long as its CE-VID belongs to theVLAN-bundleVLAN bundle defined in the PE1/PE2/PE3 port to CE1/CE2/CE3. Our example is a specialVLAN-VLAN bundlecase,case since the entire CE-VID range is defined in theports, thereforeports; therefore, any CE-VID would be part of EVI200. Please refer tosectionSection 6.2.2 for more information about thecontrolcontrol- plane andforwarding planeforwarding-plane interaction for BUM and unicast traffic from the different CEs. 6.4.VLAN-aware bundling service proceduresVLAN-Aware Bundling Service Procedures The last potential service type analyzed in this document isVLAN-awareVLAN- aware bundling. When this type of service interface is used to carry the 4k CE-VIDs among CE1,CE2CE2, and CE3, all the CE-VIDs will be mapped to the sameEVI, for example EVI300.EVI (for example, EVI300). The difference, compared to theVLAN-bundleVLAN bundle service type in the previous section, is that each incoming CE-VID will also be mapped to a different "normalized"Ethernet-TagEthernet Tag in addition to EVI300. If no translation is required, theEthernet-tagEthernet Tag will match the CE-VID.OtherwiseOtherwise, a translation between CE-VID andEthernet-tagEthernet Tag will be needed at the imposition PE and at the disposition PE. The main advantage of this approach is the ability to control customer broadcast domains while providing a single EVI to the customer. 6.4.1. Servicestartup proceduresStartup Procedures As soon as the EVI300 is created in PE1,PE2PE2, and PE3, the followingcontrol planecontrol-plane actions are carried out: o Flooding tree setup per EVI perEthernet-TagEthernet Tag (4k routes): Each PE will send one Inclusive Multicast Ethernet Tag route per EVI and perEthernet-Tag (henceEthernet Tag (hence, 4k routes per PE) so that the flooding tree per customer broadcast domain can besetup.set up. Note that ingress replication or P2MP LSPs can optionally be signaled in the PMSI Tunnel attribute and the corresponding tree be created. In the describeduse-case,use case, since all the CE-VIDs andEthernet-TagsEthernet Tags are defined on the three PEs, multicast tree aggregation might make sense in order to save forwarding states. o Ethernet A-D routes per ESI (one route for ESI12): A single A-D route for ESI12 will be issued from PE1 and PE2. This route will include a single RT (RT for EVI300), an ESI Label extended community with the active-standby flag set to zero (all-activemulti-homing type)multihoming type), and an ESI Label different than zero (used by the non-DF for split-horizon functions). This route will be imported by the three PEs, since the RT matches theEVI300 RTlocallyconfigured.configured EVI300 RT. The A-D routes per ESI will be used for fast convergence and split-horizon functions, as described in [RFC7432]. o Ethernet A-D routes per EVI:aA single A-D route (EVI300) may be sent by PE1 and PE2 forESI12,ESI12 in case no CE-VID translation is required. This route includes the EVI300 RT and an MPLSlabelLabel to be used by PE3 for the aliasing function. This route will be imported by the three PEs. Note that if CE-VID translation is required, an A-D per EVI route is required perEthernet-TagEthernet Tag (4k). 6.4.2. PacketWalkthroughWalk-Through The packetwalkthroughwalk-through for the VLAN-aware case is similar to the one described before. Compared to the other two cases, VLAN-aware services allow for CE-VID translation and for an N:1 CE-VID to EVI mapping. Both things are not supported at once in either of the two other service interfaces. Some differences compared to the packetwalkthroughwalk-through described insectionSection 6.2.2are:are as follows: o At the ingress PE, the frames are identified to be forwarded in the EVI300 context as long as their CE-VID belong to the range defined in the PE port to the CE. In addition to it, CE-VID=x is mapped to a "normalized" Ethernet-Tag=y at the MAC-VRF-300 (where x and y might be equal if no translation is needed). Qualified learning is now required (a differentBridge Tablebridge table is allocated withinMAC- VRF-300MAC-VRF-300 for eachEthernet-Tag). PotentiallyEthernet Tag). Potentially, the same MAC could be learned in two differentEthernet-Tag Bridge TablesEthernet Tag bridge tables of the same MAC-VRF. o Any new locally learned MAC on the MAC-VRF-300/Ethernet-Tag=y interface is advertised by the ingress PE in a MACadvertisement route,Advertisement route usingnowtheEthernet-Tagnow Ethernet Tag field (Ethernet-Tag=y) so that the remote PE learns the MAC associatedtowith theMAC-VRF- 300/Ethernet-Tag=yMAC-VRF-300/ Ethernet-Tag=y FIB. Note that theEthernet-TagEthernet Tag field is not used in advertisements of MACs learned on VLAN-based orVLAN- bundleVLAN-bundle service interfaces. o At the ingress PE, BUM frames are sent to the corresponding flooding tree for the particularEthernet-TagEthernet Tag they are mapped to. Each individualEthernet-TagEthernet Tag can have a different flooding tree within the same EVI300. For instance, Ethernet-Tag=y can use ingress replication to get to the remotePEsPEs, whereas Ethernet- Tag=z can use ap2mpP2MP LSP. o At the egress PE,Ethernet-Tag=y, forEthernet-Tag=y (for a given broadcast domain withinMAC-VRF-300,MAC-VRF-300) can be translated to egress CE-VID=x. That is not possible forVLAN-bundleVLAN bundle interfaces. It is possible for VLAN- based interfaces, but it requires a separate MAC-VRF per CE-VID. 7.MPLS-based forwarding model use-caseMPLS-Based Forwarding Model Use Case EVPN supports an alternative forwarding model, usually referred to as the MPLS-based forwarding or dispositionmodelmodel, as opposed to the MAC-based forwarding or disposition model described insectionSection 6. Using the MPLS-based forwarding model instead of the MAC-based model might have an impacton:on the following: oThethe number of forwarding statesrequired.required; and oThethe FIB where the forwarding states arehandled: MAChandled (MAC FIB or MPLSLFIB.Label FIB (LFIB)). The MPLS-based forwarding model avoids the destination MAC lookup at the egress PE MACFIB,FIB at the expense of increasing the number of next-hop forwarding states at the egress MPLS LFIB. This also has an impact on the control plane and the label allocation model, since an MPLS-based disposition PE must send as many routes and labels as required next-hops in the egress MAC-VRF. This concept is equivalent to the forwarding models supported in IP-VPNs at the egress PE, where an IP lookup in the IP-VPN FIBmight be necessarymay or may not be necessary depending on the available next-hop forwarding states in the LFIB. The followingsub-sectionssubsections highlight the impact on thecontrolcontrol- anddata planedata-plane procedures described insectionSection 6 whenandan MPLS-based forwarding model is used. Note that both forwarding models are compatible and interoperable in the same network. The implementation of either model in each PE is a local decision to the PE node. 7.1. Impact ofMPLS-based forwardingMPLS-Based Forwarding on the EVPNnetwork startupNetwork Startup The MPLS-based forwarding model has no impact on the procedures explained insectionSection 6.1. 7.2. Impact ofMPLS-based forwardingMPLS-Based Forwarding on theVLAN-based service proceduresVLAN-Based Service Procedures Compared to the MAC-based forwarding model, the MPLS-based forwarding model has no impact in terms of the number ofroutes,routes when all the service interfaces areVLAN-based.based on VLAN. The differences for theuse-caseuse case described in this document are summarized in the following list: o Flooding tree setup per EVI (4k routes per PE): There is no impact when compared to the MAC-based model. o Ethernet A-D routes per ESI (one set of routes for ESI12 per PE): There is no impact compared to the MAC-based model. o Ethernet A-D routes per EVI (4k routes per PE/ESI): There is no impact compared to the MAC-based model. oMAC-advertisementMAC Advertisement routes:insteadInstead of allocating and advertising the same MPLSlabelLabel for all the new MACs locallylearntlearned on the same MAC-VRF, a different label must be advertised per CE next-hop or MAC so that no MAC FIB lookup is needed at the egress PE. In general, this means that a different labelat(at least perCECE) must be advertised, although the PE can decide to implement a label per MAC if more granularity(hence(hence, less scalability) is required in terms of forwarding states. Forexampleexample, if CE2 sends traffic from two different MACs to PE1,CE2-MAC1CE2-MAC1, and CE2-MAC2, the same MPLSlabel=xLabel=x can be re-used for both MACadvertisementsadvertisements, since they both share the same source ESI12. It is up to the PE1 implementation to use a different label per individual MAC within the same ESSegment(even if only one label per ESI is enough). o PE1,PE2PE2, and PE3 will not add forwarding states to the MAC FIB upon learning new local CE MAC addresses on the dataplane,plane but will rather add forwarding states to the MPLS LFIB. 7.3. Impact ofMPLS-based forwardingMPLS-Based Forwarding on theVLAN-bundle service proceduresVLAN Bundle Service Procedures Compared to the MAC-based forwarding model, the MPLS-based forwarding model has no impact in terms of number of routes when all the service interfaces areVLAN-bundleVLAN bundle type. The differences for theuse-caseuse case described in this document are summarized in the following list: o Flooding tree setup per EVI (one route): There is no impact compared to the MAC-based model. o Ethernet A-D routes per ESI (one route for ESI12 per PE): There is no impact compared to the MAC-based model. o Ethernet A-D routes per EVI (one route per PE/ESI): There is no impact compared to the MAC-based model since no VLAN translation is required. oMAC-advertisementMAC Advertisement routes:insteadInstead of allocating and advertising the same MPLSlabelLabel for all the new MACs locallylearntlearned on the same MAC-VRF, a different label must be advertised per CE next-hop or MAC so that no MAC FIB lookup is needed at the egress PE. In general, this means that a different labelat(at least perCECE) must be advertised, although the PE can decide to implement a label per MAC if more granularity(hence(hence, less scalability) is required in terms of forwarding states. It is up to the PE1 implementation to use a different label per individual MAC within the same ESSegment(even if only one label per ESI is enough). o PE1,PE2PE2, and PE3 will not add forwarding states to the MAC FIB upon learning new local CE MAC addresses on the data plane, but will rather add forwarding states to the MPLS LFIB. 7.4. Impact ofMPLS-based forwardingMPLS-Based Forwarding on theVLAN-aware service proceduresVLAN-Aware Service Procedures Compared to the MAC-based forwarding model, the MPLS-based forwarding model has no impact in terms of the number of A-D routes when all the service interfaces are of the VLAN-aware bundle type. The differences for theuse-caseuse case described in this document are summarized in the following list: o Flooding tree setup per EVI (4k routes per PE): There is no impact compared to the MAC-based model. o Ethernet A-D routes per ESI (one route for ESI12 per PE): There is no impact compared to the MAC-based model. o Ethernet A-D routes per EVI (1 route per ESI or 4k routes perPE/ESI):PE/ ESI): PE1 and PE2 may send one route per ESI if no CE-VID translation is needed. However, 4k routes are normally sent for EVI300, one per <ESI,Ethernet-TagEthernet Tag ID> tuple. Thiswill allowallows the egress PE to find out all the forwarding information in the MPLS LFIB and even supportEthernet-TagEthernet Tag to CE-VID translation at the egress. oMAC-advertisementMAC Advertisement routes:insteadInstead of allocating and advertising the same MPLSlabelLabel for all the new MACs locallylearntlearned on the same MAC-VRF, a different label must be advertised per CE next-hop or MAC so that no MAC FIB lookup is needed at the egress PE. In general, this means that a different labelat(at least perCECE) must be advertised, although the PE can decide to implement a label per MAC if more granularity(hence(hence, less scalability) is required in terms of forwarding states. It is up to the PE1 implementation to use a different label per individual MAC within the sameES Segment.ES. Note that theEthernet-TagEthernet Tag will be set to a non-zero value for theMAC- advertisementMAC Advertisement routes. The same MAC address can be announced with a differentEthernet-TagEthernet Tag value. This will make the advertising PE install two different forwarding states in the MPLS LFIB. o PE1,PE2PE2, and PE3 will not add forwarding states to the MAC FIB upon learning new local CE MAC addresses on the dataplane,plane but will rather add forwarding states to the MPLS LFIB. 8. Comparison betweenMAC-basedMAC-Based andMPLS-basedMPLS-Based Egress Forwarding Models Both forwarding models are possible in a networkdeploymentdeployment, and each one has its own trade-offs. Both forwarding models can save A-D routes per EVI when VLAN-aware bundling services are deployed and no CE-VID translation is required. While this saves a significant amount of routes, customers normally require CE-VIDtranslation, hencetranslation; hence, we assume an A-D per EVI route per <ESI,Ethernet-Tag>Ethernet Tag> is needed. The MAC-based model saves a significant amount of MPLSlabelsLabels compared to the MPLS-based forwarding model. All the MACs and A-D routes for the same EVI can signal the same MPLSlabel,Label, saving labels from the local PE space. A MAC FIB lookup at the egress PE is required in order to do so. The MPLS-based forwarding model can save forwarding states at the egress PEs if labels pernext hopnext-hop CE (as opposed to per MAC) are implemented. No egress MAC lookup is required. Also, a different label per next-hop CE per MAC-VRF is consumed, as opposed to a single label per MAC-VRF. Table 2 summarizes the resource implementation details of both models.+-----------------------------+----------------+----------------++-----------------------------+-----------------+------------------+ | Resources |MAC-based | MPLS-based | | |MAC-Based Model | MPLS-Based Model |+-----------------------------+----------------+----------------++-----------------------------+-----------------+------------------+ | MPLSlabels consumedLabels Consumed | 1 per MAC-VRF | 1 per CE/EVI | | Egress PE ForwardingstatesStates | 1 per MAC | 1 pernext-hopNext-Hop | | Egress PE Lookups | 2 (MPLS+MAC) | 1 (MPLS) |+-----------------------------+----------------+----------------++-----------------------------+-----------------+------------------+ Table2 -2: Resource ComparisonBetween MAC-basedbetween MAC-Based andMPLS-basedMPLS-Based Models The egress forwarding model is an implementation local to the egress PE and is independent of the model supported on the rest of thePEs, i.e.PEs; i.e., in ouruse-case,use case, PE1,PE2PE2, and PE3 could have either egress forwarding model without any dependencies. 9. Trafficflow optimizationFlow Optimization In addition to the procedures described acrosssectionsSections 3 through 8, EVPN [RFC7432] procedures allow for optimized traffic handling in order to minimize unnecessary flooding across the entire infrastructure. Optimization is provided through specific ARP termination and the ability to block unknown unicast flooding. Additionally, EVPN procedures allow for intelligent, close to the source, inter-subnet forwarding and solves the commonly knownsub- optimalsuboptimal routing problem. Besides the traffic efficiency,ingressingress- based inter-subnet forwarding also optimizes packet forwarding rules and implementation at the egress nodes as well. Details of these procedures are outlined insectionsSections 9.1 and 9.2. 9.1.Control PlaneControl-Plane Procedures 9.1.1. MAClearning optionsLearning Options The fundamental premise of [RFC7432] is the notion of a different approach to MAC address learning compared to traditional IEEE 802.1 bridge learning methods;specificallyspecifically, EVPN differentiates between data andcontrol plane drivencontrol-plane-driven learning mechanisms.Data drivenData-driven learning implies that there is no separate communication channel used to advertise and propagate MAC addresses. Rather, MAC addresses are learned throughIEEE defined bridge-learningIEEE-defined bridge learning procedures as well as by snooping on DHCP and ARP requests. As different MAC addresses show up on different ports, theL2Layer 2 (L2) FIB is populated with the appropriate MAC addresses.Control plane drivenControl-plane-driven learning implies a communication channel that could be either a control-plane protocol or a management-plane mechanism. In the context of EVPN, two different learning procedures aredefined, i.e.defined: local and remoteprocedures:procedures. o Local learning defines the procedures used for learning the MAC addresses of network elements locally connected to a MAC-VRF. Local learning could be implemented through all three learning procedures: control plane, managementplane as well asplane, and data plane. However, the expectation is that for most of the use cases, local learning through the data plane should be sufficient. o Remote learning defines the procedures used for learning MAC addresses of network elements remotely connected to a MAC-VRF,i.e.i.e., far-end PEs. Remote learning procedures defined in [RFC7432] advocate using onlycontrol plane learning; specifically BGP.control-plane learning, BGP specifically. Through the use of BGP EVPN NLRIs, the remote PE has the capability of advertising all the MAC addresses present in its local FIB. 9.1.2.Proxy-ARP/NDProxy ARP/ND In EVPN, MAC addresses are advertised via the MAC/IP AdvertisementRoute,route, as discussed in [RFC7432].OptionallyOptionally, an IP address can be advertised along with the MAC address advertisement. However, there are certain rules put in place in terms of IP address usage: if the MAC/IP Route contains an IP address, this particular IP address correlates directly with the advertised MAC address. Such advertisement allows us to build aproxy-ARP/NDproxy ARP / Neighbor Discovery (ND) table populated with the IP<->MAC bindings received from all the remote nodes. Furthermore, based on these bindings, a local MAC-VRF can now provideProxy-ARP/NDproxy ARP/ND functionality for all ARP requests and ND solicitations directed to the IP address pool learned through BGP. Therefore, the amount of unnecessary L2flooding, ARP/NDflooding (ARP/ND requests/solicitations in thiscase,case) can be further reduced by the introduction ofProxy-ARP/NDproxy ARP/ND functionality across all EVI MAC-VRFs. 9.1.3. Unknown Unicastflooding suppressionFlooding Suppression Given that all locally learned MAC addresses are advertised through BGP to all remote PEs, suppressing flooding of anyUnknown Unicastunknown unicast traffic towards the remote PEs is a feasible network optimization. The assumption in the use case is made that any network device that appears on a remote MAC-VRF will somehow signal its presence to the network. This signaling can be donethroughthrough, forexampleexample, gratuitous ARPs. Once the remote PE acknowledges the presence of the node in the MAC-VRF, it will do two things: install its MAC address in its local FIB and advertise this MAC address to all other BGP speakers via EVPN NLRI. Therefore, we can assume that any active MAC address is propagated andlearntlearned through the entire EVI. Given that MAC addresses becomepre-populated -prepopulated -- once nodes are alive on the network--- there is no need to flood any unknown unicast towards the remote PEs. If the owner of a given destination MAC is active, the BGP route will be present in the local RIB and FIB, assuming that the BGP import policies are successfully applied; otherwise, the owner of such destination MAC is not present on the network. It is worth noting that unknown unicast flooding must not besuppressed,suppressed unless (at least) one of the following two statementsareis given: a)controlcontrol- ormanagement planemanagement-plane learning is performed throughout the entire EVI for all the MACs or b) all the EVI-attached devices signal their presence when they come up(GARPs(Gratuitous ARP (GARP) packets or similar). 9.1.4. Optimization ofInter-subnet forwardingInter-Subnet Forwarding In a scenario in which both L2 and L3 services are needed over the same physical topology, some interaction between EVPN and IP-VPN is required. A common way of stitching the two service planes is through the use of anIRBIntegrated Routing and Bridging (IRB) interface, which allows for traffic to be either routed or bridged depending on its destination MAC address. If the destination MAC address is the oneoffrom the IRB interface, traffic needs to be passed through a routing module and potentially be either routed to a remote PE or forwarded to a local subnet. If the destination MAC address is not the oneoffrom theIRB,IRB interface, the MAC-VRF follows standard bridging procedures. A typical example of EVPN inter-subnet forwarding would be a scenario in which multiple IP subnets are part of a single or multiple EVIs, and they all belong to a single IP-VPN. In such topologies, it is desired that inter-subnet traffic can be efficiently routed without any tromboning effects in the network. Due to the overlapping physical and service topology in such scenarios, all inter-subnet connectivity will be locally routed through the IRB interface. In addition to optimizing the traffic patterns in the network, local inter-subnet forwarding alsooptimizesgreatly optimizes the amount of processing needed to cross the subnets. Through EVPN MAC advertisements, the local PE learns the real destination MAC address associated with the remote IP address and the inter-subnet forwarding can happen locally. When the packet is received at the egress PE, it is directly mapped to an egressMAC-VRF, bypassingMAC-VRF and bypasses any egressIP-VPNIP- VPN processing. Please refer to [EVPN-INTERSUBNET] for more information about the IP inter-subnet forwarding procedures in EVPN. 9.2. PacketWalkthroughWalk-Through Examples Assuming that the services aresetupset up according tofigureFigure 1 insectionSection 3, the following flow optimization processes will take place in terms of creating,receivingreceiving, and forwarding packets across the network. 9.2.1.Proxy-ARP exampleProxy ARP Example for CE2 to CE3trafficTraffic Using Figure 1 insectionSection 3, considerEVI 400EVI400 residing on PE1,PE2PE2, and PE3 connecting CE2 and CE3 networks. Also, consider that PE1 and PE2 are part of the all-activemulti-homingmultihoming ES for CE2, and that PE2 is electeddesignated-forwarderdesignated forwarder for EVI400. We assume that all the PEs implement theproxy-ARPproxy ARP functionality in the MAC-VRF-400 context. In this scenario, PE3 will not only advertise the MAC addresses through the EVPN MAC AdvertisementRouteroute but also IP addresses of individualhosts, i.e.hosts (i.e., /32prefixes,prefixes) behind CE3. Upon receiving the EVPN routes, PE1 and PE2 will install the MAC addresses in the MAC- VRF-400 FIBandand, based on the associated received IP addresses, PE1 and PE2 can now build aproxy-ARPproxy ARP table within the context of MAC- VRF-400. From the forwarding perspective, when a node behind CE2 sends a frame destined to a node behind CE3, it will first send an ARP requesttoto, forexampleexample, PE2 (based on the result of the CE2 hashing). Assuming that PE2 has populated itsproxy-ARPproxy ARP table for all active nodes behind the CE3, and that the IP address in the ARP message matches the entry in the table, PE2 will respond to the ARP request with the actual MAC address on behalf of the node behind CE3. Once the nodes behind CE2 learn the actual MAC address of the nodes behind CE3, all the MAC-to-MAC communications between the two networks will be unicast. 9.2.2. Floodsuppression exampleSuppression Example forCE1 to CE3 trafficCE1-to-CE3 Traffic Using Figure 1 insectionSection 3, considerEVI 500EVI500 residing on PE1 and PE3 connecting CE1 and CE3 networks. Consider that both PE1 and PE3 have disabled unknown unicast flooding for this specific EVI context. Once the network devices behind CE3 comeonlineonline, they will learn their MAC addresses and create local FIB entries for these devices. Note that local FIB entries could also be created through either a control or management plane between PE and CE as well. Consequently, PE3 will automatically create EVPN Type 2 MAC AdvertisementRoutesroutes and advertise all locally learned MAC addresses. The routes will also include the corresponding MPLSlabel.Label. Given that PE1 automatically learns and installs all MAC addresses behind CE3, its MAC-VRF FIB will already bepre-populatedprepopulated with the respective next-hops and label assignments associated with the MAC addresses behind CE3. As such, as soon as the traffic sent by CE1 to nodes behind CE3 is received into the context ofEVI 500,EVI500, PE1 will push the MPLS Label(s) onto the original Ethernet frame and send the packet to the MPLS network. As usual, once PE3 receives this packet, and depending on the forwarding model, PE3 will either do a next-hop lookup in theEVI 500 context,EVI500 context orwilljust forward the traffic directly to the CE3. In the case that PE1 MAC-VRF-500 does not have a MAC entry for a specific destination that CE1 is trying to reach, PE1 will drop the frame since unknown unicast flooding is disabled. Based on the assumption that all the MAC entries behind the CEs arepre-populatedprepopulated throughgratuitous-ARPgratuitous ARP and/or DHCP requests, if one specific MAC entry is not present in the MAC-VRF-500 FIB on PE1, the owner of that MAC is not alive on the network behind theCE3, henceCE3; hence, the traffic can be dropped at PE1 instead ofbe floodedflooding andconsumeconsuming network bandwidth. 9.2.3. Optimization ofinter-subnet forwarding exampleInter-subnet Forwarding Example for CE3 to CE2trafficTraffic Using Figure 1 insection 3Section 3, consider that there is an IP-VPN 666 context residing on PE1,PE2PE2, andPE3PE3, which connects CE1,CE2CE2, and CE3 into a single IP-VPN domain. Also consider that there are two EVIs present on the PEs,EVI 600EVI600 andEVI 60.EVI60. Each IP subnet is associatedtowith a different MAC-VRF context.ThusThus, there is a singlesubnet,subnet600,(subnet 600) between CE1 and CE3 that is established throughEVI 600.EVI600. Similarly, there is anothersubnet,subnet60,(subnet 60) between CE2 and CE3 that is established throughEVI 60.EVI60. Since both subnets are part of the sameIP VPN,IP-VPN, there is a mapping of each EVI (or individual subnet) to a local IRB interface on the three PEs. If a node behind CE2 wants to communicate with a node on the same subnet seating behind CE3, the communication flow will follow the standard EVPN procedures,i.e.i.e., FIB lookup within the PE1 (or PE2) after adding the corresponding EVPN label to the MPLSlabelLabel stack (downstream label allocation from PE3 forEVI 60).EVI60). When it comes to crossing the subnet boundaries, the ingress PE implements local inter-subnet forwarding. For example, when a node behind CE2(EVI 60)(EVI60) sends a packet to a node behind CE1(EVI 600)(EVI600), the destination IP address will be in the subnet 600, but the destination MAC address will be the address of the source node's default gateway, which in this case will be an IRB interface on PE1 (connectingEVI 60EVI60 to IP-VPN 666). Once PE1 sees the traffic destined to its own MAC address, it will route the packet toEVI 600, i.e.EVI600, i.e., it will change the source MAC address to the one of the IRB interface inEVI 600EVI600 and change the destination MAC address to the address belonging to the node behind CE1, which is already populated in the MAC-VRF-600 FIB, either throughdatadata- orcontrol planecontrol-plane learning. An important optimization to be noted is the local inter-subnet forwarding in lieu ofIP VPNIP-VPN routing. If the node from subnet 60 (behind CE2) is sending a packet to the remoteend nodeend-node on subnet 600 (behind CE3), the mechanism in place still honors the local inter- subnet (inter-EVI) forwarding. In ouruse-case,use case, therefore, when the node from subnet 60 behind CE2 sends traffic to the node on subnet 600 behind CE3, the destination MAC address is the PE1 MAC-VRF-60 IRB MAC address. However, once the traffic locally crossesEVIs,EVIs toEVI 600, viaEVI600 (via the IRB interface onPE1,PE1), the source MAC address is changed to that of the IRB interface and the destination MAC address is changed to the one advertised by PE3 via EVPN and already installed in MAC-VRF-600. The rest of the forwarding through PE1 is using the MAC-VRF-600 forwarding context and label space. Another very relevant optimization is due to the fact that traffic between PEs is forwarded throughEVPN,EVPN rather than through IP-VPN. In the example described above for traffic fromEVI 60EVI60 on CE2 toEVI 600EVI600 on CE3, there is no need for IP-VPN processing on the egress PE3. Traffic is forwarded either to theEVI 600EVI600 context in PE3 for further MAC lookup and next-hopprocessing,processing or directly to the node behind CE3, depending on the egress forwarding model being used. 10. Security Considerations Please refer to the "Security Considerations" section in [RFC7432]. The standards produced by the SIDRWGWorking Group address secure route origin authentication (e.g., RFCs6480-93)6480 through 6493) and route advertisement security (e.g., RFCs8205-11).8205 through 8211). They protect the integrity and authenticity of IP address advertisements andASN/IPASN/ IP prefix bindings. Thisdocument,document and[RFC7432],[RFC7432] use BGP to convey otherinfo, e.g.,info (e.g., MACaddresses, and thusaddresses); thus, the protections offered by the SIDR WG RFCs are not applicable in this context. 11. IANA ConsiderationsNoThis document does not require any IANAconsiderations are needed.actions. 12. References 12.1. Normative References [RFC7209] Sajassi, A., Aggarwal, R., Uttaro, J., Bitar, N., Henderickx, W., and A. Isaac, "Requirements for Ethernet VPN (EVPN)", RFC 7209, DOI 10.17487/RFC7209, May 2014,<http://www.rfc- editor.org/info/rfc7209>.<https://www.rfc-editor.org/info/rfc7209>. [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015,<http://www.rfc- editor.org/info/rfc7432>.<https://www.rfc-editor.org/info/rfc7432>. 12.2. Informative References[EVPN-INTERSUBNET] Sajassi et al., "IP Inter-subnet forwarding in EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03.txt[RFC4761] Kompella, K.,Ed.,Ed. and Y. Rekhter, Ed., "Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007,<http://www.rfc- editor.org/info/rfc4761>.<https://www.rfc-editor.org/info/rfc4761>. [RFC4762] Lasserre, M.,Ed.,Ed. and V. Kompella, Ed., "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007,<http://www.rfc-editor.org/info/rfc4762>.<https://www.rfc-editor.org/info/rfc4762>. [RFC6074] Rosen, E., Davie, B., Radoaca, V., and W. Luo, "Provisioning, Auto-Discovery, and Signaling in Layer 2 Virtual Private Networks (L2VPNs)", RFC 6074, DOI 10.17487/RFC6074, January 2011,<http://www.rfc-editor.org/info/rfc6074>. 13.<https://www.rfc-editor.org/info/rfc6074>. [EVPN-INTERSUBNET] Sajassi, A., Salam, S., Thoria, S., Drake, J., Rabadan, J., and L. Yong, "Integrated Routing and Bridging in EVPN", Work in Progress, draft-ietf-bess-evpn-inter- subnet-forwarding-03, February 2017. Acknowledgments The authors want to thank Giles Heron for his detailed review of the document. We also thank StefanPlug,Plug and Eric Wunan for their comments.14.ContributorsIn addition to the authors listed on the front page, theThe followingco-authors have alsopeople contributed substantially to the content of thisdocument:document and should be considered coauthors: Florin Balus Keyur Patel Aldrin Isaac Truman Boyes15.Authors' Addresses Jorge Rabadan (editor) Nokia 777 E. Middlefield Road Mountain View, CA 94043USAUnited States America Email: jorge.rabadan@nokia.com Senad Palislamovic Nokia Email: senad.palislamovic@nokia.com Wim Henderickx Nokia Copernicuslaan 50 2018 Antwerp Belgium Email: wim.henderickx@nokia.com Ali Sajassi Cisco Email: sajassi@cisco.com James Uttaro AT&T Email: uttaro@att.com