<?xmlversion="1.0" encoding="US-ASCII"?> <!-- $Id: draft-ietf-bess-evpn-irb-mcast.xml,v 1.4 2020/12/23 19:26:29 zzhang Exp $ -->version='1.0' encoding='UTF-8'?> <!DOCTYPE rfcSYSTEM "rfc2629.dtd"[ <!ENTITYRFC2119 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml">nbsp " "> <!ENTITYRFC3376 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3376.xml">zwsp "​"> <!ENTITYRFC3810 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3810.xml">nbhy "‑"> <!ENTITYRFC4541 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4541.xml"> <!ENTITY RFC3032 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3032.xml"> <!ENTITY RFC4360 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4360.xml"> <!ENTITY RFC4364 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4364.xml"> <!ENTITY RFC6513 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6513.xml"> <!ENTITY RFC6514 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6514.xml"> <!ENTITY RFC6625 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6625.xml"> <!ENTITY RFC7153 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7153.xml"> <!ENTITY RFC7432 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7432.xml"> <!ENTITY RFC7606 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7606.xml"> <!ENTITY RFC7716 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7716.xml"> <!ENTITY RFC7761 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7761.xml"> <!ENTITY RFC8174 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"> <!ENTITY RFC8296 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8296.xml"> <!ENTITY RFC8584 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8584.xml"> <!ENTITY RFC9135 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9135.xml"> <!ENTITY RFC9136 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9136.xml"> <!ENTITY RFC9251 SYSTEM "https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9251.xml">wj "⁠"> ]><?rfc toc="yes"?> <?rfc tocompact="yes"?> <?rfc tocdepth="6"?> <?rfc tocindent="yes"?> <?rfc symrefs="yes"?> <?rfc sortrefs="yes"?> <?rfc strict="no"?> <?rfc rfcedstyle="yes"?> <?rfc inline="yes"?> <?rfc compact="yes"?> <?rfc subcompact="no"?> <?rfc comments="no"?><rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="std" docName="draft-ietf-bess-evpn-irb-mcast-11"ipr="trust200902">number="9625" consensus="true" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" xml:lang="en" tocInclude="true" tocDepth="6" symRefs="true" sortRefs="true" version="3"> <front> <titleabbrev="evpn-irb-mcast">EVPNabbrev="EVPN OISM Forwarding">EVPN Optimized Inter-Subnet Multicast (OISM) Forwarding</title> <seriesInfo name="RFC" value="9625"/> <author fullname="Wen Lin" initials="W." surname="Lin"> <organization>Juniper Networks, Inc.</organization> <address> <postal> <street>10 Technology Park Drive</street> <city>Westford</city><region>Massachusetts</region><region>MA</region> <code>01886</code> <country>UnitedStates</country>States of America</country> </postal> <email>wlin@juniper.net</email> </address> </author> <author fullname="Zhaohui Zhang" initials="Z." surname="Zhang"> <organization>Juniper Networks, Inc.</organization> <address> <postal> <street>10 Technology Park Drive</street> <city>Westford</city><region>Massachusetts</region><region>MA</region> <code>01886</code> <country>UnitedStates</country>States of America</country> </postal> <email>zzhang@juniper.net</email> </address> </author> <author fullname="John Drake" initials="J." surname="Drake"> <organization>Juniper Networks, Inc.</organization> <address> <postal> <street>1194 N. Mathilda Ave</street> <city>Sunnyvale</city> <region>CA</region> <code>94089</code> <country>UnitedStates</country>States of America</country> </postal> <email>jdrake@juniper.net</email> </address> </author> <author fullname="Eric C. Rosen" initials="E." surname="Rosen" role="editor"> <organization>Juniper Networks, Inc.</organization> <address> <postal> <street>10 Technology Park Drive</street> <city>Westford</city><region>Massachusetts</region><region>MA</region> <code>01886</code> <country>UnitedStates</country>States of America</country> </postal><email> erosen52@gmail.com</email><email>erosen52@gmail.com</email> </address> </author> <author fullname="Jorge Rabadan" initials="J." surname="Rabadan"> <organization>Nokia</organization> <address> <postal> <street>777 E. Middlefield Road</street> <city>Mountain View</city> <region>CA</region> <code>94043</code> <country>UnitedStates</country>States of America</country> </postal> <email>jorge.rabadan@nokia.com</email> </address> </author> <author fullname="Ali Sajassi" initials="A." surname="Sajassi"> <organization>Cisco Systems</organization> <address> <postal> <street>170 West Tasman Drive</street> <city>San Jose</city> <region>CA</region> <code>95134</code> <country>UnitedStates</country>States of America</country> </postal> <email>sajassi@cisco.com</email> </address> </author><workgroup>BESS</workgroup><date month="August" year="2024"/> <area>RTG</area> <workgroup>bess</workgroup> <keyword>OISM</keyword> <keyword>PEG</keyword> <keyword>MEG</keyword> <keyword>SBD</keyword> <abstract> <t> Ethernet VPN (EVPN) provides a service that allows a single Local Area Network (LAN), comprising a single IP subnet, to be divided into multiple"segments".segments. Each segment may be located at a different site, and the segments are interconnected by an IP or MPLS backbone. Intra-subnet traffic (either unicast or multicast) always appears to the end users to be bridged, even when it is actually carried over the IP or MPLS backbone. When a single"tenant"tenant owns multiple such LANs, EVPN also allows IP unicast traffic to be routed between those LANs. This document specifies new procedures that allow inter-subnet IP multicast traffic to be routed among the LANs of a giventenant,tenant while still making intra-subnet IP multicast traffic appear to be bridged. These procedures can provide optimal routing of the inter-subnet multicasttraffic,traffic and do not require any such traffic to egress a given router and then ingress that same router. These procedures also accommodate IP multicast traffic that originates or is destined to be external to the EVPN domain. </t> </abstract><note title="Requirements Language"> <t> The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shown here. </t> </note></front> <middle> <sectiontitle="Introduction" anchor="introduction">anchor="introduction" numbered="true" toc="default"> <name>Introduction</name> <sectiontitle="Terminology" anchor="terminology">anchor="terminology" numbered="true" toc="default"> <name>Terminology</name> <t> In thisdocumentdocument, we make frequent use of the following terminology:<list style="symbols"> <t> OISM: Optimized</t> <dl newline="false" spacing="normal"> <dt>OISM:</dt> <dd>Optimized Inter-Subnet Multicast.EVPN&nbhy;PEsEVPN PEs that follow the procedures of this document will be known as "OISM"PEs. EVPN&nbhy;PEsProvider Edges (PEs). EVPN PEs that do not follow the procedures of this document will be known as"non&nbhy;OISM""non-OISM" PEs.</t> <t> IP</dd> <dt>IP MulticastPacket: AnPacket:</dt> <dd>An IP packet whose IP Destination Address field is a multicast address that is not alink&nbhy;locallink-local address.(Link&nbhy;local(Link-local addresses are IPv4 addresses in the 224/24 range and IPv6addressaddresses in the FF02/16 range.)</t> <t> IP</dd> <dt>IP MulticastFrame: AnFrame:</dt> <dd>An Ethernet frame whose payload is an IP multicast packet (as defined above).</t> <t> (S,G)</dd> <dt>(S,G) MulticastPacket: AnPacket:</dt> <dd>An IP multicast packet whoseIPSource IP Address field contains S and whose IP Destination Address field contains G.</t> <t> (S,G)</dd> <dt>(S,G) MulticastFrame: AnFrame:</dt> <dd>An IP multicast frame whose payload contains S in itsIPSource IP Address field and G in its IP Destination Address field.</t> <t> EVPN Instance (EVI):</dd> <dt>EVI:</dt> <dd>EVPN Instance. An EVPN instance spanning theProvider Edge (PE)PE devices participating in that EVPN.</t> <t> Broadcast Domain (BD): an</dd> <dt>BD:</dt> <dd><t>Broadcast Domain. An emulated Ethernet, such that two systems on the same BD will receive each other'slink&nbhy;local broadcasts. <vspace/> <vspace/>link-local broadcasts.</t> <t> Note that EVPN supports service models in which a singleEVPN InstanceEVI contains only oneBD,BD and service models in which a single EVI contains multiple BDs. Both types of servicemodelmodels are supported by thisdraft.document. In all models, a given BD belongs to only one EVI. </t><t> Designated Forwarder (DF).</dd> <dt>DF:</dt> <dd><t>Designated Forwarder. As defined in <xreftarget="RFC7432"/>,target="RFC7432" format="default"/>, an Ethernet segment may bemulti&nbhy;homedmultihomed (attached to more than one PE). An Ethernet segment may also contain multipleBDs,BDs of one or more EVIs. For each such EVI, one of the PEs attached to the segment becomes that EVI's DF for that segment. Since a BD may belong to only one EVI, we can speak unambiguously of the BD's DF for a given segment.<vspace/> <vspace/> When the text makes it clear that we are speaking in the context of a given BD, we will frequently use the term "a segment's DF" to mean the given BD's DF for that segment.</t><t> AC: Attachment</dd> <dt>AC:</dt> <dd><t>Attachment Circuit. An AC connects the bridging function of anEVPN&nbhy;PEEVPN PE to an Ethernet segment of a particular BD. ACs are not visible at therouter (L3) layer. <vspace/> <vspace/>Layer 3. </t> <t> If a given Ethernet segment, attached to a given PE, contains n BDs, wewillsay that the PE has n ACs to that segment. </t><t> L3 Gateway: An</dd> <dt>L3 Gateway:</dt> <dd>An L3 Gateway is a PE that connects an EVPNtenant domainTenant Domain to an external multicast domain by performing both the OISM procedures and the Layer 3 multicast procedures of the external domain.</t> <t> PEG (PIM/EVPN Gateway): A</dd> <dt>PEG:</dt> <dd>PIM/EVPN Gateway. An L3 Gateway that connects an EVPN Tenant Domain to an external multicast domain whose Layer 3 multicast procedures are those of PIM <xreftarget="RFC7761"/>. </t> <t> MEG (MVPN/EVPN Gateway): Atarget="RFC7761" format="default"/>. </dd> <dt>MEG:</dt> <dd>MVPN/EVPN Gateway. An L3 Gateway that connects an EVPN Tenant Domain to an external multicast domain whose Layer 3 multicast procedures are those ofMVPN (<xref target="RFC6513"/>,Multicast VPN (MVPN) <xreftarget="RFC6514"/>). </t> <t> IPMG (IPtarget="RFC6513" format="default"/> <xref target="RFC6514" format="default"/>. </dd> <dt>IPMG:</dt> <dd>IP MulticastGateway):Gateway. A PE that is used for interworking OISMEVPN&nbhy;PEsEVPN PEs withnon&nbhy;OISM EVPN&nbhy;PEs. </t> <t> DR (Designated Router):non-OISM EVPN PEs. </dd> <dt>DR:</dt> <dd>Designated Router. A PE that has special responsibilities for handling multicast on a given BD.</t> <t> FHR (First</dd> <dt>FHR:</dt> <dd>First HopRouter):Router. The FHR is a PIM router <xreftarget="RFC7761"/>target="RFC7761" format="default"/> with special responsibilities. It is the first multicast router to see (S,G) packets from source S, and if G is an"Any SourceAny-Source Multicast(ASM)"(ASM) group, the FHR is responsible for sending PIM Register messages to the PIM Rendezvous Point (RP) for group G.</t> <t> LHR (Last</dd> <dt>LHR:</dt> <dd>Last HopRouter):Router. The LHR is a PIM router <xreftarget="RFC7761"/>target="RFC7761" format="default"/> with special responsibilities. Generally, it is attached to a LAN, and it determines whether there are any hosts on the LAN that need to receive a given multicast flow. If so, it creates and sends the PIM Join messages that are necessary to receive the flow.</t> <t> EC (Extended Community).</dd> <dt>EC:</dt> <dd>Extended Community. A BGP Extended Communities attribute(<xref target="RFC4360"/>,<xreftarget="RFC7153"/>)target="RFC4360" format="default"/> <xref target="RFC7153" format="default"/> is a BGP path attribute that consists of one or moreextended communities. </t> <t> RT (Route Target):Extended Communities. </dd> <dt>RT:</dt> <dd>Route Target. A Route Target is a particular kind of BGP Extended Community. A BGP Extended Community consists of a type field, a sub-type field, and a value field. Certain type/sub-type combinations indicate that a particular Extended Community is an RT. RT1 and RT2 are considered to be the same RT if and only if they have the same type,samesub-type, andsamevalue fields.</t> <t> Use of the "C&nbhy;" prefix. In</dd> <dt>C- prefix:</dt> <dd>In many documents on VPN multicast, the prefix"C&nbhy;"C- appears before any address or wildcard that refers to an address or addresses in a tenant's addressspace,space rather than to an address of addresses in the address space of the backbone network. This document omits the"C&nbhy;"C- prefix in many cases where it is clear from the context that the reference is to the tenant's address space.</t> </list> </t></dd> </dl> <t> This document also assumes familiarity with the terminology of <xreftarget="RFC4364"/>,target="RFC4364" format="default"/>, <xreftarget="RFC6514"/>,target="RFC6514" format="default"/>, <xreftarget="RFC7432"/>,target="RFC7432" format="default"/>, <xreftarget="RFC7761"/>,target="RFC7761" format="default"/>, <xreftarget="RFC9251"/>,target="RFC9136" format="default"/>, <xreftarget="RFC9136"/>target="RFC9251" format="default"/>, and <xreftarget="I-D.ietf-bess-evpn-bum-procedure-updates"/>.target="RFC9572" format="default"/>. </t> <section> <name>Requirements Language</name> <t> The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shown here. </t> </section><!-- terminology --></section> <sectiontitle="Background" anchor="background">anchor="background" numbered="true" toc="default"> <name>Background</name> <t> Ethernet VPN (EVPN) <xreftarget="RFC7432"/>target="RFC7432" format="default"/> provides a Layer 2 VPN (L2VPN) solution, which allows an IP or MPLS backbone provider to offer Ethernet service to a set of customers, known as "tenants". </t> <t> In this section (as well as in <xreftarget="RFC9135"/>),target="RFC9135" format="default"/>), we provide some essential background information on EVPN. </t> <sectiontitle="Segments,anchor="intro_bd" numbered="true" toc="default"> <name>Segments, Broadcast Domains, andTenants" anchor="intro_bd">Tenants</name> <t> One of the key concepts of EVPN is the Broadcast Domain (BD). A BD is essentially an emulated Ethernet. Each BD belongs to a single tenant. A BD typically consists of multiple Ethernet"segments",segments, and each segment may be attached to a different EVPN Provider Edge(EVPN&nbhy;PE)(EVPN PE) router.EVPN&nbhy;PEEVPN PE routers are often referred to as "Network VirtualizationEndpoints" or NVEs.Endpoints (NVEs)". However, this document will use the term"EVPN&nbhy;PE","EVPN PE" or, when the context is clear, just "PE". </t> <t> In this document, the term "segment" is usedinterchangeableinterchangeably withthe"Ethernet Segment" or"ES""ES", as defined in <xreftarget="RFC7432"/>.target="RFC7432" format="default"/>. </t> <t> Attached to each segment are"Tenant Systems" (TSes).Tenant Systems (TSs). A TS may be any type of system, physical or virtual, host or router, etc., that can attach to an Ethernet. </t> <t> When twoTSesTSs are on the same segment, traffic between them does not pass through anEVPN&nbhy;PE.EVPN PE. When twoTSesTSs are on different segments of the same BD, traffic between them does pass through anEVPN&nbhy;PE.EVPN PE. </t> <t> When twoTSes,TSs, say TS1 andTS2TS2, are on the same BD,then: <list style="symbols">then the following occurs: </t> <ul spacing="normal"> <li> <t> If TS1 knows theMACMedia Access Control (MAC) address of TS2, TS1 can send unicast Ethernet frames to TS2. TS2 will receive the frames unaltered. </t> </li> <li> <t> If TS1 broadcasts an Ethernet frame, TS2 will receive the unaltered frame. </t> </li> <li> <t> If TS1 multicasts an Ethernet frame, TS2 will receive the unalteredframe,frame as long as TS2 has been provisioned to receive the Ethernet multicast destination MAC address. </t></list> </t></li> </ul> <t> When we say that TS2 receives an unaltered frame from TS1, we mean that the frame still contains TS1's MACaddress,address and that no alteration of the frame's payload (and consequently, no alteration of the payload's IP header) has been made. </t> <t> EVPN allows a single segment to be attached to multiple PE routers. This is known as "EVPNmulti&nbhy;homing".multihoming". Suppose a given segment is attached to both PE1 and PE2, and suppose PE1 receives a frame from that segment. It may be necessary for PE1 to send the frame over the backbone to PE2. EVPN has procedures to ensure that such a frame cannot be sentby PE2back to its originatingsegment.segment by PE2. This is particularly important for multicast, because a frame arriving at PE1 from a given segment will already have been seen by all the systems on that segment that need to see it. If the framewerewas sent back to the originating segment by PE2, receivers on that segment would receive the packet twice. Even worse, the frame might be sent back to PE1, which could cause an infinite loop. </t> </section><!-- intro_bd --><sectiontitle="Inter-BDanchor="inter_bd" numbered="true" toc="default"> <name>Inter-BD (Inter-Subnet) IPTraffic" anchor="inter_bd">Traffic</name> <t> If a given tenant has multiple BDs, the tenant may wish to allow IP communication among these BDs. Such a set of BDs is known as an "EVPN Tenant Domain" or just a "Tenant Domain". </t> <t> If tenant systems TS1 and TS2 are not in the same BD, then they do not receive unaltered Ethernet frames from each other. In order for TS1 to send traffic to TS2, TS1 encapsulates an IP datagram inside an Ethernetframe,frame and uses Ethernet to send these frames to an IP router. The router decapsulates the IP datagram, does the IPprocessingprocessing, and re-encapsulates the datagram for Ethernet. The MACsource addressSource Address field now has the MAC address of the router, not of TS1. The TTL field of the IP datagram should be decremented by exactly 1, even if the frame needs to be sent from one PE to another. The structure of the provider's backbone is thus hidden from the tenants. </t> <t> EVPN accommodates the need forinter&nbhy;BDinter-BD communication within a Tenant Domain by providing an integrated L2/L3 service for unicast IP traffic. EVPN's Integrated Routing and Bridging (IRB) functionality is specified in <xreftarget="RFC9135"/>.target="RFC9135" format="default"/>. Each BD in a Tenant Domain is assumed to be a single IP subnet, and each IP subnet within a given Tenant Domain is assumed to be a single BD. EVPN's IRB functionality allows IP traffic to travel from one BD toanother,another and ensures that proper IP processing (e.g., TTL decrement) is done. </t> <t> A brief overview of IRB, including the notion of an"IRB interface",IRB interface, can be found in <xreftarget="irb"/>.target="irb" format="default"/>. As explained there, an IRB interface is a sort of virtual interface connecting an L3 routing instance to a BD. A BD may have multipleattachment circuitsAttachment Circuits (ACs) to a given PE, where each AC connects to a different Ethernet segment of the BD. However, these ACs are not visible to the L3 routing function; from the perspective of an L3 routing instance, a PE has just one interface to each BD, viz., the IRB interface for that BD. </t> <t> In this document, when traffic is routed out of an IRB interface, we say it is sent down the IRB interface to the BD that the IRB is for. In the other direction, traffic is sent up the IRB interface from the BD to the L3 routing instance. </t> <t> The"L3L3 routinginstance"instance depicted in <xreftarget="irb"/>target="irb" format="default"/> is associated with a single TenantDomain,Domain and may be thought of asan IP&nbhy;VRFIP Virtual Routing and Forwarding (IP-VRF) for that Tenant Domain. </t> </section><!-- inter_bd --><sectiontitle="EVPNanchor="evpn_ip_mcast" numbered="true" toc="default"> <name>EVPN and IPMulticast" anchor="evpn_ip_mcast">Multicast</name> <t> <xreftarget="RFC9135"/>target="RFC9135" format="default"/> and <xreftarget="RFC9136"/>target="RFC9136" format="default"/> coverinter&nbhy;subnet (inter&nbhy;BD)inter-subnet (inter-BD) IP unicast forwarding, but they do not coverinter&nbhy;subnetinter-subnet IP multicast forwarding. </t> <t> <xreftarget="RFC7432"/>target="RFC7432" format="default"/> coversintra&nbhy;subnet (intra&nbhy;BD)intra-subnet (intra-BD) Ethernet multicast. Theintra&nbhy;subnetintra-subnet Ethernet multicast procedures of <xreftarget="RFC7432"/>target="RFC7432" format="default"/> are used for EthernetBroadcastbroadcast traffic,forEthernet unicast traffic whoseMACDestination MAC Address field contains anUnknownunknown address, andforEthernet traffic whoseMACDestination MAC Address field contains an EthernetMulticastmulticast MAC address. These three classes of traffic are known collectively as "BUM traffic"(Broadcast/Unknown-Unicast/Multicast),(Broadcast, Unknown Unicast, or Multicast traffic), and the procedures for handling BUM traffic are known as "BUM procedures". </t> <t> <xreftarget="RFC9251"/>target="RFC9251" format="default"/> extends theintra&nbhy;subnetintra-subnet Ethernet multicast procedures by adding procedures that are specific to, and optimized for, the use of IP multicast within a subnet. However, that document does not coverinter&nbhy;subnetinter-subnet IP multicast. </t> <t> The purpose of this document is to specify procedures for EVPN that provide optimized IP multicast functionality within an EVPNtenant domain.Tenant Domain. This document also specifies procedures that allow IP multicast packets to be sourced from or destined to systems outside the Tenant Domain.We refer to theThe entire set oftheseprocedures are referred to as"OISM" (Optimized Inter&nbhy;Subnet Multicast)"Optimized Inter-Subnet Multicast (OISM)" procedures. </t> <t> In order to support the OISM procedures specified in this document, anEVPN&nbhy;PE MUSTEVPN PE <bcp14>MUST</bcp14> also support <xreftarget="RFC9135"/>target="RFC9135" format="default"/> and <xreftarget="RFC9251"/>.target="RFC9251" format="default"/>. (However, certain procedures in <xreftarget="RFC9251"/>target="RFC9251" format="default"/> are modified when OISM is supported.) </t> </section><!-- "evpn_ip_mcast --><sectiontitle="BDs, MAC-VRFS,anchor="evpn_stuff" numbered="true" toc="default"> <name>BDs, MAC-VRFs, and EVPN ServiceModels" anchor="evpn_stuff">Models</name> <t> <xreftarget="RFC7432"/>target="RFC7432" format="default"/> defines the notion of"MAC&nbhy;VRF".MAC-VRF (MAC Virtual Routing and Forwarding). AMAC&nbhy;VRFMAC-VRF contains one or more"Bridge Tables"bridge tables (seesection 3 of<xreftarget="RFC7432"/> for a discussion of this terminology),target="RFC7432" format="default" sectionFormat="of" section="3"/>), each of which represents a single Broadcast Domain. </t> <t> In the IRB model (outlined in <xreftarget="irb"/>),target="irb" format="default"/>), an L3 routing instance has one IRB interface per BD, NOT one perMAC&nbhy;VRF.MAC-VRF. This document does not distinguish between a"Broadcast Domain"Broadcast Domain and a"Bridge Table", and will usebridge table; instead, it uses the terms interchangeably (or will use the acronym "BD" to refer to either). The way the BDs are grouped intoMAC&nbhy;VRFsMAC-VRFs is not relevant to the procedures specified in this document. </t> <t>Section 6 of<xreftarget="RFC7432"/>target="RFC7432" format="default" sectionFormat="of" section="6"/> also defines several different EVPN service models:<list style="symbols"></t> <ul spacing="normal"> <li> <t> In the"vlan&nbhy;based service",vlan-based service, eachMAC&nbhy;VRFMAC-VRF contains one"bridge table",bridge table, where the bridge table corresponds to a particular Virtual LAN(VLAN). (See section 3 of(VLAN) (see <xreftarget="RFC7432"/> for a discussion of this terminology.)target="RFC7432" format="default" sectionFormat="of" section="3"/>). Thus, each VLAN is treated as a BD. </t> </li> <li> <t> In the"vlanvlan bundleservice",service, eachMAC&nbhy;VRFMAC-VRF contains one bridge table, where the bridge table corresponds to a set of VLANs.ThusThus, a set of VLANs are treated as constituting a single BD. </t> </li> <li> <t> In the"vlan&nbhy;awarevlan-aware bundleservice",service, eachMAC&nbhy;VRFMAC-VRF may contain multiple bridge tables, where each bridge table corresponds to one BD. If aMAC&nbhy;VRFMAC-VRF contains several bridge tables, then it corresponds to several BDs. </t></list> </t></li> </ul> <t> The procedures in this document are intended to work for all these service models. </t> </section><!-- evpn_stuff --></section><!-- background --><sectiontitle="Needanchor="need" numbered="true" toc="default"> <name>Need forEVPN-awareEVPN-Aware MulticastProcedures" anchor="need">Procedures</name> <t>Inter&nbhy;subnetInter-subnet IP multicast among a set of BDs can be achieved, in anon&nbhy;optimalnon-optimal manner, without any specific EVPN procedures. For instance, if a particular tenant has n BDs among whichheit wants to send IP multicast traffic,heit can simply attach a conventional multicast router to all n BDs. Or more generally, as long as each BD has at least one IP multicast router, and the IP multicast routers communicate multicast control information with each other, conventional IP multicast procedures will work normally, and no special EVPN functionality is needed. </t> <t> However, that technique does not provide optimal routing for multicast. In conventional multicast routing, for a given multicast flow, there is only one multicast router on each BD that is permitted to send traffic of that flow to the BD. If that BD has receivers for a given flow, but the source of the flow is not on that BD, then the flow must pass through that multicast router. This leads to the"hair&nbhy;pinning"hairpinning problem described (for unicast) in <xreftarget="irb"/>.target="irb" format="default"/>. </t> <t> For example, consider an (S,G) flow that is sourced by a TS S and needs to be received byTSesTSs R1 and R2. Suppose S is on a segment of BD1, R1 is on a segment of BD2, but both are attached to PE1.Suppose alsoAlso suppose that the tenant has a multicastrouter,router attached to a segment of BD1 and to a segment of BD2. However, the segments to which that router is attached are both attached to PE2.ThenThen, the flow from S to R would have to follow the path:S-->PE1-->PE2-->Tenant Multicast Router-->PE2-->PE1-->R1.S-->PE1-->PE2-->tenant multicast router-->PE2-->PE1-->R1. Obviously, the pathS-->PE1-->RS-->PE1-->R would be preferred. </t><figure align="center"> <artwork><artwork name="" type="" align="left" alt=""><![CDATA[ +---+ +---+ |PE1+----------------------+PE2| +---+-+ +-+---+ | \ \ / / | BD1 BD2 BD3 BD3 BD2 BD1 | | | \ | | S R1 R2 router</artwork> </figure>]]></artwork> <t> Now suppose that there is a second receiver, R2. R2 is attached to a third BD, BD3. However, it is attached to a segment of BD3 that is attached to PE1. And supposealsothat theTenant Multicast Routertenant multicast router is attached to a segment of BD3 that attaches to PE2. In this case, theTenant Multicast Routertenant multicast router will make two copies of the packet, one for BD2 and one for BD3. PE2 will send both copies back to PE1. Not only is the routing sub-optimal, butalsoPE2 also sends multiple copies of the same packet toPE1. ThisPE1, which is a further sub-optimality. </t> <t> This is only an example; many more examples of sub-optimal multicast routing can easily be given. To eliminate sub-optimal routing and extra copies, it is necessary to have a multicast solution that isEVPN-aware,EVPN-aware and that can use its knowledge of the internal structure of a Tenant Domain to ensure that multicast traffic gets routed optimally. The procedures in this document allow us to avoid all such sub-optimalities when routinginter&nbhy;subnetinter-subnet multicast traffic within a Tenant Domain. </t> </section> <sectiontitle="Additionalanchor="requirements" numbered="true" toc="default"> <name>Additional Requirements That MustbeBe Met by theSolution" anchor="requirements">Solution</name> <t> In addition to providing optimal routing of multicast flows within a Tenant Domain, the EVPN-aware multicast solution is intended to satisfy the following requirements:<list style="symbols"></t> <ul spacing="normal"> <li> <t> The solution must integrate well with the procedures specified in <xreftarget="RFC9251"/>.target="RFC9251" format="default"/>. That is, an integrated set of procedures must handle bothintra&nbhy;subnetintra-subnet multicast andinter&nbhy;subnetinter-subnet multicast. </t> </li> <li> <t> With regard tointra&nbhy;subnetintra-subnet multicast, the solutionMUST<bcp14>MUST</bcp14> maintain the integrity of the multicast Ethernet service. This means:<list style="symbols"></t> <ul spacing="normal"> <li> <t> If a source and a receiver are on the same subnet, the MACsource addressSource Address (SA) of the multicast frame sent by the source will not get rewritten. </t> </li> <li> <t> If a source and a receiver are on the same subnet, no IP processing of the Ethernet payload is done. The IP TTL is not decremented, the IPv4 header checksum is not changed, no fragmentation is done, etc. </t></list> </t></li> </ul> </li> <li> <t> On the other hand, if a source and a receiver are on different subnets, the frame received by the receiver will not have the MAC SourceaddressAddress of the source, as the frame will appear to have come from a multicast router. Also, proper processing of the IP header is done, e.g., TTLdecrementdecrements by 1, header checksum modification, possible fragmentation, etc. </t> </li> <li> <t> If a Tenant Domain contains several BDs, itMUST<bcp14>MUST</bcp14> be possible for a multicast flow (even when the multicast group address is an"any source multicast" (ASM) address),ASM address) to have sources in one of those BDs and receivers in one or more of the otherBDs,BDs without requiring the presence of any system performing PIMRendezvous Point (RP)RP functions <xreftarget="RFC7761"/>.target="RFC7761" format="default"/>. </t> </li> <li> <t> Sometimes a MAC address used by one TS on a particular BD is also used by another TS on a different BD.Inter&nbhy;subnetInter-subnet routing of multicast trafficMUST NOT<bcp14>MUST NOT</bcp14> make any assumptions about the uniqueness of a MAC address across several BDs. </t> </li> <li> <t> If twoEVPN&nbhy;PEsEVPN PEs attached to the same Tenant Domain both support the OISM procedures, each may receiveinter&nbhy;subnetinter-subnet multicasts from the other, even if the egress PE is not attached to any segment of the BD from which the multicast packets are being sourced. ItMUST NOT<bcp14>MUST NOT</bcp14> be necessary to provision the egress PE with knowledge of the ingress BD. </t> </li> <li> <t> There must be a procedure that allowsEVPN&nbhy;PEEVPN PE routers supporting OISM procedures to send/receive multicast traffic to/fromEVPN&nbhy;PEEVPN PE routers that support only <xreftarget="RFC7432"/>,target="RFC7432" format="default"/> but thatdodoes not support the OISM procedures or even the procedures of <xreftarget="RFC9135"/>.target="RFC9135" format="default"/>. However, when interworking with such routers (which we call"non&nbhy;OISM"non-OISM PE routers"), optimal routing may not be achievable. </t> </li> <li> <t> ItMUST<bcp14>MUST</bcp14> be possible to support scenarios in which multicast flows with sources inside a Tenant Domain have"external"external receivers, i.e., receivers that are outside the domain. It must also be possible to support scenarios where multicast flows with external sources (sources outside the Tenant Domain) have receivers inside the domain.<vspace/> <vspace/></t> <t> This presupposes that unicast routes to multicast sources outside the domain can be distributed toEVPN&nbhy;PEsEVPN PEs attached to thedomain,domain and that unicast routes to multicast sources within the domain can be distributed outside the domain.<vspace/> <vspace/></t> <t> Of particular importanceisare thescenarioscenarios in which the external sources and/or receivers are reachable viaL3VPN/MVPN, and the scenario in which external sources and/or receivers are reachableL3VPN/MVPN or via IP/PIM.<vspace/> <vspace/></t> <t> The solution for external interworkingMUST<bcp14>MUST</bcp14> allow for deployment scenarios in which EVPN does not need to export a host route for every multicast source. </t> </li> <li> <t> The solution for external interworking must not presuppose that the same tunneling technology is used within both the EVPN domain and the external domain. For example, MVPN interworking must be possible when MVPN is using MPLSP2MP tunneling,Point-to-Multipoint (P2MP) tunneling and when EVPN is using Ingress Replication (IR) orVXLANVirtual eXtensible Local Area Network (VXLAN) tunneling. </t> </li> <li> <t> The solution must not be overly dependent on the details of a small set of usecases,cases but must be adaptable to new use cases as they arise. (That is, the solution must be robust.) </t></list> </t></li> </ul> </section> <sectiontitle="Modelanchor="model_overview" numbered="true" toc="default"> <name>Model of Operation:Overview" anchor="model_overview">Overview</name> <sectiontitle="Control Plane" anchor="cp_overview">anchor="cp_overview" numbered="true" toc="default"> <name>Control Plane</name> <t> In this section, and in the remainder of this document, we assume the reader is familiar with the procedures ofIGMP/MLDIGMP / Multicast Listener Discovery (MLD) (see <xreftarget="RFC3376"/>target="RFC3376" format="default"/> and <xreftarget="RFC3810"/>),target="RFC3810" format="default"/>), by which hosts announce their interest in receiving particular multicast flows. </t> <t> Consider a Tenant Domain consisting of a set of k BDs:BD1, ..., BDk.BD1, ..., BDk. To support the OISM procedures, each Tenant Domain must also be associated with a"SupplementarySupplementary BroadcastDomain"Domain (SBD). An SBD is treated in the control plane as a real BD, but it does not have any ACs. The SBD has several uses; these will be described later in this document (see Sections <xreftarget="sbd_intro"/>target="sbd_intro" format="counter"/> and <xreftarget="solution"/>).target="solution" format="counter"/>). </t> <t> Each PE that attaches to one or more of the BDs in a giventenant domainTenant Domain will be provisioned to recognize that those BDs are part of the same Tenant Domain. Note that a given PE does not need to be configured with all the BDs of a given Tenant Domain. In general, a PE will only be attached to a subset of the BDs in a given TenantDomain,Domain and will be configured only with that subset of BDs. However, each PE attached to a given Tenant Domain must be configured with the SBD for that Tenant Domain. </t> <t> Suppose a particular segment of a particular BD is attached to PE1. <xreftarget="RFC7432"/>target="RFC7432" format="default"/> specifies that PE1 must originate an Inclusive Multicast Ethernet Tag (IMET) route for thatBD,BD and that the IMET route must be propagated to all other PEs attached to the same BD. If the given segment contains a host that has interest in receiving a particular multicast flow, either an (S,G) flow or a (*,G) flow, PE1 will learn of that interest by participating in the IGMP/MLD snooping procedures, as specified in <xreftarget="RFC4541"/>.target="RFC4541" format="default"/>. In this case:<list style="symbols"></t> <ul spacing="normal"> <li> <t> PE1 is interested in receiving the flow; </t> </li> <li> <t>Thethe AC attaching the interested host to PE1 is also said to be interested in the flow; and </t> </li> <li> <t>Thethe BD containing an AC that is interested in a particular flow is also said to be interested in that flow. </t></list> </t></li> </ul> <t> Once PE1 determines that it has an AC that is interested in receiving a particular flow or set of flows, it originates one or more Selective Multicast Ethernet Tag (SMET)route(s)routes <xreftarget="RFC9251"/>target="RFC9251" format="default"/> to advertise that interest. </t> <t> Note that each IMET or SMET route is"for"for a particular BD. The notion of a route being"for"for a particular BD is explained in <xreftarget="bd_route"/>.target="bd_route" format="default"/>. </t> <t> When OISM is being supported, the procedures of <xreftarget="RFC9251"/>,target="RFC9251" format="default"/> are modified as follows:<list style="symbols"></t> <ul spacing="normal"> <li> <t> The IMET route originated by a particular PE for a particular BD is distributed to all other PEs attached to the Tenant Domain containing that BD, even to those PEs that are not attached to that particular BD. </t> </li> <li> <t> The SMET routes originated by a particular PE are originated on a per-Tenant-Domainbasis,basis rather thanona per-BD basis. That is, the SMET routes are considered to be for the Tenant Domain'sSBD,SBD rather thanforany of its ordinary BDs. These SMET routes are distributed to all the PEs attached to the Tenant Domain.<vspace/> <vspace/></t> <t> In this way, each PE attached to a given Tenant Domain learns, fromeachthe otherPEPEs attached to the same Tenant Domain, the set of flows that are of interest to each of those other PEs. </t></list> </t> <!-- <t> --> <!-- OISM PEs MUST follow the procedures of <xref --> <!-- target="RFC9251"/>. (Though see Sections <xref --> <!-- target="smet_adv" format="counter"/> and <xref target="external" --> <!-- format="counter"/> for an exception.) In this document, we extend --> <!-- the procedures of <xref target="RFC9251"/> so that IMET and --> <!-- SMET routes for a particular BD are distributed not just to PEs --> <!-- that attach to that BD, but to PEs that attach to any BD in the --> <!-- Tenant Domain. --> <!-- </t> --> <!-- <t> --> <!-- Note that if some PE attached to the Tenant Domain does not --> <!-- support <xref target="RFC9251"/>, it will be assumed to be --> <!-- interested in all flows. Whether a particular remote PE supports --> <!-- <xref target="RFC9251"/> is determined by the presence of an --> <!-- "EVPN Multicast Flags Extended Community" attached to its IMET --> <!-- routes; this is specified in <xref target="RFC9251"/>. --> <!-- </t> --></li> </ul> <t> An OISM PE that is provisioned with several BDs in the same Tenant DomainMUST<bcp14>MUST</bcp14> originate an IMET route for each such BD. To indicate its support of <xreftarget="RFC9251"/>,target="RFC9251" format="default"/>, itSHOULD<bcp14>SHOULD</bcp14> attach the EVPN Multicast Flags Extended Community<!-- (with the RFC9251 flag --> <!-- set) -->to each such IMET route, but itMUST<bcp14>MUST</bcp14> attach the EC<!-- (with the --> <!-- RFC9251 flag set) -->to at least one such IMET route. </t> <t> Suppose PE1 is provisioned with both BD1 andBD2,BD2 andis provisioned to considerconsiders them to be part of the same Tenant Domain. It is possible that PE1 will receivefrom PE2both an IMET route for BD1 and an IMET route forBD2.BD2 from PE2. If either of these IMET routes has the EVPN Multicast Flags Extended Community, PE1MUST<bcp14>MUST</bcp14> assume that PE2 is supporting the procedures of <xreftarget="RFC9251"/>target="RFC9251" format="default"/> for ALL BDs in the Tenant Domain. </t> <t> If a PE supports OISM functionality, it indicatesthatthat, by setting the"OISM-supported"OISM-supported flag in the Multicast Flags ExtendedCommunity thatCommunity, it attaches to some or all of its IMET routes. An OISM PESHOULD<bcp14>SHOULD</bcp14> attach this EC with the OISM-supported flag set to all the IMET routes it originates. However, if PE1 imports IMET routes from PE2, and at least one of PE2's IMET routes indicates that PE2 is an OISM PE, PE1MUST<bcp14>MUST</bcp14> assume that PE2 is following OISM procedures. </t> </section><!-- control plane overview --><sectiontitle="Data Plane" anchor="dp_overview">anchor="dp_overview" numbered="true" toc="default"> <name>Data Plane</name> <t> Suppose PE1 has an AC to a segment inBD1,BD1 and PE1 receivesfrom that ACan (S,G) multicast frame from that AC (as defined in <xreftarget="terminology"/>).target="terminology" format="default"/>). </t> <t> There may be other ACs of PE1 on whichTSesTSs have indicated an interest (via IGMP/MLD) in receiving (S,G) multicast packets. PE1 is responsible for sending the received multicast packet on those ACs. There are two cases to consider:<list style="symbols"></t> <ul spacing="normal"> <li> <t> Intra-Subnet Forwarding: In this case, anattachmentAC with interest in (S,G) is connected to a segment that is part of the source BD, BD1. If the segment is notmulti&nbhy;homed,multihomed, or if PE1 is the Designated Forwarder (DF) (see <xreftarget="RFC7432"/>)target="RFC7432" format="default"/>) for that segment, PE1 sends the multicast frame on that AC without changing the MAC SA. The IP header is not modified at all; in particular, the TTL is not decremented. </t> </li> <li> <t> Inter-Subnet Forwarding: An AC with interest in (S,G) is connected to a segment of BD2, where BD2 is different than BD1. If PE1 is the DF for that segment (or if the segment is notmulti&nbhy;homed),multihomed), PE1 decapsulates the IP multicast packet, performs any necessary IP processing (including TTL decrement), and then re-encapsulates the packet appropriately for BD2. PE1 then sends the packet on the AC. Note that after re-encapsulation, the MAC SA will be PE1's MAC address on BD2. The IP TTL will have been decremented by 1. </t></list> </t></li> </ul> <t> In addition, there may be other PEs that are interested in (S,G) traffic. Suppose PE2 is such a PE.ThenThen, PE1 tunnels a copy of the IP multicast frame (with its original MACSA,SA and with no alteration of the payload's IP header) to PE2. The tunnel encapsulation contains information that PE2 can use to associate the frame with an"apparentapparent sourceBD".BD. If the actual source BD of the frame is BD1, then:<list style="symbols"></t> <ul spacing="normal"> <li> <t> If PE2 is attached to BD1, the tunnel encapsulation used to send the frame to PE2 will cause PE2 to identify BD1 as the apparent source BD. </t> </li> <li> <t> If PE2 is not attached to BD1, the tunnel encapsulation used to send the frame to PE2 will cause PE2 to identify the SBD as the apparent source BD. </t></list> </t></li> </ul> <t> Note that the tunnel encapsulation used for a particular BD will have been advertised in an IMET route orS&nbhy;PMSIa Selective Provider Multicast Service Interface (S-PMSI) route <xreftarget="I-D.ietf-bess-evpn-bum-procedure-updates"/>target="RFC9572" format="default"/> for that BD. That route carries a PMSI Tunnelattribute,Attribute (PTA), which specifies how packets originating from that BD are encapsulated. This information enables the PE receiving a tunneled packet to identify the apparent source BD as stated above. See <xreftarget="adv_tunnels"/>target="adv_tunnels" format="default"/> for more details. </t> <t> When PE2 receives the tunneled frame, it will forward it on any of its ACs that have interest in (S,G). </t> <t> If PE2 determines from the tunnel encapsulation that the apparent source BD is BD1,then <list style="symbols">then: </t> <ul spacing="normal"> <li> <t> For those ACs that connect PE2 to BD1, theintra&nbhy;subnetintra-subnet forwarding procedure described above is used, except that it is now PE2, not PE1, carrying out that procedure. Unmodified EVPN procedures from <xreftarget="RFC7432"/>target="RFC7432" format="default"/> are used to ensure that a packet originating from amulti&nbhy;homedmultihomed segment is never sent back to that segment. </t> </li> <li> <t> For those ACs that do not connect to BD1, theinter&nbhy;subnetinter-subnet forwarding procedure described above is used, except that it is now PE2, not PE1, carrying out that procedure. </t></list> </t></li> </ul> <t> If the tunnel encapsulation identifies the apparent source BD as the SBD, PE2 applies theinter&nbhy;subnetinter-subnet forwarding procedures described above to all of its ACs that have interest in the flow. </t> <t> These procedures ensure that an IP multicast frame travels from its ingress PE to all egress PEs that are interested in receiving it. While in transit, the frame retains its original MAC SA, and the payload of the frame retains its original IP header. Note that in all cases, when an IP multicast packet is sent from one BD to another, these procedures cause its TTL to be decremented by 1. </t> <t> Sofarfar, we have assumed that an IP multicast packet arrives at its ingress PE over an AC that belongs to one of the BDs in a given Tenant Domain. However, it is possible for a packet to arrive at its ingress PE in other ways. Since anEVPN&nbhy;PEEVPN PE supporting IRB has anIP&nbhy;VRF,IP-VRF, it is possible that theIP&nbhy;VRFIP-VRF will have a"VRF interface"VRF interface that is not an IRB interface. For example, there might be a VRF interface that is actually a physical link to an external Ethernet switch,or toa directly attached host, ortoa router. When anEVPN&nbhy;PE,EVPN PE, say PE1, receives a packet through such means, we will say that the packet has an"external"external source (i.e., a source"outsideoutside the TenantDomain").Domain). There are also other scenarios in which a multicast packet might have an external source, e.g., it might arrive over an MVPN tunnel from an L3VPN PE. In such cases, we will still refer to PE1 as the "ingressEVPN&nbhy;PE".EVPN PE". </t> <t> When anEVPN&nbhy;PE,EVPN PE, say PE1, receives an externally sourced multicast packet, and there are receivers for that packet inside the Tenant Domain, it does the following:<list style="symbols"></t> <ul spacing="normal"> <li> <t> Suppose PE1 has an AC in BD1 that has interest in (S,G).ThenThen, PE1 encapsulates the packet for BD1, filling in the MAC SA field with PE1's own MAC address on BD1. It sends the resulting frame on the AC. </t> </li> <li> <t> Suppose some otherEVPN&nbhy;PE,EVPN PE, say PE2, has interest in (S,G). PE1 encapsulates the packet for Ethernet, filling in the MAC SA field with PE1's own MAC address on the SBD. PE1 then tunnels the packet to PE2. The tunnel encapsulation will identify the apparent source BD as the SBD. Since the apparent source BD is the SBD, PE2 will know to treat the frame as aninter&nbhy;subnetinter-subnet multicast. </t></list> </t></li> </ul> <t> Wheningress replicationIR is used to transmit IP multicast frames from an ingressEVPN&nbhy;PEEVPN PE to a set of egress PEs, then the ingress PE has to send multiple copies of the frame. Each copy is the original Ethernet frame; decapsulation and IP processing take place only at the egress PE. </t> <t> If aPoint-to-Multipoint (P2MP)P2MP tree orBIERBit Index Explicit Replication (BIER) <xreftarget="I-D.ietf-bier-evpn"/>target="RFC9624" format="default"/> is used to transmit an IP multicast frame from an ingress PE to a set of egress PEs, then the ingress PE only has to send one copy of the frame to each of its next hops. Again, each egress PE receives the original frame and does any necessary IP processing. </t> </section><!-- dp_overview --></section><!-- model_overview --></section><!-- Introduction --><sectiontitle="Detailedanchor="model_detail" numbered="true" toc="default"> <name>Detailed Model ofOperation" anchor="model_detail">Operation</name> <t> The model described in <xreftarget="dp_overview"/>target="dp_overview" format="default"/> can be expressed more precisely using the notion of"IRB interface"IRB interface (see <xreftarget="irb"/>).target="irb" format="default"/>). For a given Tenant Domain:<list style="symbols"></t> <ul spacing="normal"> <li> <t> A given PE has one IRB interface for each BD to which it is attached. This IRB interface connects L3 routing to that BD. When IP multicast packets are sent or received on the IRB interfaces, the semantics of the interfaceisare modified from the semantics described in <xreftarget="irb"/>.target="irb" format="default"/>. See <xreftarget="ingress_irb_use"/>target="ingress_irb_use" format="default"/> for the details of the modification. </t> </li> <li> <t> Each PE also has an IRB interface that connects L3 routing to the SBD. The semantics of this interface is different than the semantics of the IRB interface to the real BDs. See <xreftarget="ingress_irb_use"/>. </t> </list>target="ingress_irb_use" format="default"/>. </t> </li> </ul> <t> In thissectionsection, we assume that PIM is not enabled on the IRB interfaces. In general, it is not necessary to enable PIM on the IRB interfaces unless there are PIM routers on one of the Tenant Domain'sBDs,BDs orunlessthere is some other scenario requiring a Tenant Domain's L3 routing instance to become a PIM adjacency of some other system. These cases will be discussed in <xreftarget="pim"/>.target="pim" format="default"/>. </t> <sectiontitle="Supplementaryanchor="sbd_intro" numbered="true" toc="default"> <name>Supplementary BroadcastDomain" anchor="sbd_intro">Domain</name> <t> Suppose a given Tenant Domain contains three BDs (BD1, BD2, and BD3) and two PEs(PE1,(PE1 and PE2). PE1 attaches to BD1 and BD2, while PE2 attaches to BD2 and BD3. </t> <t> To carry out the procedures described above, all the PEs attached to the Tenant Domain must be provisioned with the SBD for thattenant domain. A Route Target (RT)Tenant Domain. An RT must be associated with theSBD,SBD and provisioned on each of those PEs. We will refer to that RT as the"SBD&nbhy;RT"."SBD-RT". </t> <t> A Tenant Domain is also configured with anIP&nbhy;VRFIP-VRF <xreftarget="RFC9135"/>,target="RFC9135" format="default"/>, and theIP&nbhy;VRFIP-VRF is associated with an RT. This RTMAY<bcp14>MAY</bcp14> be the same as theSBD&nbhy;RT.SBD-RT. </t> <t> Suppose an (S,G) multicast frame originating on BD1 has a receiver on BD3. PE1 will transmit the packet to PE2 as a frame, and the encapsulation will identify the frame's source BD as BD1. Since PE2 is not provisioned with BD1, it will treat the packet as if its source BD were the SBD. That is, a packet can be transmitted from BD1 to BD3 even though its ingress PE is not configured forBD3,BD3 and/or its egress PE is not configured for BD1. </t> <t> EVPN supports service models in which a givenEVPN Instance (EVI)EVI can contain only one BD. It also supports service models in which a given EVI can contain multiple BDs. No matter which service model is being used for a particular tenant, it is highlyRECOMMENDED<bcp14>RECOMMENDED</bcp14> that an EVI containing only the SBD be provisioned for that tenant. </t> <t> If, for some reason, it is not feasible to provision an EVI that contains only the SBD, it is possible to put the SBD in an EVI that contains other BDs. However, in that case, theSBD&nbhy;RT MUSTSBD-RT <bcp14>MUST</bcp14> be different than the RT associated with any other BD.OtherwiseOtherwise, the procedures of this document (as detailed in Sections <xref target="bd_route" format="counter"/> and <xref target="sbd_rts" format="counter"/>) will not produce correct results. </t> </section><!-- sbd --><sectiontitle="Detectinganchor="bd_route" numbered="true" toc="default"> <name>Detecting When a Route isFor/Fromfor/from a ParticularBD" anchor="bd_route">BD</name> <t> In this document, we frequently say that a particular multicast route is "from"a particular BD,oris"for" a particularBD,BD or is "related to"a particular BD,or"is associated"associated with" a particular BD. These terms are used interchangeably. Subsequent sections of this document explain when various routes must be originated for particular BDs. In this section, we explain how the PE originating a route marks the route to indicate which BD it is for. We also explain how a PE receiving the route determines which BD the route is for. </t> <t> In EVPN, each BD is assigneda Route Target (RT).an RT. An RT is a BGPextended communityExtended Community that can be attached to the BGP routes used by the EVPN control plane. In some EVPN service models, each BD is assigned a unique RT. In other service models, a set of BDs (all in the same EVI) may be assigned the same RT. The RT that is assigned to the SBD is called the"SBD&nbhy;RT"."SBD-RT". </t> <t> In those service models that allow a set of BDs to share a single RT, each BD is assigned anon&nbhy;zeronon-zero Tag ID. The Tag ID appears in the Network Layer Reachability Information (NLRI) of many of the BGP routes that are used by the EVPN control plane. </t> <t> A given route may be for theSBD,SBD orforan"ordinary BD"ordinary BD (a BD that is not the SBD). An RT that has been assigned to an ordinary BD will be known as an "ordinaryBD&nbhy;RT".BD-RT". </t> <t> When constructing an IMET, SMET,S&nbhy;PMSI,S-PMSI, or Leaf <xreftarget="I-D.ietf-bess-evpn-bum-procedure-updates"/>target="RFC9572" format="default"/> route that is for a given BD, the following rules apply:<list style="symbols"></t> <ul spacing="normal"> <li> <t> If the route is for an ordinary BD, say BD1,then <list style="symbols">then: </t> <ul spacing="normal"> <li> <t> the routeMUST<bcp14>MUST</bcp14> carry the ordinaryBD&nbhy;RTBD-RT associated withBD1,BD1 and </t> </li> <li> <t> the routeMUST NOT<bcp14>MUST NOT</bcp14> carry any RT that is associated with an ordinary BD other than BD1. </t></list> </t></li> </ul> </li> <li> <t> If the route is for the SBD, the routeMUST<bcp14>MUST</bcp14> carry theSBD&nbhy;RT,SBD-RT andMUST NOT<bcp14>MUST NOT</bcp14> carry any RT that is associated with any other BD. </t> </li> <li> <t> As detailed in subsequent sections, under certaincircumstancescircumstances, a route that is for BD1 may carry both the RT of BD1 and also theSBD&nbhy;RT. </t> </list>SBD-RT. </t> </li> </ul> <t>The IMET route for the SBDMUST<bcp14>MUST</bcp14> carry a Multicast Flags ExtendedCommunity,Community in which an"OISM SBD"OISM SBD flag is set. </t> <t>The IMET route for a BD other than the SBDSHOULD<bcp14>SHOULD</bcp14> carry an EVI-RT EC as defined in <xreftarget="RFC9251"/>.target="RFC9251" format="default"/>. The EC is constructed from theSBD&nbhy;RT,SBD-RT to indicate the BD's corresponding SBD. This allows all PEs to check that they have consistent SBD provisioning andallowallows an Assisted Replication (AR) replicator to automatically determine a BD's corresponding SBD without any provisioning, as explained in <xreftarget="SBD-matching"/>.target="SBD-matching" format="default"/>. </t> <t> When receiving an IMET, SMET,S&nbhy;PMSI,S-PMSI, or Leaf route, it is necessary for the receiving PE to determine the BD to which the route belongs. This is done by examining the RTs carried by the route, as well as the Tag ID field of the route's NLRI. There are several cases to consider. Some of these cases are error cases that arise when the route has not been properly constructed. </t> <t> When one of the error cases is detected, the routeMUST<bcp14>MUST</bcp14> be regarded as a malformed route, and the"treat-as-withdraw"treat-as-withdraw procedure of <xreftarget="RFC7606"/> MUSTtarget="RFC7606" format="default"/> <bcp14>MUST</bcp14> be applied. Note that these error cases are only detectable by EVPN procedures at the receiving PE; BGP procedures at intermediate nodes will generally not detect the existence of such errorcases,cases and in generalSHOULD NOT<bcp14>SHOULD NOT</bcp14> attempt to do so. </t><t> <list style="format Case %d:"><ol spacing="normal" type="Case %d:"><li> <t> The receiving PE recognizes more than one of the route's RTs as being anSBD&nbhy;RTSBD-RT (i.e., the route carriesSBD&nbhy;RTsSBD-RTs of more than one Tenant Domain).<vspace/> <vspace/></t> <t> This is an error case; the route has not been properly constructed. </t> </li> <li> <t> The receiving PE recognizes one of the route's RTs as being associated with an ordinaryBD,BD and recognizes one of the route's other RTs as being associated with a different ordinary BD.<vspace/> <vspace/></t> <t> This is an error case; the route has not been properly constructed. </t> </li> <li> <t> The receiving PE recognizes one of the route's RTs as being associated with an ordinary BD in a particular TenantDomain,Domain and recognizes another of the route's RTs as being associated with the SBD of a different Tenant Domain.<vspace/> <vspace/></t> <t> This is an error case; the route has not been properly constructed. </t> </li> <li> <t> The receiving PE does not recognize any of the route's RTs as being associated with an ordinary BD in any of itstenant domains,Tenant Domains but does recognize one of the RTs as theSBD&nbhy;RTSBD-RT of one of its Tenant Domains.<vspace/> <vspace/></t> <t> In this case, the receiving PE associates the route with the SBD of that Tenant Domain. This association is made even if the Tag ID field of the route's NLRI is not the Tag ID of the SBD.<vspace/> <vspace/></t> <t> This is a normal use case where either (a) the route is for a BD to which the receiving PE is notattached,attached or (b) the route is for the SBD. In either case, the receiving PE associates the route with the SBD. </t> </li> <li> <t> The receiving PE recognizes exactly one of the RTs as an ordinaryBD&nbhy;RTBD-RT that is associated with one of the PE's EVIs, sayEVI&nbhy;1.EVI-1. The receiving PE also recognizes one of the RTs as being theSBD&nbhy;RTSBD-RT of the Tenant Domain containingEVI&nbhy;1. <vspace/> <vspace/>EVI-1. </t> <t> In this case, the route is associated with the BD inEVI&nbhy;1EVI-1 that is identified (in the context ofEVI&nbhy;1)EVI-1) by the Tag ID field of the route's NLRI. (IfEVI&nbhy;1EVI-1 contains only a single BD, the Tag ID is likely to be zero.)<vspace/> <vspace/></t> <t> This is the case where the route is for a BD to which the receiving PE is attached, but the route also carries theSBD&nbhy;RT.SBD-RT. In this case, the receiving PE associates the route with the ordinary BD, not with the SBD. </t></list> </t></li> </ol> <t>N.B.: AccordingNote that according to the above rules, the mapping from BD to RT is a many-to-one or one-to-one mapping. A route that anEVPN&nbhy;PEEVPN PE originates for a particular BD carries that BD's RT, and anEVPN&nbhy;PEEVPN PE that receives the route associates it with a BD as described above. However, RTs are not used only to help identify the BD to which a route belongs; they may also be used by BGP to determine the path along which the route isdistributed,distributed and to determine which PEs receive the route. There may be cases where it is desirable to originate a route for a particularBD,BD but have that route distributed to only some of theEVPN&nbhy;PEsEVPN PEs attached to that BD. Or one might want the route distributed to some intermediate set of systems, where it might be modified or replaced before being propagated further. Such situations are outside the scope of this document. </t> <t> Additionally, there may be situations where it is desirable to exchange routes among two or more different Tenant Domains("EVPN Extranet").(EVPN Extranet). Such situations are outside the scope of this document. </t><!-- <t> --> <!-- A route is for a particular BD if it carries the RT that has been --> <!-- assigned to that BD, and its NLRI contains the Tag ID that has been --> <!-- assigned to that BD. --> <!-- </t> --> <!-- <t> --> <!-- Note that a route that is for a particular BD may also carry --> <!-- additional RTs. --> <!-- </t> --></section> <sectiontitle="Useanchor="ingress_irb_use" numbered="true" toc="default"> <name>Use of IRB Interfaces at IngressPE" anchor="ingress_irb_use">PE</name> <t> When an (S,G) multicast frame is received from an AC belonging to a particular BD, say BD1:<list style="format %d."> <t</t> <ol spacing="normal" type="1"><li anchor="inter-PE"> <t> The frame is sent unchanged to otherEVPN&nbhy;PEsEVPN PEs that are interested in (S,G) traffic. The encapsulation used to send the frame to the otherEVPN&nbhy;PEsEVPN PEs depends on the tunnel type being used for multicast transmission. (For our purposes, we considerIngress Replication (IR), Assisted Replication (AR)IR, AR, and BIER to be"tunnel types",tunnel types, even though IR,ARAR, and BIER do not actually use P2MP tunnels.) At the egress PE, the apparent source BD of the frame can be inferred from the tunnel encapsulation. If the egress PE is not attached to the actual source BD, it will infer that the apparent source BD is the SBD.<vspace/> <vspace/></t> <t> Note that thethe inter&nbhy;PEinter-PE transmission of a multicast frame amongEVPN&nbhy;PEsEVPN PEs of the same Tenant Domain does NOT involve the IRBinterfaces,interfaces as long as the multicast frame was received over an AC attached to one of the Tenant Domain's BDs. </t><t</li> <li anchor="up_IRB"> <t> The frame is also sent up the IRB interface that attaches BD1 to the Tenant Domain's L3 routing instance in this PE. That is, the L3 routing instance, behaving as if it were a multicast router, receives the IP multicast frames that arrive at the PE from its local ACs. The L3 routing instance decapsulates the frame's payload to extract the IP multicast packet, decrements the IP TTL, adjusts the header checksum, and does any other necessary IP processing (e.g., fragmentation). </t><t</li> <li anchor="down_IRB"> <t> The L3 routing instance keeps track of which BDs have local receivers for (S,G) traffic. (A"local receiver"local receiver is a TS, reachable via a local AC, that has expressed interest in (S,G) traffic.) If the L3 routing instance has an IRB interface to BD2, and it knows that BD2 has a LOCAL receiver interested in (S,G) traffic, it encapsulates the packet in an Ethernet header for BD2, putting its own MAC address in the MAC SA field.ThenThen, it sends the packet down the IRB interface to BD2. </t></list> </t></li> </ol> <t> If a packet is sent from the L3 routing instance to a particular BD via the IRB interface (step <xref target="down_IRB" format="counter"/> in the above list), and if the BD in question is NOT the SBD, the packet is sent ONLY to LOCAL ACs of that BD. If the packet needs to go to other PEs, it has already been sent to them in step <xref target="inter-PE" format="counter"/>. Note that this is a change in the IRB interface semantics from what is described in <xreftarget="RFC9135"/>target="RFC9135" format="default"/> and <xreftarget="IRB"/>.target="IRB" format="default"/>. </t> <t> If a given locally attached segment ismulti-homed,multihomed, existing EVPN procedures ensure that a packet is not sent by a given PE to that segment unless the PE is the DF for that segment. Those procedures also ensure that a packet is never sent by a PE to its segment of origin.ThusThus, EVPN segmentmulti-homingmultihoming is fully supported; duplicate delivery to a segment or looping on a segment are therebyprevented,prevented without the need for any new procedures to be defined in this document. </t> <t> What if an IP multicast packet is received from outside thetenant domain?Tenant Domain? For instance, perhaps PE1'sIP&nbhy;VRFIP-VRF for a particulartenant domainTenant Domain also has a physical interface leading to an external switch, host, orrouter,router and PE1 receives an IP multicast packet or frame on thatinterface. Orinterface, or perhaps the packet is from anL3VPN,L3VPN or a different EVPN Tenant Domain. </t> <t> Such a packet is first processed by the L3 routing instance, which decrements TTL and does any other necessary IP processing.ThenThen, the packet is sent into the Tenant Domain by sending it down the IRB interface to the SBD of that Tenant Domain. This requires encapsulating the packet in an Ethernet header. The MAC SA field will contain the PE's own MAC on the SBD. </t> <t> An IP multicast packet sent by the L3 routing instance down the IRB interface to the SBD is treated as if it had arrived from a local AC, and steps <xref target="inter-PE"format="counter"/>&nbhy;<xrefformat="counter"/>-<xref target="down_IRB" format="counter"/> are applied. Note that the semantics of sending a packet down the IRB interface to the SBD are thus slightly different than the semantics of sending a packet down other IRB interfaces. IP multicast packets sent down the SBD's IRB interface may be distributed to other PEs, but IP multicast packets sent down other IRB interfaces are distributed only to local ACs. </t> <t> If a PE sends alink&nbhy;locallink-local multicast packet down the SBD IRB interface, that packet will be distributed (as an Ethernet frame) to other PEs of the TenantDomain,Domain but will not appear on any of the actual BDs. </t> </section><!-- ingress_irb_use --><sectiontitle="Useanchor="egress_irb_use" numbered="true" toc="default"> <name>Use of IRB Interfaces at an EgressPE" anchor="egress_irb_use">PE</name> <t> Suppose an egressEVPN&nbhy;PEEVPN PE receives an (S,G) multicast frame from the frame's ingressEVPN&nbhy;PE.EVPN PE. As described above, the packet will arrive as an Ethernet frame over a tunnel from the ingress PE, and the tunnel encapsulation will identify the source BD of the Ethernet frame. </t> <t> We define the notion of the frame's"apparentapparent sourceBD"BD as follows. If the egress PE is attached to the actual source BD, the actual source BD is the apparent source BD. If the egress PE is not attached to the actual source BD, the SBD is the apparent source BD. </t> <t> The egress PE now takes the following steps:<list style="format %d."></t> <ol spacing="normal" type="1"><li> <t> If the egress PE has ACs belonging to the apparent source BD of the frame, it sends the frame unchanged to any ACs of that BD that have interest in (S,G) packets. The MAC SA of the frame is not modified, and the IP header of the frame's payload is not modified in any way. </t> </li> <li> <t> The frame is also sent to the L3 routing instance by being sent up the IRB interface that attaches the L3 routing instance to the apparent source BD. Steps <xref target="up_IRB" format="counter"/> and <xref target="down_IRB" format="counter"/>oflisted in <xreftarget="ingress_irb_use"/>target="ingress_irb_use" format="default"/> are then applied. </t></list> </t></li> </ol> </section><!-- egress_irb_use --><sectiontitle="Announcinganchor="interest" numbered="true" toc="default"> <name>Announcing Interest in(S,G)" anchor="interest">(S,G)</name> <t> <xreftarget="RFC9251"/>target="RFC9251" format="default"/> defines procedures used by an egress PE to announce its interest in a multicast flow or set of flows. If an egress PE determines it has LOCAL receivers in a particular BD, say BD1, that are interested in a particular set of flows, it originates one or more SMET routes for BD1. Each SMET route specifies a particular (S,G) or (*,G) flow. By originatingana SMET route for BD1, a PE is announcing "I have receivers for (S,G) or (*,G) in BD1". Suchana SMET route carries theRoute Target (RT)RT for BD1, ensuring that it will be distributed to all PEs that are attached to BD1. </t> <t> The OISM procedures for originating SMET routes differ slightly from those in <xreftarget="RFC9251"/>.target="RFC9251" format="default"/>. In most cases, the SMET routes are considered to be for theSBD,SBD rather thanforthe BD containing local receivers. These SMET routes carry theSBD&nbhy;RT,SBD-RT and do not carry any ordinary BD-RT. Details on the processing of SMET routes can be found in <xreftarget="smet_adv"/>.target="smet_adv" format="default"/>. </t> <t> Since the SMET routes carry the SBD-RT, every ingress PE attached to a particular Tenant Domain will learn of all other PEs (attached to the same Tenant Domain) that have interest in a particular set of flows. Note that a PE that receives a given SMET route does not necessarily have any BDs (other than the SBD) in common with the PE that originates that SMET route. </t> <t> If all the sources and receivers for a given (*,G) are in the Tenant Domain,inter&nbhy;subnet "Any Source Multicast"inter-subnet ASM traffic will be properly routed without requiring anyRendezvous Points,RPs, shared trees, or other complex aspects of multicast routing infrastructure. Suppose, for example, that:<list style="symbols"></t> <ul spacing="normal"> <li> <t> PE1 has a local receiver, on BD1, for (*,G) and </t> </li> <li> <t> PE2 has a local source, on BD2, for (*,G). </t></list> </t></li> </ul> <t> PE1 will originateana SMET(*,G) route for the SBD, and PE2 will receive that route, even if PE2 is not attached to BD1. PE2 will thus know to forward (S,G) traffic to PE1. PE1 does not need to do any"source discovery".source discovery. (This does assume that source S does not send the same (S,G) datagram on two differentBDs,BDs and that the Tenant Domain does not contain two or more sources with the same IP address S. The use of multicast sources that have IP"anycast"anycast addresses is outside the scope of this document.) </t> <t> If some PE attached to the Tenant Domain does not support[RFC9251],<xref target="RFC9251"/>, it will be assumed to be interested in all flows. Whether a particular remote PE supports[RFC9251]<xref target="RFC9251"/> or not is determined by the presence of the Multicast Flags Extended Community<!-- (and the --> <!-- setting of the "IGMP Proxy" flag within the EC) -->in its IMET route; this is specified in[RFC9251].<xref target="RFC9251"/>. </t> </section><!-- interest --><sectiontitle="Tunnelinganchor="tunneling" numbered="true" toc="default"> <name>Tunneling Frames from IngressPEPEs to EgressPEs" anchor="tunneling">PEs</name> <t> <xreftarget="RFC7432"/>target="RFC7432" format="default"/> specifies the procedures for setting up and using"BUM tunnels".BUM tunnels. A BUM tunnel is a tunnel used to carry traffic on a particular BD if that traffic is (a) broadcast traffic,or(b) unicast traffic with an unknown Destination MACDA,Address, or (c) Ethernet multicast traffic. </t> <t> This document allows the BUM tunnels to be used as the default tunnels for transmitting IP multicast frames. It also allows a separate set of tunnels to be used, instead of the BUM tunnels, as the default tunnels for carrying IP multicast frames. Let's call these "IPMulticast Tunnels".multicast tunnels". </t> <t> When the tunneling is done viaIngress ReplicationIR or via BIER, this difference is of no significance. However, when P2MP tunnels are used, there is a significant advantage to having separate IP multicast tunnels. </t> <t>Other things being equal, itIt is desirable for an ingress PE to transmit a copy of a given (S,G) multicast frame on only one P2MP tunnel. All egress PEs interested in (S,G) packets then have to join that tunnel. If the source BD and PE for an (S,G) frame are BD1 andPE1PE1, respectively, and if PE2 has receivers on BD2 for (S,G), then PE2 must join the P2MPLSPLabel Switched Path (LSP) on which PE1 transmits the (S,G) frame. PE2 must join this P2MP LSP even if PE2 is not attached to the sourceBD (BD1).BD, BD1. If PE1werewas transmitting the multicast frame on its BD1 BUM tunnel, then PE2 would have to join the BD1 BUM tunnel, even though PE2 has no BD1attachment circuits.Attachment Circuits. This would cause PE2 to pull all the BUM traffic from BD1, most of which it would just have to discard.ThusThus, it isRECOMMENDED<bcp14>RECOMMENDED</bcp14> that the default IP multicast tunnels be distinct from the BUM tunnels. </t> <t> Notwithstanding the above, link-local IP multicast trafficMUST<bcp14>MUST</bcp14> always be carried on the BUMtunnels,tunnels and ONLY on the BUM tunnels.link-localLink-local IP multicast traffic consists of IPv4 traffic with a destination address prefix of 224/24 and IPv6 traffic with a destination address prefix of FF02/16. In this document, the terms "IP multicast packet" and "IP multicast frame" are defined in <xreftarget="terminology"/>target="terminology" format="default"/> so as to excludelink&nbhy;locallink-local traffic. </t> <t> Note that it is also possible to use"selective tunnels"selective tunnels to carry particular multicast flows (see <xreftarget="adv_tunnels"/>).target="adv_tunnels" format="default"/>). When an (S,G) frame is transmitted on a selective tunnel, it is not transmitted on the BUM tunnel or on the default IPMulticastmulticast tunnel. </t> </section><!-- tunneling --><sectiontitle="Advanced Scenarios" anchor="adv_scen">anchor="adv_scen" numbered="true" toc="default"> <name>Advanced Scenarios</name> <t> There are some deployment scenarios that require special procedures:<!-- These are discussed in detail in <xref --> <!-- target="advanced"/>. These scenarios are: --> <list style="format %d."></t> <ol spacing="normal" type="1"><li> <t> Some multicast sources or receivers are attached to PEs that support <xreftarget="RFC7432"/>,target="RFC7432" format="default"/> but do not support this document or <xreftarget="RFC9135"/>.target="RFC9135" format="default"/>. To interoperate with these"non&nbhy;OISM PEs",non-OISM PEs, it is necessary to have one or more gateway PEs that interface the tunnels discussed in this document with the BUM tunnels of the legacy PEs. This is discussed in <xreftarget="no-OISM"/>.target="no-OISM" format="default"/>. </t> </li> <li> <t> Sometimes multicast traffic originates from outside the EVPNdomain,domain or needs to be sent outside the EVPN domain. This is discussed in <xreftarget="external"/>.target="external" format="default"/>. An important special case of this, integration with MVPN, is discussed in <xreftarget="mvpn"/>.target="mvpn" format="default"/>. </t> </li> <li> <t> In some scenarios, one or more of the tenant systems is a PIM router, and the Tenant Domain is used as a transit network that is part of a larger multicast domain. This is discussed in <xreftarget="pim"/>. </t> </list>target="pim" format="default"/>. </t> </li> </ol> </section><!-- adv_scen --></section><!-- model_detail --><sectiontitle="EVPN-awareanchor="solution" numbered="true" toc="default"> <name>EVPN-Aware Multicast Solution ControlPlane" anchor="solution">Plane</name> <sectiontitle="Supplementaryanchor="sbd_rts" numbered="true" toc="default"> <name>Supplementary Broadcast Domain (SBD) and RouteTargets" anchor="sbd_rts">Targets</name> <t> As discussed in <xreftarget="sbd_intro"/>,target="sbd_intro" format="default"/>, every Tenant Domain is associated with a singleSupplementary Broadcast Domain (SBD).SBD. Recall that a Tenant Domain is defined to be a set of BDs that can freely send and receive IP multicast traffic to/from each other. If anEVPN&nbhy;PEEVPN PE has one or more ACs in a BD of a particular Tenant Domain, and if theEVPN&nbhy;PEEVPN PE supports the procedures of this document, thatEVPN&nbhy;PE MUSTEVPN PE <bcp14>MUST</bcp14> be provisioned with the SBD of that Tenant Domain. </t> <t> At eachEVPN&nbhy;PEEVPN PE attached to a given Tenant Domain, there is an IRB interface leading from the L3 routing instance of that Tenant Domain to the SBD. However, the SBD has no ACs. </t> <t> Each SBD is provisioned witha Route Target (RT).an RT. All theEVPN&nbhy;PEsEVPN PEs supporting a given SBD are provisioned with that RT as an import RT. That RTMUST NOT<bcp14>MUST NOT</bcp14> be the same as the RT associated with any other BD. </t> <t> We will use the term"SBD&nbhy;RT""SBD-RT" to denote the RT that has been assigned to the SBD. Routes carrying this RT will be propagated to allEVPN&nbhy;PEsEVPN PEs in the same Tenant Domain as the originator. </t> <t> <xreftarget="bd_route"/>target="bd_route" format="default"/> specifies the rules by which anEVPN&nbhy;PEEVPN PE that receives a route determines whether a received route"belongs to"belongs to a particular ordinary BD or SBD. </t> <t> <xreftarget="bd_route"/>target="bd_route" format="default"/> also specifies additional rules that must be followed when constructing routes that belong to a particular BD, including the SBD. </t> <t> The SBDSHOULD<bcp14>SHOULD</bcp14> be in anEVPN Instance (EVI)EVI of its own. Even if the SBD is not in an EVI of its own, theSBD&nbhy;RT MUSTSBD-RT <bcp14>MUST</bcp14> be different than the RT associated with any other BD. This restriction is necessary in order for the rules of Sections <xref target="bd_route" format="counter"/> and <xref target="sbd_rts" format="counter"/> to work correctly. </t> <t> Note that an SBD, just like any other BD, is associated on eachEVPN&nbhy;PEEVPN PE with aMAC&nbhy;VRF.MAC-VRF. Per <xreftarget="RFC7432"/>,target="RFC7432" format="default"/>, eachMAC&nbhy;VRFMAC-VRF is associated with a Route Distinguisher (RD). When constructing a route that is"for"for an SBD, anEVPN&nbhy;PEEVPN PE will place the RD of the associatedMAC&nbhy;VRFMAC-VRF in the"Route Distinguisher"Route Distinguisher field of the NLRI. (If the Tenant Domain has severalMAC&nbhy;VRFsMAC-VRFs on a given PE, theEVPN&nbhy;PEEVPN PE has a choice of which RD to use.) </t> <t> IfAssisted Replication (AR, seeAR <xreftarget="I-D.ietf-bess-evpn-optimized-ir"/>)target="RFC9574" format="default"/> is used, eachAR&nbhy;REPLICATORAR-REPLICATOR for a given Tenant Domain must be provisioned with the SBD of that Tenant Domain, even if theAR&nbhy;REPLICATORAR-REPLICATOR does not have any L3 routinginstance.instances. </t> </section><!-- sbd_rts --><sectiontitle="Advertisinganchor="adv_tunnels" numbered="true" toc="default"> <name>Advertising the Tunnels Used for IPMulticast" anchor="adv_tunnels">Multicast</name> <t> The procedures used for advertising the tunnels that carry IP multicast traffic depend upon the type of tunnel being used. If the tunnel type is neitherIngress Replication, Assisted Replication,IR, AR, nor BIER, there are procedures for advertising both"inclusive tunnels"inclusive tunnels and"selective tunnels".selective tunnels. </t> <t> When IR,ARAR, or BIER are used to transmit IP multicast packets across the core, there are no P2MP tunnels. Once an ingressEVPN&nbhy;PEEVPN PE determines the set of egressEVPN&nbhy;PEsEVPN PEs for a given flow, the IMET routes contain all the information needed to transport packets of that flow to the egress PEs. </t> <t> If AR is used, the ingressEVPN&nbhy;PEEVPN PE is also anAR&nbhy;LEAFAR-LEAF, and the IMET route coming from the selectedAR&nbhy;REPLICATORAR-REPLICATOR contains the information needed. TheAR&nbhy;REPLICATORAR-REPLICATOR will behave as an ingressEVPN&nbhy;PEEVPN PE when sending a flow to the egressEVPN&nbhy;PEs.EVPN PEs. </t> <t> If the tunneling technique requires P2MP tunnels to be set up (e.g.,RSVP&nbhy;TERSVP-TE P2MP,mLDP,Multipoint LDP (mLDP), or PIM), some of the tunnels may be selective tunnels and some may be inclusive tunnels. </t> <t> Selective P2MP tunnels are always advertised by the ingress PE usingS&nbhy;PMSI A&nbhy;DS-PMSI Auto-Discovery (A-D) routes <xreftarget="I-D.ietf-bess-evpn-bum-procedure-updates"/>.target="RFC9572" format="default"/>. </t> <t> For inclusive tunnels, there is a choice between using a BD's ordinary"BUM tunnel" <xref target="RFC7432"/>BUM tunnel as the default inclusive tunnel for carrying IP multicasttraffic,traffic or using a separate IP multicast tunnel as the default inclusive tunnel for carrying IP multicast. In the former case, the inclusive tunnel is advertised in an IMET route. In the latter case, the inclusive tunnel is advertised in a(C&nbhy;*,C&nbhy;*) S&nbhy;PMSI A&nbhy;D(C-*,C-*) S-PMSI A-D route <xreftarget="I-D.ietf-bess-evpn-bum-procedure-updates"/>.target="RFC9572" format="default"/>. Details may be found in subsequent sections. </t> <sectiontitle="Constructinganchor="sbd-routes" numbered="true" toc="default"> <name>Constructing Routes for theSBD" anchor="sbd-routes">SBD</name> <t> There are situations in which anEVPN&nbhy;PEEVPN PE needs to originate IMET, SMET, and/orSPMSIS-PMSI routes for the SBD. Throughout this document, we will refer to such routes respectively as"SBD&nbhy;IMET"SBD-IMET routes","SBD&nbhy;SMET"SBD-SMET routes", and"SBD&nbhy;SPMSI"SBD-SPMSI routes". Subsequent sections detail the conditions under which these routes need to be originated. </t> <t> When anEVPN&nbhy;PEEVPN PE needs to originate anSBD&nbhy;IMET, SBD&nbhy;SMET,SBD-IMET, SBD-SMET, orSBD&nbhy;SPMSISBD-SPMSI route, it constructs the route as follows:<list style="symbols"></t> <ul spacing="normal"> <li> <t>theThe RD field of the route's NLRI is set to the RD of theMAC&nbhy;VRFMAC-VRF that is associated with theSBD;SBD. </t> </li> <li> <t>the SBD&nbhy;RTThe SBD-RT is attached to theroute;route. </t> </li> <li> <t>the "Tag ID"The Tag ID field of the route's NLRI is set to the Tag ID that has been assigned to the SBD. This is most likely 0 if aVLAN&nbhy;basedVLAN-based orVLAN&nbhy;bundleVLAN-bundle service is beingused,used but non-zero if a VLAN-aware bundle service is being used. </t></list> </t></li> </ul> </section><!-- sbd-routes --><sectiontitle="Ingress Replication" anchor="imet-ir">anchor="imet-ir" numbered="true" toc="default"> <name>Ingress Replication</name> <t> WhenIngress Replication (IR)IR is used to transport IP multicast frames of a given Tenant Domain, eachEVPN&nbhy;PEEVPN PE attached to that Tenant DomainMUST<bcp14>MUST</bcp14> originate anSBD&nbhy;IMETSBD-IMET route (see <xreftarget="sbd-routes"/>).target="sbd-routes" format="default"/>). </t> <t> TheSBD&nbhy;IMETSBD-IMET routeMUST<bcp14>MUST</bcp14> carry aPMSI Tunnel attribute (PTA),PTA, and the MPLSlabelLabel field of the PTAMUST<bcp14>MUST</bcp14> specify a downstream-assigned MPLS label that maps uniquely (in the context of the originatingEVPN&nbhy;PE)EVPN PE) to the SBD. </t> <t> Following the procedures of <xreftarget="RFC7432"/>,target="RFC7432" format="default"/>, anEVPN&nbhy;PE MUSTEVPN PE <bcp14>MUST</bcp14> also originate an IMET route for each BD to which it is attached. Each of these IMET routes carries a PTA specifying adownstream&nbhy;assigneddownstream-assigned label that maps uniquely, in the context of the originatingEVPN&nbhy;PE,EVPN PE, to the BD in question. These IMET routes need not carry theSBD&nbhy;RT.SBD-RT. </t> <t> When an ingressEVPN&nbhy;PEEVPN PE needs to use IR to send an IP multicast frame from a particular source BD to an egressEVPN&nbhy;PE,EVPN PE, the ingress PE determines whether or not the egress PE has originated an IMET route for that BD. If so, that IMET route contains the MPLS label that the egress PE has assigned to the source BD. The ingress PE uses that label when transmitting the packet to the egress PE. Otherwise, the ingress PE uses the label that the egress PE has assigned to the SBD (in theSBD&nbhy;IMETSBD-IMET route originated by the egress). </t> <t> Note that the set of IMET routes originated by a given egress PE, and installed by a given ingress PE, may change over time. If the egress PE withdraws its IMET route for the source BD, the ingress PEMUST<bcp14>MUST</bcp14> stop using the label carried in that IMETroute,route and insteadMUST<bcp14>MUST</bcp14> use the label carried in theSBD&nbhy;IMETSBD-IMET route from that egress PE. Implementors must also take into account that an IMET route from a particular PE for a particular BD may arrive after that PE'sSBD&nbhy;IMETSBD-IMET route. </t> </section><!-- imet-ir --><sectiontitle="Assisted Replication" anchor="imet-ar">anchor="imet-ar" numbered="true" toc="default"> <name>Assisted Replication</name> <t> WhenAssisted ReplicationAR is used to transport IP multicast frames of a given Tenant Domain, eachEVPN&nbhy;PEEVPN PE (including theAR&nbhy;REPLICATOR)AR-REPLICATOR) attached to the Tenant DomainMUST<bcp14>MUST</bcp14> originate anSBD&nbhy;IMETSBD-IMET route (see <xreftarget="sbd-routes"/>).target="sbd-routes" format="default"/>). </t> <t> AnAR&nbhy;REPLICATORAR-REPLICATOR attached to a given Tenant Domain is considered to be anEVPN&nbhy;PEEVPN PE of that Tenant Domain. It is attached to all the BDs in the Tenant Domain, but it does not necessarily have L3 routing instances. </t> <t> As withIngress Replication,IR, theSBD&nbhy;IMETSBD-IMET route carries a PTA where the MPLSlabelLabel field specifies the downstream-assigned MPLS label that identifies the SBD. However, theAR&nbhy;REPLICATORAR-REPLICATOR andAR&nbhy;LEAF EVPN&nbhy;PEsAR-LEAF EVPN PEs will set the PTA's flags differently, as per <xreftarget="I-D.ietf-bess-evpn-optimized-ir"/>.target="RFC9574" format="default"/>. </t> <t> In addition, eachEVPN&nbhy;PEEVPN PE originates an IMET route for each BD to which it is attached. As in the case ofIngress Replication,IR, these routes carry the downstream-assigned MPLS labels that identify the BDs and do not carry theSBD&nbhy;RT.SBD-RT. </t> <t> When an ingressEVPN&nbhy;PE,EVPN PE, acting asAR&nbhy;LEAF,AR-LEAF, needs to send an IP multicast frame from a particular source BD to an egressEVPN&nbhy;PE,EVPN PE, the ingress PE determines whether or not there is anyAR&nbhy;REPLICATORAR-REPLICATOR that originated an IMET route for that BD. After theAR&nbhy;REPLICATORAR-REPLICATOR selection (if there are more than one), theAR&nbhy;LEAFAR-LEAF uses the label contained in the IMET route of theAR&nbhy;REPLICATORAR-REPLICATOR when transmitting packets to it. TheAR&nbhy;REPLICATORAR-REPLICATOR receives the packet and, based on the procedures specified in <xreftarget="I-D.ietf-bess-evpn-optimized-ir"/>target="RFC9574" format="default"/> and in <xreftarget="imet-ir"/>target="imet-ir" format="default"/> of this document, transmits the packets to the egressEVPN&nbhy;PEsEVPN PEs using the labels contained in the received IMET routes for either the source BD or the SBD. </t> <t> If an ingressAR&nbhy;LEAFAR-LEAF for a given BD has not received any IMET route for that BD from anAR&nbhy;REPLICATOR,AR-REPLICATOR, the ingressAR&nbhy;LEAFAR-LEAF follows the procedures in <xreftarget="imet-ir"/>.target="imet-ir" format="default"/>. </t> <sectiontitle = "Automaticanchor="SBD-matching" numbered="true" toc="default"> <name>Automatic SBDMatching" anchor="SBD-matching">Matching</name> <t>Each PE needs to know a BD's corresponding SBD. Configuring that information in each BD is onewayway, but it requires repetitive configuration and consistency checking (to make sure that all the BDs of the same tenant are configured with the same SBD). A better way is to configure the SBD info in the L3 routing instance so that all related BDs will derive the SBD information. </t> <t>AnAR-replicatorAR-REPLICATOR also needs to know the same information, though it does not necessarily have an L3 routing instance. However, from the EVI-RT EC in a BD's IMET route, anAR-replicatorAR-REPLICATOR can derive the corresponding SBD of that BD without any configuration. </t> </section> </section><!-- imet-ar --><sectiontitle="BIER" anchor="imet-bier">anchor="imet-bier" numbered="true" toc="default"> <name>BIER</name> <t> When BIER is used to transport multicast packets of a given Tenant Domain, and a givenEVPN&nbhy;PEEVPN PE attached to that Tenant Domain is a possible ingressEVPN&nbhy;PEEVPN PE for traffic originating outside that Tenant Domain, the givenEVPN&nbhy;PE MUSTEVPN PE <bcp14>MUST</bcp14> originate anSBD&nbhy;IMET route,SBD-IMET route (see <xreftarget="sbd-routes"/>).target="sbd-routes" format="default"/>). </t> <t> In addition, IMET routes that are originated for other BDs in the Tenant DomainMUST<bcp14>MUST</bcp14> carry theSBD&nbhy;RT.SBD-RT. </t> <t> Each IMET route (including but not limited to theSBD&nbhy;IMETSBD-IMET route)MUST<bcp14>MUST</bcp14> carry aPMSI Tunnel attribute (PTA).PTA. The MPLSlabelLabel field of the PTAMUST<bcp14>MUST</bcp14> specify an upstream-assigned MPLS label that maps uniquely (in the context of the originatingEVPN&nbhy;PE)EVPN PE) to the BD for which the route is originated. </t> <t> Suppose an ingressEVPN&nbhy;PE,EVPN PE, say PE1, needs to use BIER to tunnel an IP multicast frame to a set of egressEVPN&nbhy;PEs.EVPN PEs. And suppose the frame's source BD is BD1. The frame is encapsulated as follows:<list style="symbols"></t> <ul spacing="normal"> <li> <t> A four-octet MPLS label stack entry <xreftarget="RFC3032"/>target="RFC3032" format="default"/> is prepended to the frame. The Label field is set to the upstream-assigned label that PE1 has assigned to BD1. </t> </li> <li> <t> The resulting MPLS packet is then encapsulated in a BIER encapsulation <xreftarget="RFC8296"/>,target="RFC8296" format="default"/> <xreftarget="I-D.ietf-bier-evpn"/>.target="RFC9624" format="default"/>. The BIER BitString is set to identify the egressEVPN&nbhy;PEs.EVPN PEs. The BIER"proto"Proto field is set to the value for "MPLS packet withupstream&nbhy;assignedan upstream-assigned label at top of the stack". </t></list> </t></li> </ul> <t> Note: It is possible that the packet being tunneled from PE1 originated outside the Tenant Domain. In this case, the actual sourceBD (BD1)BD, BD1, is considered to be the SBD, and theupstream&nbhy;assignedupstream-assigned label it carries will be the label that PE1 assigned to theSBD,SBD and advertised in itsSBD&nbhy;IMETSBD-IMET route. </t> <t> Suppose an egress PE, say PE2, receives such a BIER packet. TheBFIR&nbhy;idBFIR-id field of the BIER header allows PE2 to determine that the ingress PE is PE1. There are then two cases to consider:<list style="numbers"></t> <ol spacing="normal" type="1"><li> <t> PE2 has received and installed an IMET route for BD1 from PE1.<vspace/> <vspace/></t> <t> In this case, the BIER packet will be carrying theupstream&nbhy;assignedupstream-assigned label that is specified in the PTA of that IMET route. This enables PE2 to determine the"apparentapparent sourceBD"BD (as defined in <xreftarget="egress_irb_use"/>). <!-- that --> <!-- BD1 is the source BD of the IP multicast frame carried by --> <!-- the BIER packet. -->target="egress_irb_use" format="default"/>). </t> </li> <li> <t> PE2 has not received and installed an IMET route for BD1 from PE1.<vspace/> <vspace/></t> <t> In this case, PE2 will not recognize theupstream&nbhy;assignedupstream-assigned label carried in the BIER packet. PE2MUST<bcp14>MUST</bcp14> discard the packet. </t></list> </t></li> </ol> <t> Further details on the use of BIER to support EVPN can be found in <xreftarget="I-D.ietf-bier-evpn"/>.target="RFC9624" format="default"/>. </t> </section><!-- imet-bier --><sectiontitle="Inclusiveanchor="adv_incl" numbered="true" toc="default"> <name>Inclusive P2MPTunnels" anchor="adv_incl">Tunnels</name> <sectiontitle="Usinganchor="ip_bum" numbered="true" toc="default"> <name>Using the BUM Tunnels as IP Multicast InclusiveTunnels" anchor="ip_bum">Tunnels</name> <t> The procedures in this section apply onlywhen <list style="format (%c)">when: </t> <ol spacing="normal" type="%c)"><li> <t> it is desired to use the BUM tunnels to carry IP multicast traffic across thebackbone,backbone and </t> </li> <li> <t> the BUM tunnels are P2MP tunnels (i.e., neither IR, AR, nor BIER are being used to transport the BUM traffic). </t></list> </t></li> </ol> <t> In this case, an IP multicast frame (whetherinter&nbhy;subnetinter-subnet orintra&nbhy;subnet)intra-subnet) will be carried across the backbone in the BUM tunnel belonging to its source BD. EachEVPN&nbhy;PEEVPN PE attached to a given Tenant Domain needs to join the BUM tunnels for every BD in the Tenant Domain, even those BDs to which theEVPN&nbhy;PEEVPN PE is not locally attached. This ensures that an IP multicast packet from any source BD can reach all PEs attached to the Tenant Domain. </t> <t> Note that this will cause all the BUM traffic from a given BD in a Tenant Domain to be sent to all PEs that attach to that Tenant Domain, even the PEs that don't attach to the given BD. To avoid this, it isRECOMMENDED<bcp14>RECOMMENDED</bcp14> that the BUM tunnels not be used as IPMulticastmulticast inclusivetunnels,tunnels and that the procedures of <xreftarget="wc_spmsi"/>target="wc_spmsi" format="default"/> be used instead. </t> <t> If a PE is a possible ingressEVPN&nbhy;PEEVPN PE for traffic originating outside the Tenant Domain, the PEMUST<bcp14>MUST</bcp14> originate anSBD&nbhy;IMETSBD-IMET route (see <xreftarget="sbd-routes"/>).target="sbd-routes" format="default"/>). This routeMUST<bcp14>MUST</bcp14> carry a PTA specifying the P2MP tunnel used for transmitting IP multicast packets that originate outside thetenant domain.Tenant Domain. AllEVPN&nbhy;PEsEVPN PEs of the Tenant DomainMUST<bcp14>MUST</bcp14> join the tunnel specified in the PTA of anSBD&nbhy;IMETSBD-IMET route:<list style="symbols"></t> <ul spacing="normal"> <li> <t> If the tunnel is an RSVP-TE P2MP tunnel, the originator of the routeMUST<bcp14>MUST</bcp14> use RSVP-TE P2MP procedures to add each PE of the Tenant Domain to the tunnel, even PEs that have not originated anSBD&nbhy;IMETSBD-IMET route. </t> </li> <li> <t> If the tunnel is an mLDP or PIM tunnel, each PE importing theSBD&nbhy;IMETSBD-IMET routeMUST<bcp14>MUST</bcp14> add itself to the tunnel, using mLDP or PIM procedures, respectively. </t></list> </t></li> </ul> <t> Whether or not a PE originates anSBD&nbhy;IMETSBD-IMET route, it will of course originate an IMET route for each BD to which it is attached. Each of these IMET routesMUST<bcp14>MUST</bcp14> carry theSBD&nbhy;RT,SBD-RT, as well as the RT for the BD to which it belongs. </t> <t> If a received IMET route is not theSBD&nbhy;IMETSBD-IMET route, it will also be carrying the RT for its source BD. The route's NLRI will carry the Tag ID for the source BD. From the RT and the Tag ID, any PE receiving the route can determine the route's source BD. </t> <t> If the MPLSlabelLabel field of the PTA contains zero, the specified P2MP tunnel is used only to carry frames of a single source BD. </t> <t> If the MPLSlabelLabel field of the PTA does not contain zero, itMUST<bcp14>MUST</bcp14> contain an upstream-assigned MPLS label that maps uniquely (in the context of the originatingEVPN&nbhy;PE)EVPN PE) to the source BD(or,(or in the case of anSBD&nbhy;IMETSBD-IMET route, to the SBD). The tunnel may then be used to carry frames of multiple source BDs. The apparent source BD of a particular packet is inferred from the label carried by the packet. </t> <t> IP multicast traffic originating outside the Tenant Domain is transmitted with the label corresponding to the SBD, as specified in the ingressEVPN&nbhy;PE's SBD&nbhy;IMETEVPN PE's SBD-IMET route. </t> </section><!-- ip_bum --><sectiontitle="Usinganchor="wc_spmsi" numbered="true" toc="default"> <name>Using WildcardS&nbhy;PMSIS-PMSI A-D Routes to Advertise Inclusive Tunnels Specific to IPMulticast" anchor="wc_spmsi">Multicast</name> <t> The procedures of this section apply when (and only when) it is desired to transmit IP multicast traffic on an inclusivetunnel,tunnel but not on the same tunnel used to transmit BUM traffic. </t> <t> However, these procedures do NOT apply when the tunnel type isIngress ReplicationIR or BIER, EXCEPT in the case where it is necessary to interwork betweennon&nbhy;OISMnon-OISM PEs and OISM PEs, as specified in <xreftarget="no-OISM"/>.target="no-OISM" format="default"/>. </t> <t> EachEVPN&nbhy;PEEVPN PE attached to the given Tenant DomainMUST<bcp14>MUST</bcp14> originate anSBD&nbhy;SPMSI A&nbhy;DSBD-SPMSI A-D route. The NLRI of that routeMUST<bcp14>MUST</bcp14> contain(C&nbhy;*,C&nbhy;*)(C-*,C-*) (see <xreftarget="RFC6625"/>).target="RFC6625" format="default"/>). Additional rules for constructing that route are given in <xreftarget="sbd-routes"/>.target="sbd-routes" format="default"/>. </t> <t> In addition, anEVPN&nbhy;PE MUSTEVPN PE <bcp14>MUST</bcp14> originate anS&nbhy;PMSI A&nbhy;DS-PMSI A-D route containing(C&nbhy;*,C&nbhy;*)(C-*,C-*) in its NLRI for each of the other BDs, in the given Tenant Domain, to which it is attached. All such routesMUST<bcp14>MUST</bcp14> carry theSBD&nbhy;RT.SBD-RT. This ensures that those routes are imported by allEVPN&nbhy;PEsEVPN PEs attached to the Tenant Domain. </t> <t> A PE receiving these routes follows the procedures of <xreftarget="bd_route"/>target="bd_route" format="default"/> to determine which BD the route is for.<!-- The route carrying the PTA will also be carrying the RT for --> <!-- that source BD, and the route's NLRI will contain the Tag ID --> <!-- for that source BD. This allows any PE receiving the route to --> <!-- determine the source BD associated with the route. --></t> <t> If the MPLSlabelLabel field of the PTA contains zero, the specified tunnel is used only to carry frames of a single source BD. </t> <t> If the MPLSlabelLabel field of the PTA does not contain zero, itMUST<bcp14>MUST</bcp14> specify an upstream-assigned MPLS label that maps uniquely (in the context of the originatingEVPN&nbhy;PE)EVPN PE) to the source BD. The tunnel may be used to carry frames of multiple source BDs, and the apparent source BD for a particular packet is inferred from the label carried by the packet. </t> <t> TheEVPN&nbhy;PE advertising these S&nbhy;PMSI A&nbhy;D route routes is specifying the default tunnel that it will use (as ingress PE) for transmitting IP multicast packets. The upstream-assigned label allows an egressEVPN PEto determine the apparent source BD of a given packet. </t> </section> <!-- wc-spmsi --> <!-- <section title="RSVP-TE P2MP" anchor="wc-rsvp"> --> <!-- <t> --> <!-- When RSVP-TE P2MP is used to transport multicast packets of a --> <!-- given Tenant Domain on an inclusive tunnel, and it is desired --> <!-- to avoid using the BUM tunnels, then each EVPN&nbhy;PE attached to --> <!-- that Tenant Domain MUST originate an SBD&nbhy;SPMSI A-D route. The --> <!-- NLRI of that route MUST contain (C&nbhy;*,C&nbhy;*) (see <xref --> <!-- target="RFC625"/>). Additional rules for constructing that --> <!-- route are given in <xref target="sbd-spmsi"/>. --> <!-- </t> --> <!-- <t> --> <!-- In addition, an EVPN&nbhy;PE MUST originate an S&nbhy;PMSI A-D route --> <!-- containing (C&nbhy;*,C&nbhy;*) in its NLRI for each of the other BDs in --> <!-- the Tenant Domain to which it is attached. All such route --> <!-- MUST carry the SBD&nbhy;RT. This ensures that those routes are --> <!-- imported by all EVPN&nbhy;PEs attached to the Tenant Domain. --> <!-- </t> --> <!-- <t> --> <!-- Each S&nbhy;PMSI A-D route (including but not limited to the --> <!-- SBD&nbhy;SPMSI route) MUST carry a PMSI Tunnel attribute (PTA). --> <!-- The MPLS label field of the PTA MUST specify an --> <!-- upstream-assigned MPLS label that maps uniquely (in the --> <!-- context of the originating EVPN&nbhy;PE) to the SBD. --> <!-- </t> --> <!-- <t> --> <!-- The EVPN&nbhy;PEadvertising theseS&nbhy;PMSIS-PMSI A-Droute routes is --> <!-- specifying the default tunnel that it will use (as ingress PE) --> <!-- for transmitting IP multicast packets. The upstream-assigned --> <!-- label allows an egress PE to determine the source BD of a --> <!-- given packet. --> <!-- </t> --> <!-- </section> <!-\- wc-rsvp -\-> --> <!-- <section title="mLDP or PIM" anchor="wc-recv"> --> <!-- <t> --> <!-- When either mLDP or PIM (or any other protocol that uses a --> <!-- receiver-driven technique to build P2MP or MP2MP trees) is --> <!-- used to transport multicast packets of a given Tenant Domain --> <!-- on an inclusive tunnel, and it is desired to avoid using the --> <!-- BUM tunnels, the procedures of this section apply. --> <!-- </t> --> <!-- <t> --> <!-- then each, --> <!-- an EVPN&nbhy;PE attached to that Tenant Domain MUST NOT originate --> <!-- an SBD&nbhy;IMET route. --> <!-- </t> --> <!-- <t> --> <!-- However, IMET routes that are originated for other BDs in the --> <!-- Tenant Domain MUST carry the SBD&nbhy;RT. --> <!-- </t> --> <!-- <t> --> <!-- Each IMET route MUST carry a PMSI Tunnel attribute (PTA), and --> <!-- the MPLS label field of the PTA MUST specify an --> <!-- upstream-assigned MPLS label that maps uniquely (in the --> <!-- context of the originating EVPN&nbhy;PE) to a particular BD. --> <!-- </t> --> <!-- <t> --> <!-- The EVPN&nbhy;PE advertising these IMETroutes is specifying the--> <!--default tunnel that it will use (as ingress PE) for--> <!--transmitting IP multicast packets. The upstream-assigned--> <!--label allows an egress PE to determine the apparent source BD of a--> <!--given packet.--> <!-- </t> --> <!-- <t> --> <!-- The procedures of this section <xref target="imet-recv"/> --> <!-- apply whenever the tunnel technology is based on the --> <!-- construction of the multicast trees in a "receiver-driven" --> <!-- manner; mLDP and PIM are two ways of constructing trees in a --> <!-- receiver-driven manner. --> <!--</t>--> <!--</section><!-\- wc-recv -\-> --></section><!-- wc_spmsi--><sectiontitle="Selective Tunnels" anchor="adv_secl">anchor="adv_secl" numbered="true" toc="default"> <name>Selective Tunnels</name> <t> An ingressEVPN&nbhy;PEEVPN PE for a given multicast flow or set of flows can always assign the flow to a particular P2MP tunnel by originating anS&nbhy;PMSI A&nbhy;DS-PMSI A-D route whose NLRI identifies the flow or set of flows. The NLRI of the route could be(C&nbhy;*,C&nbhy;G),(C-*,C-G) or(C&nbhy;S,C&nbhy;G).(C-S,C-G). TheS&nbhy;PMSI A&nbhy;DS-PMSI A-D routeMUST<bcp14>MUST</bcp14> carry theSBD&nbhy;RT,SBD-RT so that it is imported by allEVPN&nbhy;PEsEVPN PEs attached to the Tenant Domain. </t><!-- <t> --> <!-- <cref source=" ECR"> --> <!-- Any need for (C&nbhy;S,C&nbhy;*)? --> <!-- </cref> --> <!-- </t> --><t> AnS&nbhy;PMSI A&nbhy;DS-PMSI A-D route is"for"for a particular source BD. ItMUST<bcp14>MUST</bcp14> carry the RT associated with that BD, and itMUST<bcp14>MUST</bcp14> have the Tag ID for that BD in its NLRI. </t> <t> When anEVPN&nbhy;PEEVPN PE imports anS&nbhy;PMSI A&nbhy;DS-PMSI A-D route, it applies the rules of <xreftarget="bd_route"/>target="bd_route" format="default"/> to associate the route with a particular BD. </t> <t> Each such routeMUST<bcp14>MUST</bcp14> contain a PTA, as specified in <xreftarget="wc_spmsi"/>.target="wc_spmsi" format="default"/>. </t> <t> An egressEVPN&nbhy;PEEVPN PE interested in the specified flow or flowsMUST<bcp14>MUST</bcp14> join the specified tunnel. Procedures for joining the specified tunnel are specific to the tunnel type. (Note that if the tunnel type isRSVP&nbhy;TERSVP-TE P2MP LSP, the Leaf Information Required (LIR) flag of the PTASHOULD NOT<bcp14>SHOULD NOT</bcp14> be set. An ingress OISM PE knows which OISM EVPN PEs are interested in any givenflow,flow and hence can add them to theRSVP&nbhy;TERSVP-TE P2MP tunnel that carries such flows.)<!-- If the tunnel is an --> <!-- RSVP-TE P2MP LSP, joining the tunnel will require the sending of --> <!-- Leaf A-D routes, as discussed in <xref target="I-D.ietf-bess-evpn-bum-procedure-updates"/>. --></t> <t> If the PTA does not specify a non-zero MPLS label, the apparent source BD of any packets that arrive on that tunnel is considered to be the BD associated with the route that carries the PTA. If the PTA does specify a non-zero MPLS label, the apparent source BD of any packets that arrive on that tunnel carrying the specified label is considered to be the BD associated with the route that carries the PTA. </t> <t> It should be notedthatthat, when either IR or BIER is used, there is no need for an ingress PE to useS&nbhy;PMSI A&nbhy;DS-PMSI A-D routes to assign specific flows to selective tunnels. The procedures of <xreftarget="smet_adv"/>,target="smet_adv" format="default"/>, along with the procedures of Sections <xreftarget="imet-ir"/>,target="imet-ir" format="counter"/>, <xreftarget="imet-ar"/>, ortarget="imet-ar" format="counter"/>, and <xreftarget="imet-bier"/>,target="imet-bier" format="counter"/>, provide the functionality of selective tunnels without the need to useS&nbhy;PMSI A&nbhy;DS-PMSI A-D routes. </t> </section><!-- adv_secl --></section><!-- adv_incl --><sectiontitle="Advertisinganchor="smet_adv" numbered="true" toc="default"> <name>Advertising SMETRoutes" anchor="smet_adv">Routes</name> <t> <xreftarget="RFC9251"/>target="RFC9251" format="default"/> allows an egressEVPN&nbhy;PEEVPN PE to express its interest in a particular multicast flow or set of flows by originatingana SMET route. The NLRI of the SMET route identifies the flow or set of flows as(C&nbhy;*,C&nbhy;*) or (C&nbhy;*,C&nbhy;G)(C-*,C-*), (C-*,C-G), or(C&nbhy;S,C&nbhy;G).(C-S,C-G). </t> <t> Each SMET route belongs to a particular BD. The Tag ID for the BD appears in the NLRI of the route, and the route carries the RT associated with that BD. From this<RT, tag><RT, tag> pair, otherEVPN&nbhy;PEsEVPN PEs can identify the BD to which a received SMET route belongs. (Remember though that the route may be carrying multiple RTs.) </t> <t> There are three cases to consider:<list style="symbols"></t> <ol spacing="normal" type="Case %d:"> <li> <t>Case 1:It is known that no BD of a Tenant Domain contains a multicast router.<vspace/> <vspace/></t> <t> In this case, an egress PE advertises its interest in a flow or set of flows by originatingana SMET route that belongs to the SBD. We refer to this as anSBD&nbhy;SMETSBD-SMET route. TheSBD&nbhy;SMETSBD-SMET route carries theSBD&nbhy;RT,SBD-RT and has the Tag ID for the SBD in its NLRI. SMET routes for the individual BDs are not needed, because there is no need for a PE that receivesana SMET route to send a corresponding IGMP/MLD Join message on any of its ACs. </t> </li> <li> <t>Case 2:It is known that more than one BD of a Tenant Domain may contain a multicast router.<vspace/> <vspace/></t> <t> This isverymuch like Case 1. An egress PE advertises its interest in a flow or set of flows by originating anSBD&nbhy;SMETSBD-SMET route. TheSBD&nbhy;SMETSBD-SMET route carries theSBD&nbhy;RT,SBD-RT and has the Tag ID for the SBD in its NLRI.<vspace/> <vspace/></t> <t> In this case, it is important to be sure that SMET routes for the individual BDs are not originated.Suppose, forFor example, suppose that PE1 had local receivers for a given flow on both BD1 andBD2,BD2 and that it originated SMET routes for both those BDs.ThenThen, PEs receiving those SMET routes might send IGMP/MLD Joins on both those BDs. This could cause externally sourced multicast traffic to enter the Tenant Domain at both BDs, which could result in duplication of data.<vspace/> <vspace/></t> <t> Note that if it is possible that more than one BD contains a tenant multicast router, then in order to receive multicast data originating from outside EVPN, the PEsMUST<bcp14>MUST</bcp14> follow the procedures of <xreftarget="external"/>.target="external" format="default"/>. </t> </li> <li> <t>Case 3:It is known that only a single BD of a Tenant Domain contains a multicast router.<vspace/> <vspace/></t> <t> Suppose that an egress PE is attached to a BD on which there might be a tenant multicast router. (The tenant router is not necessarily on a segment that is attached to that PE.) And suppose that the PE has one or more ACs attached to thatBDBD, which are interested in a given multicast flow. In this case, in addition to the SMET route for the SBD, the egress PEMAY<bcp14>MAY</bcp14> originateana SMET route for that BD. This will enable the ingress PE(s) to send IGMP/MLD messages on ACs for the BD, as specified in <xreftarget="RFC9251"/>.target="RFC9251" format="default"/>. As long as that is the only BD on which there is a tenant multicast router, there is no possibility of duplication of data.<!-- <vspace/> --> <!-- <vspace/> --> <!-- If an SMET route is not an SBD&nbhy;SMET route, and if the SMET --> <!-- route is for (C&nbhy;S,C&nbhy;G) (i.e., no wildcard source), and --> <!-- if the EVPN&nbhy;PE originating it knows the source BD of --> <!-- C&nbhy;S, it MAY put only the RT for that BD on the route. --> <!-- Otherwise, the route MUST carry the SBD&nbhy;RT, so that it gets --> <!-- distributed to all the EVPN&nbhy;PEs attached to the tenant --> <!-- domain. --> <!-- <vspace/> --> <!-- <vspace/> --> </t> </list></t> </li> </ol> <t> This document does not specify procedures for dynamically determining which of the three cases applies to a given deployment; the PEs of a given Tenant DomainMUST<bcp14>MUST</bcp14> be provisioned to know which case applies. </t> <t> As detailed in[RFC9251], an<xref target="RFC9251"/>, a SMET route carries flags indicating whether IGMP (v1,v2v2, or v3) or MLD (v1 or v2) messages should be triggered on the ACs of the BD to which the SMET route belongs. For IGMP v3 and MLD v2, theIEInclude/Exclude (IE) flag also indicates whether the source information in the SMET route is of an Include Group type or Exclude Group type. If an SBD PE needs to generate IGMP/MLD reportsas(as it is the case insection 6.2),<xref target="external_pim_router"/>) or the route is for an (S, G) state, the value of the flagsMUST<bcp14>MUST</bcp14> be set according to the rules in[RFC9251].<xref target="RFC9251"/>. Otherwise, the flagsSHOULD<bcp14>SHOULD</bcp14> be set to 0. </t> <t> Note that a PE only needs to originate the set ofSBD&nbhy;SMETSBD-SMET routes that are needed in order to receive multicast trafficin which itthat the PE isinterested.interested in. Suppose PE1 has ACs attached to BD1 that are interested in(C&nbhy;*,C&nbhy;G) traffic,(C-*,C-G) traffic and ACs attached to BD2 that are interested in(C&nbhy;S,C&nbhy;G)(C-S,C-G) traffic. A singleSBD&nbhy;SMETSBD-SMET route specifying(C&nbhy;*,C&nbhy;G)(C-*,C-G) will attract all the necessary flows. </t> <t> As another example, suppose the ACs attached to BD1 are interested in(C&nbhy;*,C&nbhy;G)(C-*,C-G) but not in(C&nbhy;S,C&nbhy;G),(C-S,C-G), while the ACs attached to BD2 are interested in(C&nbhy;S,C&nbhy;G).(C-S,C-G). A singleSBD&nbhy;SMETSBD-SMET route specifying(C&nbhy;*,C&nbhy;G)(C-*,C-G) will pull in all the necessary flows. </t> <t> In other words, to determine the set ofSBD&nbhy;SMETSBD-SMET routes that have to be sent for a givenC&nbhy;G,C-G, the PE has to merge the IGMP/MLD state for all the BDs (of the given Tenant Domain) to which it is attached. </t> <t> Per <xreftarget="RFC9251"/>,target="RFC9251" format="default"/>, importingana SMET route for a particular BD will cause the IGMP/MLD state to be instantiated for the IRB interface to that BD. This also appliesas wellwhen the BD is the SBD. </t> <t> However, traffic that originates in one of the actual BDs of a particular Tenant DomainMUST NOT<bcp14>MUST NOT</bcp14> be sent down the IRB interface that connects the L3 routing instance of that Tenant Domain to the SBD. That would cause duplicate delivery of traffic, since such traffic will have already been distributed throughout the Tenant Domain. Therefore, when setting up the IGMP/MLD state based onSBD&nbhy;SMETSBD-SMET routes, care must be taken to ensure that the IRB interface to the SBD is not added to the Outgoing Interface (OIF) list if the traffic originates within the Tenant Domain. </t> <t> There are some multicast scenarios that make use of"anycast sources".anycast sources. For example, two different sources may share the same anycast IP address, say S1, and each may transmit an (S1,G) multicast flow. In such a scenario, the two (S1,G) flows are typically identical. Ordinary PIM procedures will cause only one of the flows to be delivered to each receiver that has expressed interest in either (*,G) or (S1,G). However, the OISM procedures described in this document will result in both of the (S1,G) flows being distributed in the Tenant Domain, and duplicate delivery will result. Therefore, if there are receivers for (*,G) in a given Tenant Domain, thereMUST NOT<bcp14>MUST NOT</bcp14> be anycast sources for G within that Tenant Domain. (This restriction could be lifted by defining additional procedures;howeverhowever, that is outside the scope of this document.) </t> </section><!-- smet_adv --></section><!-- adv_tunnels --><sectiontitle="Constructinganchor="mcast_state" numbered="true" toc="default"> <name>Constructing Multicast ForwardingState" anchor="mcast_state">State</name> <sectiontitle="Layeranchor="l2_state" numbered="true" toc="default"> <name>Layer 2 MulticastState" anchor="l2_state">State</name> <t> AnEVPN&nbhy;PEEVPN PE maintains"layerLayer 2 multicaststate"state for each BD to which it is attached. Note that this is used for forwarding IP multicast frames based on the inner IP header. The state is learned through IGMP/MLD snooping <xreftarget="RFC4541"/>target="RFC4541" format="default"/> and procedures in this document. </t> <t> Let PE1 be anEVPN&nbhy;PE,EVPN PE and BD1 be a BD to which it is attached. At PE1, BD1'slayerLayer 2 multicast state for a given(C&nbhy;S,C&nbhy;G)(C-S,C-G) or(C&nbhy;*,C&nbhy;G)(C-*,C-G) governs the disposition of an IP multicast packet that is received by BD1'slayerLayer 2 multicast function on anEVPN&nbhy;PE.EVPN PE. </t> <t> An IP multicast (S,G) packet is considered to have been received by BD1'slayerLayer 2 multicast function in PE1 in the following cases:<list style="symbols"></t> <ul spacing="normal"> <li> <t> The packet is the payload of an Ethernet frame received by PE1 from an AC that attaches to BD1. </t> </li> <li> <t> The packet is the payload of an Ethernet frame whose apparent source BD is BD1,andwhich is received by the PE1 over a tunnel from anotherEVPN&nbhy;PE.EVPN PE. </t> </li> <li> <t> The packet is received from BD1's IRB interface (i.e., has been transmitted by PE1's L3 routing instance down BD1's IRB interface). </t></list> </t></li> </ul> <t> According to the procedures of this document, alltransmissiontransmissions of IP multicast packets from oneEVPN&nbhy;PEEVPN PE to anotherisare done atlayerLayer 2. That is, the packets are transmitted as Ethernet frames, according to thelayerLayer 2 multicast state. </t> <t> EachlayerLayer 2 multicast state (S,G) or (*,G) contains a set of"output interfaces" (OIFoutgoing interfaces (an OIF list). The disposition of an (S,G) multicast frame received by BD1'slayerLayer 2 multicast function is determined as follows:<list style="symbols"></t> <ul spacing="normal"> <li> <t> The OIF list is taken from BD1'slayerLayer 2 (S,G) state, or if there is no such (S,G) state, then it is taken from BD1's (*,G) state. (If neither state exists, the OIF list is considered to be null.) </t> </li> <li> <t> The rules of <xreftarget="iif_oif"/>target="iif_oif" format="default"/> are applied to the OIF list. This will generally result in the frame being transmitted to some, but not all, elements of the OIF list. </t></list> </t></li> </ul> <t> Note that there is no Reverse Path Forwarding (RPF) check atlayerLayer 2.<!-- In EVPN multicast, it --> <!-- is assumed that a given IP source address appears on only a single --> <!-- segment at any one time, therefore no RPF check is --> <!-- necessary. However, the OIF list used by a particular PE for a --> <!-- particular (S,G) frame depends on the way that the frame was --> <!-- received. --></t> <sectiontitle="Constructinganchor="oif_construct" numbered="true" toc="default"> <name>Constructing the OIFList" anchor="oif_construct">List</name> <t> In this document, we have extended the procedures of <xreftarget="RFC9251"/>target="RFC9251" format="default"/> so that IMET and SMET routes for a particular BD are distributed not just to PEs that attach to thatBD,BD but to PEs that attach to any BD in the Tenant Domain. In this way, each PE attached to a given Tenant Domain learns, fromotheranother PE attached to the same Tenant Domain, the set of flows that are of interest to each of those other PEs. (If some PE attached to the Tenant Domain does not support <xreftarget="RFC9251"/>,target="RFC9251" format="default"/>, it will be assumed to be interested in all flows. Whether or not a particular remote PE supports <xreftarget="RFC9251"/>target="RFC9251" format="default"/> is determined by the presence of an Extended Community in its IMET route; this is specified in <xreftarget="RFC9251"/>.)target="RFC9251" format="default"/>.) If a set of remote PEs are interested in a particular flow, the tunnels used to reach those PEs are added to the OIF list of the multicast states corresponding to that flow. </t> <t> AnEVPN&nbhy;PEEVPN PE may run IGMP/MLD snooping procedures <xreftarget="RFC4541"/>target="RFC4541" format="default"/> on each of itsACs,ACs in order to determine the set of flows of interest to each AC. (An AC is said to be interested in a given flow if it connects to a segment that has tenant systems interested in that flow.) If IGMP/MLD procedures are not being run on a given AC, that AC is considered to be interested in all flows. For each BD, the set of ACs interested in a given flow is determined, and the ACs of that set are added to the OIF list of that BD's multicast state for that flow. </t> <t> The OIF list for each multicast state must also contain the IRB interface for the BD to which the state belongs.<!-- However, a given frame will be sent up only one --> <!-- IRB interface, depending upon the source BD of the frame. This --> <!-- is explained further in <xref target="iif_oif"/>. --></t> <t> Implementors should note that the OIF list of a multicast state will change from time to time as ACs and/or remote PEs either become interestedin,in or lose interestin,in particular multicast flows. </t> </section><!-- oif_construct --><sectiontitle="Dataanchor="iif_oif" numbered="true" toc="default"> <name>Data Plane: Applying the OIF List to an (S,G)Frame" anchor="iif_oif">Frame</name> <t> When an (S,G) multicast frame is received by thelayerLayer 2 multicast function of a givenEVPN&nbhy;PE,EVPN PE, say PE1, its disposition depends upon (a)onthe way it was received, (b)uponthe OIF list of the corresponding multicast state (see <xreftarget="oif_construct"/>),target="oif_construct" format="default"/>), (c)uponthe"eligibility"eligibility of an AC to receive a given frame (see <xreftarget="ac_eligibility"/>)target="ac_eligibility" format="default"/>), and (d)uponits apparent source BD (see <xreftarget="adv_tunnels"/>target="adv_tunnels" format="default"/> for information about determining the apparent source BD of a frame received over a tunnel from another PE). </t> <sectiontitle="Eligibilityanchor="ac_eligibility" numbered="true" toc="default"> <name>Eligibility of an AC to Receive aFrame" anchor="ac_eligibility">Frame</name> <t> A given (S,G) multicast frame is eligible to be transmitted by a given PE, say PE1, on a given AC, say AC1, only if one of the following conditions holds:<list style="format %d."></t> <ol spacing="normal" type="1"><li> <t>ESIEthernet Segment Identifier (ESI) labels are being used, PE1 is the DF for the segment to which AC1 is connected, and the frame did not originate from that same segment (as determined by the ESIlabel), orlabel). </t> </li> <li> <t> The ingress PE for the frame is a remote PE, say PE2, local bias is being used, and PE2 is not connected to the same segment as AC1. </t><!-- <t> --> <!-- <cref source=" ECR"> --> <!-- Of course, if the frame has no ingress EVPN&nbhy;PE --> <!-- (maybe it comes from MVPN) or if we can't tell who --> <!-- the ingress EVPN&nbhy;PE is, this won't work. --> <!-- </cref> --> <!-- </t> --> </list> </t></li> </ol> </section><!-- ac_eligibility --><sectiontitle="Applyinganchor="oif_apply" numbered="true" toc="default"> <name>Applying the OIFList" anchor="oif_apply">List</name> <t> Assume a given (S,G) multicast frame has been received by a given PE, say PE1. PE1 determines the apparent source BD of the frame, finds thelayerLayer 2 (S,G) state for that BD (or the (*,G) state if there is no (S,G) state), and uses the OIF list from that state. (Note that if PE1 is not attached to the actual source BD, the apparent source BD will be the SBD.) </t> <t>SupposeIf PE1 has determined the frame's apparent source BD to be BD1 (which may or may not be theSBD.) There areSBD), then the following casesto consider: <list style="format %d.">should be considered: </t> <ol spacing="normal" type="1"><li> <t> The frame was received by PE1 from a local AC, say AC1, that attaches to BD1.<list style="format %c."></t> <ol spacing="normal" type="a"><li> <t> The frameMUST<bcp14>MUST</bcp14> be sent on all local ACs of BD1 that appear in the OIF list, except for AC1 itself.<!-- (a) AC1 itself, and (b) any AC --> <!-- that is attached to the same segment as AC1. --></t> </li> <li> <t> The frameMUST<bcp14>MUST</bcp14> also be delivered to any otherEVPN&nbhy;PEsEVPN PEs that have interest in it. This is achieved as follows:<list style="format %i."></t> <ol spacing="normal" type="i"><li> <t> If (a) AR is being used,and(b) PE1 is anAR&nbhy;LEAF,AR-LEAF, and (c) the OIF list isnon&nbhy;null,non-null, PE1MUST<bcp14>MUST</bcp14> send the frame to theAR&nbhy;REPLICATOR.AR-REPLICATOR. </t> </li> <li> <t>OtherwiseOtherwise, the frameMUST<bcp14>MUST</bcp14> be sent on all tunnels in the OIF list. </t></list> </t></li> </ol> </li> <li> <t> The frameMUST<bcp14>MUST</bcp14> be sent to the local L3 routing instance by being sent up the IRB interface of BD1. ItMUST NOT<bcp14>MUST NOT</bcp14> be sent up any other IRB interfaces. </t></list> </t></li> </ol> </li> <li> <t> The frame was received by PE1 over a tunnel from another PE. (See <xreftarget="adv_tunnels"/>target="adv_tunnels" format="default"/> for the rules to determine the apparent source BD of a packet received from another PE. Note that if PE1 is not attached to the source BD, it will regard the SBD as the apparent source BD.)<list style="format %c."></t> <ol spacing="normal" type="a"><li> <t> The frameMUST<bcp14>MUST</bcp14> be sent on all local ACs in the OIF list that connect to BD1 and that are eligible (per <xreftarget="ac_eligibility"/>)target="ac_eligibility" format="default"/>) to receive the frame. </t> </li> <li> <t> The frameMUST<bcp14>MUST</bcp14> be sent up the IRB interface of the apparent source BD. (Note that this may be the SBD.) The frameMUST NOT<bcp14>MUST NOT</bcp14> be sent up any other IRB interfaces. </t> </li> <li> <t> If PE1 is not anAR&nbhy;REPLICATOR,AR-REPLICATOR, itMUST NOT<bcp14>MUST NOT</bcp14> send the frame to any otherEVPN&nbhy;PEs.EVPN PEs. However, if PE1 is anAR&nbhy;REPLICATOR,AR-REPLICATOR, itMUST<bcp14>MUST</bcp14> send the frame to all tunnels in the OIF list, except for the tunnel over which the frame was received. </t></list> </t></li> </ol> </li> <li> <t> The frame was received by PE1 from the BD1 IRB interface (i.e., the frame has been transmitted by PE1's L3 routing instance down the BD1 IRB interface), and BD1 is NOT the SBD.<list style="format %c."></t> <ol spacing="normal" type="a"><li> <t> The frameMUST<bcp14>MUST</bcp14> be sent on all local ACs in the OIF list that are eligible, as per <xreftarget="ac_eligibility"/>,target="ac_eligibility" format="default"/>, to receive the frame. </t> </li> <li> <t> The frameMUST NOT<bcp14>MUST NOT</bcp14> be sent to any otherEVPN&nbhy;PEs.EVPN PEs. </t> </li> <li> <t> The frameMUST NOT<bcp14>MUST NOT</bcp14> be sent up any IRB interfaces. </t></list> </t></li> </ol> </li> <li> <t> The frame was received from the SBD IRB interface (i.e., has been transmitted by PE1's L3 routing instance down the SBD IRB interface).<list style="format %c."></t> <ol spacing="normal" type="a"><li> <t> The frameMUST<bcp14>MUST</bcp14> be sent on all tunnels in the OIF list. This causes the frame to be delivered to any otherEVPN&nbhy;PEsEVPN PEs that have interest in it. </t> </li> <li> <t> The frameMUST NOT<bcp14>MUST NOT</bcp14> be sent on any local ACs. </t> </li> <li> <t> The frameMUST NOT<bcp14>MUST NOT</bcp14> be sent up any IRB interfaces. </t></list> </t> </list> </t></li> </ol> </li> </ol> </section><!-- oif_apply --></section><!-- iif_oif --> <!-- <section title="OIF List and Tunnels to other EVPN&nbhy;PEs" --> <!-- anchor="oif_tunnels"> --> <!-- <t> --> <!-- When using MPLS to tunnel an IP multicast frame from one PE to --> <!-- another, the frame will always carry a label that is specific to --> <!-- a BD. We will refer to this as the "BD label". In order to --> <!-- tunnel the packet, one first pushes on the ESI label, then the --> <!-- BD label. These labels will be examined by the egress PE(s). --> <!-- Then one pushes on whatever additional labels are needed to --> <!-- transport the packet to the egress PE(s). --> <!-- </t> --> <!-- <t> --> <!-- <cref source=" ECR"> --> <!-- Did I get the ESI order right? --> <!-- </cref> --> <!-- </t> --> <!-- <t> --> <!-- This section specifies how the proper BD label is chosen for --> <!-- each type of tunnel. It also discusses other procedures that --> <!-- are specific to particular tunnel types. --> <!-- </t> --> <!-- <t> --> <!-- <cref source=" ECR"> --> <!-- I guess we should also say how to do this with VXLAN. --> <!-- </cref> --> <!-- </t> --> <!-- <section title="Ingress Replication" anchor="oif_ir"> --> <!-- <t> --> <!-- This section applies when Ingress Replication is the technique --> <!-- used to tunnel IP multicast frames to other EVPN&nbhy;PEs of a --> <!-- Tenant Domain. --> <!-- </t> --> <!-- <t> --> <!-- When an ingress EVPN&nbhy;PE builds its L2 forwarding state for --> <!-- (S,G), it needs to determine all the egress EVPN&nbhy;PEs that are --> <!-- interested in (S,G). How it does this is discussed in <xref --> <!-- target="cp_overview"/> and in <xref target="interest"/>. --> <!-- </t> --> <!-- <t> --> <!-- Then for each egress EVPN&nbhy;PE, the ingress PE needs to --> <!-- encapsulate the frame in an MPLS packet. It first pushes on --> <!-- the ESI label. It pushes on a BD label that was assigned by --> <!-- the egress PE. This label will have been advertised in a PTA --> <!-- attached to an IMET route originated by that egress PE. --> <!-- </t> --> <!-- <t> --> <!-- To find the proper BD label, the ingress PE needs to examine --> <!-- the IMET routes received from the egress PE. There are two --> <!-- cases to consider: --> <!-- <list style="numbers"> --> <!-- <t> --> <!-- There is an installed IMET route from the egress PE for --> <!-- the source BD of the IP multicast frame. In this case, --> <!-- the label is taken from the PTA of that route. --> <!-- </t> --> <!-- <t> --> <!-- There is no installed IMET route from the egress PE for --> <!-- the source BD of the IP multicast frame. In this case, --> <!-- there should be an installed IMET route from the egress --> <!-- PE for the SBD. The label is taken from the PTA of that --> <!-- route. --> <!-- </t> --> <!-- </list> --> <!-- </t> --> <!-- <t> --> <!-- If neither case holds, there is an error, and the egress PE --> <!-- in question will not receive the (S,G) frame. --> <!-- </t> --> <!-- <t> --> <!-- The OIF list for a given multicast state must be constructed --> <!-- such that a copy of the frame is sent to each interested --> <!-- egress PE, with the proper BD label for that egress PE. --> <!-- </t> --> <!-- <t> --> <!-- When a frame is received by a given PE over an MPLS ingress --> <!-- replication tunnel, its top label will be label that was --> <!-- assigned by the receiving PE to a specific BD (possibly the --> <!-- SBD). The receiving PE can thus infer the source BD of the --> <!-- frame. --> <!-- </t> --> <!--</section><!-\- oif_ir -\-> --> <!--<sectiontitle="BIER" anchor="oif_bier"> --> <!-- <t> --> <!-- Please refer to <xref target="I-D.ietf-bier-evpn"/> for details. --> <!-- </t> --> <!-- <t> --> <!-- <cref source=" ECR"> --> <!-- I'm assuming that everything is covered in that draft, and --> <!-- we don't have to say anything in this draft. What do you --> <!-- think? --> <!-- </cref> --> <!-- </t> --> <!-- </section> <!-\- oif_bier -\-> --> <!-- <section title="P2MP LSPs" anchor="oif_p2mp"> --> <!-- <section title="RSVP-TE P2MP" anchor="oif_rsvp"> --> <!-- <t> --> <!-- The procedure for using RSVP-TE P2MP to set up an inclusive --> <!-- tunnel is as follows. --> <!-- </t> --> <!-- <t> --> <!-- In case of RSVP-TE P2MP, the source NVE establishes a --> <!-- P2MP tunnel to all remote NVEs found through the SBD's IMET --> <!-- routes and advertises the tunnel in the IMET route for the --> <!-- source subnet. If tunnel aggregation is not used, a remote NVE --> <!-- attached to the source subnet binds the incoming tunnel branch --> <!-- to the source subnet, and a remote NVE that is not attached to --> <!-- the source subnet binds the incoming tunnel branch to the SBD. --> <!-- </t> --> <!-- <t> --> <!-- In case of BIER, or if tunnel aggregation is used for --> <!-- mLDP/RSVP-TE P2MP, a remote NVE binds the upstream allocated --> <!-- label in the IMET route for a source subnet to that subnet if it --> <!-- is present on the NVE. Otherwise it binds the label to the SBD. --> <!-- </t> --> <!-- </section> <!-\- oif_rsvp -\-> --> <!-- <section title="PIM or mLDP" anchor="oif_pim_mldp"> --> <!-- <t> --> <!-- In case of PIM/mLDP, a remote NVE joins the tunnel advertised in --> <!-- the IMET route for a source subnet. If tunnel aggregation is not --> <!-- used, a remote NVE attached to the source subnet binds the --> <!-- incoming tunnel branch to the source subnet, and a remote NVE --> <!-- that is not attached to the source subnet binds the incoming --> <!-- tunnel branch to the SBD. --> <!-- </t> --> <!-- </section> <!-\- oif-pim_mldp -\-> --> <!-- </section> <!-\- oif_p2mp -\-> --> <!-- </section> <!-\- oif_tunnels -\-> --> </section> <!-- l2_mcast_state --> <section title="Layeranchor="rpf" numbered="true" toc="default"> <name>Layer 3 ForwardingState" anchor="rpf">State</name> <t> If anEVPN&nbhy;PEEVPN PE is performing IGMP/MLD procedures on the ACs of a given BD, it processes those messages atlayerLayer 2 to help form thelayerLayer 2 multicast state. It also sends those messages up that BD's IRB interface to the L3 routing instance of a particulartenant domain.Tenant Domain. This causes(C&nbhy;S,C&nbhy;G)the (C-S,C-G) or(C&nbhy;*,C&nbhy;G)(C-*,C-G) L3 state to be created/updated. </t> <t> AlayerLayer 3 multicast state has both an Input Interface (IIF) and an OIF list. </t> <t> For a(C&nbhy;S,C&nbhy;G)(C-S,C-G) state, if the source BD is present on the PE, the IIF is set to the IRB interface that attaches to that BD.OtherwiseOtherwise, the IIF is set to the SBD IRB interface. </t><!--t> <cref source=" ECR"> And if S is an unknown address? No one else seems bothered by the fact that we might need to do an RPF check against an unknown address, is there something I am not understanding? </cref> </t--><t> For(C&nbhy;*,C&nbhy;G)(C-*,C-G) states, traffic can arrive from any BD, so the IIF needs to be set to a wildcard value meaning "any IRB interface". </t> <t> The OIF list of these states includes one or more of the IRB interfaces of the Tenant Domain. In general, maintenance of the OIF list does not require any EVPN-specific procedures. However, there is one EVPN-specific rule:<list> <t></t> <t indent="3"> If the IIF is one of the IRB interfaces (or thewild cardwildcard meaning "any IRB interface"), then the SBD IRB interfaceMUST NOT<bcp14>MUST NOT</bcp14> be added to the OIF list. Traffic originating from within a particular EVPN Tenant Domain must not be sent down the SBD IRB interface, as such traffic has already been distributed to allEVPN&nbhy;PEsEVPN PEs attached to that Tenant Domain. </t></list> </t><t> Please also see <xreftarget="gen_prin"/>,target="gen_prin" format="default"/>, which states a modification of this rule for the case where OISM is interworking with external Layer 3 multicast routing. </t> </section><!-- rpf --></section><!-- mcast_state --><sectiontitle="Interworkinganchor="no-OISM" numbered="true" toc="default"> <name>Interworking withnon&nbhy;OISM EVPN&nbhy;PEs" anchor="no-OISM">Non-OISM EVPN PEs</name> <t> It is possible that a given Tenant Domain will be attached to both OISM PEs andnon&nbhy;OISMnon-OISM PEs.Inter&nbhy;subnetInter-subnet IP multicast should be possible and fully functional even if not all PEs attaching to a Tenant Domain can be upgraded to support OISM functionality. </t> <t> Note that thenon&nbhy;OISMnon-OISM PEs are not required to have IRBsupport,support or support for <xreftarget="RFC9251"/>. Ittarget="RFC9251" format="default"/>. However, it ishoweveradvantageous for thenon&nbhy;OISMnon-OISM PEs to support <xreftarget="RFC9251"/>.target="RFC9251" format="default"/>. </t> <t> In this section, we will use the following terminology:<list style="symbols"> <t> PE&nbhy;S: the</t> <dl spacing="normal" newline="false"> <dt>PE-S:</dt> <dd>The ingress PE for an (S,G)flow. </t> <t> PE&nbhy;R: anflow.</dd> <dt>PE-R:</dt> <dd>An egress PE for an (S,G)flow. </t> <t> BD&nbhy;S: theflow.</dd> <dt>BD-S:</dt> <dd>The source BD for an (S,G) flow.PE&nbhy;SPE-S must have one or more ACs attachedBD&nbhy;S,to BD-S, at least one of which attaches to hostS. </t> <t> BD&nbhy;R: aS.</dd> <dt>BD-R:</dt> <dd>A BD that contains a host interested in the flow. The host is attached toPE&nbhy;RPE-R via an AC that belongs toBD&nbhy;R. </t> </list> </t>BD-R.</dd> </dl> <t> To allow OISM PEs to interwork withnon&nbhy;OISMnon-OISM PEs, a given Tenant Domain needs to contain one or more"IPIP MulticastGateways"Gateways (IPMGs). An IPMG is an OISM PE with special responsibilities regarding the interworking between OISM andnon&nbhy;OISMnon-OISM PEs. </t> <t> If a PE is functioning as an IPMG, itMUST<bcp14>MUST</bcp14> signal this fact by setting the"IPMG"IPMG flag in the Multicast Flags EC that it attaches to its IMET routes. An IPMGSHOULD<bcp14>SHOULD</bcp14> attach this EC, with the IPMG flag set, to all IMET routes it originates. Furthermore, if PE1 imports any IMET route from PE2 that has the EC present with the"IPMG"IPMG flag set, then the PE1 will assume that PE2 is an IPMG. </t> <t> An IPMG Designated Forwarder(IPMG&nbhy;DF)(IPMG-DF) selection procedure is used to ensurethat, at any given time,that there is exactly one activeIPMG&nbhy;DFIPMG-DF for any givenBD.BD at any given time. Details of theIPMG&nbhy;DFIPMG-DF selection procedure are in <xreftarget="ipmg-df"/>.target="ipmg-df" format="default"/>. TheIPMG&nbhy;DFIPMG-DF for a given BD, sayBD&nbhy;S,BD-S, has special functions to perform when it receives (S,G) frames on that BD:<list style="symbols"></t> <ul spacing="normal"> <li> <t> If the frames are from anon&nbhy;OISM PE&nbhy;S: <list style="symbols">non-OISM PE-S: </t> <ul spacing="normal"> <li> <t> TheIPMG&nbhy;DFIPMG-DF forwards them to OISM PEs that do not attach toBD&nbhy;SBD-S but have interest in (S,G).<vspace/> <vspace/></t> <t> Note that OISM PEs that do attach toBD&nbhy;SBD-S will have received the frames on the BUM tunnel from thenon&nbhy;OISM PE&nbhy;S.non-OISM PE-S. </t> </li> <li> <t> TheIPMG&nbhy;DFIPMG-DF forwards them tonon&nbhy;OISMnon-OISM PEs that have interest in (S,G) on ACs that do not belong toBD&nbhy;S. <vspace/> <vspace/>BD-S. </t> <t> Note that if anon&nbhy;OISMnon-OISM PE has multiple BDs (other thanBD&nbhy;S)BD-S) with interest in (S,G), it will receive one copy of the frame for each such BD. This is necessary because thenon&nbhy;OISMnon-OISM PEs cannot move IP multicast traffic from one BD to another. </t></list> </t></li> </ul> </li> <li> <t> If the frames are from an OISM PE, theIPMG&nbhy;DFIPMG-DF forwards them tonon&nbhy;OISMnon-OISM PEs that have interest in (S,G) on ACs that do not belong toBD&nbhy;S. <vspace/> <vspace/>BD-S. </t> <t> If anon&nbhy;OISMnon-OISM PE has interest in (S,G) on an AC belonging toBD&nbhy;S,BD-S, it will have received a copy of the (S,G) frame, encapsulated forBD&nbhy;S,BD-S, from the OISMPE&nbhy;S. (SeePE-S (see <xreftarget="imet-ir"/>.)target="imet-ir" format="default"/>). If thenon&nbhy;OISMnon-OISM PE has interest in (S,G) on one or more ACs belonging toBD&nbhy;R1,...,BD&nbhy;RkBD-R1,...,BD-Rk where theBD&nbhy;RiBD-Ri are distinct fromBD&nbhy;S,BD-S, theIPMG&nbhy;DFIPMG-DF needs to send it a copy of the frame for eachBD&nbhy;Ri. </t> </list>BD-Ri. </t> </li> </ul> <t> If an IPMG receives a frame on a BD for which it is not theIPMG&nbhy;DF,IPMG-DF, it just follows normal OISM procedures. </t> <t> This section specifies several sets of procedures:<list style="symbols"></t> <ul spacing="normal"> <li> <t> the procedures that theIPMG&nbhy;DFIPMG-DF for a given BD needs to follow when receiving, on that BD, an IP multicast frame from anon&nbhy;OISMnon-OISM PE; </t> </li> <li> <t> the procedures that theIPMG&nbhy;DFIPMG-DF for a given BD needs to follow when receiving, on that BD, an IP multicast frame from an OISM PE; and </t> </li> <li> <t> the procedures that an OISM PE needs to follow when receiving, on a given BD, an IP multicast frame from anon&nbhy;OISMnon-OISM PE, when the OISM PE is not theIPMG&nbhy;DFIPMG-DF for that BD. </t></list> </t></li> </ul> <t> To enableOISM/non&nbhy;OISMOISM/non-OISM interworking in a given Tenant Domain, the Tenant DomainMUST<bcp14>MUST</bcp14> have someEVPN&nbhy;PEsEVPN PEs that can function as IPMGs. An IPMG must be configured with the SBD. It must also be configured with every BD of the Tenant Domain that exists on any of thenon&nbhy;OISMnon-OISM PEs of that domain. (Operationally, it may be simpler to configure the IPMG with all the BDs of the Tenant Domain.) </t> <t>A non&nbhy;OISMOf course, a non-OISM PEof courseonly needs to be configured with BDs for which it has ACs. An OISM PE that is not an IPMG only needs to be configured with the SBD and with the BDs for which it has ACs. </t> <t> An IPMGMUST<bcp14>MUST</bcp14> originate a wildcard SMET route (with(C&nbhy;*,C&nbhy;*)(C-*,C-*) in the NLRI) for each BD in the Tenant Domain. This will cause it to receive all the IP multicast traffic that is sourced in the Tenant Domain. Note thatnon&nbhy;OISMnon-OISM nodes that do not support <xreftarget="RFC9251"/>target="RFC9251" format="default"/> will send all the multicast traffic from a given BD to all PEs attached to that BD, even if those PEs do not originateana SMET route. </t> <t> The interworking procedures vary somewhat depending upon whether packets are transmitted from PE to PE viaIngress Replication (IR)IR or viaPoint-to-Multipoint (P2MP)P2MP tunnels.WeIn this section, we do not consider the use of BIERin this section,due to the low likelihood of there being anon&nbhy;OISMnon-OISM PE that supports BIER. </t> <sectiontitle="IPMG Designated Forwarder" anchor="ipmg-df"> <!-- <t> --> <!-- Each IPMG for a given Tenant Domain MUST be configured with an "IPMG --> <!-- Ethernet Segment" for that domain. This is an Ethernet Segment that --> <!-- has no ACs. Conceptually, it represents the set of non-OISM PEs --> <!-- contained in the Tenant Domain. In the control plane, this Ethernet --> <!-- Segment is identified by an ESI of Type 0; thus the ESI is an --> <!-- arbitrary 9-octet value, managed and configured by the operator. --> <!-- </t> --> <!-- <t> --> <!-- EVPN supports a number of procedures that can be used to select the --> <!--anchor="ipmg-df" numbered="true" toc="default"> <name>IPMG DesignatedForwarder (DF) for a particular BD on a particular Ethernet --> <!-- segment. Some of the possible procedures can be found, e.g., in <xref --> <!-- target="RFC7432"/>, <xref target="EVPN-DF-NEW"/>, and <xref --> <!-- target="I-D.ietf-bess-evpn-pref-df"/>. Whatever procedure is in use in a given --> <!-- deployment can be adapted to select an IPMG&nbhy;DF for a given BD, as --> <!-- follows. --> <!-- </t> --> <!-- <t> --> <!-- Each IPMG will originate an Ethernet Segment route for the IPMG dummy --> <!-- Ethernet segment. It MUST carry an ES-Import Route Target whose value --> <!-- is set to the high-order 6-octets of the IPMG ESI. Thus only IPMGs --> <!-- will import the route. --> <!-- </t> --> <!-- <t> --> <!-- Once the set of IPMGs is known, it is also possible to determine the --> <!-- set of BDs supported by each IPMG. The DF selection procedure can --> <!-- then be used to choose a DF for each BD. (The conditions under which --> <!-- the IPMG&nbhy;DF for a given BD changes depends upon the DF selection --> <!-- algorithm that is in use.) --> <!-- </t> -->Forwarder</name> <t> Every PE that is eligible for selection as anIPMG&nbhy;DFIPMG-DF for a particular BD originates both an IMET route for that BD and anSBD&nbhy;IMETSBD-IMET route. As stated in <xreftarget="no-OISM"/>,target="no-OISM" format="default"/>, theseSBD&nbhy;IMETSBD-IMET routes carry a Multicast Flags EC with the IPMGFlagflag set. </t> <t> TheseSBD&nbhy;IMETSBD-IMET routesSHOULD<bcp14>SHOULD</bcp14> also carry a DF Election EC. The DF Election EC and its use is specified in <xreftarget="RFC8584"/>.target="RFC8584" format="default"/>. When the route is originated, theAC&nbhy;DFAC-DF bit in the DF Election ECSHOULD not<bcp14>SHOULD NOT</bcp14> be set. This bit is not used when selecting anIPMSG&nbhy;DF,IPMG-DF, i.e., itMUST<bcp14>MUST</bcp14> be ignored by the receiver of anSBD&nbhy;IMETSBD-IMET route. </t> <t> In the context of a given Tenant Domain, to select theIPMG&nbhy;DFIPMG-DF for a particular BD, say BD1, the IPMGs of the Tenant Domain perform the followingprocedure: <list style="symbols">procedures: </t> <ul spacing="normal"> <li> <t> From the set of receivedSBD&nbhy;IMETSBD-IMET routes for the giventenant domain,Tenant Domain, determine the candidate set of PEs that support IPMG functionality for that domain. </t> </li> <li> <t>Eliminate fromFrom that candidatesetset, eliminate any PEs from which an IMET route for BD1 has not been received. </t> </li> <li> <t> Select a DFElectionelection algorithm as specified in <xreftarget="RFC8584"/>.target="RFC8584" format="default"/>. Some of the possible algorithms can be found, e.g., in <xreftarget="RFC8584"/>,target="RFC8584" format="default"/>, <xreftarget="RFC7432"/>,target="RFC7432" format="default"/>, and <xreftarget="I-D.ietf-bess-evpn-pref-df"/>.target="I-D.ietf-bess-evpn-pref-df" format="default"/>. </t> </li> <li> <t> Apply the DFElection Algorithmelection algorithm (see <xreftarget="RFC8584"/>)target="RFC8584" format="default"/>) to the candidate set of PEs. The"winner'winner becomes the IPMG-DF for BD1. </t></list> </t></li> </ul> <t> Note that even if a given PE supports MEG<xref target="mvpn"/>)(<xref target="mvpn" format="default"/>) and/or PEG (<xreftarget="pim_iwork"/>)target="pim_iwork" format="default"/>) functionality, as well as IPMG functionality, itsSBD&nbhy;IMETSBD-IMET routes carry only one DF Election EC. </t> </section><!-- ipmg_df --><sectiontitle="Ingress Replication" anchor="non-OISM-IR">anchor="non-OISM-IR" numbered="true" toc="default"> <name>Ingress Replication</name> <t> The procedures of this section are used whenIngress ReplicationIR is used to transmit packets from one PE to another. </t> <t> When anon&nbhy;OISM PE&nbhy;Snon-OISM PE-S transmits a multicast frame fromBD&nbhy;SBD-S to another PE,PE&nbhy;R, PE&nbhy;Ssay PE-R, PE-S will use the encapsulation specified in theBD&nbhy;SBD-S IMET route that was originated byPE&nbhy;R.PE-R. This encapsulation will include the label that appears in the"MPLS label"MPLS Label field of thePMSI Tunnel attribute (PTA)PTA of the IMET route. If the tunnel type is VXLAN, the"label"label is actually a Virtual Network Identifier (VNI); for other tunnel types, the label is an MPLS label. In either case,we will speak ofthetransmittedframesas carryingare transmitted with a label that was assigned to a particular BD by thePE&nbhy;RPE-R to which the frame is being transmitted. </t> <t> To supportOISM/non&nbhy;OISMOISM/non-OISM interworking, an OISMPE&nbhy;R MUSTPE-R <bcp14>MUST</bcp14> originate, for each of its BDs, both an IMET route and anS&nbhy;PMSI (C&nbhy;*,C&nbhy;*) A&nbhy;D(C-*,C-*) S-PMSI A-D route. Note that even when IR is being used, interworking between OISM andnon&nbhy;OISMnon-OISM PEs requires the OISM PEs to follow the rules of <xreftarget="wc_spmsi"/>,target="wc_spmsi" format="default"/>, as modified below. </t> <t>Non&nbhy;OISMNon-OISM PEs will not understandS&nbhy;PMSI A&nbhy;DS-PMSI A-D routes. So when anon&nbhy;OISM PE&nbhy;Snon-OISM PE-S transmits an IP multicast frame with a particular source BD to an IPMG, it encapsulates the frame using the label specified in that IPMG'sBD&nbhy;SBD-S IMET route. (This is just the procedure of <xreftarget="RFC7432"/>.)target="RFC7432" format="default"/>.) </t> <t> The(C&nbhy;*,C&nbhy;*) S&nbhy;PMSI A&nbhy;D(C-*,C-*) S-PMSI A-D route originated by a given OISM PE will have a PTA that specifies IR.<list style="symbols"></t> <ul spacing="normal"> <li> <t> If MPLS tunneling is being used, the MPLSlabelLabel fieldSHOULD<bcp14>SHOULD</bcp14> contain anon&nbhy;zeronon-zero value, and the LIR flagSHOULD<bcp14>SHOULD</bcp14> be zero. (The case where the MPLSlabelLabel field is zero or the LIR flag is set is outside the scope of this document.) </t> </li> <li> <t> If the tunnel encapsulation is VXLAN, the MPLSlabelLabel fieldMUST<bcp14>MUST</bcp14> contain anon&nbhy;zeronon-zero value, and the LIR flagMUST<bcp14>MUST</bcp14> be zero. </t></list> </t></li> </ul> <t> When an OISMPE&nbhy;SPE-S transmits an IP multicast frame to an IPMG, it will use the label specified in that IPMG's(C&nbhy;*,C&nbhy;*) S&nbhy;PMSI A&nbhy;D(C-*,C-*) S-PMSI A-D route. </t> <t> When a PE originates both an IMET route and a(C&nbhy;*,C&nbhy;*) S&nbhy;PMSI A&nbhy;D(C-*,C-*) S-PMSI A-D route, the values of the MPLSlabelLabel field in the respective PTAs must be distinct. Further, eachMUST<bcp14>MUST</bcp14> map uniquely (in the context of the originating PE) to the route's BD. </t><!-- <t> --> <!-- In MPLS networks, it is also possible to originate an S&nbhy;PMSI A-D route --> <!-- whose PTA specifies a zero in the MPLS label field, and whose "PTA --> <!-- flags" field has the "Leaf Information Required" bit set. This will --> <!-- cause each OISM PE to send a Leaf A-D route to each other OISM PE, --> <!-- specifying the label to use when transmitting to it a frame of the --> <!-- corresponding source BD. --> <!-- </t> --><t> As a result, an IPMG receiving an MPLS-encapsulated IP multicast frame can always tell by the label whether the frame's ingress PE is an OISM PE or anon&nbhy;OISMnon-OISM PE. When an IPMG receives a VXLAN-encapsulated IP multicastframeframe, it may need to determine the identity of the ingress PE from the outer IP encapsulation; it can then determine whether the ingress PE is an OISM PE or anon&nbhy;OISMnon-OISM PE by looking at the IMET route from that PE. </t> <t> Suppose an IPMG receives an IP multicast frame from anotherEVPN&nbhy;PEEVPN PE in the TenantDomain,Domain and the IPMG is not theIPMG&nbhy;DFIPMG-DF for the frame's source BD.ThenThen, the IPMG performs only the ordinary OISM functions; it does not perform the IPMG-specific functions for that frame. In the remainder of this section, when we discuss the procedures applied by an IPMG when it receives an IP multicast frame, we are presuming that the source BD of the frame is a BD for which the IPMG is theIPMG&nbhy;DF.IPMG-DF. </t> <t> We have two basic cases to consider: (1) a frame's ingress PE is anon&nbhy;OISM node,non-OISM node and (2) a frame's ingress PE is an OISM node. </t> <sectiontitle="Ingressanchor="pe-s-non-oism" numbered="true" toc="default"> <name>Ingress PE isnon&nbhy;OISM" anchor="pe-s-non-oism">Non-OISM</name> <t> In this case, anon&nbhy;OISMnon-OISM PE,PE&nbhy;S,say PE-S, has received an (S,G) multicast frame over an AC that is attached to a particular BD,BD&nbhy;S.say BD-S. By virtue of normal EVPN procedures,PE&nbhy;SPE-S has sent a copy of the frame to everyPE&nbhy;RPE-R (both OISM andnon&nbhy;OISM)non-OISM) in the Tenant Domain that is attached toBD&nbhy;S.BD-S. If thenon&nbhy;OISMnon-OISM node supports <xreftarget="RFC9251"/>,target="RFC9251" format="default"/>, only PEs that have expressed interest in (S,G) receive the frame. The IPMG will have expressed interest via a(C&nbhy;*,C&nbhy;*)(C-*,C-*) SMET route and thus receives the frame. </t> <t> Any OISM PE (including an IPMG) receiving the frame will apply normal OISM procedures. As aresultresult, it will deliver the frame to any of its local ACs (inBD&nbhy;SBD-S or in any other BD) that have interest in (S,G). </t> <t> An OISM PE that is also theIPMG&nbhy;DFIPMG-DF for a particular BD, sayBD&nbhy;S,BD-S, has additional procedures that it applies to frames received onBD&nbhy;SBD-S fromnon&nbhy;OISMnon-OISM PEs:<list style="format %d. "> <t</t> <ol spacing="normal" type="1"><li anchor="non-oism-to-oism"> <t> When theIPMG&nbhy;DFIPMG-DF forBD&nbhy;SBD-S receives an (S,G) frame from anon&nbhy;OISMnon-OISM node, itMUST<bcp14>MUST</bcp14> forward a copy of the frame to every OISM PE that is NOT attached toBD&nbhy;SBD-S but has interest in (S,G). The copy sent to a given OISMPE&nbhy;RPE-R must carry the label thatPE&nbhy;RPE-R has assigned to the SBD in anS&nbhy;PMSI A&nbhy;DS-PMSI A-D route. The IPMGMUST NOT<bcp14>MUST NOT</bcp14> do any IP processing of the frame's IP payload. TTL decrement and other IP processing will be done byPE&nbhy;R,PE-R, per the normal OISM procedures. There is no need for the IPMG to include an ESI label in the frame's tunnel encapsulation, because it is already known that the frame's source BD has no presence onPE&nbhy;R.PE-R. There is also no need for the IPMG to modify the frame's MAC SA. </t><t</li> <li anchor="non-oism-to-non-oism"> <t> In addition, when theIPMG&nbhy;DFIPMG-DF forBD&nbhy;SBD-S receives an (S,G) frame from anon&nbhy;OISMnon-OISM node, it may need to forward copies of the frame to othernon&nbhy;OISMnon-OISM nodes. Before it does so, itMUST<bcp14>MUST</bcp14> decapsulate the (S,G)packet,packet and do the IP processing (e.g., TTL decrement). SupposePE&nbhy;RPE-R is anon&nbhy;OISMnon-OISM node that has an AC toBD&nbhy;R,BD-R, whereBD&nbhy;RBD-R is not the same asBD&nbhy;S,BD-S, and that AC has interest in (S,G). The IPMG must then encapsulate the (S,G) packet (after the IP processing has been done) in an Ethernet header. The MAC SA field will have the MAC address of the IPMG's IRB interface forBD&nbhy;R.BD-R. The IPMG then sends the frame toPE&nbhy;R.PE-R. The tunnel encapsulation will carry the label thatPE&nbhy;RPE-R advertised in its IMET route forBD&nbhy;R.BD-R. There is no need to include an ESI label, as the source and destination BDs are known to be different.<vspace/> <vspace/></t> <t> Note that if anon&nbhy;OISM PE&nbhy;Rnon-OISM PE-R has several BDs (other thanBD&nbhy;S)BD-S) with local ACs that have interest in (S,G), the IPMG will send it one copy for each such BD. This is necessary because thenon&nbhy;OISMnon-OISM PE cannot move packets from one BD to another. </t></list> </t></li> </ol> <t> There may be deployment scenarios in which every OISM PE is configured with every BD that is present on anynon&nbhy;OISMnon-OISM PE. In such scenarios, the procedures of item <xref target="non-oism-to-oism" format="counter"/> above will not actually result in the transmission of any packets.HenceHence, if it is known a priori that this deployment scenario exists for a giventenant domain,Tenant Domain, the procedures of item <xref target="non-oism-to-oism" format="counter"/> above can be disabled. </t> </section><!-- PE&nbhy;S-non&nbhy;OISM --><sectiontitle="Ingressanchor="pe-s-oism" numbered="true" toc="default"> <name>Ingress PE isOISM" anchor="pe-s-oism">OISM</name> <t> In this case, an OISM PE,PE&nbhy;S,say PE-S, has received an (S,G) multicast frame over an AC that attaches to a particular BD,BD&nbhy;S.say BD-S. </t> <t> By virtue of receiving all the IMET routes forBD&nbhy;S, PE&nbhy;SBD-S, PE-S will know all the PEs attached toBD&nbhy;S.BD-S. By virtue of normal OISM procedures:<list style="symbols"></t> <ul spacing="normal"> <li> <t>PE&nbhy;SPE-S will send a copy of the frame to every OISMPE&nbhy;RPE-R (including the IPMG) in the Tenant Domain that is attached toBD&nbhy;SBD-S and has interest in (S,G). The copy sent to a givenPE&nbhy;RPE-R carries the label thatthatthePE&nbhy;RPE-R has assigned toBD&nbhy;SBD-S in its(C&nbhy;*,C&nbhy;*) S&nbhy;PMSI A&nbhy;D route. <!-- or Leaf(C-*,C-*) S-PMSI A-D route.--></t> </li> <li> <t>PE&nbhy;SPE-S will also transmit a copy of the (S,G) frame to every OISMPE&nbhy;RPE-R that has interest in (S,G) but is not attached toBD&nbhy;S.BD-S. The copy will contain the label that thePE&nbhy;RPE-R has assigned to the SBD. (As specified in <xreftarget="pe-s-non-oism"/>,target="pe-s-non-oism" format="default"/>, an IPMG is assumed to have indicated interest in all multicast flows.) </t> </li> <li> <t>PE&nbhy;SPE-S will also transmit a copy of the (S,G) frame to everynon&nbhy;OISM PE&nbhy;Rnon-OISM PE-R that is attached toBD&nbhy;S.BD-S. It does this using the label advertised by thatPE&nbhy;RPE-R in its IMET route forBD&nbhy;S. </t> </list>BD-S. </t> </li> </ul> <t> ThePE&nbhy;RsPE-Rs follow their normal procedures. An OISM PE that receives the (S,G) frame onBD&nbhy;SBD-S applies the OISM procedures to deliver the frame to its localACs,ACs as necessary. Anon&nbhy;OISMnon-OISM PE that receives the (S,G) frame onBD&nbhy;SBD-S delivers the frame only to its localBD&nbhy;S ACs,BD-S ACs as necessary. </t> <t> Suppose that anon&nbhy;OISM PE&nbhy;Rnon-OISM PE-R has interest in (S,G) on aBD, BD&nbhy;R,BD that is different thanBD&nbhy;S.BD-S, say BD-R. If thenon&nbhy;OISM PE&nbhy;Rnon-OISM PE-R is attached toBD&nbhy;S,BD-S, the OISMPE&nbhy;SPE-S will send it the original (S,G) multicast frame, but thenon&nbhy;OISM PE&nbhy;Rnon-OISM PE-R will not be able to send the frame to ACs that are not inBD&nbhy;S.BD-S. IfPE&nbhy;RPE-R is not even attached toBD&nbhy;S,BD-S, the OISMPE&nbhy;SPE-S will not send it a copy of the frame at all, becausePE&nbhy;RPE-R is not attached to the SBD. In these cases, the IPMG needs to relay the (S,G) multicast traffic from OISMPE&nbhy;SPE-S tonon&nbhy;OISM PE&nbhy;R.non-OISM PE-R. </t> <t> When theIPMG&nbhy;DFIPMG-DF forBD&nbhy;SBD-S receives an (S,G) frame from an OISMPE&nbhy;S,PE-S, it has to forward it to everynon&nbhy;OISM PE&nbhy;R thatnon-OISM PE-R that has interest in (S,G) on aBD&nbhy;RBD-R that is different thanBD&nbhy;S.BD-S. The IPMGMUST<bcp14>MUST</bcp14> decapsulate the IP multicast packet, do the IP processing, re-encapsulate it forBD&nbhy;RBD-R (changing the MAC SA to the IPMG's own MAC address forBD&nbhy;R),BD-R), and send a copy of the frame toPE&nbhy;R.PE-R. Note that a givennon&nbhy;OISM PE&nbhy;Rnon-OISM PE-R will receive multiple copies of theframe,frame if it has multiple BDs on which there is interest in the frame. </t> </section><!-- pe-s-oism --></section><!-- non&nbhy;OISM-IR --><sectiontitle="P2MP Tunnels" anchor="non-oism-p2mp">anchor="non-oism-p2mp" numbered="true" toc="default"> <name>P2MP Tunnels</name> <t> When IR is used to distribute the multicast traffic among theEVPN&nbhy;PEs,EVPN PEs, the proceduresofdescribed in <xreftarget="non-OISM-IR"/>target="non-OISM-IR" format="default"/> ensure that there will be no duplicate delivery of multicast traffic. That is, no egress PE will ever send a frame twice on any given AC. If P2MP tunnels are being used to distribute the multicast traffic, it is necessary to have additional procedures to prevent duplicate delivery. </t> <t> At the present time, it is not clear that there will be a use case in which OISM nodes need to interwork withnon&nbhy;OISMnon-OISM nodes that use P2MP tunnels. If it is determined that there is such a use case, procedures for P2MP may be specified in a separate document. </t> </section><!-- non-oism-p2mp --> <!-- <section title="P2MP Tunnels" anchor="non-oism-p2mp"> --> <!-- <t> --> <!-- <cref source=" ECR"> --> <!-- This section still preliminary, and may never actually be --> <!-- needed. Hence it will probably be omitted from the next rev. --> <!-- We might want to say something short like "procedures will be --> <!-- provided if it is determined that this is a valid use case". --> <!-- </cref> --> <!-- </t> --> <!-- <t> --> <!-- When IR is used to distribute the multicast traffic among the --> <!-- EVPN-PEs, the procedures of <xref target="non-OISM-IR"/> ensure --> <!-- that there will be no duplicate delivery of multicast traffic. --> <!-- That is, no egress PE will ever send a frame twice on any given --> <!-- AC. If P2MP tunnels are being used to distribute the multicast --> <!-- traffic, it is necessary have additional procedures to prevent --> <!-- duplicate delivery. --> <!-- <list style="symbols"> --> <!-- <t> --> <!-- Every OISM PE (including the IPMGs) MUST follow the --> <!-- procedures of <xref target="wc_spmsi"/> to advertise IP --> <!-- Multicast Tunnels that are distinct from the BUM tunnels. --> <!-- OISM PEs MUST NOT send IP multicast frames on the BUM --> <!-- tunnels. --> <!-- </t> --> <!-- <t> --> <!-- The IPMG that is the DF for a given BD MUST join all the IP --> <!-- multicast tunnels for that BD from OISM PEs. Thus the IPMG --> <!-- for a given source BD, say BD-S, will receive each IP --> <!-- multicast frame whose source BD is BD-S and whose ingress PE --> <!-- is OISM. --> <!-- </t> --> <!-- <t> --> <!-- Since non-OISM PEs do not join the IP multicast tunnels, IP --> <!-- multicast traffic from an OISM ingress PE to a set of --> <!-- non-OISM egress PEs must be relayed by the IPMG. --> <!-- <vspace/> --> <!-- <vspace/> --> <!-- Note that the IPMG does not relay IP multicast frames from --> <!-- an ingress OISM PE to any egress OISM PEs. --> <!-- <vspace/> --> <!-- <vspace/> --> <!-- Since non-OISM PEs expect to receive IP multicast frames on --> <!-- the BUM tunnels, the IPMG must advertise a tunnel for each --> <!-- BD that looks to non-OISM nodes as if it were an ordinary --> <!-- BUM tunnel, but is not joined by the OISM nodes. We will --> <!-- refer to these as "non-OISM BUM tunnels". --> <!-- </t> --> <!-- </list> --> <!-- </t> --> <!-- <t> --> <!-- Consider an IP multicast frame whose ingress PE is non-OISM --> <!-- and whose source BD is BD-S. --> <!-- <list style="symbols"> --> <!-- <t> --> <!-- The procedures of <xref target="RFC7432"/> will cause the --> <!-- frame to be sent to all other PEs (OISM or non-OISM) that --> <!-- attach to BD-S. --> <!-- </t> --> <!-- <t> --> <!-- If the frame has any non-OISM egress PEs with receivers that --> <!-- are in BD-R, where BD-R is different than BD-S, the IPMG --> <!-- must relay the frame to the egress non-OISM PEs. It does --> <!-- this by doing the IP processing of the frame's IP multicast --> <!-- packet, and then re-encapsulating the packet for BD-R. --> <!-- After re-encapsulation, the MAC SA field of the Ethernet --> <!-- header carries the MAC address used by the IPMG itself in --> <!-- BD-R. Then the frame is transmitted on the non-OISM BUM --> <!-- tunnel for BD-R. --> <!-- <vspace/> --> <!-- <vspace/> --> <!-- Note that if a non-OISM egress PE has receivers on BD-R1 --> <!-- and BD-R2, neither of which is BD-S, the egress PE will --> <!-- receive two copies of the frame, one on the non-OISM BUM --> <!-- tunnel for BD-R1, and one on the non-OISM BUM tunnel for --> <!-- BD-R2. This is necessary because the non-OISM PEs cannot --> <!-- route frames from one BD to another. --> <!-- </t> --> <!-- <t> --> <!-- OISM PEs MUST join all the IP multicast tunnels --> <!-- advertised by other OISM PEs, including those --> <!-- advertised by the IPMG. When an OISM PE joins the IP --> <!-- multicast tunnel for a given BD: --> <!-- <list style="symbols"> --> <!-- <t> --> <!-- If the PE is attached to that BD, it associates the --> <!-- tunnel with that BD. --> <!-- </t> --> <!-- <t> --> <!-- If the PE is not attached to that BD, it associates --> <!-- the tunnel with the SBD. --> <!-- </t> --> <!-- </list> --> <!-- </t> --> <!-- <t> --> <!-- Recall that when the IPMG (or any other OISM node) uses an --> <!-- S-PMSI A-D route to advertise a tunnel in a particular BD, the --> <!-- route carries an RT and an Ethernet Tag ID that together --> <!-- identify the BD. An OISM PE uses the RT and Tag ID to --> <!-- determine whether the route is for one of the BDs to --> <!-- which the PE is attached. --> <!-- </t> --> <!-- <t> --> <!-- When the IPMG receives an IP multicast frame on BD-S from --> <!-- a non-OISM ingress PE, it relays the frame unchanged on --> <!-- its IP multicast tunnel for BD-S. This will carry the --> <!-- frame to all OISM PEs that have interest in it. --> <!-- </t> --> <!-- <t> --> <!-- When an OISM PE receives an IP multicast frame on the --> <!-- IP multicast tunnel from an IPMG, and the OISM PE has --> <!-- associated that frame with BD-S, it discards the frame. --> <!-- Otherwise, it follows the normal OISM procedures for --> <!-- frames received on an SBD tunnel. --> <!-- <vspace/> --> <!-- <vspace/> --> <!-- Note that if the BD-S IP multicast tunnel from the IPMG is --> <!-- an MPLS P2MP LSP, and if the PTA describing that tunnel --> <!-- has a zero in the MPLS Label field, then an OISM PE --> <!-- attached to BD-S SHOULD NOT join the tunnel. --> <!-- </t> --> <!-- </list> --> <!-- </t> --> <!-- <t> --> <!-- In order for these procedures to work, the IPMG may need to --> <!-- advertise two BUM tunnels for each BD in the tenant domain: --> <!-- <list style="format %d. "> --> <!-- <t> --> <!-- Its "ordinary" BUM tunnel for the given BD, advertised --> <!-- in an IMET route for that BD. This tunnel will be --> <!-- joined by all PEs (OISM and non-OISM) attached to the --> <!-- given BD. The IPMG will use this tunnel only for --> <!-- transmitting BUM traffic that it receives from those of --> <!-- its own ACs attaching to the given BD. --> <!-- </t> --> <!-- <t> --> <!-- A "non-OISM relay BUM tunnel" for the given BD. This is --> <!-- advertised in a second IMET route for that BD. This --> <!-- second IMET route MUST have a different RD and MUST have --> <!-- a different "originating router's IP address". It MUST --> <!-- carry a flag or an extended community [to be determined] --> <!-- meaning "non-OISM-only". --> <!-- <vspace/> --> <!-- <vspace/> --> <!-- Non-OISM nodes will join this tunnel, since it looks to them --> <!-- like an ordinary BUM tunnel. OISM nodes will not join it, --> <!-- since it is marked "non-OISM-only". --> <!-- <vspace/> --> <!-- <vspace/> --> <!-- When the IPMG needs to relay IP multicast traffic from an --> <!-- ingress PE (other than the IPMG itself) to a set of non-OISM --> <!-- egress PEs, it sends the traffic on the non-OISM BUM --> <!-- tunnel. Note that there is a different non-OISM BUM tunnel --> <!-- for each BD. --> <!-- </t> --> <!-- </list> --> <!-- </t> --> <!-- <t> --> <!-- If the IPMG does not have any ACs, it need not advertise an --> <!-- "ordinary BUM tunnel". --> <!-- </t> --> <!-- <t> --> <!-- The IPMG must also advertise an "OISM-only relay tunnel" in an --> <!-- S-PMSI A-D route. These tunnels can be advertised in (C-*,C-*) --> <!-- S-PMSI A-D routes or in more specific S-PMSI A-D routes. The --> <!-- routes advertising these tunnels must be marked with a flag or --> <!-- EC that identifies them as being OISM-only relay tunnels from an --> <!-- IPMG. Since these tunnels are advertised in S-PMSI A-D routes, --> <!-- and non-OISM nodes do not understand S-PMSI A-D routes, the --> <!-- non-OISM nodes will not join these tunnels. --> <!-- </t> --> <!-- <t> --> <!-- The non-OISM nodes will advertise P2MP tunnels in their IMET --> <!-- routes. These are their "BUM tunnels". Suppose the IPMG gets --> <!-- an (S,G) multicast frame from the BD-S BUM tunnel of a non-OISM --> <!-- PE-S. --> <!-- <list style="symbols"> --> <!-- <t> --> <!-- If there are any non-OISM PEs with interest in (S,G) on a --> <!-- given BD, say BD-R, where BD-R is not the same as BD-S, then --> <!-- the IPMG does the IP processing of the (S,G) multicast --> <!-- packet, changes the MAC SA of the Ethernet encapsulation to --> <!-- its own MAC address, and sends the frame on the non-OISM --> <!-- relay BUM tunnel for BD-R. --> <!-- </t> --> <!-- <t> --> <!-- If there are any OISM PEs with interest in (S,G) on a given --> <!-- BD, say BD-R, where BD-R is not the same as BD-S, then the --> <!-- IPMG sends the frame on the OISM-only relay tunnel for BD-R --> <!-- that it uses for sending (S,G) traffic. The IPMG MUST NOT --> <!-- do the IP processing and MUST NOT modify the MAC SA of the --> <!-- frame. --> <!-- </t> --> <!-- </list> --> <!-- </t> --> <!-- <t> --> <!-- If an IPMG has ACs attaching to a particular BD, it MUST NOT --> <!-- transmit those frames on an OISM-only relay tunnel. Instead, --> <!-- the IPMG will need to originate a different set of routes to --> <!-- advertise the tunnels for carrying packets from its ACs. These --> <!-- routes have a different RD and a different originating router IP --> <!-- address than the relay tunnels. --> <!-- </t> --> <!-- <t> --> <!-- We have assumed above that non-OISM nodes will not recognize --> <!-- S-PMSI A-D routes. If we want to accommodate a future where --> <!-- non-OISM nodes do support S-PMSI A-D routes, an EC or flag would --> <!-- have to be used to distinguish the OISM-only IP multicast --> <!-- tunnels from the ordinary IP multicast tunnels. --> <!-- The non-OISM nodes would be upgraded to understand this flag or --> <!-- EC when they are upgraded to understand S-PMSI A-D routes. --> <!-- </t> --> <!-- </section> --></section><!-- no-OISM --><sectiontitle="Trafficanchor="external" numbered="true" toc="default"> <name>Traffic to/from Outside the EVPN TenantDomain" anchor="external">Domain</name> <t> In this section, we discuss scenarios where a multicast source outside a given EVPN Tenant Domain sends traffic to receivers inside the domain (as well as, possibly, to receivers outside the domain). This requires the OISM procedures to interwork with variouslayerLayer 3 multicast routing procedures. </t> <t>We assume inIn thissectionsection, we assume that the Tenant Domain is not being used as an intermediate transit network for multicast traffic; that is, we do not consider the case where the Tenant Domain contains multicast routers that will receive traffic from sources outside the domain and forward the traffic to receivers outside the domain. The transit scenario is considered in <xreftarget="pim"/>.target="pim" format="default"/>. </t> <t> We can divide the non-transit scenarios into two classes:<list style="format %d. "></t> <ol spacing="normal" type="1"><li> <t> One or more of the EVPN PE routers provide the functionality needed to interwork withlayerLayer 3 multicast routing procedures. </t> </li> <li> <t> A single BD in the Tenant Domain contains external multicast routers("tenant(tenant multicastrouters"),routers), and those tenant multicast routers are used to interwork, on behalf of the entire Tenant Domain, withlayerLayer 3 multicast routing procedures. </t></list> </t></li> </ol> <sectiontitle="Layeranchor="evpn-pe-l3-iwork" numbered="true" toc="default"> <name>Layer 3 Interworking via EVPN OISMPEs" anchor="evpn-pe-l3-iwork">PEs</name> <sectiontitle="General Principles" anchor="gen_prin">anchor="gen_prin" numbered="true" toc="default"> <name>General Principles</name> <t> Sometimes it is necessary to interwork an EVPN Tenant Domain with an externallayerLayer 3 multicast domain (the"external domain"),external domain), e.g., a PIM or MVPN domain. This is needed to allow EVPN tenant systems to receive multicast traffic from sources("external sources")(external sources) outside the EVPN Tenant Domain. It is also needed to allow receivers("external receivers")(external receivers) outside the EVPN Tenant Domain to receive traffic from sources inside the Tenant Domain. </t> <t> In order to allow interworking between an EVPN Tenant Domain and an external domain, one or more OISM PEs must be"L3 Gateways".L3 Gateways. An L3 Gateway participates both in the OISM procedures and in the L3 multicast routing procedures of the external domain, as shown in the following figure. </t> <figure><artwork><name>Interworking via OISM PEs</name> <artwork name="" type="" align="left" alt=""><![CDATA[ src1 rcvr1 | | R1 RP R2 PIM/MVPNdomainDomain +---+ +---+ -----|GW1|----------------------|GW2|---- +---+ +---+ | \ \ / / | | \ \ / / | BD1 BD2 SBD SBD BD2 BD1 EVPN Domain SBD SBD / \ / \ +---+ +---+ |PE1| |PE2| +---+ +---+ | \ / | BD1 BD2 BD2 BD1 | | | | src2 rcvr2 src3 rcvr3</artwork> </figure> </t>]]></artwork></figure> <t> An L3 Gateway that has interest in receiving (S,G) traffic must be able to determine the best route to S. If an L3 Gateway has interest in (*,G), it must be able to determine the best route to G's RP. In these interworking scenarios, the L3 Gateway must be running alayerLayer 3 unicast routing protocol. Via this protocol, it imports unicast routes (either IP routes orVPN&nbhy;IPVPN-IP routes) from routers other than EVPN PEs. And since there may be multicast sources inside the EVPN Tenant Domain, the EVPN PEs also need to export, either as IP routes or asVPN&nbhy;IPVPN-IP routes (depending upon the external domain), unicast routes to those sources. </t> <t> When selecting the best route to a multicast source or RP, an L3 Gateway might have a choice between an EVPN route and anIP/VPN&nbhy;IPIP/VPN-IP route. When such a choice exists, the L3 GatewaySHOULD<bcp14>SHOULD</bcp14> always prefer the EVPN route. This will ensure that when traffic originates in the Tenant Domain and has a receiver in the Tenant Domain, the path to that receiver will remain within the EVPN Tenant Domain, even if the source is also reachable via a routed path. This also provides protection againstsub&nbhy;optimalsub-optimal routing that might occur if two EVPN PEs exportIP/VPN&nbhy;IPIP/VPN-IP routes and each imports the other's IP/VPN-IP routes. </t> <t> <xreftarget="rpf"/>target="rpf" format="default"/> discusses the waylayerLayer 3 multicast states are constructed by OISM PEs. TheselayerLayer 3 multicast states have IRB interfaces as their IIF and OIF listentries,entries and are the basis for interworking OISM with otherlayerLayer 3 multicast procedures such as MVPN or PIM. From the perspective of thelayerLayer 3 multicast procedures running in a given L3 Gateway, an EVPN Tenant Domain is a set of IRB interfaces. </t> <t> When interworking an EVPN Tenant Domain with an external domain, the L3 Gateway'slayerLayer 3 multicast states will not only have IRB interfaces as IIF and OIF listentries,entries but also other"interfaces"interfaces that lead outside the Tenant Domain. For example, when interworking with MVPN, the multicast states may have MVPN tunnels as well as IRB interfaces as IIF or OIF list members. When interworking with PIM, the multicast states may have PIM-enablednon&nbhy;IRBnon-IRB interfaces as IIF or OIF list members. </t> <t> As long as a Tenant Domain is not being used as an intermediate transit network for IP multicast traffic, it is not necessary to enable PIM on its IRB interfaces. </t> <t> In general, an L3 Gateway has the following responsibilities:<list style="symbols"></t> <ul spacing="normal"> <li> <t> It exports, to the external domain, unicast routes to those multicast sources in the EVPN Tenant Domain that are locally attached to the L3 Gateway. </t> </li> <li> <t> It imports, from the external domain, unicast routes to multicast sources that are in the external domain. </t> </li> <li> <t> It executes the procedures necessary to draw externally sourced multicast traffic that is of interest to locally attached receivers in the EVPN Tenant Domain. When such traffic is received, the traffic is sent down the IRB interfaces of the BDs on which the locally attached receivers reside. </t></list> </t></li> </ul> <t> One of the L3 Gateways in a given Tenant Domain becomes the"DR"DR for theSBD. (SeeSBD (see <xreftarget="dr_selection"/>.)target="dr_selection" format="default"/>). This L3gatewayGateway has the following additional responsibilities:<list style="symbols"></t> <ul spacing="normal"> <li> <t> It exports, to the external domain, unicast routes to multicast sources in the EVPN Tenant Domain that are not locally attached to any L3gateway.Gateway. </t> </li> <li> <t> It imports, from the external domain, unicast routes to multicast sources that are in the external domain. </t> </li> <li> <t> It executes the procedures necessary to draw externally sourced multicast traffic that is of interest to receivers in the EVPN Tenant Domain that are not locally attached to an L3gateway.Gateway. When such traffic is received, the traffic is sent down the SBD IRB interface. OISM procedures already described in this document will then ensure that the IP multicast traffic gets distributed throughout the Tenant Domain to any EVPN PEs that have interest in it.ThusThus, to an OISM PE that is not an L3gatewayGateway, the externally sourced traffic will appear to have been sourced on the SBD. </t></list> </t></li> </ul> <t> In order for this to work, some special care is needed when an L3gatewayGateway creates or modifies alayerLayer 3 (*,G) multicast state. Suppose group G has both external sources (sources outside the EVPN Tenant Domain) and internal sources (sources inside the EVPNtenant domain).Tenant Domain). <xreftarget="rpf"/>target="rpf" format="default"/> states that when there are internal sources, the SBD IRB interface must not be added to the OIF list of the (*,G) state. Traffic from internal sources will already have been delivered to all the EVPN PEs that have interest in it. However, if the OIF list of the (*,G) state does not contain its SBD IRB interface, then traffic from external sources will not get delivered to other EVPN PEs. </t> <t> One way of handling this is the following. When an L3gatewayGateway receives (S,G) traffic that is from an interface other thanan IRB interface,IRB, and the traffic corresponds to alayerLayer 3 (*,G) state, the L3gatewayGateway can create (S,G) state. The IIF will be set to the external interface over which the traffic is expected. The OIF list will contain the SBD IRB interface, as well as the IRB interfaces of any other BDs attached to the PEG DR that have locally attached receivers with interest in the (S,G) traffic. The (S,G) state will ensure that the external traffic is sent down the SBD IRB interface. The following text will assume this procedure;howeverhowever, other implementation techniques may also be possible. </t> <t> If a particular BD<!-- (other than the SBD) -->is attached to several L3 Gateways, one of the L3 Gateways becomes the DR for thatBD. (SeeBD (see <xreftarget="dr_selection"/>.)target="dr_selection" format="default"/>). If the interworking scenario requires FHR functionality, it is generally the DR for a particular BD that is responsible for performing that functionality on behalf of the source hosts on thatBD. (E.g.,BD (e.g., if the interworking scenario requires that PIM Register messages be sent by an FHR, the DR for a given BD would send the PIM Register messages for sources on thatBD.) Note thoughBD). Although, note that the DR for the SBD does not perform FHR functionality on behalf of external sources. </t> <t> An optional alternative is to have each L3gatewayGateway perform FHR functionality for locally attached sources.ThenThen, the DR would only have to perform FHR functionality on behalf of sources that are locally attached to itself AND sources that are not attached to any L3gateway.Gateway. </t> <t>N.B.: IfNote that if it is possible that more than one BD contains a tenant multicast router, then a PE receivingana SMET route for that BDMUST NOT<bcp14>MUST NOT</bcp14> reconstruct IGMP/MLD Join Reports from the SMETroute,route andMUST NOT<bcp14>MUST NOT</bcp14> transmit any such IGMP/MLD Join Reports on its local ACs attaching to that BD. Otherwise, multicast traffic may be duplicated. </t> </section><!-- gen_prin --><sectiontitle="Interworkinganchor="mvpn" numbered="true" toc="default"> <name>Interworking withMVPN" anchor="mvpn">MVPN</name> <t> In this section, we specify the procedures necessary to allow EVPN PEs running OISM procedures to interwork with L3VPN PEs that run BGP-based MVPN <xreftarget="RFC6514"/>target="RFC6514" format="default"/> procedures. More specifically, the procedures herein allow a given EVPN Tenant Domain to become part of anL3VPN/MVPN,L3VPN/MVPN and support multicast flows whereeither: <list style="symbols">either of the following occurs: </t> <ul spacing="normal"> <li> <t> The source of a given multicast flow is attached to an Ethernet segment whose BD is part of an EVPN Tenant Domain, and one or more receivers of the flow are attached to the network via L3VPN/MVPN. (Other receivers may be attached to the network via EVPN.) </t> </li> <li> <t> The source of a given multicast flow is attached to the network via L3VPN/MVPN, and one or more receivers of the flow are attached to an Ethernet segment that is part of an EVPNtenant domain.Tenant Domain. (Other receivers may be attached via L3VPN/MVPN.) </t></list> </t></li> </ul> <t> In this interworking model, existing L3VPN/MVPN PEs are unaware that certain sources or receivers are part of an EVPN Tenant Domain. The existing L3VPN/MVPN nodes run only their standard procedures and are entirely unaware of EVPN. Interworking is achieved by having some or all of the EVPN PEs function as L3 Gateways running L3VPN/MVPN procedures, as detailed in the followingsub-sections.subsections. </t> <t> In this section, we assume that there are no tenant multicast routers on any of the EVPN-attached Ethernet segments.(There(Of course, there mayof coursebe multicast routers in the L3VPN.) Consideration of the case where there are tenant multicast routers isdeferred tilladdressed in <xreftarget="pim"/>.)target="pim" format="default"/>. </t><!-- <section title="Model of Operation" anchor="iwork_model"> --><t> To support MVPN/EVPN interworking, we introduce the notion of an MVPN/EVPNGateway, or MEG.Gateway (MEG). </t> <t> A MEG is an L3 Gateway (see <xreftarget="gen_prin"/>), hencetarget="gen_prin" format="default"/>); hence, it is both an OISM PE and an L3VPN/MVPN PE. For a given EVPN Tenant Domain, it will have anIP&nbhy;VRF.IP-VRF. If the Tenant Domain is part of an L3VPN/MVPN, theIP&nbhy;VRFIP-VRF also serves as an L3VPN VRF <xreftarget="RFC4364"/>.target="RFC4364" format="default"/>. The IRB interfaces of theIP&nbhy;VRFIP-VRF are considered to be"VRF interfaces"VRF interfaces of the L3VPN VRF. The L3VPN VRF may also have other local VRF interfaces that are not EVPN IRB interfaces. </t> <t> The VRF on the MEG will importVPN&nbhy;IPVPN-IP routes <xreftarget="RFC4364"/>target="RFC4364" format="default"/> from other L3VPNProvider Edge (PE)PE routers. It will also exportVPN&nbhy;IPVPN-IP routes to other L3VPN PE routers. In order to do so, it must be appropriately configured with theRoute TargetsRTs used in the L3VPN to control the distribution of theVPN&nbhy;IPVPN-IP routes.These Route TargetsIn general, these RTs willin generalbe different than theRoute TargetsRTs used for controlling the distribution of EVPN routes, as there is no need to distribute EVPN routes to L3VPN-only PEs and no reason to distribute L3VPN/MVPN routes to EVPN-only PEs. </t> <t> Note that the RDs in the importedVPN&nbhy;IPVPN-IP routes will not necessarily conform to the EVPN rules (as specified in <xreftarget="RFC7432"/>)target="RFC7432" format="default"/>) for creating RDs.ThereforeTherefore, a MEGMUST NOT<bcp14>MUST NOT</bcp14> expect the RDs of theVPN&nbhy;IPVPN-IP routes to be of any particular format other than what is required by the L3VPN/MVPN specifications. </t> <t> TheVPN&nbhy;IPVPN-IP routes that a MEG exports to L3VPN are subnet routes and/or host routes for the multicast sources that are part of the EVPNtenant domain.Tenant Domain. The exact set of routes that need to be exported is discussed in <xreftarget="e2m"/>.target="e2m" format="default"/>. </t> <t> Each IMET route originated by a MEGSHOULD<bcp14>SHOULD</bcp14> carry a Multicast Flags Extended Community with the"MEG"MEG flag set, indicating that the originator of the IMET route is a MEG. However, PE1 will consider PE2 to be a MEG if PE1 imports at least one IMET route from PE2 that carries the Multicast Flags EC with the MEG flag set. </t> <t> All the MEGs of a given Tenant Domain attach to the SBD of that domain, and one of them is selected to be the SBD's Designated Router (the"MEG SBD&nbhy;DR")MEG SBD-DR) for the domain. The selection procedure is discussed in <xreftarget="dr_selection"/>.target="dr_selection" format="default"/>. </t> <t> In this model of operation, MVPN procedures and EVPN procedures are largely independent. In particular, there is no assumption that MVPN and EVPN use the same kind of tunnels.ThusThus, no special procedures are needed to handle the common scenarios where, e.g., EVPN uses VXLAN tunnels but MVPN uses MPLS P2MP tunnels, or where EVPN usesIngress ReplicationIR but MVPN uses MPLS P2MP tunnels. </t> <t> Similarly, no special procedures are needed to prevent duplicate data delivery on Ethernet segments that aremulti&nbhy;homed.multihomed. </t> <t> The MEG does have some special procedures (described below) for interworking between EVPN and MVPN; these have to do with selection of the Upstream PE for a given multicast source, with the exporting ofVPN&nbhy;IP routes,VPN-IP routes and with the generation of MVPNC&nbhy;multicastC-multicast routes triggered by the installation of SMET routes. </t> <sectiontitle="MVPNanchor="m2e" numbered="true" toc="default"> <name>MVPN Sources with EVPNReceivers" anchor="m2e">Receivers</name> <sectiontitle="Identifyinganchor="mvpn_source" numbered="true" toc="default"> <name>Identifying MVPNSources" anchor="mvpn_source">Sources</name> <t> Consider a multicast source S. It is possible that a MEG will import both an EVPN unicast route to S and aVPN&nbhy;IPVPN-IP route (or an ordinary IP route), where the prefix length of each route is the same. In order to draw (S,G) multicast traffic for any group G, the MEGSHOULD<bcp14>SHOULD</bcp14> use the EVPN route rather than theVPN&nbhy;IPVPN-IP or IP route to determine the"Upstream PE"Upstream PE (seesection 5 of<xreftarget="RFC6513"/>).target="RFC6513" format="default" sectionFormat="of" section="5"/>). </t> <t> Doing so ensures that when an EVPN tenant system desires to receive a multicast flow from another EVPN tenant system, the traffic from the source to that receiver stays within the EVPN domain. This prevents problems that might arise if there is a unicast route via L3VPN toS,S but no multicast routers along the routed path. This also prevents problem that might arise as a result of the fact that the MEGs will import each others'VPN&nbhy;IPVPN-IP routes. </t> <t> Inthe<xreftarget="mvpn_join"/>,target="mvpn_join" format="default"/>, we describe the procedures to be used when the selected route to S is aVPN&nbhy;IPVPN-IP route. </t> </section><!-- mvpn_source --><sectiontitle="Joininganchor="mvpn_join" numbered="true" toc="default"> <name>Joining a Flow from an MVPNSource" anchor="mvpn_join">Source</name> <t> Consider a tenant system, say R, on a particular BD, say BD-R. Suppose R wants to receive (S,G) multicast traffic, where source S is not attached to any PE in the EVPN TenantDomain,Domain but is attached to an MVPN PE.<list style="symbols"></t> <ul spacing="normal"> <li> <t> Suppose R is on a singly homed Ethernet segment ofBD&nbhy;R,BD-R and that segment is attached to PE1, where PE1 is a MEG. PE1 learns via IGMP/MLD listening that R is interested in (S,G). PE1 determines from its VRF that there is no route to S within the Tenant Domain (i.e., no EVPN RT-2 route matching on S's IPaddress),address) but that there is a route to S via L3VPN (i.e., the VRF contains a subnet or host route to S that was received as aVPN&nbhy;IPVPN-IP route). Thus, PE1thusoriginates (if it hasn't already) an MVPNC&nbhy;multicastC-multicast Source TreeJoin(S,G)Join (S,G) route. The route is constructed according to normal MVPN procedures.<vspace/> <vspace/></t> <t> ThelayerLayer 2 multicast state is constructed as specified in <xreftarget="l2_state"/>. <vspace/> <vspace/>target="l2_state" format="default"/>. </t> <t> In thelayerLayer 3 multicast state, the IIF is the appropriate MVPN tunnel, and the IRB interface toBD&nbhy;RBD-R is added to the OIF list.<vspace/> <vspace/></t> <t> When PE1 receives (S,G) traffic from the appropriate MVPN tunnel, it performs IP processing of thetraffic,traffic and then sends the traffic down its IRB interface toBD&nbhy;R.BD-R. Following normal OISM procedures, the (S,G) traffic will be encapsulated for Ethernet and sent on the AC to which R is attached. </t> </li> <li> <t> Suppose R is on a singly homed Ethernet segment ofBD&nbhy;R,BD-R and that segment is attached to PE1, where PE1 is an OISM PE but is NOT a MEG. PE1 learns via IGMP/MLD listening that R is interested in (S,G). PE1 follows normal OISM procedures, originating an SBD-SMET route for (S,G); this route will be received by all the MEGs of the Tenant Domain, including the MEGSBD&nbhy;DR. TheSBD-DR. From PE1's IMET routes, the MEGSBD&nbhy;DRSBD-DR can determinefrom PE1's IMET routeswhether or not PE1 is itself a MEG. If PE1 is not a MEG, the MEGSBD&nbhy;DRSBD-DR will originate (if it hasn't already) an MVPNC&nbhy;multicastC-multicast Source TreeJoin(S,G)Join (S,G) route. This will cause the MEGSBD&nbhy;DRSBD-DR to receive (S,G) traffic on an MVPN tunnel.<vspace/> <vspace/></t> <t> ThelayerLayer 2 multicast state is constructed as specified in <xreftarget="l2_state"/>. <vspace/> <vspace/>target="l2_state" format="default"/>. </t> <t> In thelayerLayer 3 multicast state, the IIF is the appropriate MVPN tunnel, and the IRB interface to the SBD is added to the OIF list.<vspace/> <vspace/></t> <t> When the MEGSBD&nbhy;DRSBD-DR receives (S,G) traffic on an MVPN tunnel, it performs IP processing of thetraffic,traffic andthethen sends the traffic down its IRB interface to the SBD. Following normal OISM procedures, the traffic will be encapsulated for Ethernet and delivered to all PEs in the Tenant Domain that have interest in (S,G), including PE1. </t> </li> <li> <t> If R is on amulti&nbhy;homedmultihomed Ethernet segment ofBD&nbhy;R,BD-R, one of the PEs attached to the segment will be its DF (following normal EVPN procedures), and the DF will know (via IGMP/MLD listening or the procedures of <xreftarget="RFC9251"/>)target="RFC9251" format="default"/>) that a tenant system reachable via one of its local ACs toBD&nbhy;RBD-R is interested in (S,G) traffic. The DF is responsible for originating an SBD-SMET route for (S,G), following normal OISM procedures. If the DF is a MEG, itMUST<bcp14>MUST</bcp14> originate the corresponding MVPNC&nbhy;multicastC-multicast Source TreeJoin(S,G)Join (S,G) route; if the DF is not a MEG, the MEGSBD&nbhy;DRSBD-DR SBDMUST<bcp14>MUST</bcp14> originate theC&nbhy;multicastC-multicast route when it receives the SMET route.<vspace/> <vspace/></t> <t> Optionally, if the non-DF is a MEG, itMAY<bcp14>MAY</bcp14> originate the corresponding MVPNC&nbhy;multicastC-multicast Source TreeJoin(S,G)Join (S,G) route. This will cause the traffic to flow to both the DF and the non-DF, but only the DF will forward the traffic out an AC. This allows for quicker recovery if the DF's local AC to R fails. </t> </li> <li> <t> If R is attached to anon&nbhy;OISMnon-OISM PE, it will receive the traffic via an IPMG, as specified in <xreftarget="no-OISM"/>. </t> </list>target="no-OISM" format="default"/>. </t> </li> </ul> <t> If an EVPN-attached receiver is interested in (*,G) traffic, and if it is possible for there to be sources of (*,G) traffic that are attached only to L3VPN nodes, the MEGs will have to know the group-to-RP mappings. That will enable them to originate MVPNC&nbhy;multicastC-multicast Shared TreeJoin(*,G)Join (*,G) routes and to send themtowardstoward the RP. (Since we are assuming in this section that there are no tenant multicast routers attached to the EVPN Tenant Domain, the RP must be attached via L3VPN. Alternatively, the MEG itself could be configured to function as an RP for group G.) </t> <t> ThelayerLayer 2 multicast states are constructed as specified in <xreftarget="l2_state"/>.target="l2_state" format="default"/>. </t> <t> In thelayerLayer 3 (*,G) multicast state, the IIF is the appropriate MVPN tunnel. A MEG will add its IRB interfaces to the (*,G) OIF listits IRB interfacesfor any BDs containing locally attached receivers. If there are receivers attached to other EVPN PEs, then whenever (S,G) traffic from an external source matches a (*,G) state, the MEG will create (S,G) state, with the MVPN tunnel as the IIF, the OIF list copied from the (*,G) state, and the SBD IRB interface added to the OIF list. (Please see the discussion in <xreftarget="gen_prin"/>target="gen_prin" format="default"/> regarding the inclusion of the SBD IRB interface in a (*,G) state; the SBD IRB interface is only used in the OIF listonlyfor traffic from external sources.) </t><!-- <t> --> <!-- Some special care is needed in the creation of the layer 3 (*,G) --> <!-- multicast state. Suppose group G has both external sources --> <!-- (sources outside the EVPN Tenant Domain) and internal sources --> <!-- (sources inside the EVPN Tenant Domain). <xref target="rpf"/> --> <!-- states that when there are internal sources, the SBD IRB interface --> <!-- must not be added to the oiflist of the (*,G) state. Traffic from --> <!-- internal sources will have been sent down the SBD IRB interface of --> <!-- its ingress PE, and thus will already have been delivered to all --> <!-- the EVPN PEs that have interest in it. However, if the oiflist of --> <!-- the PEG DR's (*,G) state does not contain its SBD IRB interface, --> <!-- then traffic from external sources will not get delivered to other --> <!-- EVPN PEs. Therefore, when a MEG receives (S,G) traffic from an --> <!-- MVPN tunnel, and the traffic corresponds to a layer 3 (*,G) state, --> <!-- the MEG MUST create (S,G) state. The iif will be set to the MVPN --> <!-- tunnel the (S,G) traffic is expected to arrive on, and the oiflist --> <!-- will contain the SBD IRB interface, as well as the IRB interfaces --> <!-- of any other BDs attached to the PEG DR that have locally attached --> <!-- receivers with interest in the (S,G) traffic. --> <!-- </t> --><t> Normal MVPN procedures will then result in the MEG getting the (*,G) traffic from all the multicast sources for G that are attached via L3VPN. This traffic arrives on MVPN tunnels. When the MEG removes the traffic from these tunnels, it does the IP processing. If there are any receivers on a given BD,BD&nbhy;R,say BD-R, that are attached via local EVPN ACs, the MEG sends the traffic down itsBD&nbhy;RBD-R IRB interface. If there are any other EVPN PEs that are interested in the (*,G) traffic, the MEG sends the traffic down the SBD IRB interface. Normal OISM procedures then distribute the traffic as needed to otherEVPN&nbhy;PEs.EVPN PEs. </t> </section><!-- mvpn_join --></section><!-- m2e --><sectiontitle="EVPNanchor="e2m" numbered="true" toc="default"> <name>EVPN Sources with MVPNReceivers" anchor="e2m">Receivers</name> <sectiontitle="General procedures" anchor="e2m_general">anchor="e2m_general" numbered="true" toc="default"> <name>General Procedures</name> <t> Consider the case where an EVPN tenant system S is sending IP multicast traffic to groupG,G and there is a receiver R for the (S,G) traffic that is attached to theL3VPN,L3VPN but not attached to the EVPN Tenant Domain.(We assume in(In thisdocumentdocument, we assume that theL3VPN/MVPN&nbhy;onlyL3VPN-/MVPN-only nodes will not have any special procedures to deal with the case where a source is inside an EVPN domain.) </t> <t> In this case, an L3VPN PE through which R can be reached has to send an MVPNC&nbhy;multicast Join(S,G)C-multicast Join (S,G) route to one of the MEGs that is attached to the EVPN Tenant Domain. For this to happen, the L3VPN PE must have imported a VPN-IP route for S (either a host route or a subnet route) from a MEG. </t> <t> If a MEG determines that there is multicast source transmitting on one of its ACs, the MEGSHOULD<bcp14>SHOULD</bcp14> originate aVPN&nbhy;IPVPN-IP host route for that source. This determinationSHOULD<bcp14>SHOULD</bcp14> be made by examining the IP multicast traffic that arrives on the ACs. (ItMAY<bcp14>MAY</bcp14> be made by provisioning.) A MEGSHOULD NOT<bcp14>SHOULD NOT</bcp14> export aVPN&nbhy;IPVPN-IP host route for any IP address that is not known to be a multicast source (unless it has some other reason for exporting such a route). TheVPN&nbhy;IPVPN-IP host route for a given multicast sourceMUST<bcp14>MUST</bcp14> be withdrawn if the source goes silent for a configurable period oftime,time or if it can be determined that the source is no longer reachable via a local AC. </t> <t> A MEGSHOULD<bcp14>SHOULD</bcp14> also originate aVPN&nbhy;IPVPN-IP subnet route for each of the BDs in the Tenant Domain. </t> <t>VPN&nbhy;IPVPN-IP routes exported by a MEG must carry any attributes orextended communitiesExtended Communities that are required by L3VPN and MVPN. In particular, aVPN&nbhy;IPVPN-IP route exported by a MEG must carry a VRF Route Import Extended Community corresponding to theIP&nbhy;VRFIP-VRF from which it isimported,imported and a Source AS Extended Community. </t> <t> As a result, if S is attached to a MEG, the L3VPN nodes will direct their MVPNC&nbhy;multicastC-multicast Join routes to that MEG. Normal MVPN procedures will cause the traffic to be delivered to the L3VPN nodes. ThelayerLayer 3 multicast state for (S,G) will have the MVPN tunnel on its OIF list. The IIF will be the IRB interface leading to the BD containing S. </t> <t> If S is not attached to a MEG, the L3VPN nodes will direct theirC&nbhy;multicastC-multicast Join routes to whichever MEG appears to be on the best route to S's subnet. Upon receiving theC&nbhy;multicastC-multicast Join, that MEG will originate an EVPN SMET route for (S,G). As a result, the MEG will receive the (S,G) traffic atlayerLayer 2 via the OISM procedures. The (S,G) traffic will be sent up the appropriate IRB interface, and thelayerLayer 3 MVPN procedures will ensure that the traffic is delivered to the L3VPN nodes that have requested it. ThelayerLayer 3 multicast state for (S,G) will have the MVPN tunnel in the OIF list, and the IIF will be one of the following:<list style="symbols"></t> <ul spacing="normal"> <li> <t> If S belongs to a BD that is attached to the MEG, the IIF will be the IRB interface to thatBD;BD. </t> </li> <li> <t>OtherwiseOtherwise, the IIF will be the SBD IRB interface. </t></list> </t></li> </ul> <t> Note that this works even if S is attached to anon&nbhy;OISMnon-OISM PE, per the procedures of <xreftarget="no-OISM"/>.target="no-OISM" format="default"/>. </t> </section><!-- e2m_general --><sectiontitle="Any-Sourceanchor="e2m_asm" numbered="true" toc="default"> <name>Any-Source Multicast (ASM)Groups" anchor="e2m_asm">Groups</name> <t> Suppose the MEGSBD&nbhy;DRSBD-DR learns that one of the PEs in its Tenant Domain is interested in(*,G),(*,G) traffic, where G is anAny&nbhy;Source Multicast (ASM)ASM group. If there are no tenant multicast routers, the MEGSBD&nbhy;DR SHOULDSBD-DR <bcp14>SHOULD</bcp14> perform the"FirstFirst HopRouter"Router (FHR) functionality for group G on behalf of the Tenant Domain, as described in <xreftarget="RFC7761"/>.target="RFC7761" format="default"/>. This means that the MEGSBD&nbhy;DRSBD-DR must know the identity of theRendezvous Point (RP)RP for each group, must send Register messages to theRendezvous Point,RP, etc. </t> <t> If the MEGSBD&nbhy;DRSBD-DR is to be the FHR for the Tenant Domain, it must see all the multicast traffic that is sourced from within the domain and destined to an ASM group address. The MEG can ensure this by originating anSBD&nbhy;SMETSBD-SMET route for (*,*). </t> <t> (As a possible optimization, anSBD&nbhy;SMETSBD-SMET route for (*,"anyany ASMgroup")group) may be defined in a separate document.) </t> <t> In some deployment scenarios, it may be preferred that the MEG that receives the (S,G) traffic over an AC be the one providing the FHR functionality. This behavior isOPTIONAL.<bcp14>OPTIONAL</bcp14>. If this option is used, itMUST<bcp14>MUST</bcp14> be ensured that the MEG DR does not provide the FHR functionality for (S,G) traffic that is attached to another MEG; FHR functionality for (S,G) traffic from a particular source SMUST<bcp14>MUST</bcp14> be provided by only a single router. </t> <t> Other deployment scenarios are also possible. For example, one might want to configure the MEGs themselves to be RPs. In this case, the RPs would have to exchange with each other information about which sources are active. The method exchanging such information is outside the scope of this document. </t> </section><!-- e2m_asm --><sectiontitle="Sourceanchor="e2m_multi" numbered="true" toc="default"> <name>Source on MultihomedSegment" anchor="e2m_multi">Segment</name> <t> Suppose S is attached to a segment that isall&nbhy;active multi&nbhy;homedall-active multihomed toPElPE1 and PE2. If S is transmitting to two groups, say G1 and G2, it is possible that PE1 will receive the (S,G1) traffic fromS whileS, whereas PE2receiveswill receive the (S,G2) traffic from S. </t> <t> This creates an issue for MVPN/EVPN interworking, because there is no way to cause L3VPN/MVPN nodes to select PE1 as the ingress PE for (S,G1) traffic while selecting PE2 as the ingress PE for (S,G2) traffic. </t> <t> However, the following procedure ensures that the IP multicast traffic will still flow, even if the L3VPN/MVPN nodespickspick the"wrong" EVPN&nbhy;PEwrong EVPN PE as the Upstream PEfor (say)for, e.g., the (S,G1) traffic. </t> <t> Suppose S is on an Ethernet segment, belonging to BD1, that ismulti&nbhy;homedmultihomed to both PE1 and PE2, where PE1 is a MEG. And suppose that IP multicast traffic from S to G travels over the AC that attaches the segment to PE2. If PE1 receives aC&nbhy;multicastC-multicast Source Tree Join (S,G) route, itMUST<bcp14>MUST</bcp14> originateana SMET route for (S,G). Normal OISM procedures will then cause PE2 to send the (S,G) traffic to PE1 on an EVPN IP multicast tunnel. Normal OISM procedures will also cause PE1 to send the (S,G) traffic up its BD1 IRB interface. Normal MVPN procedures will then cause PE1 to forward the traffic on an MVPN tunnel. In this case, the routing is not optimal, but the traffic does flow correctly. </t> </section><!-- e2m_multi --> </section> <!-- e2m --> <!--</section><!-\- iwork_model -\-> --><sectiontitle="Obtaininganchor="optimal" numbered="true" toc="default"> <name>Obtaining Optimal Routing of TrafficBetweenbetween MVPN andEVPN" anchor="optimal">EVPN</name> <t> The routing of IP multicast traffic between MVPN nodes and EVPN nodes will be optimal as long as there is a MEG along the optimal route. There are various deployment strategies that can be used to obtain optimal routing between MVPN and EVPN. </t> <t> In one such scenario, a Tenant Domain will have a small number of strategically placed MEGs. For example, aData Centerdata center may have a small number of MEGs that connect it to a wide-area network.ThenThen, the optimal route into or out of theData Centerdata center would be through the MEGs. </t> <t> In this scenario, the MEGs do not need to originateVPN&nbhy;IPVPN-IP host routes for the multicastsources,sources; they only need to originateVPN&nbhy;IPVPN-IP subnet routes. The internal structure of the EVPN is completely hidden from the MVPN node. EVPNactionsactions, such as MAC Mobility and Mass Withdrawal <xreftarget="RFC7432"/>target="RFC7432" format="default"/>, have zero impact on the MVPN control plane. </t> <t> While this deployment scenario provides the most optimal routing and has the least impact on the installed based of MVPN nodes, it does complicate network planning considerations. </t> <t> Another way of providing routing that is close to optimal is to turn each EVPN PE into a MEG.ThenThen, routing of MVPN-to-EVPN traffic is optimal. However, routing of EVPN-to-MVPN traffic is not guaranteed to be optimal when a source host is on amulti&nbhy;homedmultihomed Ethernet segment (as discussed in <xreftarget="e2m"/>.)target="e2m" format="default"/>.) </t> <t> The obvious disadvantage of this method is that it requires every EVPN PE to be a MEG. </t> <t> The procedures specified in this document allow an operator to add MEG functionality to any subset ofhisits EVPN OISM PEs. This allows an operator to make whatever trade-offs deemed appropriate between optimal routing and MEG deployment. </t> </section><!-- optimal --><sectiontitle="Selectinganchor="dr_selection" numbered="true" toc="default"> <name>Selecting the MEGSBD-DR" anchor="dr_selection"> <!-- <t> --> <!-- Each MEG for a given Tenant Domain MUST be configured with a "MEG --> <!-- Ethernet Segment" for that domain. This is an Ethernet Segment that --> <!-- has no ACs. Conceptually, it represents the set of non-MEG PEs --> <!-- contained in the Tenant Domain. In the control plane, this Ethernet --> <!-- Segment is identified by an ESI of Type 0; thus the ESI is an --> <!-- arbitrary 9-octet value, managed and configured by the operator. --> <!-- </t> --> <!-- <t> --> <!-- EVPN supports a number of procedures that can be used to select the --> <!-- Designated Forwarder (DF) for a particular BD on a particular --> <!-- Ethernet segment. Some of the possible procedures can be found, --> <!-- e.g., in <xref target="RFC7432"/>, <xref target="EVPN-DF-NEW"/>, and --> <!-- <xref target="I-D.ietf-bess-evpn-pref-df"/>. Whatever procedure is in use in --> <!-- a given deployment can be adapted to select a MEG DR for a given BD, --> <!-- as follows. --> <!-- </t> --> <!-- <t> --> <!-- Each MEG will originate an Ethernet Segment route for the MEG --> <!-- Ethernet Segment. It MUST carry an ES-Import Route Target whose --> <!-- value is set to the high-order 6-octets of the MEG ESI. Thus only --> <!-- MEGs will import the route. --> <!-- </t> --> <!-- <t> --> <!-- Once the set of MEGs is known, it is also possible to determine the --> <!-- set of BDs supported by each MEG. The DF selection procedure can --> <!-- then be used to choose a MEG DR for the SBD. (The conditions under --> <!-- which the MEG DR changes depends upon the DF selection algorithm --> <!-- that is in use.) --> <!-- </t> --> <!-- <t> --> <!-- These procedures can also be used to select a DR for each BD. If --> <!-- the interworking scenario requires FHR functionality, it is --> <!-- generally the DR for a particular BD that is responsible for --> <!-- performing that functionality on behalf of the source hosts on that --> <!-- BD. --> <!-- </t> -->SBD-DR</name> <t> Every PE that is eligible for selection as the MEGSBD&nbhy;DRSBD-DR originates anSBD&nbhy;IMETSBD-IMET route. As stated in <xreftarget="no-OISM"/>,target="no-OISM" format="default"/>, theseSBD&nbhy;IMETSBD-IMET routes carry a Multicast Flags EC with the MEGFlagflag set. </t> <t> TheseSBD&nbhy;IMETSBD-IMET routesSHOULD<bcp14>SHOULD</bcp14> also carry a DF Election EC. The DF Election EC and its use are specified in <xreftarget="RFC8584"/>.target="RFC8584" format="default"/>. When the route is originated, theAC&nbhy;DFAC-DF bit in the DF Election ECSHOULD<bcp14>SHOULD</bcp14> be set to zero. This bit is not used when selecting a MEGSBD&nbhy;DR,SBD-DR, i.e., itMUST<bcp14>MUST</bcp14> be ignored by the receiver of anSBD&nbhy;IMETSBD-IMET route. </t> <t> In the context of a given Tenant Domain, to select the MEG SBD-DR, the MEGs of the Tenant Domain perform the following procedure:<list style="symbols"></t> <ul spacing="normal"> <li> <t> From the set of receivedSBD&nbhy;IMETSBD-IMET routes for the giventenant domain,Tenant Domain, determine the candidate set of PEs that support MEG functionality for that domain. </t> </li> <li> <t> Select a DFElectionelection algorithm as specified in <xreftarget="RFC8584"/>.target="RFC8584" format="default"/>. Some of the possible algorithms can be found, e.g., in <xreftarget="RFC7432"/>,target="RFC7432" format="default"/>, <xreftarget="RFC8584"/>,target="RFC8584" format="default"/>, and <xreftarget="I-D.ietf-bess-evpn-pref-df"/>.target="I-D.ietf-bess-evpn-pref-df" format="default"/>. </t> </li> <li> <t> Apply the DFElection Algorithmelection algorithm (see <xreftarget="RFC8584"/>)target="RFC8584" format="default"/>) to the candidate set of PEs. The"winner"winner becomes the MEG SBD-DR. </t></list> </t></li> </ul> <t> Note that if a given PE supports IPMG (<xreftarget="mvpn"/>)target="mvpn" format="default"/>) or PEG (<xreftarget="pim_iwork"/>)target="pim_iwork" format="default"/>) functionality as well as MEG functionality, itsSBD&nbhy;IMETSBD-IMET routes carry only one DF Election EC. </t> </section><!-- dr_selection --></section><!-- mvpn --><sectiontitle="Interworkinganchor="gtm" numbered="true" toc="default"> <name>Interworking with'GlobalGlobal TableMulticast'" anchor="gtm">Multicast</name> <t> If multicast service to the outside sources and/or receivers is provided via the BGP-based"GlobalGlobal TableMulticast"Multicast (GTM) procedures of <xreftarget="RFC7716"/>,target="RFC7716" format="default"/>, the procedures of <xreftarget="mvpn"/>target="mvpn" format="default"/> can easily be adapted for EVPN/GTM interworking. The way to adapt the MVPN procedures to GTM is explained in <xreftarget="RFC7716"/>.target="RFC7716" format="default"/>. </t> </section><!-- gtm --><sectiontitle="Interworkinganchor="pim_iwork" numbered="true" toc="default"> <name>Interworking withPIM" anchor="pim_iwork">PIM</name> <t> Aswe have been discussing,discussed, there may be receivers in an EVPNtenant domainTenant Domain that are interested in multicast flows whose sources are outside the EVPN Tenant Domain. Or there may be receivers outside an EVPN Tenant Domain that are interested in multicast flows whose sources are inside the Tenant Domain. </t> <t> If the outside sources and/or receivers are part of an MVPN,interworkingsee the procedures for interworking that are covered in <xreftarget="mvpn"/>.target="mvpn" format="default"/>. </t> <t> There are also cases where an external source or receiver are attached viaIP,IP and thelayerLayer 3 multicast routing is done via PIM. In this case, the interworking between the"PIM domain"PIM domain and the EVPNtenant domainTenant Domain is done at L3 Gateways that perform"PIM/EVPN Gateway"PIM/EVPN Gateway (PEG) functionality. A PEG is very similar to a MEG, except that itslayerLayer 3 multicast routing is done via PIM rather than via BGP. </t> <t> If external sources or receivers for a given group are attached to a PEG via alayerLayer 3 interface, that interface should be treated as a VRF interface attached to the Tenant Domain's L3VPN VRF. ThelayerLayer 3 multicast routing instance for that Tenant Domain will either run PIM on the VRF interface orwilllisten for IGMP/MLD messages on that interface. If the external receiver is attached elsewhere on an IP network, the PE has to enable PIM on its interfaces to the backbone network. In both cases, the PE needs to perform PEG functionality, and its IMET routes must carry the Multicast Flags EC with the PEG flag set. </t> <t> For each BD on which there is a multicast source or receiver, one of the PEGs willbecomesbecome the PEG DR. DR selection can be done using the same procedures specified in <xreftarget="dr_selection"/>,target="dr_selection" format="default"/>, except with"PEG"PEG substituted for"MEG".MEG. </t> <t> As long as there are no tenant multicast routers within the EVPN Tenant Domain, the PEGs do not need to run PIM on their IRB interfaces. </t> <sectiontitle="Sourceanchor="evpn_source" numbered="true" toc="default"> <name>Source Inside EVPNDomain" anchor="evpn_source">Domain</name> <t> If a PEG receives a PIMJoin(S,G)Join (S,G) from outside the EVPNtenant domain,Tenant Domain, it may find it necessary to create (S,G) state. The PE needs to determine whether S is within the Tenant Domain. If S is not within the EVPN Tenant Domain, the PE carries out normallayerLayer 3 multicast routing procedures. If S is within the EVPNtenant domain,Tenant Domain, the IIF of the (S,G) state is set as follows:<list style="symbols"></t> <ul spacing="normal"> <li> <t>ifIf S is on a BD that is attached to the PE, the IIF is the PE's IRB interface to thatBD;BD. </t> </li> <li> <t>ifIf S is not on a BD that is attached to the PE, the IIF is the PE's IRB interface to the SBD. </t></list> </t></li> </ul> <t> When the PE creates such an (S,G) state, itMUST<bcp14>MUST</bcp14> originate (if it hasn't already) anSBD&nbhy;SMETSBD-SMET route for (S,G). This will cause it to pull the (S,G) traffic vialayerLayer 2. When the traffic arrives over an EVPN tunnel, it gets sent up an IRB interface where thelayerLayer 3 multicast routing determines the packet's disposition. TheSBD&nbhy;SMETSBD-SMET route is withdrawn when the (S,G) state no longer exists (unless there is some other reason for not withdrawing it). </t> <t> If there are no tenant multicast routers within the EVPNtenant domain,Tenant Domain, there cannot be an RP in the Tenant Domain, so a PEG does not have to handle externally arriving PIMJoin(*,G)Join (*,G) messages. </t><!-- <t> --> <!-- It is possible that the PEG may later receive PIM Prune(S,G,rpt) --> <!-- from the external network. At that time, it MAY advertise (C&nbhy;S,C&nbhy;G) --> <!-- an SMET route with the Exclude Group type bit and IGMPv3 bit in the --> <!-- Flags field set (see <xref target="RFC9251"/> for details), --> <!-- signaling to other EVPN PEs that the particular (C&nbhy;S,C&nbhy;G) traffic is --> <!-- not needed. --> <!-- </t> --> <!-- <t> --> <!-- <cref source=" ECR"> --> <!-- It seems to me that the above paragraph is not needed, because we --> <!-- are assuming there is no RP in the Tenant Domain. So we can't get --> <!-- Prune(S,G,rpt). Am I wrong about that? --> <!-- </cref> --> <!-- </t> --><t> The PEG DR for a particular BDMUST<bcp14>MUST</bcp14> act as the a First Hop Router for that BD. It will examine all (S,G) traffic on the BD, and whenever G is an ASM group, the PEG DR will send Register messages to the RP for G. This means that the PEG DR will need to pull all the (S,G) traffic originating on a givenBD,BD by originatingana SMET (*,*) route for that BD. If a PEG DR is the DR for all the BDs,in SHOULDit <bcp14>SHOULD</bcp14> originate just anSBD&nbhy;SMETSBD-SMET (*,*) route rather thanana SMET (*,*) route for each BD. </t> <t> The rules for exporting IP routes to multicast sources are the same as those specified for MEGs in <xreftarget="e2m"/>,target="e2m" format="default"/>, except that the exported routes will be IP routes rather thanVPN&nbhy;IPVPN-IP routes, and it is not necessary to attach the VRF Route Import EC or the Source AS EC. </t> <t> When a source is on amulti&nbhy;homedmultihomed segment, the same issue discussed in <xreftarget="e2m_multi"/>target="e2m_multi" format="default"/> exists. Suppose S is on an Ethernet segment, belonging to BD1, that ismulti&nbhy;homedmultihomed to both PE1 and PE2, where PE1 is a PEG. And suppose that IP multicast traffic from S to G travels over the AC that attaches the segment to PE2. If PE1 receives an external PIM Join (S,G) route, itMUST<bcp14>MUST</bcp14> originateana SMET route for (S,G). Normal OISM procedures will cause PE2 to send the (S,G) traffic to PE1 on an EVPN IP multicast tunnel. Normal OISM procedures will also cause PE1 to send the (S,G) traffic up its BD1 IRB interface. Normal PIM procedures will then cause PE1 to forward the traffic along a PIM tree. In this case, the routing is not optimal, but the traffic does flow correctly. </t> </section><!-- evpn_source --><sectiontitle="Sourceanchor="external_source" numbered="true" toc="default"> <name>Source Outside EVPNDomain" anchor="external_source">Domain</name> <t> By means of normal OISM procedures, a PEG learns whether there are receivers in the Tenant Domain that are interested in receiving (*,G) or (S,G) traffic. The PEG must determine whether or not S (or the RP for G) is outside the EVPN Tenant Domain. If so, and if there is a receiver on BD1 interested in receiving such traffic, the PEG DR for BD1 is responsible for originating a PIMJoin(S,G)Join (S,G) orJoin(*,G)Join (*,G) control message. </t> <t> An alternative would be to allow any PEG that is directly attached to a receiver to originate the PIM Joins.ThenThen, the PEG DR would only have to originate PIM Joins on behalf of receivers that are not attached to a PEG. However, if this is done, it is necessary for the PEGs to run PIM on all their IRBinterfaces,interfaces so that the PIM Assert procedures can be used to prevent duplicate delivery to a given BD. </t> <t> The IIF for thelayerLayer 3 (S,G) or (*,G) state is determined by normal PIM procedures. If a receiver is on BD1, and the PEG DR is attached to BD1, its IRB interface to BD1 is added to the OIF list. This ensures that any receivers locally attached to the PEG DR will receive the traffic. If there are receivers attached to other EVPN PEs, then whenever (S,G) traffic from an external source matches a (*,G) state, the PEG will create (S,G) state. The IIF will be set to whatever external interface the traffic is expected to arrive on (copied from the (*,G) state), the OIF list is copied from the (*,G) state, and the SBD IRB interface is added to the OIF list. </t><!-- <t> --> <!-- However, there is the following problem. Suppose group G has both --> <!-- external sources (sources outside the EVPN Tenant Domain) and --> <!-- internal sources (sources inside the EVPN Tenant Domain). <xref --> <!-- target="rpf"/> states that when there are internal sources, the SBD --> <!-- IRB interface must not be added to the oiflist of the (*,G) state. --> <!-- Traffic from internal sources will have been sent down the SBD IRB --> <!-- interface of its ingress PE, and thus will already have been --> <!-- delivered to all the EVPN PEs that have interest in it. However, if --> <!-- the oiflist of the PEG DR's (*,G) state does not contain its SBD IRB --> <!-- interface, then traffic from external sources will not get delivered --> <!-- to other EVPN PEs. Therefore, when the PEG DR receives (S,G) --> <!-- traffic corresponding to a layer 3 (*,G) state, the PEG DR MUST --> <!-- create (S,G) state. The iif will be set to whatever external --> <!-- interface the (S,G) traffic is expected to arrive on, and the --> <!-- oiflist will contain the SBD IRB interface, as well as the IRB --> <!-- interfaces of any other BDs attached to the PEG DR that have locally --> <!-- attached receivers with interest in the (S,G) traffic. --> <!-- </t> --></section><!-- external source --></section><!-- pim_iwork --></section><!-- evpn-pe-l3-iwork --><sectiontitle="Interworkinganchor="external_pim_router" numbered="true" toc="default"> <name>Interworking with PIM via an External PIMRouter" anchor="external_pim_router"> <!--t> <cref source=" ECR"> I made some changes to this section (formerly titled "A Variation on External Connection"), so I might have introduced some errors. I removed the requirement for the "gateway BD" to be the SBD, as that doesn't seem to be necessary, and I'm not sure we want to say that the SBD can have ACs. I removed the mention of Layer 3 multicast states, as I think the Layer 3 states are constructed via normal OISM procedures. I removed the mention of AE bundles because that is an implementation/deployment issue. Please correct anything I screwed up here. </cref> </t-->Router</name> <t> <xreftarget="evpn-pe-l3-iwork"/>target="evpn-pe-l3-iwork" format="default"/> describes how to use an OISM PE router as the gateway to anon&nbhy;EVPNnon-EVPN multicastdomain,domain when the EVPNtenant domainTenant Domain is not being used as an intermediate transit network for multicast. An alternative approach is to have one or more external PIM routers (perhaps operated by a tenant) on one of the BDs of thetenant domain.Tenant Domain. We will refer to this BD as the "gateway BD". </t> <t> In this model:<list style="symbols"></t> <ul spacing="normal"> <li> <t> The EVPN Tenant Domain is treated as a stub network attached to the external PIM routers. </t> </li> <li> <t> The external PIM routers follow normal PIMprocedures,procedures and provide the FHR and LHR functionality for the entire Tenant Domain. </t> </li> <li> <t> The OISM PEs do not run PIM. </t> </li> <li> <t> ThereMUST NOT<bcp14>MUST NOT</bcp14> be more than one gateway BD. </t> </li> <li> <t> If an OISM PE not attached to the gateway BD has interest in a given multicast flow, it conveys that interest, following normal OISM procedures, by originating an SBD-SMET route for that flow. </t> </li> <li> <t> If a PE attached to the gateway BD receives an SBD-SMET, it may need to generate and transmit a corresponding IGMP/MLD Join on one or more of its ACs. (Procedures for generating an IGMP/MLD Join as a result of receivingana SMET route are given in <xreftarget="RFC9251"/>.)target="RFC9251" format="default"/>.) The PEMUST<bcp14>MUST</bcp14> know which BD is theGatewaygateway BD andMUST NOT<bcp14>MUST NOT</bcp14> transmit an IGMP/MLD Join to any other BDs. Furthermore, even if a particular AC is part of that BD, the PESHOULD NOT<bcp14>SHOULD NOT</bcp14> transmit an IGMP/MLD Join on that AC unless there is an external PIM router attached via that AC.<vspace/> <vspace/></t> <t> As a result, IGMP/MLD messages will be received by the external PIM routers on the gateway BD, and those external PIM routers will send PIM Join messages externally as required. Traffic for the given multicast flow will then be received by one of the external PIM routers, and that traffic will be forwarded by that router to the gateway BD.<vspace/> <vspace/></t> <t> The normal OISM procedures will then cause the given multicast flow to be tunneled to any PEs of the EVPN Tenant Domain that have interest in the flow. PEs attached to the gateway BD will see the flow as originating from the gatewayBDBD, and other PEs will see the flow as originating from the SBD. </t> </li> <li> <t> An OISM PE attached to a gateway BDMUST<bcp14>MUST</bcp14> set itslayerLayer 2 multicast state to indicate that each AC to the gateway BD has interest in all multicast flows. ItMUST<bcp14>MUST</bcp14> also originateana SMET route for (*,*). The procedures for originating SMET routes are discussed in <xreftarget="interest"/>. <vspace/> <vspace/>target="interest" format="default"/>. </t> <t> This will cause the OISM PEs attached to the gateway BD to receive all the IP multicast traffic that is sourced within the EVPNtenant domain,Tenant Domain and to transmit that traffic to the gateway BD, where the external PIM routers will receive it. This enables the external PIM routers to perform FHR functions on behalf of the entire Tenant Domain. (Of course, if the gateway BD has amulti&nbhy;homedmultihomed segment, only the PE that is the DF for that segment will transmit the multicast traffic to the segment.) </t></list> </t></li> </ul> </section><!-- external_pim_router --></section><!-- external --><sectiontitle="Usinganchor="pim" numbered="true" toc="default"> <name>Using an EVPN Tenant Domain as an Intermediate (Transit) Network for Multicasttraffic" anchor="pim"> <!--t> <cref source=" ECR"> Basically this section says "Use PIM, send PIM messages transparently, and use SMET routes to be sure you pull the right traffic to the right place if the traffic is originating from the Tenant Domain, or if the traffic is being sent through the Tenant Domain by a tenant router." One might expect there to be a "use the SMET routes to recreate the PIM messages", but that isn't present in this rev. (There are a few comments where we might consider that.) </cref> </t-->Traffic</name> <t> In this section, we consider the scenario where one or more BDs of an EVPN Tenant Domain are being used to carry IP multicast traffic for which the source and at least one receiver are not part thetenant domain.Tenant Domain. That is, one or more BDs of the Tenant Domain are intermediate"links"links of a larger multicast tree created by PIM. </t> <t> We define a "tenant multicast router" as a multicast router, running PIM,that is: <list style="format %d.">that: </t> <ol spacing="normal" type="1"><li> <t> is attached to one or more BDs of the TenantDomain,Domain but </t> </li> <li> <t> is not an EVPN PE router. </t></list> </t></li> </ol> <t> In order for an EVPN Tenant Domain to be used as a transit network for IP multicast, one or more of its BDs must have tenant multicast routers, and an OISM PEthat attachingattached to such a BDMUST<bcp14>MUST</bcp14> be provisioned to enable PIM on its IRB interface to that BD. (This is true even if none of the tenant routers is on a segment attached to the PE.) Further, all the OISM PEs (even ones not attached to a BD with tenant multicast routers)MUST<bcp14>MUST</bcp14> be provisioned to enable PIM on their SBD IRB interfaces. </t> <t> If PIM is enabled on a particular BD, the DRSelectionselection procedure of <xreftarget="dr_selection"/> MUSTtarget="dr_selection" format="default"/> <bcp14>MUST</bcp14> be replaced by the normal PIM DR Election procedure of <xreftarget="RFC7761"/>.target="RFC7761" format="default"/>. Note that this may result in one of the tenant routers being selected as theDR,DR rather than one of the OISM PE routers. In this case, First Hop Router and Last Hop Router functionality will not be performed by any of the EVPN PEs. </t> <t> A PIM control message on a particular BD is considered to be alink&nbhy;locallink-local multicastmessage, andmessage and, assuchsuch, is sent transparently from PE to PE via the BUM tunnel for that BD. This is true whether the control message was received from anAC,AC orwhether it was receivedfrom the locallayerLayer 3 routing instance via an IRB interface. </t><!--t> <cref source=" ECR"> Do we need to consider "PIM Proxy" optimizations here? </cref> </t--><t> A PIM Join/Prune message contains three fields that are relevant to the present discussion:<list style="symbols"></t> <ul spacing="normal"> <li> <t> Upstream Neighbor </t> </li> <li> <t> Group Address (G) </t> </li> <li> <t> Source Address (S), omitted in the case of (*,G) Join/Prunemessages. </t> </list>messages </t> </li> </ul> <t> We will generally speak of a PIM Join as a"Join(S,G)"Join (S,G) or a"Join(*,G)" message,Join (*,G) message and will use the term"Join(X,G)""Join (X,G)" to mean"either Join(S,G)either "Join (S,G)" orJoin(*,G)"."Join (*,G)". In the context of aJoin(X,G),Join (X,G), we will use the term "X" to mean"S"S" in the case of(S,G),(S,G) orG's RP"G's RP" in the case of(*,G)".(*,G). </t> <t> Suppose BD1 contains two tenant multicast routers, say C1 and C2. Suppose C1 is on a segment attached toPE1,PE1 and C2 is on a segment attached to PE2. When C1 sends a PIMJoin(X,G)Join (X,G) to BD1, the Upstream Neighbor field might be set toeitherPE1, PE2, or C2. C1 chooses the Upstream Neighbor based on its unicast routing. Typically, it will chooseas the Upstream Neighborthe PIM router on BD1 that is"closest"closest (according to the unicast routing) toX.X as the Upstream Neighbor. Note that this will not necessarily be PE1. PE1 may not even be visible to the unicast routing algorithm used by the tenant routers. Even if it is, it is unlikely to be the PIM router that is closest to X. So we need to consider the following two cases:<list style="format %d. "></t> <ol spacing="normal" type="1"><li> <t> C1 sends a PIMJoin(X,G)Join (X,G) to BD1, with PE1 as the Upstream Neighbor.<vspace/> <vspace/></t> <t> PE1's PIM routing instance will receive the Join arrive on the BD1 IRB interface. If X is not within the Tenant Domain, PE1 handles the Join according to normal PIM procedures. This will generally result in PE1 selecting an Upstream Neighbor and sending it aJoin(X,G). <vspace/> <vspace/>Join (X,G). </t> <t> If X is within the TenantDomain,Domain but is attached to some other PE, PE1 sends (if it hasn't already) anSBD&nbhy;SMETSBD-SMET route for (X,G). The IIF of thelayerLayer 3 (X,G) state will be the SBD IRB interface, and the OIF list will include the IRB interface to BD1.<vspace/> <vspace/></t> <t> TheSBD&nbhy;SMETSBD-SMET route will pull the (X,G) traffic to PE1, and the (X,G) state will result in the (X,G) traffic being forwarded to C1.<vspace/> <vspace/></t> <t> If X is within the TenantDomain,Domain but is attached to PE1 itself, noSBD&nbhy;SMETSBD-SMET route is sent. The IIF of thelayerLayer 3 (X,G) state will be the IRB interface to X's BD, and the OIF list will include the IRB interface to BD1. </t> </li> <li> <t> C1 sends a PIMJoin(X,G)Join (X,G) to BD1, with either PE2 or C2 as the Upstream Neighbor.<vspace/> <vspace/></t> <t> PE1's PIM routing instance will receive the Join arrive on the BD1 IRB interface. If neither X nor Upstream Neighbor is within thetenant domain,Tenant Domain, PE1 handles the Join according to normal PIM procedures. This will NOT result in PE1 sending aJoin(X,G). <vspace/> <vspace/>Join (X,G). </t> <t> If either X or Upstream Neighbor is within the Tenant Domain, PE1 sends (if it hasn't already) anSBD&nbhy;SMETSBD-SMET route for (X,G). The IIF of thelayerLayer 3 (X,G) state will be the SBD IRB interface, and the OIF list will include the IRB interface to BD1.<vspace/> <vspace/></t> <t> TheSBD&nbhy;SMETSBD-SMET route will pull the (X,G) traffic to PE1, and the (X,G) state will result in the (X,G) traffic being forwarded to C1. </t></list> </t> <!--t> <cref source=" ECR"> Above (case 1) is where we might consider using the SMET route to as a proxy for a PIM message. </cref> </t> <t> <cref source=" ECR"> Here (case 2) if we wanted to use the SMET route to proxy for the PIM Join we'd have to block PIM J/P messages from being sent transparently from site to site. </cref> </t--> <!-- <t>It is possible that an EVPN broadcast domain is providing --> <!-- transit service for a tenant's larger network and there are tenant --> <!-- routers attached to the subnet, running routing protocols like PIM. --> <!-- In that case, traffic routed by an upstream NVE to the subnet via IRB --> <!-- interface may be expected on a downstream tenant router. However, --> <!-- since multicast data traffic sent down the IRB interfaces --> <!-- is forwarded to local ACs only and not to other EVPN sites according --> <!-- to rule XXX --> <!-- <!-\- <xref target="blockrule" format="counter"/> in <xref target="solution"/>, additional -\-> --> <!-- procedures are needed to handle this situation with tenant routers. --> <!-- In particular, NVEs connecting to tenant routers or traffic sources --> <!-- need to run PIM on the IRB interface for the transit subnet and --> <!-- the SBD. --> <!-- </t> --> <!-- <t>Consider the following situation: --> <!-- <figure> --> <!-- <artwork> --> <!-- S1 S2 --> <!-- \ N1 / N2 --> <!-- CE1a CE2b --> <!-- \ vlan1 / vlan1 --> <!-- NVE1 -\-\-\-\-\-\-\-\-\-\-\- NVE2 -\-\-\- CE2a -\- receiver --> <!-- / N3 N4 --> <!-- S3 --> <!-- </artwork> --> <!-- </figure> --> <!-- </t> --> <!-- <t>CE1a, CE2a/b are three CE routers on vlan1 that is implemented by EVPN. --> <!-- The CEs and NVE1/2 run PIM protocol and are PIM neighbors on vlan1. --> <!-- CE2a has a receiver on network N4 for multicast traffic from S1/2/3 --> <!-- on network N1/2/3 respectively. --> <!-- </t> --> <!-- <t>CE2a sends PIM joins to CE1a/CE2b/NVE1 on vlan1 for the three sources --> <!-- respectively and they all route traffic accordingly onto vlan1. --> <!-- Traffic from S1/2 will reach CE2a because NVE1/2 receive the L2 traffic --> <!-- on their ACs and forward across the core following EVPN procedures. --> <!-- Traffic from S3 is routed into vlan1 by NVE1 via the IRB interface, --> <!-- and per rule XXX --> <!-- <!-\- <xref target="blockrule" format="counter"/> in <xref target="solution"/> the traffic will not be -\-> --> <!-- sent across the core. Thus, according to the procedures specified --> <!-- so far, the traffic from S3 will never be received by NVE2 or CE2a. --> <!-- </t> --> <!-- <t>To solve this problem, NVE2 needs to know that CE2a sent a PIM join --> <!-- to another NVE in vlan1 and needs to pull traffic via the SBD, --> <!-- where the traffic via IRB is not blocked on the core side. Because PIM --> <!-- protocol already requires a router to process join/prune messages that --> <!-- it receives on an interface even if it is not the intended RPF neighbor --> <!-- (for the purpose of join suppression and prune overriding), NVE2 can --> <!-- realize that the upstream router in the join message is another NVE --> <!-- vs. a CE router (this only requires the NVEs to keep track if a neighbor --> <!-- is an NVE for the subnet). In that case, it treats that join/prune --> <!-- as for itself. Correspondingly, its PIM upstream state machine will --> <!-- choose one of the NVEs as the RPF neighbor. Between this local NVE and --> <!-- the chosen RPF neighbor there could be multiple subnets including the --> <!-- SBD but the SBD IRB interface is explicitly chosen as the RPF --> <!-- interface. Corresponding join/prune is sent over the SBD IRB --> <!-- interface (optionally the the join/prune could be replaced with --> <!-- SMET routes) and the upstream NVE will route traffic through the SBD. --> <!-- This NVE then route traffic further downstream to CE routers. --> <!-- </t> --> <!-- <t>Similarly, if an NVE needs to send PIM join/prune messages due to its --> <!-- local IGMP/MLD state changes, the RPF interface is always explicitly --> <!-- set to the SBD IRB. --> <!-- </t> --> <!-- <t>Note that, if CE2a chooses NVE1 or NVE2 instead of CE1a --> <!-- as its RPF neighbor for S1, then both CE1a and NVE2 will send --> <!-- traffic to vlan1 (NVE1 receives join from NVE2 on the SBD and sends --> <!-- join to CE1a on vlan1. NVE1 receives traffic from CE1a on vlan1 --> <!-- and route to SBD. NVE2 receives traffic on SBD and route to --> <!-- local receivers on vlan1). PIM assert procedure kicks in but only --> <!-- on NVE2, as CE1a does not receive traffic from NVE2. To address this, --> <!-- an NVE must track all the RPF neighbors and not add an IRB interface --> <!-- to the OIF list if it received a corresponding PIM join on the IRB, --> <!-- in which a tenant router is listed as the upstream neighbor. --> <!-- That tenant router will deliver traffic to the subnet, and the traffic --> <!-- will be forwarded through the core as it is not routed down the IRB --> <!-- but received on an AC. --> <!-- </t> --> <!-- <t>With PIM-ASM, if the DR on a source subnet is a tenant router, --> <!-- it will handle the registering procedures for PIM-ASM. As a result, --> <!-- the NVE at same site as the tenant router/DR MUST not handle registering --> <!-- procedures as described in <xref target="solution"/>. --> <!-- </t> --></li> </ol> </section><!-- Tenant Routers --> <!-- </section> <!-\- Advanced Topics -\-> --><section anchor="IANA"title="IANA Considerations">numbered="true" toc="default"> <name>IANA Considerations</name> <t> IANAis requested to assignhas assigned new flags in the "Multicast Flags ExtendedCommunity Flags" registry. These flags are: <list style="symbols"> <t> IPMG </t> <t> MEG </t> <t> PEG </t> <t> OISM SBD </t> <t> OISM-supported </t> </list> </t> <!-- <t>This document requests the following IANA assignments: --> <!-- <list style="symbols"> --> <!-- <t>A "Non-OISM" Sub-Type in --> <!-- "EVPN Extended Community Sub-Types"Community" registryforunder the--> <!-- EVPN Non-OISM"Border Gateway Protocol (BGP) ExtendedCommunity. --> <!--Communities" registry as shown below. </t>--> <!-- <t>An "Optimized Inter-subnet Multicast" bit (OISM) in the --> <!-- Multicast<table align="center"> <name>Multicast Flagsextended community defined in --> <!-- <xref target="RFC9251"/> --> <!-- </t> --> <!-- </list> --> <!-- </t> -->Extended Community Registry</name> <thead> <tr> <th align="left" colspan="1" rowspan="1">Bit</th> <th align="left" colspan="1" rowspan="1">Name</th> <th align="left" colspan="1" rowspan="1">Reference</th> <th align="left" colspan="1" rowspan="1">Change Controller</th> </tr> </thead> <tbody> <tr> <td align="left" colspan="1" rowspan="1">7</td> <td align="left" colspan="1" rowspan="1">OISM SBD</td> <td align="left" colspan="1" rowspan="1">RFC 9625</td> <td align="left" colspan="1" rowspan="1">IETF</td> </tr> <tr> <td align="left" colspan="1" rowspan="1">9</td> <td align="left" colspan="1" rowspan="1">IPMG</td> <td align="left" colspan="1" rowspan="1">RFC 9625</td> <td align="left" colspan="1" rowspan="1">IETF</td> </tr> <tr> <td align="left" colspan="1" rowspan="1">10</td> <td align="left" colspan="1" rowspan="1">MEG</td> <td align="left" colspan="1" rowspan="1">RFC 9625</td> <td align="left" colspan="1" rowspan="1">IETF</td> </tr> <tr> <td align="left" colspan="1" rowspan="1">11</td> <td align="left" colspan="1" rowspan="1">PEG</td> <td align="left" colspan="1" rowspan="1">RFC 9625</td> <td align="left" colspan="1" rowspan="1">IETF</td> </tr> <tr> <td align="left" colspan="1" rowspan="1">12</td> <td align="left" colspan="1" rowspan="1">OISM-supported</td> <td align="left" colspan="1" rowspan="1">RFC 9625</td> <td align="left" colspan="1" rowspan="1">IETF</td> </tr> </tbody> </table> </section><!-- iana --><section anchor="Security"title="Security Considerations">numbered="true" toc="default"> <name>Security Considerations</name> <t> This document uses protocols and procedures defined in the normativereferences,references and inherits the security considerations of those references. </t> <t> This document adds flags or Extended Communities (ECs) to a number of BGProutes,routes in order to signal that particular nodes support the OISM, IPMG, MEG, and/or PEG functionalities that are defined in this document. Incorrect addition, removal, or modification of those flags and/or ECs will cause the procedures defined herein to malfunction, in which case loss or diversion of data traffic is possible. Implementations should provide tools to easily debug configuration mistakes that cause the signaling of incorrect information. </t> <t> The interworking with non-OISM networks described insections 5Sections <xref target="no-OISM" format="counter"/> and6, require<xref target="external" format="counter"/> requires gateway functions in multiple redundant PEs, among which one of them is elected as Designated Forwarder for a given BD (or SBD). The election of the MEG or PEGDesignated Router,DR, as well as the IPMG DesignatedForwarderForwarder, makes use of theRFC8584Designated Forwarder electionprocedures.procedures <xref target="RFC8584"/>. An attacker with access to one of these Gateways may influence such election and therefore modify the forwarding of multicast traffic between the OISM network and the external domain. The operator should be especially careful with the protection of these gateways by making sure the management interfaces to access the gateways are only allowed to authorized operators. </t> <t> The document also introduces the concept of per-Tenant-Domain dissemination for the SMET routes, as opposed to per-BD distribution in[RFC9251].<xref target="RFC9251"/>. That is,e.g., ana SMET route triggered by the reception of an IGMP/MLDjoinJoin in BD-1 onPE1,PE1 needs to be distributed and imported by all PEs of the Tenant Domain, even to those PEs that are not attached to BD-1. This means that an attacker with access to only one BD in a PE of the TenantDomain,Domain might force the advertisement of SMET routes and impact the resources of all the PEs of the Tenant Domain, as opposed to only the PEs of that particular BD (as inRFC9251).<xref target="RFC9251"/>). The implementation should provide ways to filter/control the client IGMP/MLD reports that are received by the attached hosts. </t> </section><!-- security --> <section anchor="Acknowledgements" title="Acknowledgements"> <t> The authors thank Vikram Nagarajan and Princy Elizabeth for their work on <xref target="external_pim_router"/> and <xref target="SBD-matching"/>. The authors also benefited tremendously from discussions with Aldrin Isaac on EVPN multicast optimizations. </t> </section> <!-- acks --></middle> <back><references title="Normative References"> &RFC2119; <!-- normative keywords --> &RFC3376; <!-- IGMP --> &RFC3810; <!-- MLDv2 --> &RFC3032; <!-- MPLS Encaps --> &RFC4360; <!-- BGP ECs --> &RFC6625; <!-- MVPN Wildcards --> &RFC7153; <!-- BGP ECs --> &RFC7432; <!-- EVPN --> &RFC8174; <!-- normative keywords capitalized --> &RFC8584; &RFC9135; &RFC9136; &RFC9251; <?rfc include='reference.I-D.ietf-bess-evpn-bum-procedure-updates'?> <?rfc include='reference.I-D.ietf-bess-evpn-optimized-ir'?><displayreference target="I-D.ietf-bess-evpn-pref-df" to="EVPN-DF"/> <references> <name>References</name> <references> <name>Normative References</name> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3376.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3810.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3032.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4360.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6625.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7153.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7432.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8584.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9135.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9136.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9251.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9572.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9574.xml"/> </references> <references> <name>Informative References</name> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4364.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6513.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6514.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4541.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7606.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7716.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7761.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8296.xml"/> <reference anchor="RFC9624" target="https://www.rfc-editor.org/info/rfc9624"> <front> <title>EVPN Broadcast, Unknown Unicast, or Multicast (BUM) Using Bit Index Explicit Replication (BIER)</title> <author initials="Z." surname="Zhang" fullname="Zhaohui (Jeffrey) Zhang"> <organization>Juniper Networks</organization> </author> <author initials="T." surname="Przygienda" fullname="Tony Przygienda"> <organization>Juniper Networks</organization> </author> <author initials="A." surname="Sajassi" fullname="Ali Sajassi"> <organization>Cisco Systems</organization> </author> <author initials="J." surname="Rabadan" fullname="Jorge Rabadan"> <organization>Nokia</organization> </author> <date month="August" year="2024"/> </front> <seriesInfo name='RFC' value='9624'/> <seriesInfo name='DOI' value='10.17487/RFC9624'/> </reference> <xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-bess-evpn-pref-df.xml"/> </references><references title="Informative References"> &RFC4364; <!-- L3VPN --> <!-- &RFC5015; --> <!-- BIDIR-PIM --> &RFC6513; <!-- MVPN --> &RFC6514; <!-- MVPN --> &RFC4541; <!-- snooping --> &RFC7606; <!-- BGP Error Handling --> &RFC7716; <!-- GTM --> &RFC7761; <!-- PIM-SM --> &RFC8296; <!-- BIER encapsulation --> <?rfc include='reference.I-D.ietf-bier-evpn'?> <?rfc include='reference.I-D.ietf-bess-evpn-pref-df'?></references> <sectiontitle="Integratedanchor="irb" numbered="true" toc="default"> <name>Integrated Routing andBridging" anchor="irb">Bridging</name> <t> ThisAppendixappendix provides a short tutorial on the interaction of routing and bridging.FirstFirst, it showsthe traditionala model, where bridging and routing are performed in separate devices.ThenThen, it shows the model specified in <xreftarget="RFC9135"/>,target="RFC9135" format="default"/>, where a single device contains both routing and bridging functions. The latter model is presupposed in the body of this document. </t> <t> <xreftarget="conventional_router"/>target="conventional_router" format="default"/> shows the model where a"traditional"routerthatonly does routing and has no L2 bridging capabilities. There are twoLANs,LANs: LAN1 and LAN2. LAN1 is realized by switch1, and LAN2 is realized by switch2. The router has an interface,"lan1"lan1, that attaches to LAN1 (via switch1) and aninterface "lan2"interface, lan2, thatattachsattaches to LAN2 (via switch2). Eachintrefaceinterface is configured, as an IP interface, with an IP address and a subnet mask. </t> <figurealign="center" anchor="conventional_router" title="Conventionalanchor="conventional_router"> <name>Conventional Router with LANInterfaces">Interfaces</name> <artworkalign="center"><![CDATA[align="center" name="" type="" alt=""><![CDATA[ +-------+ +--------+ +-------+ | | lan1| |lan2 | | H1 -----+Switch1+--------+ Router1+--------+Switch2+------H3 | | | | | | H2 -----| | | | | | +-------+ +--------+ +-------+ |_________________| |__________________| LAN1 LAN2 ]]></artwork> </figure></t><t> IP traffic (unicast or multicast) that remains within a single subnet never reaches the router. For instance, if H1 emits an Ethernet frame with H2's MAC address in the Ethernetdestination addressDestination Address field, the frame will go from H1 to Switch1 toH2,H2 without ever reaching the router. Since the frame is never seen by a router, the IP datagram within the frame remains entirely unchanged, e.g., its TTL is not decremented. The Ethernet Source and Destination MAC addresses are not changed either. </t> <t> If H1 wants to send a unicast IP datagram to H3, which is on a different subnet, H1 has to be configured with the IP address of a"default router".default router. Let's assume that H1 is configured with an IP address of Router1 as its default router address. H1 compares H3's IP address with its own IP address and IP subnetmask,mask and determines that H3 is on a different subnet. So the packet has to be routed. H1 uses ARP to map Router1's IP address to a MAC address on LAN1. H1 then encapsulates the datagram in an Ethernet frame, usingrouter1'sRouter1's MAC address as the destination MAC address, and sends the frame to Router1. </t> <t> Router1 then receives the frame over its lan1 interface. Router1 sees that the frame is addressed to it, so it removes the Ethernet encapsulation and processes the IP datagram. The datagram is not addressed to Router1, so it must be forwarded further. Router1 does a lookup of the datagram's IPdestination field,Destination Address field and determines that the destination (H3) can be reached via Router1's lan2 interface. Router1 now performs the IP processing of the datagram: it decrements the IP TTL, adjusts the IP header checksum (if present), may fragment the packetisas necessary, etc.ThenThen, the datagram (or its fragments)areis encapsulated in an Ethernet header, with Router1's MAC address on LAN2 as the MAC SourceAddress,Address and H3's MAC address on LAN2 (which Router1 determines via ARP) as theMACDestination MAC Address.FinallyFinally, the packet is sent on the lan2 interface. </t> <t> If H1 has an IP multicast datagram to send (i.e., an IP datagram whose Destination Address field is an IP Multicast Address), it encapsulates it in an Ethernet frame whoseMACDestination MAC Address is computed from the IP Destination Address. </t> <t> If H2 is a receiver for that multicast address, H2 will receive a copy of the frame, unchanged, from H1. The MAC Source Address in the Ethernet encapsulation does not change, the IP TTL field does not get decremented, etc. </t> <t> If H3 is a receiver for that multicast address, the datagram must be routed to H3. In order for this to happen, Router1 must be configured as a multicast router, and it must accept traffic sent to Ethernet multicast addresses. Router1 will receive H1's multicast frame on its lan1 interface,willremove the Ethernet encapsulation, andwilldetermine how to dispatch the IP datagram based on Router1's multicast forwarding states. If Router1 knows that there is a receiver for the multicast datagram on LAN2, it makes a copy of the datagram, decrements the TTL (and performs any other necessary IP processing), and then encapsulates the datagram in the Ethernet frame for LAN2. The MAC Source Address for this frame will be Router1's MAC Source Address on LAN2. TheMACDestination MAC Address is computed from the IP Destination Address. Finally, the frame is sent on Router1's LAN2 interface. </t> <t> <xreftarget="IRB"/>target="IRB" format="default"/> shows anIntegrated Router/Bridgeintegrated router/bridge that supports the routing/bridging integration model of <xreftarget="RFC9135"/>.target="RFC9135" format="default"/>. </t> <figurealign="center" anchor="IRB" title="Integrated Router/Bridge">anchor="IRB"> <name>Integrated Router/Bridge</name> <artworkalign="center"><![CDATA[align="center" name="" type="" alt=""><![CDATA[ +------------------------------------------+ | Integrated Router/Bridge | +-------+ +--------+ +-------+ | | IRB1| L3 |IRB2 | | H1 -----+ BD1 +--------+Routing +--------+ BD2 +------H3 | | |Instance| | | H2 -----| | | | | | +-------+ +--------+ +-------+ |___________________| |____________________| LAN1 LAN2]]> </artwork>]]></artwork> </figure></t><t> In <xreftarget="IRB"/>,target="IRB" format="default"/>, a single device consists of one or more"L3L3 RoutingInstances".Instances. The routing/forwarding tables of a given routing instance is known as an IP-VRF <xreftarget="RFC9135"/>.target="RFC9135" format="default"/>. In the context of EVPN, it is convenient to think of each routing instance as representing the routing of a particular tenant. Each IP-VRF is attached to one or more interfaces. </t> <t> When several EVPN PEs have a routing instance of the sametenant domain,Tenant Domain, those PEs advertise IP routes to the attached hosts. This is done as specified in <xreftarget="RFC9135"/>.target="RFC9135" format="default"/>. </t> <t> The integrated router/bridge shown in <xreftarget="IRB"/>target="IRB" format="default"/> also attaches to a number of"Broadcast Domains"Broadcast Domains (BDs). Each BD performs the functions that are performed by the bridges in <xreftarget="conventional_router"/>.target="conventional_router" format="default"/>. To the L3 routing instance, each BD appears to be a LAN. The interface attaching a particular BD to a particular IP-VRF is known as an "IRBInterface".interface". From the perspective of L3 routing, each BD is a subnet.ThusThus, each IRB interface is configured with a MAC address (which is the router's MAC address on the corresponding LAN), as well as an IP address and subnet mask. </t> <t> The integrated router/bridge shown in <xreftarget="IRB"/>target="IRB" format="default"/> may have multiple ACs to each BD. These ACs are visible only to the bridging function, not to the routing instance. To the L3 routing instance, there is just one"interface"interface to each BD. </t> <t> If the L3 routing instance represents the IP routing of a particular tenant, the BDs attached to that routing instance are BDs belonging to that same tenant. </t> <t> Bridging and routing now proceed exactly as in the case of <xreftarget="conventional_router"/>,target="conventional_router" format="default"/>, except that BD1 replaces Switch1, BD2 replaces Switch2, interface IRB1 replaces interface lan1, and interface IRB2 replaces interface lan2. </t> <t> It is important to understand that an IRB interface connects an L3 routing instance to a BD, NOT to a"MAC&nbhy;VRF". (SeeMAC-VRF (see <xreftarget="RFC7432"/>target="RFC7432" format="default"/> for the definition of"MAC&nbhy;VRF".)MAC-VRF). AMAC&nbhy;VRFMAC-VRF may contain several BDs, as long as no MAC address appears in more than one BD. From the perspective of the L3 routing instance, each individual BD is an individual IP subnet; whether or not each BD has its ownMAC&nbhy;VRF or notMAC-VRF is irrelevant to the L3 routing instance. </t> <t> <xreftarget="two_router_irb"/>target="two_router_irb" format="default"/> illustrates IRB when a pair of BDs (subnets) are attached to two different PE routers. In this example, each BD has two segments, and one segment of each BD is attached to one PE router. </t> <figurealign="center" anchor="two_router_irb" title="Integratedanchor="two_router_irb"> <name>Integrated Router/Bridges with DistributedSubnet">Subnet</name> <artworkalign="center"><![CDATA[align="center" name="" type="" alt=""><![CDATA[ +------------------------------------------+ | Integrated Router/Bridges | +-------+ +--------+ +-------+ | | IRB1| |IRB2 | | H1 -----+ BD1 +--------+ PE1 +--------+ BD2 +------H3 |(Seg-1)| |(L3 Rtg)| |(Seg-1)| H2 -----| | | | | | +-------+ +--------+ +-------+ |___________________| | |____________________| LAN1 | LAN2 | | +-------+ +--------+ +-------+ | | IRB1| |IRB2 | | H4 -----+ BD1 +--------+ PE2 +--------+ BD2 +------H5 |(Seg-2)| |(L3 Rtg)| |(Seg-2)| | | | | | | +-------+ +--------+ +-------+]]> </artwork>]]></artwork> </figure></t><t> If H1 needs to send an IP packet to H4, it determines from its IP address and subnet mask that H4 is on the same subnet as H1. Although H1 and H4 are not attached to the same PE router, EVPN provides Ethernet communication among all hosts that are on the same BD. Thus, H1thususes ARP to find H4's MACaddress,address and sends an Ethernet frame with H4's MAC address in the Destination MACaddressAddress field. The frame is received at PE1, but since the Destination MAC address is not PE1's MAC address, PE1 assumes that the frame is to remain on BD1.ThereforeTherefore, the packet inside the frame is NOTdecapsulated,decapsulated and is NOTsendsent up the IRB interface to PE1's routing instance. Rather, standard EVPNintra&nbhy;subnetintra-subnet procedures (as detailed in <xreftarget="RFC7432"/>)target="RFC7432" format="default"/>) are used to deliver the frame to PE2, which then sends it to H4. </t> <t> If H1 needs to send an IP packet to H5, it determines from its IP address and subnet mask that H5 is NOT on the same subnet as H1. Assuming that H1 has been configured with the IP address of PE1 as its default router, H1 sends the packet in an Ethernet frame with PE1's MAC address in its Destination MAC Address field. PE1 receives theframe,frame and sees that the frame is addressed to it. Thus, PE1thussends the frame up its IRB1 interface to the L3 routing instance. Appropriate IP processing is done, e.g., TTL decrement. The L3 routing instance determines that the"next hop"next hop for H5 is PE2, so the packet is encapsulated (e.g., in MPLS) and sent across the backbone to PE2's routing instance. PE2 will see that the packet's destination, H5, is on BD2segment-2,segment-2 and will send the packet down its IRB2 interface. This causes the IP packet to be encapsulated in an Ethernet frame with PE2's MAC address (on BD2) in the Source Address field and H5's MAC address in the Destination Address field. </t> <t> Note that if H1 has an IP packet to send to H3, the forwarding of the packet is handled entirely within PE1. PE1's routing instance sees the packet arrive on its IRB1interface,interface and then transmits the packet by sending it down its IRB2 interface. </t> <t> Often, all the hosts in a particular Tenant Domain will be provisioned with the same value of the default router IP address. This IP address can be provisioned as an"anycast address"anycast address in all the EVPN PEs attached to that Tenant Domain.ThusThus, although all hosts are provisioned with the same"defaultdefault routeraddress",address, the actual default router for a given host will be one of the PEs attached to the same Ethernet segment as the host. This provisioning method ensures that IP packets from a given host are handled by the closest EVPN PE that supports IRB. </t> <t> In the topology of <xreftarget="two_router_irb"/>,target="two_router_irb" format="default"/>, one could imagine that H1 is configured with a default router address that belongs to PE2 but not to PE1. Inter-subnet routing would still work, but IP packets from H1 to H3 would then follow the non-optimal pathH1-->PE1-->PE2-->PE1-->H3.H1-->PE1-->PE2-->PE1-->H3. Sending traffic on this sort of path, where it leaves a router and then comes back to the same router, is sometimes known as "hairpinning". Similarly, if PE2 supports IRB but PE1 dos not, the same non-optimal path from H1 to H3 would have to be followed. To avoid hairpinning, each EVPN PE needs to support IRB. </t> <t> It is worth pointing out the way IRB interfaces interact with multicast traffic. Referring again to <xreftarget="two_router_irb"/>,target="two_router_irb" format="default"/>, suppose PE1 and PE2 are functioning as IP multicast routers.Also SupposeAlso, suppose that H3 transmits a multicastpacket,packet and both H1 and H4 are interested in receiving that packet. PE1 will receive the packet from H3 via its IRB2 interface. The Ethernet encapsulation from BD2 is removed, the IP header processing is done, and the packet is thenreencapsulatedre-encapsulated for BD1, with PE1's MAC address in the MAC Source Address field.ThenThen, the packet is sent down the IRB1 interface. Layer 2 procedures (as defined in <xreftarget="RFC7432"/>target="RFC7432" format="default"/>) would then be used to deliver a copy of the packet locally toH1,H1 and remotely to H4. </t> <t> Please be aware thathisthis document modifies the semantics, described in the previous paragraph, of sending/receiving multicast traffic on an IRB interface. This is explained in <xreftarget="cp_overview"/>target="cp_overview" format="default"/> and subsequent sections. </t> </section> <section anchor="Acknowledgements" numbered="false" toc="default"> <name>Acknowledgements</name> <t> The authors thank <contact fullname="Vikram Nagarajan"/> and <contact fullname="Princy Elizabeth"/> for their work on Sections <xref target="external_pim_router" format="counter"/> and <xref target="SBD-matching" format="counter"/>. The authors also benefited tremendously from discussions with <contact fullname="Aldrin Isaac"/> on EVPN multicast optimizations. </t> </section> </back> </rfc>