rfc9625.original | rfc9625.txt | |||
---|---|---|---|---|
BESS W. Lin | Internet Engineering Task Force (IETF) W. Lin | |||
Internet-Draft Z. Zhang | Request for Comments: 9625 Z. Zhang | |||
Intended status: Standards Track J. Drake | Category: Standards Track J. Drake | |||
Expires: 5 September 2024 E. Rosen, Ed. | ISSN: 2070-1721 E. Rosen, Ed. | |||
Juniper Networks, Inc. | Juniper Networks, Inc. | |||
J. Rabadan | J. Rabadan | |||
Nokia | Nokia | |||
A. Sajassi | A. Sajassi | |||
Cisco Systems | Cisco Systems | |||
4 March 2024 | August 2024 | |||
EVPN Optimized Inter-Subnet Multicast (OISM) Forwarding | EVPN Optimized Inter-Subnet Multicast (OISM) Forwarding | |||
draft-ietf-bess-evpn-irb-mcast-11 | ||||
Abstract | Abstract | |||
Ethernet VPN (EVPN) provides a service that allows a single Local | Ethernet VPN (EVPN) provides a service that allows a single Local | |||
Area Network (LAN), comprising a single IP subnet, to be divided into | Area Network (LAN), comprising a single IP subnet, to be divided into | |||
multiple "segments". Each segment may be located at a different | multiple segments. Each segment may be located at a different site, | |||
site, and the segments are interconnected by an IP or MPLS backbone. | and the segments are interconnected by an IP or MPLS backbone. | |||
Intra-subnet traffic (either unicast or multicast) always appears to | Intra-subnet traffic (either unicast or multicast) always appears to | |||
the end users to be bridged, even when it is actually carried over | the end users to be bridged, even when it is actually carried over | |||
the IP or MPLS backbone. When a single "tenant" owns multiple such | the IP or MPLS backbone. When a single tenant owns multiple such | |||
LANs, EVPN also allows IP unicast traffic to be routed between those | LANs, EVPN also allows IP unicast traffic to be routed between those | |||
LANs. This document specifies new procedures that allow inter-subnet | LANs. This document specifies new procedures that allow inter-subnet | |||
IP multicast traffic to be routed among the LANs of a given tenant, | IP multicast traffic to be routed among the LANs of a given tenant | |||
while still making intra-subnet IP multicast traffic appear to be | while still making intra-subnet IP multicast traffic appear to be | |||
bridged. These procedures can provide optimal routing of the inter- | bridged. These procedures can provide optimal routing of the inter- | |||
subnet multicast traffic, and do not require any such traffic to | subnet multicast traffic and do not require any such traffic to | |||
egress a given router and then ingress that same router. These | egress a given router and then ingress that same router. These | |||
procedures also accommodate IP multicast traffic that originates or | procedures also accommodate IP multicast traffic that originates or | |||
is destined external to the EVPN domain. | is destined to be external to the EVPN domain. | |||
Requirements Language | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in BCP | ||||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 5 September 2024. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9625. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2024 IETF Trust and the persons identified as the | Copyright (c) 2024 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
in the Revised BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction | |||
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.1. Terminology | |||
1.2. Background . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.1.1. Requirements Language | |||
1.2.1. Segments, Broadcast Domains, and Tenants . . . . . . 7 | 1.2. Background | |||
1.2.2. Inter-BD (Inter-Subnet) IP Traffic . . . . . . . . . 8 | 1.2.1. Segments, Broadcast Domains, and Tenants | |||
1.2.3. EVPN and IP Multicast . . . . . . . . . . . . . . . . 9 | 1.2.2. Inter-BD (Inter-Subnet) IP Traffic | |||
1.2.4. BDs, MAC-VRFS, and EVPN Service Models . . . . . . . 9 | 1.2.3. EVPN and IP Multicast | |||
1.3. Need for EVPN-aware Multicast Procedures . . . . . . . . 10 | 1.2.4. BDs, MAC-VRFs, and EVPN Service Models | |||
1.4. Additional Requirements That Must be Met by the | 1.3. Need for EVPN-Aware Multicast Procedures | |||
Solution . . . . . . . . . . . . . . . . . . . . . . . . 11 | 1.4. Additional Requirements That Must Be Met by the Solution | |||
1.5. Model of Operation: Overview . . . . . . . . . . . . . . 13 | 1.5. Model of Operation: Overview | |||
1.5.1. Control Plane . . . . . . . . . . . . . . . . . . . . 13 | 1.5.1. Control Plane | |||
1.5.2. Data Plane . . . . . . . . . . . . . . . . . . . . . 15 | 1.5.2. Data Plane | |||
2. Detailed Model of Operation . . . . . . . . . . . . . . . . . 18 | 2. Detailed Model of Operation | |||
2.1. Supplementary Broadcast Domain . . . . . . . . . . . . . 18 | 2.1. Supplementary Broadcast Domain | |||
2.2. Detecting When a Route is For/From a Particular BD . . . 19 | 2.2. Detecting When a Route is for/from a Particular BD | |||
2.3. Use of IRB Interfaces at Ingress PE . . . . . . . . . . . 22 | 2.3. Use of IRB Interfaces at Ingress PE | |||
2.4. Use of IRB Interfaces at an Egress PE . . . . . . . . . . 24 | 2.4. Use of IRB Interfaces at an Egress PE | |||
2.5. Announcing Interest in (S,G) . . . . . . . . . . . . . . 24 | 2.5. Announcing Interest in (S,G) | |||
2.6. Tunneling Frames from Ingress PE to Egress PEs . . . . . 25 | 2.6. Tunneling Frames from Ingress PEs to Egress PEs | |||
2.7. Advanced Scenarios . . . . . . . . . . . . . . . . . . . 26 | 2.7. Advanced Scenarios | |||
3. EVPN-aware Multicast Solution Control Plane . . . . . . . . . 27 | 3. EVPN-Aware Multicast Solution Control Plane | |||
3.1. Supplementary Broadcast Domain (SBD) and Route Targets . 27 | 3.1. Supplementary Broadcast Domain (SBD) and Route Targets | |||
3.2. Advertising the Tunnels Used for IP Multicast . . . . . . 28 | 3.2. Advertising the Tunnels Used for IP Multicast | |||
3.2.1. Constructing Routes for the SBD . . . . . . . . . . . 29 | 3.2.1. Constructing Routes for the SBD | |||
3.2.2. Ingress Replication . . . . . . . . . . . . . . . . . 29 | 3.2.2. Ingress Replication | |||
3.2.3. Assisted Replication . . . . . . . . . . . . . . . . 30 | 3.2.3. Assisted Replication | |||
3.2.3.1. Automatic SBD Matching . . . . . . . . . . . . . 31 | 3.2.3.1. Automatic SBD Matching | |||
3.2.4. BIER . . . . . . . . . . . . . . . . . . . . . . . . 31 | 3.2.4. BIER | |||
3.2.5. Inclusive P2MP Tunnels . . . . . . . . . . . . . . . 32 | 3.2.5. Inclusive P2MP Tunnels | |||
3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive | 3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive | |||
Tunnels . . . . . . . . . . . . . . . . . . . . . . 32 | Tunnels | |||
3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise | 3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise | |||
Inclusive Tunnels Specific to IP Multicast . . . . 34 | Inclusive Tunnels Specific to IP Multicast | |||
3.2.6. Selective Tunnels . . . . . . . . . . . . . . . . . . 35 | 3.2.6. Selective Tunnels | |||
3.3. Advertising SMET Routes . . . . . . . . . . . . . . . . . 35 | 3.3. Advertising SMET Routes | |||
4. Constructing Multicast Forwarding State . . . . . . . . . . . 38 | 4. Constructing Multicast Forwarding State | |||
4.1. Layer 2 Multicast State . . . . . . . . . . . . . . . . . 38 | 4.1. Layer 2 Multicast State | |||
4.1.1. Constructing the OIF List . . . . . . . . . . . . . . 39 | 4.1.1. Constructing the OIF List | |||
4.1.2. Data Plane: Applying the OIF List to an (S,G) | 4.1.2. Data Plane: Applying the OIF List to an (S,G) Frame | |||
Frame . . . . . . . . . . . . . . . . . . . . . . . . 40 | 4.1.2.1. Eligibility of an AC to Receive a Frame | |||
4.1.2.1. Eligibility of an AC to Receive a Frame . . . . . 40 | 4.1.2.2. Applying the OIF List | |||
4.1.2.2. Applying the OIF List . . . . . . . . . . . . . . 40 | 4.2. Layer 3 Forwarding State | |||
4.2. Layer 3 Forwarding State . . . . . . . . . . . . . . . . 42 | 5. Interworking with Non-OISM EVPN PEs | |||
5. Interworking with non-OISM EVPN-PEs . . . . . . . . . . . . . 42 | 5.1. IPMG Designated Forwarder | |||
5.1. IPMG Designated Forwarder . . . . . . . . . . . . . . . . 45 | 5.2. Ingress Replication | |||
5.2. Ingress Replication . . . . . . . . . . . . . . . . . . . 46 | 5.2.1. Ingress PE is Non-OISM | |||
5.2.1. Ingress PE is non-OISM . . . . . . . . . . . . . . . 47 | 5.2.2. Ingress PE is OISM | |||
5.2.2. Ingress PE is OISM . . . . . . . . . . . . . . . . . 48 | 5.3. P2MP Tunnels | |||
5.3. P2MP Tunnels . . . . . . . . . . . . . . . . . . . . . . 49 | 6. Traffic to/from Outside the EVPN Tenant Domain | |||
6. Traffic to/from Outside the EVPN Tenant Domain . . . . . . . 50 | 6.1. Layer 3 Interworking via EVPN OISM PEs | |||
6.1. Layer 3 Interworking via EVPN OISM PEs . . . . . . . . . 50 | 6.1.1. General Principles | |||
6.1.1. General Principles . . . . . . . . . . . . . . . . . 50 | 6.1.2. Interworking with MVPN | |||
6.1.2. Interworking with MVPN . . . . . . . . . . . . . . . 54 | 6.1.2.1. MVPN Sources with EVPN Receivers | |||
6.1.2.1. MVPN Sources with EVPN Receivers . . . . . . . . 56 | 6.1.2.1.1. Identifying MVPN Sources | |||
6.1.2.1.1. Identifying MVPN Sources . . . . . . . . . . 56 | 6.1.2.1.2. Joining a Flow from an MVPN Source | |||
6.1.2.1.2. Joining a Flow from an MVPN Source . . . . . 56 | 6.1.2.2. EVPN Sources with MVPN Receivers | |||
6.1.2.2. EVPN Sources with MVPN Receivers . . . . . . . . 58 | 6.1.2.2.1. General Procedures | |||
6.1.2.2.1. General procedures . . . . . . . . . . . . . 59 | 6.1.2.2.2. Any-Source Multicast (ASM) Groups | |||
6.1.2.2.2. Any-Source Multicast (ASM) Groups . . . . . . 60 | 6.1.2.2.3. Source on Multihomed Segment | |||
6.1.2.2.3. Source on Multihomed Segment . . . . . . . . 61 | 6.1.2.3. Obtaining Optimal Routing of Traffic between MVPN | |||
6.1.2.3. Obtaining Optimal Routing of Traffic Between MVPN | and EVPN | |||
and EVPN . . . . . . . . . . . . . . . . . . . . . 61 | 6.1.2.4. Selecting the MEG SBD-DR | |||
6.1.2.4. Selecting the MEG SBD-DR . . . . . . . . . . . . 62 | 6.1.3. Interworking with Global Table Multicast | |||
6.1.3. Interworking with 'Global Table Multicast' . . . . . 63 | 6.1.4. Interworking with PIM | |||
6.1.4. Interworking with PIM . . . . . . . . . . . . . . . . 63 | 6.1.4.1. Source Inside EVPN Domain | |||
6.1.4.1. Source Inside EVPN Domain . . . . . . . . . . . . 64 | 6.1.4.2. Source Outside EVPN Domain | |||
6.1.4.2. Source Outside EVPN Domain . . . . . . . . . . . 65 | 6.2. Interworking with PIM via an External PIM Router | |||
6.2. Interworking with PIM via an External PIM Router . . . . 65 | ||||
7. Using an EVPN Tenant Domain as an Intermediate (Transit) | 7. Using an EVPN Tenant Domain as an Intermediate (Transit) | |||
Network for Multicast traffic . . . . . . . . . . . . . . 67 | Network for Multicast Traffic | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 69 | 8. IANA Considerations | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 69 | 9. Security Considerations | |||
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 70 | 10. References | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 70 | 10.1. Normative References | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 70 | 10.2. Informative References | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 72 | Appendix A. Integrated Routing and Bridging | |||
Appendix A. Integrated Routing and Bridging . . . . . . . . . . 73 | Acknowledgements | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 79 | Authors' Addresses | |||
1. Introduction | 1. Introduction | |||
1.1. Terminology | 1.1. Terminology | |||
In this document we make frequent use of the following terminology: | In this document, we make frequent use of the following terminology: | |||
* OISM: Optimized Inter-Subnet Multicast. EVPN-PEs that follow the | OISM: Optimized Inter-Subnet Multicast. EVPN PEs that follow the | |||
procedures of this document will be known as "OISM" PEs. EVPN-PEs | procedures of this document will be known as "OISM" Provider Edges | |||
that do not follow the procedures of this document will be known | (PEs). EVPN PEs that do not follow the procedures of this | |||
as "non-OISM" PEs. | document will be known as "non-OISM" PEs. | |||
* IP Multicast Packet: An IP packet whose IP Destination Address | IP Multicast Packet: An IP packet whose IP Destination Address field | |||
field is a multicast address that is not a link-local address. | is a multicast address that is not a link-local address. (Link- | |||
(Link-local addresses are IPv4 addresses in the 224/24 range and | local addresses are IPv4 addresses in the 224/24 range and IPv6 | |||
IPv6 address in the FF02/16 range.) | addresses in the FF02/16 range.) | |||
* IP Multicast Frame: An Ethernet frame whose payload is an IP | IP Multicast Frame: An Ethernet frame whose payload is an IP | |||
multicast packet (as defined above). | multicast packet (as defined above). | |||
* (S,G) Multicast Packet: An IP multicast packet whose IP Source | (S,G) Multicast Packet: An IP multicast packet whose Source IP | |||
Address field contains S and whose IP Destination Address field | Address field contains S and whose IP Destination Address field | |||
contains G. | contains G. | |||
* (S,G) Multicast Frame: An IP multicast frame whose payload | (S,G) Multicast Frame: An IP multicast frame whose payload contains | |||
contains S in its IP Source Address field and G in its IP | S in its Source IP Address field and G in its IP Destination | |||
Destination Address field. | Address field. | |||
* EVPN Instance (EVI): An EVPN instance spanning the Provider Edge | EVI: EVPN Instance. An EVPN instance spanning the PE devices | |||
(PE) devices participating in that EVPN. | participating in that EVPN. | |||
* Broadcast Domain (BD): an emulated Ethernet, such that two systems | BD: Broadcast Domain. An emulated Ethernet, such that two systems | |||
on the same BD will receive each other's link-local broadcasts. | on the same BD will receive each other's link-local broadcasts. | |||
Note that EVPN supports service models in which a single EVPN | Note that EVPN supports service models in which a single EVI | |||
Instance contains only one BD, and service models in which a | contains only one BD and service models in which a single EVI | |||
single EVI contains multiple BDs. Both types of service model are | contains multiple BDs. Both types of service models are supported | |||
supported by this draft. In all models, a given BD belongs to | by this document. In all models, a given BD belongs to only one | |||
only one EVI. | EVI. | |||
* Designated Forwarder (DF). As defined in [RFC7432], an Ethernet | DF: Designated Forwarder. As defined in [RFC7432], an Ethernet | |||
segment may be multi-homed (attached to more than one PE). An | segment may be multihomed (attached to more than one PE). An | |||
Ethernet segment may also contain multiple BDs, of one or more | Ethernet segment may also contain multiple BDs of one or more | |||
EVIs. For each such EVI, one of the PEs attached to the segment | EVIs. For each such EVI, one of the PEs attached to the segment | |||
becomes that EVI's DF for that segment. Since a BD may belong to | becomes that EVI's DF for that segment. Since a BD may belong to | |||
only one EVI, we can speak unambiguously of the BD's DF for a | only one EVI, we can speak unambiguously of the BD's DF for a | |||
given segment. | given segment. | |||
When the text makes it clear that we are speaking in the context | AC: Attachment Circuit. An AC connects the bridging function of an | |||
of a given BD, we will frequently use the term "a segment's DF" to | EVPN PE to an Ethernet segment of a particular BD. ACs are not | |||
mean the given BD's DF for that segment. | visible at the Layer 3. | |||
* AC: Attachment Circuit. An AC connects the bridging function of | ||||
an EVPN-PE to an Ethernet segment of a particular BD. ACs are not | ||||
visible at the router (L3) layer. | ||||
If a given Ethernet segment, attached to a given PE, contains n | If a given Ethernet segment, attached to a given PE, contains n | |||
BDs, we will say that the PE has n ACs to that segment. | BDs, we say that the PE has n ACs to that segment. | |||
* L3 Gateway: An L3 Gateway is a PE that connects an EVPN tenant | L3 Gateway: An L3 Gateway is a PE that connects an EVPN Tenant | |||
domain to an external multicast domain by performing both the OISM | Domain to an external multicast domain by performing both the OISM | |||
procedures and the Layer 3 multicast procedures of the external | procedures and the Layer 3 multicast procedures of the external | |||
domain. | domain. | |||
* PEG (PIM/EVPN Gateway): A L3 Gateway that connects an EVPN Tenant | PEG: PIM/EVPN Gateway. An L3 Gateway that connects an EVPN Tenant | |||
Domain to an external multicast domain whose Layer 3 multicast | Domain to an external multicast domain whose Layer 3 multicast | |||
procedures are those of PIM [RFC7761]. | procedures are those of PIM [RFC7761]. | |||
* MEG (MVPN/EVPN Gateway): A L3 Gateway that connects an EVPN Tenant | MEG: MVPN/EVPN Gateway. An L3 Gateway that connects an EVPN Tenant | |||
Domain to an external multicast domain whose Layer 3 multicast | Domain to an external multicast domain whose Layer 3 multicast | |||
procedures are those of MVPN ([RFC6513], [RFC6514]). | procedures are those of Multicast VPN (MVPN) [RFC6513] [RFC6514]. | |||
* IPMG (IP Multicast Gateway): A PE that is used for interworking | IPMG: IP Multicast Gateway. A PE that is used for interworking OISM | |||
OISM EVPN-PEs with non-OISM EVPN-PEs. | EVPN PEs with non-OISM EVPN PEs. | |||
* DR (Designated Router): A PE that has special responsibilities for | DR: Designated Router. A PE that has special responsibilities for | |||
handling multicast on a given BD. | handling multicast on a given BD. | |||
* FHR (First Hop Router): The FHR is a PIM router [RFC7761] with | FHR: First Hop Router. The FHR is a PIM router [RFC7761] with | |||
special responsibilities. It is the first multicast router to see | special responsibilities. It is the first multicast router to see | |||
(S,G) packets from source S, and if G is an "Any Source Multicast | (S,G) packets from source S, and if G is an Any-Source Multicast | |||
(ASM)" group, the FHR is responsible for sending PIM Register | (ASM) group, the FHR is responsible for sending PIM Register | |||
messages to the PIM Rendezvous Point for group G. | messages to the PIM Rendezvous Point (RP) for group G. | |||
* LHR (Last Hop Router): The LHR is a PIM router [RFC7761] with | LHR: Last Hop Router. The LHR is a PIM router [RFC7761] with | |||
special responsibilities. Generally, it is attached to a LAN, and | special responsibilities. Generally, it is attached to a LAN, and | |||
it determines whether there are any hosts on the LAN that need to | it determines whether there are any hosts on the LAN that need to | |||
receive a given multicast flow. If so, it creates and sends the | receive a given multicast flow. If so, it creates and sends the | |||
PIM Join messages that are necessary to receive the flow. | PIM Join messages that are necessary to receive the flow. | |||
* EC (Extended Community). A BGP Extended Communities attribute | EC: Extended Community. A BGP Extended Communities attribute | |||
([RFC4360], [RFC7153]) is a BGP path attribute that consists of | [RFC4360] [RFC7153] is a BGP path attribute that consists of one | |||
one or more extended communities. | or more Extended Communities. | |||
* RT (Route Target): A Route Target is a particular kind of BGP | RT: Route Target. A Route Target is a particular kind of BGP | |||
Extended Community. A BGP Extended Community consists of a type | Extended Community. A BGP Extended Community consists of a type | |||
field, a sub-type field, and a value field. Certain type/sub-type | field, a sub-type field, and a value field. Certain type/sub-type | |||
combinations indicate that a particular Extended Community is an | combinations indicate that a particular Extended Community is an | |||
RT. RT1 and RT2 are considered to be the same RT if and only if | RT. RT1 and RT2 are considered to be the same RT if and only if | |||
they have the same type, same sub-type, and same value fields. | they have the same type, sub-type, and value fields. | |||
* Use of the "C-" prefix. In many documents on VPN multicast, the | C- prefix: In many documents on VPN multicast, the prefix C- appears | |||
prefix "C-" appears before any address or wildcard that refers to | before any address or wildcard that refers to an address or | |||
an address or addresses in a tenant's address space, rather than | addresses in a tenant's address space rather than to an address of | |||
to an address of addresses in the address space of the backbone | addresses in the address space of the backbone network. This | |||
network. This document omits the "C-" prefix in many cases where | document omits the C- prefix in many cases where it is clear from | |||
it is clear from the context that the reference is to the tenant's | the context that the reference is to the tenant's address space. | |||
address space. | ||||
This document also assumes familiarity with the terminology of | This document also assumes familiarity with the terminology of | |||
[RFC4364], [RFC6514], [RFC7432], [RFC7761], [RFC9251], [RFC9136] and | [RFC4364], [RFC6514], [RFC7432], [RFC7761], [RFC9136], [RFC9251], and | |||
[I-D.ietf-bess-evpn-bum-procedure-updates]. | [RFC9572]. | |||
1.1.1. Requirements Language | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | ||||
"OPTIONAL" in this document are to be interpreted as described in | ||||
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
capitals, as shown here. | ||||
1.2. Background | 1.2. Background | |||
Ethernet VPN (EVPN) [RFC7432] provides a Layer 2 VPN (L2VPN) | Ethernet VPN (EVPN) [RFC7432] provides a Layer 2 VPN (L2VPN) | |||
solution, which allows an IP or MPLS backbone provider to offer | solution, which allows an IP or MPLS backbone provider to offer | |||
Ethernet service to a set of customers, known as "tenants". | Ethernet service to a set of customers, known as "tenants". | |||
In this section (as well as in [RFC9135]), we provide some essential | In this section (as well as in [RFC9135]), we provide some essential | |||
background information on EVPN. | background information on EVPN. | |||
1.2.1. Segments, Broadcast Domains, and Tenants | 1.2.1. Segments, Broadcast Domains, and Tenants | |||
One of the key concepts of EVPN is the Broadcast Domain (BD). A BD | One of the key concepts of EVPN is the Broadcast Domain (BD). A BD | |||
is essentially an emulated Ethernet. Each BD belongs to a single | is essentially an emulated Ethernet. Each BD belongs to a single | |||
tenant. A BD typically consists of multiple Ethernet "segments", and | tenant. A BD typically consists of multiple Ethernet segments, and | |||
each segment may be attached to a different EVPN Provider Edge | each segment may be attached to a different EVPN Provider Edge (EVPN | |||
(EVPN-PE) router. EVPN-PE routers are often referred to as "Network | PE) router. EVPN PE routers are often referred to as "Network | |||
Virtualization Endpoints" or NVEs. However, this document will use | Virtualization Endpoints (NVEs)". However, this document will use | |||
the term "EVPN-PE", or, when the context is clear, just "PE". | the term "EVPN PE" or, when the context is clear, just "PE". | |||
In this document, the term "segment" is used interchangeable with the | In this document, the term "segment" is used interchangeably with | |||
"Ethernet Segment" or "ES" in [RFC7432]. | "Ethernet Segment" or "ES", as defined in [RFC7432]. | |||
Attached to each segment are "Tenant Systems" (TSes). A TS may be | Attached to each segment are Tenant Systems (TSs). A TS may be any | |||
any type of system, physical or virtual, host or router, etc., that | type of system, physical or virtual, host or router, etc., that can | |||
can attach to an Ethernet. | attach to an Ethernet. | |||
When two TSes are on the same segment, traffic between them does not | When two TSs are on the same segment, traffic between them does not | |||
pass through an EVPN-PE. When two TSes are on different segments of | pass through an EVPN PE. When two TSs are on different segments of | |||
the same BD, traffic between them does pass through an EVPN-PE. | the same BD, traffic between them does pass through an EVPN PE. | |||
When two TSes, say TS1 and TS2 are on the same BD, then: | When two TSs, say TS1 and TS2, are on the same BD, then the following | |||
occurs: | ||||
* If TS1 knows the MAC address of TS2, TS1 can send unicast Ethernet | * If TS1 knows the Media Access Control (MAC) address of TS2, TS1 | |||
frames to TS2. TS2 will receive the frames unaltered. | can send unicast Ethernet frames to TS2. TS2 will receive the | |||
frames unaltered. | ||||
* If TS1 broadcasts an Ethernet frame, TS2 will receive the | * If TS1 broadcasts an Ethernet frame, TS2 will receive the | |||
unaltered frame. | unaltered frame. | |||
* If TS1 multicasts an Ethernet frame, TS2 will receive the | * If TS1 multicasts an Ethernet frame, TS2 will receive the | |||
unaltered frame, as long as TS2 has been provisioned to receive | unaltered frame as long as TS2 has been provisioned to receive the | |||
the Ethernet multicast destination MAC address. | Ethernet multicast destination MAC address. | |||
When we say that TS2 receives an unaltered frame from TS1, we mean | When we say that TS2 receives an unaltered frame from TS1, we mean | |||
that the frame still contains TS1's MAC address, and that no | that the frame still contains TS1's MAC address and that no | |||
alteration of the frame's payload (and consequently, no alteration of | alteration of the frame's payload (and consequently, no alteration of | |||
the payload's IP header) has been made. | the payload's IP header) has been made. | |||
EVPN allows a single segment to be attached to multiple PE routers. | EVPN allows a single segment to be attached to multiple PE routers. | |||
This is known as "EVPN multi-homing". Suppose a given segment is | This is known as "EVPN multihoming". Suppose a given segment is | |||
attached to both PE1 and PE2, and suppose PE1 receives a frame from | attached to both PE1 and PE2, and suppose PE1 receives a frame from | |||
that segment. It may be necessary for PE1 to send the frame over the | that segment. It may be necessary for PE1 to send the frame over the | |||
backbone to PE2. EVPN has procedures to ensure that such a frame | backbone to PE2. EVPN has procedures to ensure that such a frame | |||
cannot be sent by PE2 back to its originating segment. This is | cannot be sent back to its originating segment by PE2. This is | |||
particularly important for multicast, because a frame arriving at PE1 | particularly important for multicast, because a frame arriving at PE1 | |||
from a given segment will already have been seen by all the systems | from a given segment will already have been seen by all the systems | |||
on that segment that need to see it. If the frame were sent back to | on that segment that need to see it. If the frame was sent back to | |||
the originating segment by PE2, receivers on that segment would | the originating segment by PE2, receivers on that segment would | |||
receive the packet twice. Even worse, the frame might be sent back | receive the packet twice. Even worse, the frame might be sent back | |||
to PE1, which could cause an infinite loop. | to PE1, which could cause an infinite loop. | |||
1.2.2. Inter-BD (Inter-Subnet) IP Traffic | 1.2.2. Inter-BD (Inter-Subnet) IP Traffic | |||
If a given tenant has multiple BDs, the tenant may wish to allow IP | If a given tenant has multiple BDs, the tenant may wish to allow IP | |||
communication among these BDs. Such a set of BDs is known as an | communication among these BDs. Such a set of BDs is known as an | |||
"EVPN Tenant Domain" or just a "Tenant Domain". | "EVPN Tenant Domain" or just a "Tenant Domain". | |||
If tenant systems TS1 and TS2 are not in the same BD, then they do | If tenant systems TS1 and TS2 are not in the same BD, then they do | |||
not receive unaltered Ethernet frames from each other. In order for | not receive unaltered Ethernet frames from each other. In order for | |||
TS1 to send traffic to TS2, TS1 encapsulates an IP datagram inside an | TS1 to send traffic to TS2, TS1 encapsulates an IP datagram inside an | |||
Ethernet frame, and uses Ethernet to send these frames to an IP | Ethernet frame and uses Ethernet to send these frames to an IP | |||
router. The router decapsulates the IP datagram, does the IP | router. The router decapsulates the IP datagram, does the IP | |||
processing and re-encapsulates the datagram for Ethernet. The MAC | processing, and re-encapsulates the datagram for Ethernet. The MAC | |||
source address field now has the MAC address of the router, not of | Source Address field now has the MAC address of the router, not of | |||
TS1. The TTL field of the IP datagram should be decremented by | TS1. The TTL field of the IP datagram should be decremented by | |||
exactly 1, even if the frame needs to be sent from one PE to another. | exactly 1, even if the frame needs to be sent from one PE to another. | |||
The structure of the provider's backbone is thus hidden from the | The structure of the provider's backbone is thus hidden from the | |||
tenants. | tenants. | |||
EVPN accommodates the need for inter-BD communication within a Tenant | EVPN accommodates the need for inter-BD communication within a Tenant | |||
Domain by providing an integrated L2/L3 service for unicast IP | Domain by providing an integrated L2/L3 service for unicast IP | |||
traffic. EVPN's Integrated Routing and Bridging (IRB) functionality | traffic. EVPN's Integrated Routing and Bridging (IRB) functionality | |||
is specified in [RFC9135]. Each BD in a Tenant Domain is assumed to | is specified in [RFC9135]. Each BD in a Tenant Domain is assumed to | |||
be a single IP subnet, and each IP subnet within a given Tenant | be a single IP subnet, and each IP subnet within a given Tenant | |||
Domain is assumed to be a single BD. EVPN's IRB functionality allows | Domain is assumed to be a single BD. EVPN's IRB functionality allows | |||
IP traffic to travel from one BD to another, and ensures that proper | IP traffic to travel from one BD to another and ensures that proper | |||
IP processing (e.g., TTL decrement) is done. | IP processing (e.g., TTL decrement) is done. | |||
A brief overview of IRB, including the notion of an "IRB interface", | A brief overview of IRB, including the notion of an IRB interface, | |||
can be found in Appendix A. As explained there, an IRB interface is | can be found in Appendix A. As explained there, an IRB interface is | |||
a sort of virtual interface connecting an L3 routing instance to a | a sort of virtual interface connecting an L3 routing instance to a | |||
BD. A BD may have multiple attachment circuits (ACs) to a given PE, | BD. A BD may have multiple Attachment Circuits (ACs) to a given PE, | |||
where each AC connects to a different Ethernet segment of the BD. | where each AC connects to a different Ethernet segment of the BD. | |||
However, these ACs are not visible to the L3 routing function; from | However, these ACs are not visible to the L3 routing function; from | |||
the perspective of an L3 routing instance, a PE has just one | the perspective of an L3 routing instance, a PE has just one | |||
interface to each BD, viz., the IRB interface for that BD. | interface to each BD, viz., the IRB interface for that BD. | |||
In this document, when traffic is routed out of an IRB interface, we | In this document, when traffic is routed out of an IRB interface, we | |||
say it is sent down the IRB interface to the BD that the IRB is for. | say it is sent down the IRB interface to the BD that the IRB is for. | |||
In the other direction, traffic is sent up the IRB interface from the | In the other direction, traffic is sent up the IRB interface from the | |||
BD to the L3 routing instance. | BD to the L3 routing instance. | |||
The "L3 routing instance" depicted in Appendix A is associated with a | The L3 routing instance depicted in Appendix A is associated with a | |||
single Tenant Domain, and may be thought of as an IP-VRF for that | single Tenant Domain and may be thought of as IP Virtual Routing and | |||
Tenant Domain. | Forwarding (IP-VRF) for that Tenant Domain. | |||
1.2.3. EVPN and IP Multicast | 1.2.3. EVPN and IP Multicast | |||
[RFC9135] and [RFC9136] cover inter-subnet (inter-BD) IP unicast | [RFC9135] and [RFC9136] cover inter-subnet (inter-BD) IP unicast | |||
forwarding, but they do not cover inter-subnet IP multicast | forwarding, but they do not cover inter-subnet IP multicast | |||
forwarding. | forwarding. | |||
[RFC7432] covers intra-subnet (intra-BD) Ethernet multicast. The | [RFC7432] covers intra-subnet (intra-BD) Ethernet multicast. The | |||
intra-subnet Ethernet multicast procedures of [RFC7432] are used for | intra-subnet Ethernet multicast procedures of [RFC7432] are used for | |||
Ethernet Broadcast traffic, for Ethernet unicast traffic whose MAC | Ethernet broadcast traffic, Ethernet unicast traffic whose | |||
Destination Address field contains an Unknown address, and for | Destination MAC Address field contains an unknown address, and | |||
Ethernet traffic whose MAC Destination Address field contains an | Ethernet traffic whose Destination MAC Address field contains an | |||
Ethernet Multicast MAC address. These three classes of traffic are | Ethernet multicast MAC address. These three classes of traffic are | |||
known collectively as "BUM traffic" (Broadcast/Unknown-Unicast/ | known collectively as "BUM traffic" (Broadcast, Unknown Unicast, or | |||
Multicast), and the procedures for handling BUM traffic are known as | Multicast traffic), and the procedures for handling BUM traffic are | |||
"BUM procedures". | known as "BUM procedures". | |||
[RFC9251] extends the intra-subnet Ethernet multicast procedures by | [RFC9251] extends the intra-subnet Ethernet multicast procedures by | |||
adding procedures that are specific to, and optimized for, the use of | adding procedures that are specific to, and optimized for, the use of | |||
IP multicast within a subnet. However, that document does not cover | IP multicast within a subnet. However, that document does not cover | |||
inter-subnet IP multicast. | inter-subnet IP multicast. | |||
The purpose of this document is to specify procedures for EVPN that | The purpose of this document is to specify procedures for EVPN that | |||
provide optimized IP multicast functionality within an EVPN tenant | provide optimized IP multicast functionality within an EVPN Tenant | |||
domain. This document also specifies procedures that allow IP | Domain. This document also specifies procedures that allow IP | |||
multicast packets to be sourced from or destined to systems outside | multicast packets to be sourced from or destined to systems outside | |||
the Tenant Domain. We refer to the entire set of these procedures as | the Tenant Domain. The entire set of procedures are referred to as | |||
"OISM" (Optimized Inter-Subnet Multicast) procedures. | "Optimized Inter-Subnet Multicast (OISM)" procedures. | |||
In order to support the OISM procedures specified in this document, | In order to support the OISM procedures specified in this document, | |||
an EVPN-PE MUST also support [RFC9135] and [RFC9251]. (However, | an EVPN PE MUST also support [RFC9135] and [RFC9251]. (However, | |||
certain procedures in [RFC9251] are modified when OISM is supported.) | certain procedures in [RFC9251] are modified when OISM is supported.) | |||
1.2.4. BDs, MAC-VRFS, and EVPN Service Models | 1.2.4. BDs, MAC-VRFs, and EVPN Service Models | |||
[RFC7432] defines the notion of "MAC-VRF". A MAC-VRF contains one or | [RFC7432] defines the notion of MAC-VRF (MAC Virtual Routing and | |||
more "Bridge Tables" (see section 3 of [RFC7432] for a discussion of | Forwarding). A MAC-VRF contains one or more bridge tables (see | |||
this terminology), each of which represents a single Broadcast | Section 3 of [RFC7432]), each of which represents a single Broadcast | |||
Domain. | Domain. | |||
In the IRB model (outlined in Appendix A), an L3 routing instance has | In the IRB model (outlined in Appendix A), an L3 routing instance has | |||
one IRB interface per BD, NOT one per MAC-VRF. This document does | one IRB interface per BD, NOT one per MAC-VRF. This document does | |||
not distinguish between a "Broadcast Domain" and a "Bridge Table", | not distinguish between a Broadcast Domain and a bridge table; | |||
and will use the terms interchangeably (or will use the acronym "BD" | instead, it uses the terms interchangeably (or will use the acronym | |||
to refer to either). The way the BDs are grouped into MAC-VRFs is | "BD" to refer to either). The way the BDs are grouped into MAC-VRFs | |||
not relevant to the procedures specified in this document. | is not relevant to the procedures specified in this document. | |||
Section 6 of [RFC7432] also defines several different EVPN service | Section 6 of [RFC7432] also defines several different EVPN service | |||
models: | models: | |||
* In the "vlan-based service", each MAC-VRF contains one "bridge | * In the vlan-based service, each MAC-VRF contains one bridge table, | |||
table", where the bridge table corresponds to a particular Virtual | where the bridge table corresponds to a particular Virtual LAN | |||
LAN (VLAN). (See section 3 of [RFC7432] for a discussion of this | (VLAN) (see Section 3 of [RFC7432]). Thus, each VLAN is treated | |||
terminology.) Thus, each VLAN is treated as a BD. | as a BD. | |||
* In the "vlan bundle service", each MAC-VRF contains one bridge | * In the vlan bundle service, each MAC-VRF contains one bridge | |||
table, where the bridge table corresponds to a set of VLANs. Thus | table, where the bridge table corresponds to a set of VLANs. | |||
a set of VLANs are treated as constituting a single BD. | Thus, a set of VLANs are treated as constituting a single BD. | |||
* In the "vlan-aware bundle service", each MAC-VRF may contain | * In the vlan-aware bundle service, each MAC-VRF may contain | |||
multiple bridge tables, where each bridge table corresponds to one | multiple bridge tables, where each bridge table corresponds to one | |||
BD. If a MAC-VRF contains several bridge tables, then it | BD. If a MAC-VRF contains several bridge tables, then it | |||
corresponds to several BDs. | corresponds to several BDs. | |||
The procedures in this document are intended to work for all these | The procedures in this document are intended to work for all these | |||
service models. | service models. | |||
1.3. Need for EVPN-aware Multicast Procedures | 1.3. Need for EVPN-Aware Multicast Procedures | |||
Inter-subnet IP multicast among a set of BDs can be achieved, in a | Inter-subnet IP multicast among a set of BDs can be achieved, in a | |||
non-optimal manner, without any specific EVPN procedures. For | non-optimal manner, without any specific EVPN procedures. For | |||
instance, if a particular tenant has n BDs among which he wants to | instance, if a particular tenant has n BDs among which it wants to | |||
send IP multicast traffic, he can simply attach a conventional | send IP multicast traffic, it can simply attach a conventional | |||
multicast router to all n BDs. Or more generally, as long as each BD | multicast router to all n BDs. Or more generally, as long as each BD | |||
has at least one IP multicast router, and the IP multicast routers | has at least one IP multicast router, and the IP multicast routers | |||
communicate multicast control information with each other, | communicate multicast control information with each other, | |||
conventional IP multicast procedures will work normally, and no | conventional IP multicast procedures will work normally, and no | |||
special EVPN functionality is needed. | special EVPN functionality is needed. | |||
However, that technique does not provide optimal routing for | However, that technique does not provide optimal routing for | |||
multicast. In conventional multicast routing, for a given multicast | multicast. In conventional multicast routing, for a given multicast | |||
flow, there is only one multicast router on each BD that is permitted | flow, there is only one multicast router on each BD that is permitted | |||
to send traffic of that flow to the BD. If that BD has receivers for | to send traffic of that flow to the BD. If that BD has receivers for | |||
a given flow, but the source of the flow is not on that BD, then the | a given flow, but the source of the flow is not on that BD, then the | |||
flow must pass through that multicast router. This leads to the | flow must pass through that multicast router. This leads to the | |||
"hair-pinning" problem described (for unicast) in Appendix A. | hairpinning problem described (for unicast) in Appendix A. | |||
For example, consider an (S,G) flow that is sourced by a TS S and | For example, consider an (S,G) flow that is sourced by a TS S and | |||
needs to be received by TSes R1 and R2. Suppose S is on a segment of | needs to be received by TSs R1 and R2. Suppose S is on a segment of | |||
BD1, R1 is on a segment of BD2, but both are attached to PE1. | BD1, R1 is on a segment of BD2, but both are attached to PE1. Also | |||
Suppose also that the tenant has a multicast router, attached to a | suppose that the tenant has a multicast router attached to a segment | |||
segment of BD1 and to a segment of BD2. However, the segments to | of BD1 and to a segment of BD2. However, the segments to which that | |||
which that router is attached are both attached to PE2. Then the | router is attached are both attached to PE2. Then, the flow from S | |||
flow from S to R would have to follow the path: | to R would have to follow the path: S-->PE1-->PE2-->tenant multicast | |||
S-->PE1-->PE2-->Tenant Multicast Router-->PE2-->PE1-->R1. Obviously, | router-->PE2-->PE1-->R1. Obviously, the path S-->PE1-->R would be | |||
the path S-->PE1-->R would be preferred. | preferred. | |||
+---+ +---+ | +---+ +---+ | |||
|PE1+----------------------+PE2| | |PE1+----------------------+PE2| | |||
+---+-+ +-+---+ | +---+-+ +-+---+ | |||
| \ \ / / | | | \ \ / / | | |||
BD1 BD2 BD3 BD3 BD2 BD1 | BD1 BD2 BD3 BD3 BD2 BD1 | |||
| | | \ | | | | | | \ | | | |||
S R1 R2 router | S R1 R2 router | |||
Now suppose that there is a second receiver, R2. R2 is attached to a | Now suppose that there is a second receiver, R2. R2 is attached to a | |||
third BD, BD3. However, it is attached to a segment of BD3 that is | third BD, BD3. However, it is attached to a segment of BD3 that is | |||
attached to PE1. And suppose also that the Tenant Multicast Router | attached to PE1. And suppose that the tenant multicast router is | |||
is attached to a segment of BD3 that attaches to PE2. In this case, | attached to a segment of BD3 that attaches to PE2. In this case, the | |||
the Tenant Multicast Router will make two copies of the packet, one | tenant multicast router will make two copies of the packet, one for | |||
for BD2 and one for BD3. PE2 will send both copies back to PE1. Not | BD2 and one for BD3. PE2 will send both copies back to PE1. Not | |||
only is the routing sub-optimal, but also PE2 sends multiple copies | only is the routing sub-optimal, but PE2 also sends multiple copies | |||
of the same packet to PE1. This is a further sub-optimality. | of the same packet to PE1, which is a further sub-optimality. | |||
This is only an example; many more examples of sub-optimal multicast | This is only an example; many more examples of sub-optimal multicast | |||
routing can easily be given. To eliminate sub-optimal routing and | routing can easily be given. To eliminate sub-optimal routing and | |||
extra copies, it is necessary to have a multicast solution that is | extra copies, it is necessary to have a multicast solution that is | |||
EVPN-aware, and that can use its knowledge of the internal structure | EVPN-aware and that can use its knowledge of the internal structure | |||
of a Tenant Domain to ensure that multicast traffic gets routed | of a Tenant Domain to ensure that multicast traffic gets routed | |||
optimally. The procedures in this document allow us to avoid all | optimally. The procedures in this document allow us to avoid all | |||
such sub-optimalities when routing inter-subnet multicast traffic | such sub-optimalities when routing inter-subnet multicast traffic | |||
within a Tenant Domain. | within a Tenant Domain. | |||
1.4. Additional Requirements That Must be Met by the Solution | 1.4. Additional Requirements That Must Be Met by the Solution | |||
In addition to providing optimal routing of multicast flows within a | In addition to providing optimal routing of multicast flows within a | |||
Tenant Domain, the EVPN-aware multicast solution is intended to | Tenant Domain, the EVPN-aware multicast solution is intended to | |||
satisfy the following requirements: | satisfy the following requirements: | |||
* The solution must integrate well with the procedures specified in | * The solution must integrate well with the procedures specified in | |||
[RFC9251]. That is, an integrated set of procedures must handle | [RFC9251]. That is, an integrated set of procedures must handle | |||
both intra-subnet multicast and inter-subnet multicast. | both intra-subnet multicast and inter-subnet multicast. | |||
* With regard to intra-subnet multicast, the solution MUST maintain | * With regard to intra-subnet multicast, the solution MUST maintain | |||
the integrity of multicast Ethernet service. This means: | the integrity of the multicast Ethernet service. This means: | |||
- If a source and a receiver are on the same subnet, the MAC | - If a source and a receiver are on the same subnet, the MAC | |||
source address (SA) of the multicast frame sent by the source | Source Address (SA) of the multicast frame sent by the source | |||
will not get rewritten. | will not get rewritten. | |||
- If a source and a receiver are on the same subnet, no IP | - If a source and a receiver are on the same subnet, no IP | |||
processing of the Ethernet payload is done. The IP TTL is not | processing of the Ethernet payload is done. The IP TTL is not | |||
decremented, the IPv4 header checksum is not changed, no | decremented, the IPv4 header checksum is not changed, no | |||
fragmentation is done, etc. | fragmentation is done, etc. | |||
* On the other hand, if a source and a receiver are on different | * On the other hand, if a source and a receiver are on different | |||
subnets, the frame received by the receiver will not have the MAC | subnets, the frame received by the receiver will not have the MAC | |||
Source address of the source, as the frame will appear to have | Source Address of the source, as the frame will appear to have | |||
come from a multicast router. Also, proper processing of the IP | come from a multicast router. Also, proper processing of the IP | |||
header is done, e.g., TTL decrement by 1, header checksum | header is done, e.g., TTL decrements by 1, header checksum | |||
modification, possible fragmentation, etc. | modification, possible fragmentation, etc. | |||
* If a Tenant Domain contains several BDs, it MUST be possible for a | * If a Tenant Domain contains several BDs, it MUST be possible for a | |||
multicast flow (even when the multicast group address is an "any | multicast flow (even when the multicast group address is an ASM | |||
source multicast" (ASM) address), to have sources in one of those | address) to have sources in one of those BDs and receivers in one | |||
BDs and receivers in one or more of the other BDs, without | or more of the other BDs without requiring the presence of any | |||
requiring the presence of any system performing PIM Rendezvous | system performing PIM RP functions [RFC7761]. | |||
Point (RP) functions [RFC7761]. | ||||
* Sometimes a MAC address used by one TS on a particular BD is also | * Sometimes a MAC address used by one TS on a particular BD is also | |||
used by another TS on a different BD. Inter-subnet routing of | used by another TS on a different BD. Inter-subnet routing of | |||
multicast traffic MUST NOT make any assumptions about the | multicast traffic MUST NOT make any assumptions about the | |||
uniqueness of a MAC address across several BDs. | uniqueness of a MAC address across several BDs. | |||
* If two EVPN-PEs attached to the same Tenant Domain both support | * If two EVPN PEs attached to the same Tenant Domain both support | |||
the OISM procedures, each may receive inter-subnet multicasts from | the OISM procedures, each may receive inter-subnet multicasts from | |||
the other, even if the egress PE is not attached to any segment of | the other, even if the egress PE is not attached to any segment of | |||
the BD from which the multicast packets are being sourced. It | the BD from which the multicast packets are being sourced. It | |||
MUST NOT be necessary to provision the egress PE with knowledge of | MUST NOT be necessary to provision the egress PE with knowledge of | |||
the ingress BD. | the ingress BD. | |||
* There must be a procedure that allows EVPN-PE routers supporting | * There must be a procedure that allows EVPN PE routers supporting | |||
OISM procedures to send/receive multicast traffic to/from EVPN-PE | OISM procedures to send/receive multicast traffic to/from EVPN PE | |||
routers that support only [RFC7432], but that do not support the | routers that support only [RFC7432] but that does not support the | |||
OISM procedures or even the procedures of [RFC9135]. However, | OISM procedures or even the procedures of [RFC9135]. However, | |||
when interworking with such routers (which we call "non-OISM PE | when interworking with such routers (which we call "non-OISM PE | |||
routers"), optimal routing may not be achievable. | routers"), optimal routing may not be achievable. | |||
* It MUST be possible to support scenarios in which multicast flows | * It MUST be possible to support scenarios in which multicast flows | |||
with sources inside a Tenant Domain have "external" receivers, | with sources inside a Tenant Domain have external receivers, i.e., | |||
i.e., receivers that are outside the domain. It must also be | receivers that are outside the domain. It must also be possible | |||
possible to support scenarios where multicast flows with external | to support scenarios where multicast flows with external sources | |||
sources (sources outside the Tenant Domain) have receivers inside | (sources outside the Tenant Domain) have receivers inside the | |||
the domain. | domain. | |||
This presupposes that unicast routes to multicast sources outside | This presupposes that unicast routes to multicast sources outside | |||
the domain can be distributed to EVPN-PEs attached to the domain, | the domain can be distributed to EVPN PEs attached to the domain | |||
and that unicast routes to multicast sources within the domain can | and that unicast routes to multicast sources within the domain can | |||
be distributed outside the domain. | be distributed outside the domain. | |||
Of particular importance is the scenario in which the external | Of particular importance are the scenarios in which the external | |||
sources and/or receivers are reachable via L3VPN/MVPN, and the | sources and/or receivers are reachable via L3VPN/MVPN or via IP/ | |||
scenario in which external sources and/or receivers are reachable | PIM. | |||
via IP/PIM. | ||||
The solution for external interworking MUST allow for deployment | The solution for external interworking MUST allow for deployment | |||
scenarios in which EVPN does not need to export a host route for | scenarios in which EVPN does not need to export a host route for | |||
every multicast source. | every multicast source. | |||
* The solution for external interworking must not presuppose that | * The solution for external interworking must not presuppose that | |||
the same tunneling technology is used within both the EVPN domain | the same tunneling technology is used within both the EVPN domain | |||
and the external domain. For example, MVPN interworking must be | and the external domain. For example, MVPN interworking must be | |||
possible when MVPN is using MPLS P2MP tunneling, and EVPN is using | possible when MVPN is using MPLS Point-to-Multipoint (P2MP) | |||
Ingress Replication or VXLAN tunneling. | tunneling and when EVPN is using Ingress Replication (IR) or | |||
Virtual eXtensible Local Area Network (VXLAN) tunneling. | ||||
* The solution must not be overly dependent on the details of a | * The solution must not be overly dependent on the details of a | |||
small set of use cases, but must be adaptable to new use cases as | small set of use cases but must be adaptable to new use cases as | |||
they arise. (That is, the solution must be robust.) | they arise. (That is, the solution must be robust.) | |||
1.5. Model of Operation: Overview | 1.5. Model of Operation: Overview | |||
1.5.1. Control Plane | 1.5.1. Control Plane | |||
In this section, and in the remainder of this document, we assume the | In this section, and in the remainder of this document, we assume the | |||
reader is familiar with the procedures of IGMP/MLD (see [RFC3376] and | reader is familiar with the procedures of IGMP / Multicast Listener | |||
[RFC3810]), by which hosts announce their interest in receiving | Discovery (MLD) (see [RFC3376] and [RFC3810]), by which hosts | |||
particular multicast flows. | announce their interest in receiving particular multicast flows. | |||
Consider a Tenant Domain consisting of a set of k BDs: BD1, ..., BDk. | Consider a Tenant Domain consisting of a set of k BDs: BD1, ..., BDk. | |||
To support the OISM procedures, each Tenant Domain must also be | To support the OISM procedures, each Tenant Domain must also be | |||
associated with a "Supplementary Broadcast Domain" (SBD). An SBD is | associated with a Supplementary Broadcast Domain (SBD). An SBD is | |||
treated in the control plane as a real BD, but it does not have any | treated in the control plane as a real BD, but it does not have any | |||
ACs. The SBD has several uses; these will be described later in this | ACs. The SBD has several uses; these will be described later in this | |||
document (see Section 2.1 and Section 3). | document (see Sections 2.1 and 3). | |||
Each PE that attaches to one or more of the BDs in a given tenant | Each PE that attaches to one or more of the BDs in a given Tenant | |||
domain will be provisioned to recognize that those BDs are part of | Domain will be provisioned to recognize that those BDs are part of | |||
the same Tenant Domain. Note that a given PE does not need to be | the same Tenant Domain. Note that a given PE does not need to be | |||
configured with all the BDs of a given Tenant Domain. In general, a | configured with all the BDs of a given Tenant Domain. In general, a | |||
PE will only be attached to a subset of the BDs in a given Tenant | PE will only be attached to a subset of the BDs in a given Tenant | |||
Domain, and will be configured only with that subset of BDs. | Domain and will be configured only with that subset of BDs. However, | |||
However, each PE attached to a given Tenant Domain must be configured | each PE attached to a given Tenant Domain must be configured with the | |||
with the SBD for that Tenant Domain. | SBD for that Tenant Domain. | |||
Suppose a particular segment of a particular BD is attached to PE1. | Suppose a particular segment of a particular BD is attached to PE1. | |||
[RFC7432] specifies that PE1 must originate an Inclusive Multicast | [RFC7432] specifies that PE1 must originate an Inclusive Multicast | |||
Ethernet Tag (IMET) route for that BD, and that the IMET route must | Ethernet Tag (IMET) route for that BD and that the IMET route must be | |||
be propagated to all other PEs attached to the same BD. If the given | propagated to all other PEs attached to the same BD. If the given | |||
segment contains a host that has interest in receiving a particular | segment contains a host that has interest in receiving a particular | |||
multicast flow, either an (S,G) flow or a (*,G) flow, PE1 will learn | multicast flow, either an (S,G) flow or a (*,G) flow, PE1 will learn | |||
of that interest by participating in the IGMP/MLD snooping | of that interest by participating in the IGMP/MLD snooping | |||
procedures, as specified in [RFC4541]. In this case: | procedures, as specified in [RFC4541]. In this case: | |||
* PE1 is interested in receiving the flow; | * PE1 is interested in receiving the flow; | |||
* The AC attaching the interested host to PE1 is also said to be | * the AC attaching the interested host to PE1 is also said to be | |||
interested in the flow; | interested in the flow; and | |||
* The BD containing an AC that is interested in a particular flow is | * the BD containing an AC that is interested in a particular flow is | |||
also said to be interested in that flow. | also said to be interested in that flow. | |||
Once PE1 determines that it has an AC that is interested in receiving | Once PE1 determines that it has an AC that is interested in receiving | |||
a particular flow or set of flows, it originates one or more | a particular flow or set of flows, it originates one or more | |||
Selective Multicast Ethernet Tag (SMET) route(s) [RFC9251] to | Selective Multicast Ethernet Tag (SMET) routes [RFC9251] to advertise | |||
advertise that interest. | that interest. | |||
Note that each IMET or SMET route is "for" a particular BD. The | Note that each IMET or SMET route is for a particular BD. The notion | |||
notion of a route being "for" a particular BD is explained in | of a route being for a particular BD is explained in Section 2.2. | |||
Section 2.2. | ||||
When OISM is being supported, the procedures of [RFC9251], are | When OISM is being supported, the procedures of [RFC9251] are | |||
modified as follows: | modified as follows: | |||
* The IMET route originated by a particular PE for a particular BD | * The IMET route originated by a particular PE for a particular BD | |||
is distributed to all other PEs attached to the Tenant Domain | is distributed to all other PEs attached to the Tenant Domain | |||
containing that BD, even to those PEs that are not attached to | containing that BD, even to those PEs that are not attached to | |||
that particular BD. | that particular BD. | |||
* The SMET routes originated by a particular PE are originated on a | * The SMET routes originated by a particular PE are originated on a | |||
per-Tenant-Domain basis, rather than on a per-BD basis. That is, | per-Tenant-Domain basis rather than a per-BD basis. That is, the | |||
the SMET routes are considered to be for the Tenant Domain's SBD, | SMET routes are considered to be for the Tenant Domain's SBD | |||
rather than for any of its ordinary BDs. These SMET routes are | rather than any of its ordinary BDs. These SMET routes are | |||
distributed to all the PEs attached to the Tenant Domain. | distributed to all the PEs attached to the Tenant Domain. | |||
In this way, each PE attached to a given Tenant Domain learns, | In this way, each PE attached to a given Tenant Domain learns, | |||
from each other PE attached to the same Tenant Domain, the set of | from the other PEs attached to the same Tenant Domain, the set of | |||
flows that are of interest to each of those other PEs. | flows that are of interest to each of those other PEs. | |||
An OISM PE that is provisioned with several BDs in the same Tenant | An OISM PE that is provisioned with several BDs in the same Tenant | |||
Domain MUST originate an IMET route for each such BD. To indicate | Domain MUST originate an IMET route for each such BD. To indicate | |||
its support of [RFC9251], it SHOULD attach the EVPN Multicast Flags | its support of [RFC9251], it SHOULD attach the EVPN Multicast Flags | |||
Extended Community to each such IMET route, but it MUST attach the EC | Extended Community to each such IMET route, but it MUST attach the EC | |||
to at least one such IMET route. | to at least one such IMET route. | |||
Suppose PE1 is provisioned with both BD1 and BD2, and is provisioned | Suppose PE1 is provisioned with both BD1 and BD2 and considers them | |||
to consider them to be part of the same Tenant Domain. It is | to be part of the same Tenant Domain. It is possible that PE1 will | |||
possible that PE1 will receive from PE2 both an IMET route for BD1 | receive both an IMET route for BD1 and an IMET route for BD2 from | |||
and an IMET route for BD2. If either of these IMET routes has the | PE2. If either of these IMET routes has the EVPN Multicast Flags | |||
EVPN Multicast Flags Extended Community, PE1 MUST assume that PE2 is | Extended Community, PE1 MUST assume that PE2 is supporting the | |||
supporting the procedures of [RFC9251] for ALL BDs in the Tenant | procedures of [RFC9251] for ALL BDs in the Tenant Domain. | |||
Domain. | ||||
If a PE supports OISM functionality, it indicates that by setting the | If a PE supports OISM functionality, it indicates that, by setting | |||
"OISM-supported" flag in the Multicast Flags Extended Community that | the OISM-supported flag in the Multicast Flags Extended Community, it | |||
it attaches to some or all of its IMET routes. An OISM PE SHOULD | attaches to some or all of its IMET routes. An OISM PE SHOULD attach | |||
attach this EC with the OISM-supported flag set to all the IMET | this EC with the OISM-supported flag set to all the IMET routes it | |||
routes it originates. However, if PE1 imports IMET routes from PE2, | originates. However, if PE1 imports IMET routes from PE2, and at | |||
and at least one of PE2's IMET routes indicates that PE2 is an OISM | least one of PE2's IMET routes indicates that PE2 is an OISM PE, PE1 | |||
PE, PE1 MUST assume that PE2 is following OISM procedures. | MUST assume that PE2 is following OISM procedures. | |||
1.5.2. Data Plane | 1.5.2. Data Plane | |||
Suppose PE1 has an AC to a segment in BD1, and PE1 receives from that | Suppose PE1 has an AC to a segment in BD1 and PE1 receives an (S,G) | |||
AC an (S,G) multicast frame (as defined in Section 1.1). | multicast frame from that AC (as defined in Section 1.1). | |||
There may be other ACs of PE1 on which TSes have indicated an | There may be other ACs of PE1 on which TSs have indicated an interest | |||
interest (via IGMP/MLD) in receiving (S,G) multicast packets. PE1 is | (via IGMP/MLD) in receiving (S,G) multicast packets. PE1 is | |||
responsible for sending the received multicast packet on those ACs. | responsible for sending the received multicast packet on those ACs. | |||
There are two cases to consider: | There are two cases to consider: | |||
* Intra-Subnet Forwarding: In this case, an attachment AC with | * Intra-Subnet Forwarding: In this case, an AC with interest in | |||
interest in (S,G) is connected to a segment that is part of the | (S,G) is connected to a segment that is part of the source BD, | |||
source BD, BD1. If the segment is not multi-homed, or if PE1 is | BD1. If the segment is not multihomed, or if PE1 is the | |||
the Designated Forwarder (DF) (see [RFC7432]) for that segment, | Designated Forwarder (DF) (see [RFC7432]) for that segment, PE1 | |||
PE1 sends the multicast frame on that AC without changing the MAC | sends the multicast frame on that AC without changing the MAC SA. | |||
SA. The IP header is not modified at all; in particular, the TTL | The IP header is not modified at all; in particular, the TTL is | |||
is not decremented. | not decremented. | |||
* Inter-Subnet Forwarding: An AC with interest in (S,G) is connected | * Inter-Subnet Forwarding: An AC with interest in (S,G) is connected | |||
to a segment of BD2, where BD2 is different than BD1. If PE1 is | to a segment of BD2, where BD2 is different than BD1. If PE1 is | |||
the DF for that segment (or if the segment is not multi-homed), | the DF for that segment (or if the segment is not multihomed), PE1 | |||
PE1 decapsulates the IP multicast packet, performs any necessary | decapsulates the IP multicast packet, performs any necessary IP | |||
IP processing (including TTL decrement), then re-encapsulates the | processing (including TTL decrement), and then re-encapsulates the | |||
packet appropriately for BD2. PE1 then sends the packet on the | packet appropriately for BD2. PE1 then sends the packet on the | |||
AC. Note that after re-encapsulation, the MAC SA will be PE1's | AC. Note that after re-encapsulation, the MAC SA will be PE1's | |||
MAC address on BD2. The IP TTL will have been decremented by 1. | MAC address on BD2. The IP TTL will have been decremented by 1. | |||
In addition, there may be other PEs that are interested in (S,G) | In addition, there may be other PEs that are interested in (S,G) | |||
traffic. Suppose PE2 is such a PE. Then PE1 tunnels a copy of the | traffic. Suppose PE2 is such a PE. Then, PE1 tunnels a copy of the | |||
IP multicast frame (with its original MAC SA, and with no alteration | IP multicast frame (with its original MAC SA and with no alteration | |||
of the payload's IP header) to PE2. The tunnel encapsulation | of the payload's IP header) to PE2. The tunnel encapsulation | |||
contains information that PE2 can use to associate the frame with an | contains information that PE2 can use to associate the frame with an | |||
"apparent source BD". If the actual source BD of the frame is BD1, | apparent source BD. If the actual source BD of the frame is BD1, | |||
then: | then: | |||
* If PE2 is attached to BD1, the tunnel encapsulation used to send | * If PE2 is attached to BD1, the tunnel encapsulation used to send | |||
the frame to PE2 will cause PE2 to identify BD1 as the apparent | the frame to PE2 will cause PE2 to identify BD1 as the apparent | |||
source BD. | source BD. | |||
* If PE2 is not attached to BD1, the tunnel encapsulation used to | * If PE2 is not attached to BD1, the tunnel encapsulation used to | |||
send the frame to PE2 will cause PE2 to identify the SBD as the | send the frame to PE2 will cause PE2 to identify the SBD as the | |||
apparent source BD. | apparent source BD. | |||
Note that the tunnel encapsulation used for a particular BD will have | Note that the tunnel encapsulation used for a particular BD will have | |||
been advertised in an IMET route or S-PMSI route | been advertised in an IMET route or a Selective Provider Multicast | |||
[I-D.ietf-bess-evpn-bum-procedure-updates] for that BD. That route | Service Interface (S-PMSI) route [RFC9572] for that BD. That route | |||
carries a PMSI Tunnel attribute, which specifies how packets | carries a PMSI Tunnel Attribute (PTA), which specifies how packets | |||
originating from that BD are encapsulated. This information enables | originating from that BD are encapsulated. This information enables | |||
the PE receiving a tunneled packet to identify the apparent source BD | the PE receiving a tunneled packet to identify the apparent source BD | |||
as stated above. See Section 3.2 for more details. | as stated above. See Section 3.2 for more details. | |||
When PE2 receives the tunneled frame, it will forward it on any of | When PE2 receives the tunneled frame, it will forward it on any of | |||
its ACs that have interest in (S,G). | its ACs that have interest in (S,G). | |||
If PE2 determines from the tunnel encapsulation that the apparent | If PE2 determines from the tunnel encapsulation that the apparent | |||
source BD is BD1, then | source BD is BD1, then: | |||
* For those ACs that connect PE2 to BD1, the intra-subnet forwarding | * For those ACs that connect PE2 to BD1, the intra-subnet forwarding | |||
procedure described above is used, except that it is now PE2, not | procedure described above is used, except that it is now PE2, not | |||
PE1, carrying out that procedure. Unmodified EVPN procedures from | PE1, carrying out that procedure. Unmodified EVPN procedures from | |||
[RFC7432] are used to ensure that a packet originating from a | [RFC7432] are used to ensure that a packet originating from a | |||
multi-homed segment is never sent back to that segment. | multihomed segment is never sent back to that segment. | |||
* For those ACs that do not connect to BD1, the inter-subnet | * For those ACs that do not connect to BD1, the inter-subnet | |||
forwarding procedure described above is used, except that it is | forwarding procedure described above is used, except that it is | |||
now PE2, not PE1, carrying out that procedure. | now PE2, not PE1, carrying out that procedure. | |||
If the tunnel encapsulation identifies the apparent source BD as the | If the tunnel encapsulation identifies the apparent source BD as the | |||
SBD, PE2 applies the inter-subnet forwarding procedures described | SBD, PE2 applies the inter-subnet forwarding procedures described | |||
above to all of its ACs that have interest in the flow. | above to all of its ACs that have interest in the flow. | |||
These procedures ensure that an IP multicast frame travels from its | These procedures ensure that an IP multicast frame travels from its | |||
ingress PE to all egress PEs that are interested in receiving it. | ingress PE to all egress PEs that are interested in receiving it. | |||
While in transit, the frame retains its original MAC SA, and the | While in transit, the frame retains its original MAC SA, and the | |||
payload of the frame retains its original IP header. Note that in | payload of the frame retains its original IP header. Note that in | |||
all cases, when an IP multicast packet is sent from one BD to | all cases, when an IP multicast packet is sent from one BD to | |||
another, these procedures cause its TTL to be decremented by 1. | another, these procedures cause its TTL to be decremented by 1. | |||
So far we have assumed that an IP multicast packet arrives at its | So far, we have assumed that an IP multicast packet arrives at its | |||
ingress PE over an AC that belongs to one of the BDs in a given | ingress PE over an AC that belongs to one of the BDs in a given | |||
Tenant Domain. However, it is possible for a packet to arrive at its | Tenant Domain. However, it is possible for a packet to arrive at its | |||
ingress PE in other ways. Since an EVPN-PE supporting IRB has an | ingress PE in other ways. Since an EVPN PE supporting IRB has an IP- | |||
IP-VRF, it is possible that the IP-VRF will have a "VRF interface" | VRF, it is possible that the IP-VRF will have a VRF interface that is | |||
that is not an IRB interface. For example, there might be a VRF | not an IRB interface. For example, there might be a VRF interface | |||
interface that is actually a physical link to an external Ethernet | that is actually a physical link to an external Ethernet switch, a | |||
switch, or to a directly attached host, or to a router. When an | directly attached host, or a router. When an EVPN PE, say PE1, | |||
EVPN-PE, say PE1, receives a packet through such means, we will say | receives a packet through such means, we will say that the packet has | |||
that the packet has an "external" source (i.e., a source "outside the | an external source (i.e., a source outside the Tenant Domain). There | |||
Tenant Domain"). There are also other scenarios in which a multicast | are also other scenarios in which a multicast packet might have an | |||
packet might have an external source, e.g., it might arrive over an | external source, e.g., it might arrive over an MVPN tunnel from an | |||
MVPN tunnel from an L3VPN PE. In such cases, we will still refer to | L3VPN PE. In such cases, we will still refer to PE1 as the "ingress | |||
PE1 as the "ingress EVPN-PE". | EVPN PE". | |||
When an EVPN-PE, say PE1, receives an externally sourced multicast | When an EVPN PE, say PE1, receives an externally sourced multicast | |||
packet, and there are receivers for that packet inside the Tenant | packet, and there are receivers for that packet inside the Tenant | |||
Domain, it does the following: | Domain, it does the following: | |||
* Suppose PE1 has an AC in BD1 that has interest in (S,G). Then PE1 | * Suppose PE1 has an AC in BD1 that has interest in (S,G). Then, | |||
encapsulates the packet for BD1, filling in the MAC SA field with | PE1 encapsulates the packet for BD1, filling in the MAC SA field | |||
PE1's own MAC address on BD1. It sends the resulting frame on the | with PE1's own MAC address on BD1. It sends the resulting frame | |||
AC. | on the AC. | |||
* Suppose some other EVPN-PE, say PE2, has interest in (S,G). PE1 | * Suppose some other EVPN PE, say PE2, has interest in (S,G). PE1 | |||
encapsulates the packet for Ethernet, filling in the MAC SA field | encapsulates the packet for Ethernet, filling in the MAC SA field | |||
with PE1's own MAC address on the SBD. PE1 then tunnels the | with PE1's own MAC address on the SBD. PE1 then tunnels the | |||
packet to PE2. The tunnel encapsulation will identify the | packet to PE2. The tunnel encapsulation will identify the | |||
apparent source BD as the SBD. Since the apparent source BD is | apparent source BD as the SBD. Since the apparent source BD is | |||
the SBD, PE2 will know to treat the frame as an inter-subnet | the SBD, PE2 will know to treat the frame as an inter-subnet | |||
multicast. | multicast. | |||
When ingress replication is used to transmit IP multicast frames from | When IR is used to transmit IP multicast frames from an ingress EVPN | |||
an ingress EVPN-PE to a set of egress PEs, then the ingress PE has to | PE to a set of egress PEs, then the ingress PE has to send multiple | |||
send multiple copies of the frame. Each copy is the original | copies of the frame. Each copy is the original Ethernet frame; | |||
Ethernet frame; decapsulation and IP processing take place only at | decapsulation and IP processing take place only at the egress PE. | |||
the egress PE. | ||||
If a Point-to-Multipoint (P2MP) tree or BIER [I-D.ietf-bier-evpn] is | If a P2MP tree or Bit Index Explicit Replication (BIER) [RFC9624] is | |||
used to transmit an IP multicast frame from an ingress PE to a set of | used to transmit an IP multicast frame from an ingress PE to a set of | |||
egress PEs, then the ingress PE only has to send one copy of the | egress PEs, then the ingress PE only has to send one copy of the | |||
frame to each of its next hops. Again, each egress PE receives the | frame to each of its next hops. Again, each egress PE receives the | |||
original frame and does any necessary IP processing. | original frame and does any necessary IP processing. | |||
2. Detailed Model of Operation | 2. Detailed Model of Operation | |||
The model described in Section 1.5.2 can be expressed more precisely | The model described in Section 1.5.2 can be expressed more precisely | |||
using the notion of "IRB interface" (see Appendix A). For a given | using the notion of IRB interface (see Appendix A). For a given | |||
Tenant Domain: | Tenant Domain: | |||
* A given PE has one IRB interface for each BD to which it is | * A given PE has one IRB interface for each BD to which it is | |||
attached. This IRB interface connects L3 routing to that BD. | attached. This IRB interface connects L3 routing to that BD. | |||
When IP multicast packets are sent or received on the IRB | When IP multicast packets are sent or received on the IRB | |||
interfaces, the semantics of the interface is modified from the | interfaces, the semantics of the interface are modified from the | |||
semantics described in Appendix A. See Section 2.3 for the | semantics described in Appendix A. See Section 2.3 for the | |||
details of the modification. | details of the modification. | |||
* Each PE also has an IRB interface that connects L3 routing to the | * Each PE also has an IRB interface that connects L3 routing to the | |||
SBD. The semantics of this interface is different than the | SBD. The semantics of this interface is different than the | |||
semantics of the IRB interface to the real BDs. See Section 2.3. | semantics of the IRB interface to the real BDs. See Section 2.3. | |||
In this section we assume that PIM is not enabled on the IRB | In this section, we assume that PIM is not enabled on the IRB | |||
interfaces. In general, it is not necessary to enable PIM on the IRB | interfaces. In general, it is not necessary to enable PIM on the IRB | |||
interfaces unless there are PIM routers on one of the Tenant Domain's | interfaces unless there are PIM routers on one of the Tenant Domain's | |||
BDs, or unless there is some other scenario requiring a Tenant | BDs or there is some other scenario requiring a Tenant Domain's L3 | |||
Domain's L3 routing instance to become a PIM adjacency of some other | routing instance to become a PIM adjacency of some other system. | |||
system. These cases will be discussed in Section 7. | These cases will be discussed in Section 7. | |||
2.1. Supplementary Broadcast Domain | 2.1. Supplementary Broadcast Domain | |||
Suppose a given Tenant Domain contains three BDs (BD1, BD2, BD3) and | Suppose a given Tenant Domain contains three BDs (BD1, BD2, and BD3) | |||
two PEs (PE1, PE2). PE1 attaches to BD1 and BD2, while PE2 attaches | and two PEs (PE1 and PE2). PE1 attaches to BD1 and BD2, while PE2 | |||
to BD2 and BD3. | attaches to BD2 and BD3. | |||
To carry out the procedures described above, all the PEs attached to | To carry out the procedures described above, all the PEs attached to | |||
the Tenant Domain must be provisioned with the SBD for that tenant | the Tenant Domain must be provisioned with the SBD for that Tenant | |||
domain. A Route Target (RT) must be associated with the SBD, and | Domain. An RT must be associated with the SBD and provisioned on | |||
provisioned on each of those PEs. We will refer to that RT as the | each of those PEs. We will refer to that RT as the "SBD-RT". | |||
"SBD-RT". | ||||
A Tenant Domain is also configured with an IP-VRF [RFC9135], and the | A Tenant Domain is also configured with an IP-VRF [RFC9135], and the | |||
IP-VRF is associated with an RT. This RT MAY be the same as the | IP-VRF is associated with an RT. This RT MAY be the same as the SBD- | |||
SBD-RT. | RT. | |||
Suppose an (S,G) multicast frame originating on BD1 has a receiver on | Suppose an (S,G) multicast frame originating on BD1 has a receiver on | |||
BD3. PE1 will transmit the packet to PE2 as a frame, and the | BD3. PE1 will transmit the packet to PE2 as a frame, and the | |||
encapsulation will identify the frame's source BD as BD1. Since PE2 | encapsulation will identify the frame's source BD as BD1. Since PE2 | |||
is not provisioned with BD1, it will treat the packet as if its | is not provisioned with BD1, it will treat the packet as if its | |||
source BD were the SBD. That is, a packet can be transmitted from | source BD were the SBD. That is, a packet can be transmitted from | |||
BD1 to BD3 even though its ingress PE is not configured for BD3, and/ | BD1 to BD3 even though its ingress PE is not configured for BD3 and/ | |||
or its egress PE is not configured for BD1. | or its egress PE is not configured for BD1. | |||
EVPN supports service models in which a given EVPN Instance (EVI) can | EVPN supports service models in which a given EVI can contain only | |||
contain only one BD. It also supports service models in which a | one BD. It also supports service models in which a given EVI can | |||
given EVI can contain multiple BDs. No matter which service model is | contain multiple BDs. No matter which service model is being used | |||
being used for a particular tenant, it is highly RECOMMENDED that an | for a particular tenant, it is highly RECOMMENDED that an EVI | |||
EVI containing only the SBD be provisioned for that tenant. | containing only the SBD be provisioned for that tenant. | |||
If, for some reason, it is not feasible to provision an EVI that | If, for some reason, it is not feasible to provision an EVI that | |||
contains only the SBD, it is possible to put the SBD in an EVI that | contains only the SBD, it is possible to put the SBD in an EVI that | |||
contains other BDs. However, in that case, the SBD-RT MUST be | contains other BDs. However, in that case, the SBD-RT MUST be | |||
different than the RT associated with any other BD. Otherwise the | different than the RT associated with any other BD. Otherwise, the | |||
procedures of this document (as detailed in Sections 2.2 and 3.1) | procedures of this document (as detailed in Sections 2.2 and 3.1) | |||
will not produce correct results. | will not produce correct results. | |||
2.2. Detecting When a Route is For/From a Particular BD | 2.2. Detecting When a Route is for/from a Particular BD | |||
In this document, we frequently say that a particular multicast route | In this document, we frequently say that a particular multicast route | |||
is "from" a particular BD, or is "for" a particular BD, or is | is "from" or "for" a particular BD or is "related to" or "associated | |||
"related to" a particular BD, or "is associated with" a particular | with" a particular BD. These terms are used interchangeably. | |||
BD. These terms are used interchangeably. Subsequent sections of | Subsequent sections of this document explain when various routes must | |||
this document explain when various routes must be originated for | be originated for particular BDs. In this section, we explain how | |||
particular BDs. In this section, we explain how the PE originating a | the PE originating a route marks the route to indicate which BD it is | |||
route marks the route to indicate which BD it is for. We also | for. We also explain how a PE receiving the route determines which | |||
explain how a PE receiving the route determines which BD the route is | BD the route is for. | |||
for. | ||||
In EVPN, each BD is assigned a Route Target (RT). An RT is a BGP | In EVPN, each BD is assigned an RT. An RT is a BGP Extended | |||
extended community that can be attached to the BGP routes used by the | Community that can be attached to the BGP routes used by the EVPN | |||
EVPN control plane. In some EVPN service models, each BD is assigned | control plane. In some EVPN service models, each BD is assigned a | |||
a unique RT. In other service models, a set of BDs (all in the same | unique RT. In other service models, a set of BDs (all in the same | |||
EVI) may be assigned the same RT. The RT that is assigned to the SBD | EVI) may be assigned the same RT. The RT that is assigned to the SBD | |||
is called the "SBD-RT". | is called the "SBD-RT". | |||
In those service models that allow a set of BDs to share a single RT, | In those service models that allow a set of BDs to share a single RT, | |||
each BD is assigned a non-zero Tag ID. The Tag ID appears in the | each BD is assigned a non-zero Tag ID. The Tag ID appears in the | |||
Network Layer Reachability Information (NLRI) of many of the BGP | Network Layer Reachability Information (NLRI) of many of the BGP | |||
routes that are used by the EVPN control plane. | routes that are used by the EVPN control plane. | |||
A given route may be for the SBD, or for an "ordinary BD" (a BD that | A given route may be for the SBD or an ordinary BD (a BD that is not | |||
is not the SBD). An RT that has been assigned to an ordinary BD will | the SBD). An RT that has been assigned to an ordinary BD will be | |||
be known as an "ordinary BD-RT". | known as an "ordinary BD-RT". | |||
When constructing an IMET, SMET, S-PMSI, or Leaf | When constructing an IMET, SMET, S-PMSI, or Leaf [RFC9572] route that | |||
[I-D.ietf-bess-evpn-bum-procedure-updates] route that is for a given | is for a given BD, the following rules apply: | |||
BD, the following rules apply: | ||||
* If the route is for an ordinary BD, say BD1, then | * If the route is for an ordinary BD, say BD1, then: | |||
- the route MUST carry the ordinary BD-RT associated with BD1, | - the route MUST carry the ordinary BD-RT associated with BD1 and | |||
and | ||||
- the route MUST NOT carry any RT that is associated with an | - the route MUST NOT carry any RT that is associated with an | |||
ordinary BD other than BD1. | ordinary BD other than BD1. | |||
* If the route is for the SBD, the route MUST carry the SBD-RT, and | * If the route is for the SBD, the route MUST carry the SBD-RT and | |||
MUST NOT carry any RT that is associated with any other BD. | MUST NOT carry any RT that is associated with any other BD. | |||
* As detailed in subsequent sections, under certain circumstances a | * As detailed in subsequent sections, under certain circumstances, a | |||
route that is for BD1 may carry both the RT of BD1 and also the | route that is for BD1 may carry both the RT of BD1 and also the | |||
SBD-RT. | SBD-RT. | |||
The IMET route for the SBD MUST carry a Multicast Flags Extended | The IMET route for the SBD MUST carry a Multicast Flags Extended | |||
Community, in which an "OISM SBD" flag is set. | Community in which an OISM SBD flag is set. | |||
The IMET route for a BD other than the SBD SHOULD carry an EVI-RT EC | The IMET route for a BD other than the SBD SHOULD carry an EVI-RT EC | |||
as defined in [RFC9251]. The EC is constructed from the SBD-RT, to | as defined in [RFC9251]. The EC is constructed from the SBD-RT to | |||
indicate the BD's corresponding SBD. This allows all PEs to check | indicate the BD's corresponding SBD. This allows all PEs to check | |||
that they have consistent SBD provisioning and allow an Assisted | that they have consistent SBD provisioning and allows an Assisted | |||
Replication (AR) replicator to automatically determine a BD's | Replication (AR) replicator to automatically determine a BD's | |||
corresponding SBD without any provisioning, as explained in | corresponding SBD without any provisioning, as explained in | |||
Section 3.2.3.1. | Section 3.2.3.1. | |||
When receiving an IMET, SMET, S-PMSI, or Leaf route, it is necessary | When receiving an IMET, SMET, S-PMSI, or Leaf route, it is necessary | |||
for the receiving PE to determine the BD to which the route belongs. | for the receiving PE to determine the BD to which the route belongs. | |||
This is done by examining the RTs carried by the route, as well as | This is done by examining the RTs carried by the route, as well as | |||
the Tag ID field of the route's NLRI. There are several cases to | the Tag ID field of the route's NLRI. There are several cases to | |||
consider. Some of these cases are error cases that arise when the | consider. Some of these cases are error cases that arise when the | |||
route has not been properly constructed. | route has not been properly constructed. | |||
When one of the error cases is detected, the route MUST be regarded | When one of the error cases is detected, the route MUST be regarded | |||
as a malformed route, and the "treat-as-withdraw" procedure of | as a malformed route, and the treat-as-withdraw procedure of | |||
[RFC7606] MUST be applied. Note that these error cases are only | [RFC7606] MUST be applied. Note that these error cases are only | |||
detectable by EVPN procedures at the receiving PE; BGP procedures at | detectable by EVPN procedures at the receiving PE; BGP procedures at | |||
intermediate nodes will generally not detect the existence of such | intermediate nodes will generally not detect the existence of such | |||
error cases, and in general SHOULD NOT attempt to do so. | error cases and in general SHOULD NOT attempt to do so. | |||
Case 1: The receiving PE recognizes more than one of the route's RTs | Case 1: The receiving PE recognizes more than one of the route's RTs | |||
as being an SBD-RT (i.e., the route carries SBD-RTs of more | as being an SBD-RT (i.e., the route carries SBD-RTs of more | |||
than one Tenant Domain). | than one Tenant Domain). | |||
This is an error case; the route has not been properly | This is an error case; the route has not been properly | |||
constructed. | constructed. | |||
Case 2: The receiving PE recognizes one of the route's RTs as being | Case 2: The receiving PE recognizes one of the route's RTs as being | |||
associated with an ordinary BD, and recognizes one of the | associated with an ordinary BD and recognizes one of the | |||
route's other RTs as being associated with a different | route's other RTs as being associated with a different | |||
ordinary BD. | ordinary BD. | |||
This is an error case; the route has not been properly | This is an error case; the route has not been properly | |||
constructed. | constructed. | |||
Case 3: The receiving PE recognizes one of the route's RTs as being | Case 3: The receiving PE recognizes one of the route's RTs as being | |||
associated with an ordinary BD in a particular Tenant | associated with an ordinary BD in a particular Tenant Domain | |||
Domain, and recognizes another of the route's RTs as being | and recognizes another of the route's RTs as being | |||
associated with the SBD of a different Tenant Domain. | associated with the SBD of a different Tenant Domain. | |||
This is an error case; the route has not been properly | This is an error case; the route has not been properly | |||
constructed. | constructed. | |||
Case 4: The receiving PE does not recognize any of the route's RTs | Case 4: The receiving PE does not recognize any of the route's RTs | |||
as being associated with an ordinary BD in any of its tenant | as being associated with an ordinary BD in any of its Tenant | |||
domains, but does recognize one of the RTs as the SBD-RT of | Domains but does recognize one of the RTs as the SBD-RT of | |||
one of its Tenant Domains. | one of its Tenant Domains. | |||
In this case, the receiving PE associates the route with the | In this case, the receiving PE associates the route with the | |||
SBD of that Tenant Domain. This association is made even if | SBD of that Tenant Domain. This association is made even if | |||
the Tag ID field of the route's NLRI is not the Tag ID of | the Tag ID field of the route's NLRI is not the Tag ID of | |||
the SBD. | the SBD. | |||
This is a normal use case where either (a) the route is for | This is a normal use case where either (a) the route is for | |||
a BD to which the receiving PE is not attached, or (b) the | a BD to which the receiving PE is not attached or (b) the | |||
route is for the SBD. In either case, the receiving PE | route is for the SBD. In either case, the receiving PE | |||
associates the route with the SBD. | associates the route with the SBD. | |||
Case 5: The receiving PE recognizes exactly one of the RTs as an | Case 5: The receiving PE recognizes exactly one of the RTs as an | |||
ordinary BD-RT that is associated with one of the PE's EVIs, | ordinary BD-RT that is associated with one of the PE's EVIs, | |||
say EVI-1. The receiving PE also recognizes one of the RTs | say EVI-1. The receiving PE also recognizes one of the RTs | |||
as being the SBD-RT of the Tenant Domain containing EVI-1. | as being the SBD-RT of the Tenant Domain containing EVI-1. | |||
In this case, the route is associated with the BD in EVI-1 | In this case, the route is associated with the BD in EVI-1 | |||
that is identified (in the context of EVI-1) by the Tag ID | that is identified (in the context of EVI-1) by the Tag ID | |||
field of the route's NLRI. (If EVI-1 contains only a single | field of the route's NLRI. (If EVI-1 contains only a single | |||
BD, the Tag ID is likely to be zero.) | BD, the Tag ID is likely to be zero.) | |||
This is the case where the route is for a BD to which the | This is the case where the route is for a BD to which the | |||
receiving PE is attached, but the route also carries the | receiving PE is attached, but the route also carries the | |||
SBD-RT. In this case, the receiving PE associates the route | SBD-RT. In this case, the receiving PE associates the route | |||
with the ordinary BD, not with the SBD. | with the ordinary BD, not with the SBD. | |||
N.B.: According to the above rules, the mapping from BD to RT is a | Note that according to the above rules, the mapping from BD to RT is | |||
many-to-one or one-to-one mapping. A route that an EVPN-PE | a many-to-one or one-to-one mapping. A route that an EVPN PE | |||
originates for a particular BD carries that BD's RT, and an EVPN-PE | originates for a particular BD carries that BD's RT, and an EVPN PE | |||
that receives the route associates it with a BD as described above. | that receives the route associates it with a BD as described above. | |||
However, RTs are not used only to help identify the BD to which a | However, RTs are not used only to help identify the BD to which a | |||
route belongs; they may also used by BGP to determine the path along | route belongs; they may also be used by BGP to determine the path | |||
which the route is distributed, and to determine which PEs receive | along which the route is distributed and to determine which PEs | |||
the route. There may be cases where it is desirable to originate a | receive the route. There may be cases where it is desirable to | |||
route for a particular BD, but have that route distributed to only | originate a route for a particular BD but have that route distributed | |||
some of the EVPN-PEs attached to that BD. Or one might want the | to only some of the EVPN PEs attached to that BD. Or one might want | |||
route distributed to some intermediate set of systems, where it might | the route distributed to some intermediate set of systems, where it | |||
be modified or replaced before being propagated further. Such | might be modified or replaced before being propagated further. Such | |||
situations are outside the scope of this document. | situations are outside the scope of this document. | |||
Additionally, there may be situations where it is desirable to | Additionally, there may be situations where it is desirable to | |||
exchange routes among two or more different Tenant Domains ("EVPN | exchange routes among two or more different Tenant Domains (EVPN | |||
Extranet"). Such situations are outside the scope of this document. | Extranet). Such situations are outside the scope of this document. | |||
2.3. Use of IRB Interfaces at Ingress PE | 2.3. Use of IRB Interfaces at Ingress PE | |||
When an (S,G) multicast frame is received from an AC belonging to a | When an (S,G) multicast frame is received from an AC belonging to a | |||
particular BD, say BD1: | particular BD, say BD1: | |||
1. The frame is sent unchanged to other EVPN-PEs that are interested | 1. The frame is sent unchanged to other EVPN PEs that are interested | |||
in (S,G) traffic. The encapsulation used to send the frame to | in (S,G) traffic. The encapsulation used to send the frame to | |||
the other EVPN-PEs depends on the tunnel type being used for | the other EVPN PEs depends on the tunnel type being used for | |||
multicast transmission. (For our purposes, we consider Ingress | multicast transmission. (For our purposes, we consider IR, AR, | |||
Replication (IR), Assisted Replication (AR) and BIER to be | and BIER to be tunnel types, even though IR, AR, and BIER do not | |||
"tunnel types", even though IR, AR and BIER do not actually use | actually use P2MP tunnels.) At the egress PE, the apparent | |||
P2MP tunnels.) At the egress PE, the apparent source BD of the | source BD of the frame can be inferred from the tunnel | |||
frame can be inferred from the tunnel encapsulation. If the | encapsulation. If the egress PE is not attached to the actual | |||
egress PE is not attached to the actual source BD, it will infer | source BD, it will infer that the apparent source BD is the SBD. | |||
that the apparent source BD is the SBD. | ||||
Note that the the inter-PE transmission of a multicast frame | Note that the inter-PE transmission of a multicast frame among | |||
among EVPN-PEs of the same Tenant Domain does NOT involve the IRB | EVPN PEs of the same Tenant Domain does NOT involve the IRB | |||
interfaces, as long as the multicast frame was received over an | interfaces as long as the multicast frame was received over an AC | |||
AC attached to one of the Tenant Domain's BDs. | attached to one of the Tenant Domain's BDs. | |||
2. The frame is also sent up the IRB interface that attaches BD1 to | 2. The frame is also sent up the IRB interface that attaches BD1 to | |||
the Tenant Domain's L3 routing instance in this PE. That is, the | the Tenant Domain's L3 routing instance in this PE. That is, the | |||
L3 routing instance, behaving as if it were a multicast router, | L3 routing instance, behaving as if it were a multicast router, | |||
receives the IP multicast frames that arrive at the PE from its | receives the IP multicast frames that arrive at the PE from its | |||
local ACs. The L3 routing instance decapsulates the frame's | local ACs. The L3 routing instance decapsulates the frame's | |||
payload to extract the IP multicast packet, decrements the IP | payload to extract the IP multicast packet, decrements the IP | |||
TTL, adjusts the header checksum, and does any other necessary IP | TTL, adjusts the header checksum, and does any other necessary IP | |||
processing (e.g., fragmentation). | processing (e.g., fragmentation). | |||
3. The L3 routing instance keeps track of which BDs have local | 3. The L3 routing instance keeps track of which BDs have local | |||
receivers for (S,G) traffic. (A "local receiver" is a TS, | receivers for (S,G) traffic. (A local receiver is a TS, | |||
reachable via a local AC, that has expressed interest in (S,G) | reachable via a local AC, that has expressed interest in (S,G) | |||
traffic.) If the L3 routing instance has an IRB interface to | traffic.) If the L3 routing instance has an IRB interface to | |||
BD2, and it knows that BD2 has a LOCAL receiver interested in | BD2, and it knows that BD2 has a LOCAL receiver interested in | |||
(S,G) traffic, it encapsulates the packet in an Ethernet header | (S,G) traffic, it encapsulates the packet in an Ethernet header | |||
for BD2, putting its own MAC address in the MAC SA field. Then | for BD2, putting its own MAC address in the MAC SA field. Then, | |||
it sends the packet down the IRB interface to BD2. | it sends the packet down the IRB interface to BD2. | |||
If a packet is sent from the L3 routing instance to a particular BD | If a packet is sent from the L3 routing instance to a particular BD | |||
via the IRB interface (step 3 in the above list), and if the BD in | via the IRB interface (step 3 in the above list), and if the BD in | |||
question is NOT the SBD, the packet is sent ONLY to LOCAL ACs of that | question is NOT the SBD, the packet is sent ONLY to LOCAL ACs of that | |||
BD. If the packet needs to go to other PEs, it has already been sent | BD. If the packet needs to go to other PEs, it has already been sent | |||
to them in step 1. Note that this is a change in the IRB interface | to them in step 1. Note that this is a change in the IRB interface | |||
semantics from what is described in [RFC9135] and Figure 2. | semantics from what is described in [RFC9135] and Figure 3. | |||
If a given locally attached segment is multi-homed, existing EVPN | If a given locally attached segment is multihomed, existing EVPN | |||
procedures ensure that a packet is not sent by a given PE to that | procedures ensure that a packet is not sent by a given PE to that | |||
segment unless the PE is the DF for that segment. Those procedures | segment unless the PE is the DF for that segment. Those procedures | |||
also ensure that a packet is never sent by a PE to its segment of | also ensure that a packet is never sent by a PE to its segment of | |||
origin. Thus EVPN segment multi-homing is fully supported; duplicate | origin. Thus, EVPN segment multihoming is fully supported; duplicate | |||
delivery to a segment or looping on a segment are thereby prevented, | delivery to a segment or looping on a segment are thereby prevented | |||
without the need for any new procedures to be defined in this | without the need for any new procedures to be defined in this | |||
document. | document. | |||
What if an IP multicast packet is received from outside the tenant | What if an IP multicast packet is received from outside the Tenant | |||
domain? For instance, perhaps PE1's IP-VRF for a particular tenant | Domain? For instance, perhaps PE1's IP-VRF for a particular Tenant | |||
domain also has a physical interface leading to an external switch, | Domain also has a physical interface leading to an external switch, | |||
host, or router, and PE1 receives an IP multicast packet or frame on | host, or router and PE1 receives an IP multicast packet or frame on | |||
that interface. Or perhaps the packet is from an L3VPN, or a | that interface, or perhaps the packet is from an L3VPN or a different | |||
different EVPN Tenant Domain. | EVPN Tenant Domain. | |||
Such a packet is first processed by the L3 routing instance, which | Such a packet is first processed by the L3 routing instance, which | |||
decrements TTL and does any other necessary IP processing. Then the | decrements TTL and does any other necessary IP processing. Then, the | |||
packet is sent into the Tenant Domain by sending it down the IRB | packet is sent into the Tenant Domain by sending it down the IRB | |||
interface to the SBD of that Tenant Domain. This requires | interface to the SBD of that Tenant Domain. This requires | |||
encapsulating the packet in an Ethernet header. The MAC SA field | encapsulating the packet in an Ethernet header. The MAC SA field | |||
will contain the PE's own MAC on the SBD. | will contain the PE's own MAC on the SBD. | |||
An IP multicast packet sent by the L3 routing instance down the IRB | An IP multicast packet sent by the L3 routing instance down the IRB | |||
interface to the SBD is treated as if it had arrived from a local AC, | interface to the SBD is treated as if it had arrived from a local AC, | |||
and steps 1-3 are applied. Note that the semantics of sending a | and steps 1-3 are applied. Note that the semantics of sending a | |||
packet down the IRB interface to the SBD are thus slightly different | packet down the IRB interface to the SBD are thus slightly different | |||
than the semantics of sending a packet down other IRB interfaces. IP | than the semantics of sending a packet down other IRB interfaces. IP | |||
multicast packets sent down the SBD's IRB interface may be | multicast packets sent down the SBD's IRB interface may be | |||
distributed to other PEs, but IP multicast packets sent down other | distributed to other PEs, but IP multicast packets sent down other | |||
IRB interfaces are distributed only to local ACs. | IRB interfaces are distributed only to local ACs. | |||
If a PE sends a link-local multicast packet down the SBD IRB | If a PE sends a link-local multicast packet down the SBD IRB | |||
interface, that packet will be distributed (as an Ethernet frame) to | interface, that packet will be distributed (as an Ethernet frame) to | |||
other PEs of the Tenant Domain, but will not appear on any of the | other PEs of the Tenant Domain but will not appear on any of the | |||
actual BDs. | actual BDs. | |||
2.4. Use of IRB Interfaces at an Egress PE | 2.4. Use of IRB Interfaces at an Egress PE | |||
Suppose an egress EVPN-PE receives an (S,G) multicast frame from the | Suppose an egress EVPN PE receives an (S,G) multicast frame from the | |||
frame's ingress EVPN-PE. As described above, the packet will arrive | frame's ingress EVPN PE. As described above, the packet will arrive | |||
as an Ethernet frame over a tunnel from the ingress PE, and the | as an Ethernet frame over a tunnel from the ingress PE, and the | |||
tunnel encapsulation will identify the source BD of the Ethernet | tunnel encapsulation will identify the source BD of the Ethernet | |||
frame. | frame. | |||
We define the notion of the frame's "apparent source BD" as follows. | We define the notion of the frame's apparent source BD as follows. | |||
If the egress PE is attached to the actual source BD, the actual | If the egress PE is attached to the actual source BD, the actual | |||
source BD is the apparent source BD. If the egress PE is not | source BD is the apparent source BD. If the egress PE is not | |||
attached to the actual source BD, the SBD is the apparent source BD. | attached to the actual source BD, the SBD is the apparent source BD. | |||
The egress PE now takes the following steps: | The egress PE now takes the following steps: | |||
1. If the egress PE has ACs belonging to the apparent source BD of | 1. If the egress PE has ACs belonging to the apparent source BD of | |||
the frame, it sends the frame unchanged to any ACs of that BD | the frame, it sends the frame unchanged to any ACs of that BD | |||
that have interest in (S,G) packets. The MAC SA of the frame is | that have interest in (S,G) packets. The MAC SA of the frame is | |||
not modified, and the IP header of the frame's payload is not | not modified, and the IP header of the frame's payload is not | |||
modified in any way. | modified in any way. | |||
2. The frame is also sent to the L3 routing instance by being sent | 2. The frame is also sent to the L3 routing instance by being sent | |||
up the IRB interface that attaches the L3 routing instance to the | up the IRB interface that attaches the L3 routing instance to the | |||
apparent source BD. Steps 2 and 3 of Section 2.3 are then | apparent source BD. Steps 2 and 3 listed in Section 2.3 are then | |||
applied. | applied. | |||
2.5. Announcing Interest in (S,G) | 2.5. Announcing Interest in (S,G) | |||
[RFC9251] defines procedures used by an egress PE to announce its | [RFC9251] defines procedures used by an egress PE to announce its | |||
interest in a multicast flow or set of flows. If an egress PE | interest in a multicast flow or set of flows. If an egress PE | |||
determines it has LOCAL receivers in a particular BD, say BD1, that | determines it has LOCAL receivers in a particular BD, say BD1, that | |||
are interested in a particular set of flows, it originates one or | are interested in a particular set of flows, it originates one or | |||
more SMET routes for BD1. Each SMET route specifies a particular | more SMET routes for BD1. Each SMET route specifies a particular | |||
(S,G) or (*,G) flow. By originating an SMET route for BD1, a PE is | (S,G) or (*,G) flow. By originating a SMET route for BD1, a PE is | |||
announcing "I have receivers for (S,G) or (*,G) in BD1". Such an | announcing "I have receivers for (S,G) or (*,G) in BD1". Such a SMET | |||
SMET route carries the Route Target (RT) for BD1, ensuring that it | route carries the RT for BD1, ensuring that it will be distributed to | |||
will be distributed to all PEs that are attached to BD1. | all PEs that are attached to BD1. | |||
The OISM procedures for originating SMET routes differ slightly from | The OISM procedures for originating SMET routes differ slightly from | |||
those in [RFC9251]. In most cases, the SMET routes are considered to | those in [RFC9251]. In most cases, the SMET routes are considered to | |||
be for the SBD, rather than for the BD containing local receivers. | be for the SBD rather than the BD containing local receivers. These | |||
These SMET routes carry the SBD-RT, and do not carry any ordinary BD- | SMET routes carry the SBD-RT and do not carry any ordinary BD-RT. | |||
RT. Details on the processing of SMET routes can be found in | Details on the processing of SMET routes can be found in Section 3.3. | |||
Section 3.3. | ||||
Since the SMET routes carry the SBD-RT, every ingress PE attached to | Since the SMET routes carry the SBD-RT, every ingress PE attached to | |||
a particular Tenant Domain will learn of all other PEs (attached to | a particular Tenant Domain will learn of all other PEs (attached to | |||
the same Tenant Domain) that have interest in a particular set of | the same Tenant Domain) that have interest in a particular set of | |||
flows. Note that a PE that receives a given SMET route does not | flows. Note that a PE that receives a given SMET route does not | |||
necessarily have any BDs (other than the SBD) in common with the PE | necessarily have any BDs (other than the SBD) in common with the PE | |||
that originates that SMET route. | that originates that SMET route. | |||
If all the sources and receivers for a given (*,G) are in the Tenant | If all the sources and receivers for a given (*,G) are in the Tenant | |||
Domain, inter-subnet "Any Source Multicast" traffic will be properly | Domain, inter-subnet ASM traffic will be properly routed without | |||
routed without requiring any Rendezvous Points, shared trees, or | requiring any RPs, shared trees, or other complex aspects of | |||
other complex aspects of multicast routing infrastructure. Suppose, | multicast routing infrastructure. Suppose, for example, that: | |||
for example, that: | ||||
* PE1 has a local receiver, on BD1, for (*,G) | * PE1 has a local receiver, on BD1, for (*,G) and | |||
* PE2 has a local source, on BD2, for (*,G). | * PE2 has a local source, on BD2, for (*,G). | |||
PE1 will originate an SMET(*,G) route for the SBD, and PE2 will | PE1 will originate a SMET(*,G) route for the SBD, and PE2 will | |||
receive that route, even if PE2 is not attached to BD1. PE2 will | receive that route, even if PE2 is not attached to BD1. PE2 will | |||
thus know to forward (S,G) traffic to PE1. PE1 does not need to do | thus know to forward (S,G) traffic to PE1. PE1 does not need to do | |||
any "source discovery". (This does assume that source S does not | any source discovery. (This does assume that source S does not send | |||
send the same (S,G) datagram on two different BDs, and that the | the same (S,G) datagram on two different BDs and that the Tenant | |||
Tenant Domain does not contain two or more sources with the same IP | Domain does not contain two or more sources with the same IP address | |||
address S. The use of multicast sources that have IP "anycast" | S. The use of multicast sources that have IP anycast addresses is | |||
addresses is outside the scope of this document.) | outside the scope of this document.) | |||
If some PE attached to the Tenant Domain does not support [RFC9251], | If some PE attached to the Tenant Domain does not support [RFC9251], | |||
it will be assumed to be interested in all flows. Whether a | it will be assumed to be interested in all flows. Whether a | |||
particular remote PE supports [RFC9251] is determined by the presence | particular remote PE supports [RFC9251] or not is determined by the | |||
of the Multicast Flags Extended Community in its IMET route; this is | presence of the Multicast Flags Extended Community in its IMET route; | |||
specified in [RFC9251]. | this is specified in [RFC9251]. | |||
2.6. Tunneling Frames from Ingress PE to Egress PEs | 2.6. Tunneling Frames from Ingress PEs to Egress PEs | |||
[RFC7432] specifies the procedures for setting up and using "BUM | [RFC7432] specifies the procedures for setting up and using BUM | |||
tunnels". A BUM tunnel is a tunnel used to carry traffic on a | tunnels. A BUM tunnel is a tunnel used to carry traffic on a | |||
particular BD if that traffic is (a) broadcast traffic, or (b) | particular BD if that traffic is (a) broadcast traffic, (b) unicast | |||
unicast traffic with an unknown MAC DA, or (c) Ethernet multicast | traffic with an unknown Destination MAC Address, or (c) Ethernet | |||
traffic. | multicast traffic. | |||
This document allows the BUM tunnels to be used as the default | This document allows the BUM tunnels to be used as the default | |||
tunnels for transmitting IP multicast frames. It also allows a | tunnels for transmitting IP multicast frames. It also allows a | |||
separate set of tunnels to be used, instead of the BUM tunnels, as | separate set of tunnels to be used, instead of the BUM tunnels, as | |||
the default tunnels for carrying IP multicast frames. Let's call | the default tunnels for carrying IP multicast frames. Let's call | |||
these "IP Multicast Tunnels". | these "IP multicast tunnels". | |||
When the tunneling is done via Ingress Replication or via BIER, this | When the tunneling is done via IR or via BIER, this difference is of | |||
difference is of no significance. However, when P2MP tunnels are | no significance. However, when P2MP tunnels are used, there is a | |||
used, there is a significant advantage to having separate IP | significant advantage to having separate IP multicast tunnels. | |||
multicast tunnels. | ||||
Other things being equal, it is desirable for an ingress PE to | It is desirable for an ingress PE to transmit a copy of a given (S,G) | |||
transmit a copy of a given (S,G) multicast frame on only one P2MP | multicast frame on only one P2MP tunnel. All egress PEs interested | |||
tunnel. All egress PEs interested in (S,G) packets then have to join | in (S,G) packets then have to join that tunnel. If the source BD and | |||
that tunnel. If the source BD and PE for an (S,G) frame are BD1 and | PE for an (S,G) frame are BD1 and PE1, respectively, and if PE2 has | |||
PE1 respectively, and if PE2 has receivers on BD2 for (S,G), then PE2 | receivers on BD2 for (S,G), then PE2 must join the P2MP Label | |||
must join the P2MP LSP on which PE1 transmits the (S,G) frame. PE2 | Switched Path (LSP) on which PE1 transmits the (S,G) frame. PE2 must | |||
must join this P2MP LSP even if PE2 is not attached to the source BD | join this P2MP LSP even if PE2 is not attached to the source BD, BD1. | |||
(BD1). If PE1 were transmitting the multicast frame on its BD1 BUM | If PE1 was transmitting the multicast frame on its BD1 BUM tunnel, | |||
tunnel, then PE2 would have to join the BD1 BUM tunnel, even though | then PE2 would have to join the BD1 BUM tunnel, even though PE2 has | |||
PE2 has no BD1 attachment circuits. This would cause PE2 to pull all | no BD1 Attachment Circuits. This would cause PE2 to pull all the BUM | |||
the BUM traffic from BD1, most of which it would just have to | traffic from BD1, most of which it would just have to discard. Thus, | |||
discard. Thus it is RECOMMENDED that the default IP multicast | it is RECOMMENDED that the default IP multicast tunnels be distinct | |||
tunnels be distinct from the BUM tunnels. | from the BUM tunnels. | |||
Notwithstanding the above, link-local IP multicast traffic MUST | Notwithstanding the above, link-local IP multicast traffic MUST | |||
always be carried on the BUM tunnels, and ONLY on the BUM tunnels. | always be carried on the BUM tunnels and ONLY on the BUM tunnels. | |||
link-local IP multicast traffic consists of IPv4 traffic with a | Link-local IP multicast traffic consists of IPv4 traffic with a | |||
destination address prefix of 224/24 and IPv6 traffic with a | destination address prefix of 224/24 and IPv6 traffic with a | |||
destination address prefix of FF02/16. In this document, the terms | destination address prefix of FF02/16. In this document, the terms | |||
"IP multicast packet" and "IP multicast frame" are defined in | "IP multicast packet" and "IP multicast frame" are defined in | |||
Section 1.1 so as to exclude link-local traffic. | Section 1.1 so as to exclude link-local traffic. | |||
Note that it is also possible to use "selective tunnels" to carry | Note that it is also possible to use selective tunnels to carry | |||
particular multicast flows (see Section 3.2). When an (S,G) frame is | particular multicast flows (see Section 3.2). When an (S,G) frame is | |||
transmitted on a selective tunnel, it is not transmitted on the BUM | transmitted on a selective tunnel, it is not transmitted on the BUM | |||
tunnel or on the default IP Multicast tunnel. | tunnel or on the default IP multicast tunnel. | |||
2.7. Advanced Scenarios | 2.7. Advanced Scenarios | |||
There are some deployment scenarios that require special procedures: | There are some deployment scenarios that require special procedures: | |||
1. Some multicast sources or receivers are attached to PEs that | 1. Some multicast sources or receivers are attached to PEs that | |||
support [RFC7432], but do not support this document or [RFC9135]. | support [RFC7432] but do not support this document or [RFC9135]. | |||
To interoperate with these "non-OISM PEs", it is necessary to | To interoperate with these non-OISM PEs, it is necessary to have | |||
have one or more gateway PEs that interface the tunnels discussed | one or more gateway PEs that interface the tunnels discussed in | |||
in this document with the BUM tunnels of the legacy PEs. This is | this document with the BUM tunnels of the legacy PEs. This is | |||
discussed in Section 5. | discussed in Section 5. | |||
2. Sometimes multicast traffic originates from outside the EVPN | 2. Sometimes multicast traffic originates from outside the EVPN | |||
domain, or needs to be sent outside the EVPN domain. This is | domain or needs to be sent outside the EVPN domain. This is | |||
discussed in Section 6. An important special case of this, | discussed in Section 6. An important special case of this, | |||
integration with MVPN, is discussed in Section 6.1.2. | integration with MVPN, is discussed in Section 6.1.2. | |||
3. In some scenarios, one or more of the tenant systems is a PIM | 3. In some scenarios, one or more of the tenant systems is a PIM | |||
router, and the Tenant Domain is used as a transit network that | router, and the Tenant Domain is used as a transit network that | |||
is part of a larger multicast domain. This is discussed in | is part of a larger multicast domain. This is discussed in | |||
Section 7. | Section 7. | |||
3. EVPN-aware Multicast Solution Control Plane | 3. EVPN-Aware Multicast Solution Control Plane | |||
3.1. Supplementary Broadcast Domain (SBD) and Route Targets | 3.1. Supplementary Broadcast Domain (SBD) and Route Targets | |||
As discussed in Section 2.1, every Tenant Domain is associated with a | As discussed in Section 2.1, every Tenant Domain is associated with a | |||
single Supplementary Broadcast Domain (SBD). Recall that a Tenant | single SBD. Recall that a Tenant Domain is defined to be a set of | |||
Domain is defined to be a set of BDs that can freely send and receive | BDs that can freely send and receive IP multicast traffic to/from | |||
IP multicast traffic to/from each other. If an EVPN-PE has one or | each other. If an EVPN PE has one or more ACs in a BD of a | |||
more ACs in a BD of a particular Tenant Domain, and if the EVPN-PE | particular Tenant Domain, and if the EVPN PE supports the procedures | |||
supports the procedures of this document, that EVPN-PE MUST be | of this document, that EVPN PE MUST be provisioned with the SBD of | |||
provisioned with the SBD of that Tenant Domain. | that Tenant Domain. | |||
At each EVPN-PE attached to a given Tenant Domain, there is an IRB | At each EVPN PE attached to a given Tenant Domain, there is an IRB | |||
interface leading from the L3 routing instance of that Tenant Domain | interface leading from the L3 routing instance of that Tenant Domain | |||
to the SBD. However, the SBD has no ACs. | to the SBD. However, the SBD has no ACs. | |||
Each SBD is provisioned with a Route Target (RT). All the EVPN-PEs | Each SBD is provisioned with an RT. All the EVPN PEs supporting a | |||
supporting a given SBD are provisioned with that RT as an import RT. | given SBD are provisioned with that RT as an import RT. That RT MUST | |||
That RT MUST NOT be the same as the RT associated with any other BD. | NOT be the same as the RT associated with any other BD. | |||
We will use the term "SBD-RT" to denote the RT that has been assigned | We will use the term "SBD-RT" to denote the RT that has been assigned | |||
to the SBD. Routes carrying this RT will be propagated to all | to the SBD. Routes carrying this RT will be propagated to all EVPN | |||
EVPN-PEs in the same Tenant Domain as the originator. | PEs in the same Tenant Domain as the originator. | |||
Section 2.2 specifies the rules by which an EVPN-PE that receives a | Section 2.2 specifies the rules by which an EVPN PE that receives a | |||
route determines whether a received route "belongs to" a particular | route determines whether a received route belongs to a particular | |||
ordinary BD or SBD. | ordinary BD or SBD. | |||
Section 2.2 also specifies additional rules that must be followed | Section 2.2 also specifies additional rules that must be followed | |||
when constructing routes that belong to a particular BD, including | when constructing routes that belong to a particular BD, including | |||
the SBD. | the SBD. | |||
The SBD SHOULD be in an EVPN Instance (EVI) of its own. Even if the | The SBD SHOULD be in an EVI of its own. Even if the SBD is not in an | |||
SBD is not in an EVI of its own, the SBD-RT MUST be different than | EVI of its own, the SBD-RT MUST be different than the RT associated | |||
the RT associated with any other BD. This restriction is necessary | with any other BD. This restriction is necessary in order for the | |||
in order for the rules of Sections 2.2 and 3.1 to work correctly. | rules of Sections 2.2 and 3.1 to work correctly. | |||
Note that an SBD, just like any other BD, is associated on each | Note that an SBD, just like any other BD, is associated on each EVPN | |||
EVPN-PE with a MAC-VRF. Per [RFC7432], each MAC-VRF is associated | PE with a MAC-VRF. Per [RFC7432], each MAC-VRF is associated with a | |||
with a Route Distinguisher (RD). When constructing a route that is | Route Distinguisher (RD). When constructing a route that is for an | |||
"for" an SBD, an EVPN-PE will place the RD of the associated MAC-VRF | SBD, an EVPN PE will place the RD of the associated MAC-VRF in the | |||
in the "Route Distinguisher" field of the NLRI. (If the Tenant | Route Distinguisher field of the NLRI. (If the Tenant Domain has | |||
Domain has several MAC-VRFs on a given PE, the EVPN-PE has a choice | several MAC-VRFs on a given PE, the EVPN PE has a choice of which RD | |||
of which RD to use.) | to use.) | |||
If Assisted Replication (AR, see [I-D.ietf-bess-evpn-optimized-ir]) | If AR [RFC9574] is used, each AR-REPLICATOR for a given Tenant Domain | |||
is used, each AR-REPLICATOR for a given Tenant Domain must be | must be provisioned with the SBD of that Tenant Domain, even if the | |||
provisioned with the SBD of that Tenant Domain, even if the | AR-REPLICATOR does not have any L3 routing instances. | |||
AR-REPLICATOR does not have any L3 routing instance. | ||||
3.2. Advertising the Tunnels Used for IP Multicast | 3.2. Advertising the Tunnels Used for IP Multicast | |||
The procedures used for advertising the tunnels that carry IP | The procedures used for advertising the tunnels that carry IP | |||
multicast traffic depend upon the type of tunnel being used. If the | multicast traffic depend upon the type of tunnel being used. If the | |||
tunnel type is neither Ingress Replication, Assisted Replication, nor | tunnel type is neither IR, AR, nor BIER, there are procedures for | |||
BIER, there are procedures for advertising both "inclusive tunnels" | advertising both inclusive tunnels and selective tunnels. | |||
and "selective tunnels". | ||||
When IR, AR or BIER are used to transmit IP multicast packets across | When IR, AR, or BIER are used to transmit IP multicast packets across | |||
the core, there are no P2MP tunnels. Once an ingress EVPN-PE | the core, there are no P2MP tunnels. Once an ingress EVPN PE | |||
determines the set of egress EVPN-PEs for a given flow, the IMET | determines the set of egress EVPN PEs for a given flow, the IMET | |||
routes contain all the information needed to transport packets of | routes contain all the information needed to transport packets of | |||
that flow to the egress PEs. | that flow to the egress PEs. | |||
If AR is used, the ingress EVPN-PE is also an AR-LEAF and the IMET | If AR is used, the ingress EVPN PE is also an AR-LEAF, and the IMET | |||
route coming from the selected AR-REPLICATOR contains the information | route coming from the selected AR-REPLICATOR contains the information | |||
needed. The AR-REPLICATOR will behave as an ingress EVPN-PE when | needed. The AR-REPLICATOR will behave as an ingress EVPN PE when | |||
sending a flow to the egress EVPN-PEs. | sending a flow to the egress EVPN PEs. | |||
If the tunneling technique requires P2MP tunnels to be set up (e.g., | If the tunneling technique requires P2MP tunnels to be set up (e.g., | |||
RSVP-TE P2MP, mLDP, PIM), some of the tunnels may be selective | RSVP-TE P2MP, Multipoint LDP (mLDP), or PIM), some of the tunnels may | |||
tunnels and some may be inclusive tunnels. | be selective tunnels and some may be inclusive tunnels. | |||
Selective P2MP tunnels are always advertised by the ingress PE using | Selective P2MP tunnels are always advertised by the ingress PE using | |||
S-PMSI A-D routes [I-D.ietf-bess-evpn-bum-procedure-updates]. | S-PMSI Auto-Discovery (A-D) routes [RFC9572]. | |||
For inclusive tunnels, there is a choice between using a BD's | For inclusive tunnels, there is a choice between using a BD's | |||
ordinary "BUM tunnel" [RFC7432] as the default inclusive tunnel for | ordinary BUM tunnel as the default inclusive tunnel for carrying IP | |||
carrying IP multicast traffic, or using a separate IP multicast | multicast traffic or using a separate IP multicast tunnel as the | |||
tunnel as the default inclusive tunnel for carrying IP multicast. In | default inclusive tunnel for carrying IP multicast. In the former | |||
the former case, the inclusive tunnel is advertised in an IMET route. | case, the inclusive tunnel is advertised in an IMET route. In the | |||
In the latter case, the inclusive tunnel is advertised in a (C-*,C-*) | latter case, the inclusive tunnel is advertised in a (C-*,C-*) S-PMSI | |||
S-PMSI A-D route [I-D.ietf-bess-evpn-bum-procedure-updates]. Details | A-D route [RFC9572]. Details may be found in subsequent sections. | |||
may be found in subsequent sections. | ||||
3.2.1. Constructing Routes for the SBD | 3.2.1. Constructing Routes for the SBD | |||
There are situations in which an EVPN-PE needs to originate IMET, | There are situations in which an EVPN PE needs to originate IMET, | |||
SMET, and/or SPMSI routes for the SBD. Throughout this document, we | SMET, and/or S-PMSI routes for the SBD. Throughout this document, we | |||
will refer to such routes respectively as "SBD-IMET routes", | will refer to such routes respectively as "SBD-IMET routes", "SBD- | |||
"SBD-SMET routes", and "SBD-SPMSI routes". Subsequent sections | SMET routes", and "SBD-SPMSI routes". Subsequent sections detail the | |||
detail the conditions under which these routes need to be originated. | conditions under which these routes need to be originated. | |||
When an EVPN-PE needs to originate an SBD-IMET, SBD-SMET, or | When an EVPN PE needs to originate an SBD-IMET, SBD-SMET, or SBD- | |||
SBD-SPMSI route, it constructs the route as follows: | SPMSI route, it constructs the route as follows: | |||
* the RD field of the route's NLRI is set to the RD of the MAC-VRF | * The RD field of the route's NLRI is set to the RD of the MAC-VRF | |||
that is associated with the SBD; | that is associated with the SBD. | |||
* the SBD-RT is attached to the route; | * The SBD-RT is attached to the route. | |||
* the "Tag ID" field of the route's NLRI is set to the Tag ID that | * The Tag ID field of the route's NLRI is set to the Tag ID that has | |||
has been assigned to the SBD. This is most likely 0 if a | been assigned to the SBD. This is most likely 0 if a VLAN-based | |||
VLAN-based or VLAN-bundle service is being used, but non-zero if a | or VLAN-bundle service is being used but non-zero if a VLAN-aware | |||
VLAN-aware bundle service is being used. | bundle service is being used. | |||
3.2.2. Ingress Replication | 3.2.2. Ingress Replication | |||
When Ingress Replication (IR) is used to transport IP multicast | When IR is used to transport IP multicast frames of a given Tenant | |||
frames of a given Tenant Domain, each EVPN-PE attached to that Tenant | Domain, each EVPN PE attached to that Tenant Domain MUST originate an | |||
Domain MUST originate an SBD-IMET route (see Section 3.2.1). | SBD-IMET route (see Section 3.2.1). | |||
The SBD-IMET route MUST carry a PMSI Tunnel attribute (PTA), and the | The SBD-IMET route MUST carry a PTA, and the MPLS Label field of the | |||
MPLS label field of the PTA MUST specify a downstream-assigned MPLS | PTA MUST specify a downstream-assigned MPLS label that maps uniquely | |||
label that maps uniquely (in the context of the originating EVPN-PE) | (in the context of the originating EVPN PE) to the SBD. | |||
to the SBD. | ||||
Following the procedures of [RFC7432], an EVPN-PE MUST also originate | Following the procedures of [RFC7432], an EVPN PE MUST also originate | |||
an IMET route for each BD to which it is attached. Each of these | an IMET route for each BD to which it is attached. Each of these | |||
IMET routes carries a PTA specifying a downstream-assigned label that | IMET routes carries a PTA specifying a downstream-assigned label that | |||
maps uniquely, in the context of the originating EVPN-PE, to the BD | maps uniquely, in the context of the originating EVPN PE, to the BD | |||
in question. These IMET routes need not carry the SBD-RT. | in question. These IMET routes need not carry the SBD-RT. | |||
When an ingress EVPN-PE needs to use IR to send an IP multicast frame | When an ingress EVPN PE needs to use IR to send an IP multicast frame | |||
from a particular source BD to an egress EVPN-PE, the ingress PE | from a particular source BD to an egress EVPN PE, the ingress PE | |||
determines whether the egress PE has originated an IMET route for | determines whether or not the egress PE has originated an IMET route | |||
that BD. If so, that IMET route contains the MPLS label that the | for that BD. If so, that IMET route contains the MPLS label that the | |||
egress PE has assigned to the source BD. The ingress PE uses that | egress PE has assigned to the source BD. The ingress PE uses that | |||
label when transmitting the packet to the egress PE. Otherwise, the | label when transmitting the packet to the egress PE. Otherwise, the | |||
ingress PE uses the label that the egress PE has assigned to the SBD | ingress PE uses the label that the egress PE has assigned to the SBD | |||
(in the SBD-IMET route originated by the egress). | (in the SBD-IMET route originated by the egress). | |||
Note that the set of IMET routes originated by a given egress PE, and | Note that the set of IMET routes originated by a given egress PE, and | |||
installed by a given ingress PE, may change over time. If the egress | installed by a given ingress PE, may change over time. If the egress | |||
PE withdraws its IMET route for the source BD, the ingress PE MUST | PE withdraws its IMET route for the source BD, the ingress PE MUST | |||
stop using the label carried in that IMET route, and instead MUST use | stop using the label carried in that IMET route and instead MUST use | |||
the label carried in the SBD-IMET route from that egress PE. | the label carried in the SBD-IMET route from that egress PE. | |||
Implementors must also take into account that an IMET route from a | Implementors must also take into account that an IMET route from a | |||
particular PE for a particular BD may arrive after that PE's SBD-IMET | particular PE for a particular BD may arrive after that PE's SBD-IMET | |||
route. | route. | |||
3.2.3. Assisted Replication | 3.2.3. Assisted Replication | |||
When Assisted Replication is used to transport IP multicast frames of | When AR is used to transport IP multicast frames of a given Tenant | |||
a given Tenant Domain, each EVPN-PE (including the AR-REPLICATOR) | Domain, each EVPN PE (including the AR-REPLICATOR) attached to the | |||
attached to the Tenant Domain MUST originate an SBD-IMET route (see | Tenant Domain MUST originate an SBD-IMET route (see Section 3.2.1). | |||
Section 3.2.1). | ||||
An AR-REPLICATOR attached to a given Tenant Domain is considered to | An AR-REPLICATOR attached to a given Tenant Domain is considered to | |||
be an EVPN-PE of that Tenant Domain. It is attached to all the BDs | be an EVPN PE of that Tenant Domain. It is attached to all the BDs | |||
in the Tenant Domain, but it does not necessarily have L3 routing | in the Tenant Domain, but it does not necessarily have L3 routing | |||
instances. | instances. | |||
As with Ingress Replication, the SBD-IMET route carries a PTA where | As with IR, the SBD-IMET route carries a PTA where the MPLS Label | |||
the MPLS label field specifies the downstream-assigned MPLS label | field specifies the downstream-assigned MPLS label that identifies | |||
that identifies the SBD. However, the AR-REPLICATOR and AR-LEAF | the SBD. However, the AR-REPLICATOR and AR-LEAF EVPN PEs will set | |||
EVPN-PEs will set the PTA's flags differently, as per | the PTA's flags differently, as per [RFC9574]. | |||
[I-D.ietf-bess-evpn-optimized-ir]. | ||||
In addition, each EVPN-PE originates an IMET route for each BD to | In addition, each EVPN PE originates an IMET route for each BD to | |||
which it is attached. As in the case of Ingress Replication, these | which it is attached. As in the case of IR, these routes carry the | |||
routes carry the downstream-assigned MPLS labels that identify the | downstream-assigned MPLS labels that identify the BDs and do not | |||
BDs and do not carry the SBD-RT. | carry the SBD-RT. | |||
When an ingress EVPN-PE, acting as AR-LEAF, needs to send an IP | When an ingress EVPN PE, acting as AR-LEAF, needs to send an IP | |||
multicast frame from a particular source BD to an egress EVPN-PE, the | multicast frame from a particular source BD to an egress EVPN PE, the | |||
ingress PE determines whether there is any AR-REPLICATOR that | ingress PE determines whether or not there is any AR-REPLICATOR that | |||
originated an IMET route for that BD. After the AR-REPLICATOR | originated an IMET route for that BD. After the AR-REPLICATOR | |||
selection (if there are more than one), the AR-LEAF uses the label | selection (if there are more than one), the AR-LEAF uses the label | |||
contained in the IMET route of the AR-REPLICATOR when transmitting | contained in the IMET route of the AR-REPLICATOR when transmitting | |||
packets to it. The AR-REPLICATOR receives the packet and, based on | packets to it. The AR-REPLICATOR receives the packet and, based on | |||
the procedures specified in [I-D.ietf-bess-evpn-optimized-ir] and in | the procedures specified in [RFC9574] and in Section 3.2.2 of this | |||
Section 3.2.2 of this document, transmits the packets to the egress | document, transmits the packets to the egress EVPN PEs using the | |||
EVPN-PEs using the labels contained in the received IMET routes for | labels contained in the received IMET routes for either the source BD | |||
either the source BD or the SBD. | or the SBD. | |||
If an ingress AR-LEAF for a given BD has not received any IMET route | If an ingress AR-LEAF for a given BD has not received any IMET route | |||
for that BD from an AR-REPLICATOR, the ingress AR-LEAF follows the | for that BD from an AR-REPLICATOR, the ingress AR-LEAF follows the | |||
procedures in Section 3.2.2. | procedures in Section 3.2.2. | |||
3.2.3.1. Automatic SBD Matching | 3.2.3.1. Automatic SBD Matching | |||
Each PE needs to know a BD's corresponding SBD. Configuring that | Each PE needs to know a BD's corresponding SBD. Configuring that | |||
information in each BD is one way but it requires repetitive | information in each BD is one way, but it requires repetitive | |||
configuration and consistency checking (to make sure that all the BDs | configuration and consistency checking (to make sure that all the BDs | |||
of the same tenant are configured with the same SBD). A better way | of the same tenant are configured with the same SBD). A better way | |||
is to configure the SBD info in the L3 routing instance so that all | is to configure the SBD info in the L3 routing instance so that all | |||
related BDs will derive the SBD information. | related BDs will derive the SBD information. | |||
An AR-replicator also needs to know same information, though it does | An AR-REPLICATOR also needs to know the same information, though it | |||
not necessarily have an L3 routing instance. However, from the EVI- | does not necessarily have an L3 routing instance. However, from the | |||
RT EC in a BD's IMET route, an AR-replicator can derive the | EVI-RT EC in a BD's IMET route, an AR-REPLICATOR can derive the | |||
corresponding SBD of that BD without any configuration. | corresponding SBD of that BD without any configuration. | |||
3.2.4. BIER | 3.2.4. BIER | |||
When BIER is used to transport multicast packets of a given Tenant | When BIER is used to transport multicast packets of a given Tenant | |||
Domain, and a given EVPN-PE attached to that Tenant Domain is a | Domain, and a given EVPN PE attached to that Tenant Domain is a | |||
possible ingress EVPN-PE for traffic originating outside that Tenant | possible ingress EVPN PE for traffic originating outside that Tenant | |||
Domain, the given EVPN-PE MUST originate an SBD-IMET route, (see | Domain, the given EVPN PE MUST originate an SBD-IMET route (see | |||
Section 3.2.1). | Section 3.2.1). | |||
In addition, IMET routes that are originated for other BDs in the | In addition, IMET routes that are originated for other BDs in the | |||
Tenant Domain MUST carry the SBD-RT. | Tenant Domain MUST carry the SBD-RT. | |||
Each IMET route (including but not limited to the SBD-IMET route) | Each IMET route (including but not limited to the SBD-IMET route) | |||
MUST carry a PMSI Tunnel attribute (PTA). The MPLS label field of | MUST carry a PTA. The MPLS Label field of the PTA MUST specify an | |||
the PTA MUST specify an upstream-assigned MPLS label that maps | upstream-assigned MPLS label that maps uniquely (in the context of | |||
uniquely (in the context of the originating EVPN-PE) to the BD for | the originating EVPN PE) to the BD for which the route is originated. | |||
which the route is originated. | ||||
Suppose an ingress EVPN-PE, PE1, needs to use BIER to tunnel an IP | Suppose an ingress EVPN PE, say PE1, needs to use BIER to tunnel an | |||
multicast frame to a set of egress EVPN-PEs. And suppose the frame's | IP multicast frame to a set of egress EVPN PEs. And suppose the | |||
source BD is BD1. The frame is encapsulated as follows: | frame's source BD is BD1. The frame is encapsulated as follows: | |||
* A four-octet MPLS label stack entry [RFC3032] is prepended to the | * A four-octet MPLS label stack entry [RFC3032] is prepended to the | |||
frame. The Label field is set to the upstream-assigned label that | frame. The Label field is set to the upstream-assigned label that | |||
PE1 has assigned to BD1. | PE1 has assigned to BD1. | |||
* The resulting MPLS packet is then encapsulated in a BIER | * The resulting MPLS packet is then encapsulated in a BIER | |||
encapsulation [RFC8296], [I-D.ietf-bier-evpn]. The BIER BitString | encapsulation [RFC8296] [RFC9624]. The BIER BitString is set to | |||
is set to identify the egress EVPN-PEs. The BIER "proto" field is | identify the egress EVPN PEs. The BIER Proto field is set to the | |||
set to the value for "MPLS packet with upstream-assigned label at | value for "MPLS packet with an upstream-assigned label at top of | |||
top of stack". | the stack". | |||
Note: It is possible that the packet being tunneled from PE1 | Note: It is possible that the packet being tunneled from PE1 | |||
originated outside the Tenant Domain. In this case, the actual | originated outside the Tenant Domain. In this case, the actual | |||
source BD (BD1) is considered to be the SBD, and the | source BD, BD1, is considered to be the SBD, and the upstream- | |||
upstream-assigned label it carries will be the label that PE1 | assigned label it carries will be the label that PE1 assigned to the | |||
assigned to the SBD, and advertised in its SBD-IMET route. | SBD and advertised in its SBD-IMET route. | |||
Suppose an egress PE, PE2, receives such a BIER packet. The BFIR-id | Suppose an egress PE, say PE2, receives such a BIER packet. The | |||
field of the BIER header allows PE2 to determine that the ingress PE | BFIR-id field of the BIER header allows PE2 to determine that the | |||
is PE1. There are then two cases to consider: | ingress PE is PE1. There are then two cases to consider: | |||
1. PE2 has received and installed an IMET route for BD1 from PE1. | 1. PE2 has received and installed an IMET route for BD1 from PE1. | |||
In this case, the BIER packet will be carrying the | In this case, the BIER packet will be carrying the upstream- | |||
upstream-assigned label that is specified in the PTA of that IMET | assigned label that is specified in the PTA of that IMET route. | |||
route. This enables PE2 to determine the "apparent source BD" | This enables PE2 to determine the apparent source BD (as defined | |||
(as defined in Section 2.4). | in Section 2.4). | |||
2. PE2 has not received and installed an IMET route for BD1 from | 2. PE2 has not received and installed an IMET route for BD1 from | |||
PE1. | PE1. | |||
In this case, PE2 will not recognize the upstream-assigned label | In this case, PE2 will not recognize the upstream-assigned label | |||
carried in the BIER packet. PE2 MUST discard the packet. | carried in the BIER packet. PE2 MUST discard the packet. | |||
Further details on the use of BIER to support EVPN can be found in | Further details on the use of BIER to support EVPN can be found in | |||
[I-D.ietf-bier-evpn]. | [RFC9624]. | |||
3.2.5. Inclusive P2MP Tunnels | 3.2.5. Inclusive P2MP Tunnels | |||
3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive Tunnels | 3.2.5.1. Using the BUM Tunnels as IP Multicast Inclusive Tunnels | |||
The procedures in this section apply only when | The procedures in this section apply only when: | |||
(a) it is desired to use the BUM tunnels to carry IP multicast | a) it is desired to use the BUM tunnels to carry IP multicast | |||
traffic across the backbone, and | traffic across the backbone and | |||
(b) the BUM tunnels are P2MP tunnels (i.e., neither IR, AR, nor BIER | b) the BUM tunnels are P2MP tunnels (i.e., neither IR, AR, nor BIER | |||
are being used to transport the BUM traffic). | are being used to transport the BUM traffic). | |||
In this case, an IP multicast frame (whether inter-subnet or | In this case, an IP multicast frame (whether inter-subnet or intra- | |||
intra-subnet) will be carried across the backbone in the BUM tunnel | subnet) will be carried across the backbone in the BUM tunnel | |||
belonging to its source BD. Each EVPN-PE attached to a given Tenant | belonging to its source BD. Each EVPN PE attached to a given Tenant | |||
Domain needs to join the BUM tunnels for every BD in the Tenant | Domain needs to join the BUM tunnels for every BD in the Tenant | |||
Domain, even those BDs to which the EVPN-PE is not locally attached. | Domain, even those BDs to which the EVPN PE is not locally attached. | |||
This ensures that an IP multicast packet from any source BD can reach | This ensures that an IP multicast packet from any source BD can reach | |||
all PEs attached to the Tenant Domain. | all PEs attached to the Tenant Domain. | |||
Note that this will cause all the BUM traffic from a given BD in a | Note that this will cause all the BUM traffic from a given BD in a | |||
Tenant Domain to be sent to all PEs that attach to that Tenant | Tenant Domain to be sent to all PEs that attach to that Tenant | |||
Domain, even the PEs that don't attach to the given BD. To avoid | Domain, even the PEs that don't attach to the given BD. To avoid | |||
this, it is RECOMMENDED that the BUM tunnels not be used as IP | this, it is RECOMMENDED that the BUM tunnels not be used as IP | |||
Multicast inclusive tunnels, and that the procedures of | multicast inclusive tunnels and that the procedures of | |||
Section 3.2.5.2 be used instead. | Section 3.2.5.2 be used instead. | |||
If a PE is a possible ingress EVPN-PE for traffic originating outside | If a PE is a possible ingress EVPN PE for traffic originating outside | |||
the Tenant Domain, the PE MUST originate an SBD-IMET route (see | the Tenant Domain, the PE MUST originate an SBD-IMET route (see | |||
Section 3.2.1). This route MUST carry a PTA specifying the P2MP | Section 3.2.1). This route MUST carry a PTA specifying the P2MP | |||
tunnel used for transmitting IP multicast packets that originate | tunnel used for transmitting IP multicast packets that originate | |||
outside the tenant domain. All EVPN-PEs of the Tenant Domain MUST | outside the Tenant Domain. All EVPN PEs of the Tenant Domain MUST | |||
join the tunnel specified in the PTA of an SBD-IMET route: | join the tunnel specified in the PTA of an SBD-IMET route: | |||
* If the tunnel is an RSVP-TE P2MP tunnel, the originator of the | * If the tunnel is an RSVP-TE P2MP tunnel, the originator of the | |||
route MUST use RSVP-TE P2MP procedures to add each PE of the | route MUST use RSVP-TE P2MP procedures to add each PE of the | |||
Tenant Domain to the tunnel, even PEs that have not originated an | Tenant Domain to the tunnel, even PEs that have not originated an | |||
SBD-IMET route. | SBD-IMET route. | |||
* If the tunnel is an mLDP or PIM tunnel, each PE importing the | * If the tunnel is an mLDP or PIM tunnel, each PE importing the SBD- | |||
SBD-IMET route MUST add itself to the tunnel, using mLDP or PIM | IMET route MUST add itself to the tunnel, using mLDP or PIM | |||
procedures, respectively. | procedures, respectively. | |||
Whether or not a PE originates an SBD-IMET route, it will of course | Whether or not a PE originates an SBD-IMET route, it will of course | |||
originate an IMET route for each BD to which it is attached. Each of | originate an IMET route for each BD to which it is attached. Each of | |||
these IMET routes MUST carry the SBD-RT, as well as the RT for the BD | these IMET routes MUST carry the SBD-RT, as well as the RT for the BD | |||
to which it belongs. | to which it belongs. | |||
If a received IMET route is not the SBD-IMET route, it will also be | If a received IMET route is not the SBD-IMET route, it will also be | |||
carrying the RT for its source BD. The route's NLRI will carry the | carrying the RT for its source BD. The route's NLRI will carry the | |||
Tag ID for the source BD. From the RT and the Tag ID, any PE | Tag ID for the source BD. From the RT and the Tag ID, any PE | |||
receiving the route can determine the route's source BD. | receiving the route can determine the route's source BD. | |||
If the MPLS label field of the PTA contains zero, the specified P2MP | If the MPLS Label field of the PTA contains zero, the specified P2MP | |||
tunnel is used only to carry frames of a single source BD. | tunnel is used only to carry frames of a single source BD. | |||
If the MPLS label field of the PTA does not contain zero, it MUST | If the MPLS Label field of the PTA does not contain zero, it MUST | |||
contain an upstream-assigned MPLS label that maps uniquely (in the | contain an upstream-assigned MPLS label that maps uniquely (in the | |||
context of the originating EVPN-PE) to the source BD (or, in the case | context of the originating EVPN PE) to the source BD (or in the case | |||
of an SBD-IMET route, to the SBD). The tunnel may then be used to | of an SBD-IMET route, to the SBD). The tunnel may then be used to | |||
carry frames of multiple source BDs. The apparent source BD of a | carry frames of multiple source BDs. The apparent source BD of a | |||
particular packet is inferred from the label carried by the packet. | particular packet is inferred from the label carried by the packet. | |||
IP multicast traffic originating outside the Tenant Domain is | IP multicast traffic originating outside the Tenant Domain is | |||
transmitted with the label corresponding to the SBD, as specified in | transmitted with the label corresponding to the SBD, as specified in | |||
the ingress EVPN-PE's SBD-IMET route. | the ingress EVPN PE's SBD-IMET route. | |||
3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise Inclusive | 3.2.5.2. Using Wildcard S-PMSI A-D Routes to Advertise Inclusive | |||
Tunnels Specific to IP Multicast | Tunnels Specific to IP Multicast | |||
The procedures of this section apply when (and only when) it is | The procedures of this section apply when (and only when) it is | |||
desired to transmit IP multicast traffic on an inclusive tunnel, but | desired to transmit IP multicast traffic on an inclusive tunnel but | |||
not on the same tunnel used to transmit BUM traffic. | not on the same tunnel used to transmit BUM traffic. | |||
However, these procedures do NOT apply when the tunnel type is | However, these procedures do NOT apply when the tunnel type is IR or | |||
Ingress Replication or BIER, EXCEPT in the case where it is necessary | BIER, EXCEPT in the case where it is necessary to interwork between | |||
to interwork between non-OISM PEs and OISM PEs, as specified in | non-OISM PEs and OISM PEs, as specified in Section 5. | |||
Section 5. | ||||
Each EVPN-PE attached to the given Tenant Domain MUST originate an | Each EVPN PE attached to the given Tenant Domain MUST originate an | |||
SBD-SPMSI A-D route. The NLRI of that route MUST contain (C-*,C-*) | SBD-SPMSI A-D route. The NLRI of that route MUST contain (C-*,C-*) | |||
(see [RFC6625]). Additional rules for constructing that route are | (see [RFC6625]). Additional rules for constructing that route are | |||
given in Section 3.2.1. | given in Section 3.2.1. | |||
In addition, an EVPN-PE MUST originate an S-PMSI A-D route containing | In addition, an EVPN PE MUST originate an S-PMSI A-D route containing | |||
(C-*,C-*) in its NLRI for each of the other BDs, in the given Tenant | (C-*,C-*) in its NLRI for each of the other BDs, in the given Tenant | |||
Domain, to which it is attached. All such routes MUST carry the | Domain, to which it is attached. All such routes MUST carry the SBD- | |||
SBD-RT. This ensures that those routes are imported by all EVPN-PEs | RT. This ensures that those routes are imported by all EVPN PEs | |||
attached to the Tenant Domain. | attached to the Tenant Domain. | |||
A PE receiving these routes follows the procedures of Section 2.2 to | A PE receiving these routes follows the procedures of Section 2.2 to | |||
determine which BD the route is for. | determine which BD the route is for. | |||
If the MPLS label field of the PTA contains zero, the specified | If the MPLS Label field of the PTA contains zero, the specified | |||
tunnel is used only to carry frames of a single source BD. | tunnel is used only to carry frames of a single source BD. | |||
If the MPLS label field of the PTA does not contain zero, it MUST | If the MPLS Label field of the PTA does not contain zero, it MUST | |||
specify an upstream-assigned MPLS label that maps uniquely (in the | specify an upstream-assigned MPLS label that maps uniquely (in the | |||
context of the originating EVPN-PE) to the source BD. The tunnel may | context of the originating EVPN PE) to the source BD. The tunnel may | |||
be used to carry frames of multiple source BDs, and the apparent | be used to carry frames of multiple source BDs, and the apparent | |||
source BD for a particular packet is inferred from the label carried | source BD for a particular packet is inferred from the label carried | |||
by the packet. | by the packet. | |||
The EVPN-PE advertising these S-PMSI A-D route routes is specifying | The EVPN PE advertising these S-PMSI A-D routes is specifying the | |||
the default tunnel that it will use (as ingress PE) for transmitting | default tunnel that it will use (as ingress PE) for transmitting IP | |||
IP multicast packets. The upstream-assigned label allows an egress | multicast packets. The upstream-assigned label allows an egress PE | |||
PE to determine the apparent source BD of a given packet. | to determine the apparent source BD of a given packet. | |||
3.2.6. Selective Tunnels | 3.2.6. Selective Tunnels | |||
An ingress EVPN-PE for a given multicast flow or set of flows can | An ingress EVPN PE for a given multicast flow or set of flows can | |||
always assign the flow to a particular P2MP tunnel by originating an | always assign the flow to a particular P2MP tunnel by originating an | |||
S-PMSI A-D route whose NLRI identifies the flow or set of flows. The | S-PMSI A-D route whose NLRI identifies the flow or set of flows. The | |||
NLRI of the route could be (C-*,C-G), or (C-S,C-G). The S-PMSI A-D | NLRI of the route could be (C-*,C-G) or (C-S,C-G). The S-PMSI A-D | |||
route MUST carry the SBD-RT, so that it is imported by all EVPN-PEs | route MUST carry the SBD-RT so that it is imported by all EVPN PEs | |||
attached to the Tenant Domain. | attached to the Tenant Domain. | |||
An S-PMSI A-D route is "for" a particular source BD. It MUST carry | An S-PMSI A-D route is for a particular source BD. It MUST carry the | |||
the RT associated with that BD, and it MUST have the Tag ID for that | RT associated with that BD, and it MUST have the Tag ID for that BD | |||
BD in its NLRI. | in its NLRI. | |||
When an EVPN-PE imports an S-PMSI A-D route, it applies the rules of | When an EVPN PE imports an S-PMSI A-D route, it applies the rules of | |||
Section 2.2 to associate the route with a particular BD. | Section 2.2 to associate the route with a particular BD. | |||
Each such route MUST contain a PTA, as specified in Section 3.2.5.2. | Each such route MUST contain a PTA, as specified in Section 3.2.5.2. | |||
An egress EVPN-PE interested in the specified flow or flows MUST join | An egress EVPN PE interested in the specified flow or flows MUST join | |||
the specified tunnel. Procedures for joining the specified tunnel | the specified tunnel. Procedures for joining the specified tunnel | |||
are specific to the tunnel type. (Note that if the tunnel type is | are specific to the tunnel type. (Note that if the tunnel type is | |||
RSVP-TE P2MP LSP, the Leaf Information Required (LIR) flag of the PTA | RSVP-TE P2MP LSP, the Leaf Information Required (LIR) flag of the PTA | |||
SHOULD NOT be set. An ingress OISM PE knows which OISM EVPN PEs are | SHOULD NOT be set. An ingress OISM PE knows which OISM EVPN PEs are | |||
interested in any given flow, and hence can add them to the RSVP-TE | interested in any given flow and hence can add them to the RSVP-TE | |||
P2MP tunnel that carries such flows.) | P2MP tunnel that carries such flows.) | |||
If the PTA does not specify a non-zero MPLS label, the apparent | If the PTA does not specify a non-zero MPLS label, the apparent | |||
source BD of any packets that arrive on that tunnel is considered to | source BD of any packets that arrive on that tunnel is considered to | |||
be the BD associated with the route that carries the PTA. If the PTA | be the BD associated with the route that carries the PTA. If the PTA | |||
does specify a non-zero MPLS label, the apparent source BD of any | does specify a non-zero MPLS label, the apparent source BD of any | |||
packets that arrive on that tunnel carrying the specified label is | packets that arrive on that tunnel carrying the specified label is | |||
considered to be the BD associated with the route that carries the | considered to be the BD associated with the route that carries the | |||
PTA. | PTA. | |||
It should be noted that when either IR or BIER is used, there is no | It should be noted that, when either IR or BIER is used, there is no | |||
need for an ingress PE to use S-PMSI A-D routes to assign specific | need for an ingress PE to use S-PMSI A-D routes to assign specific | |||
flows to selective tunnels. The procedures of Section 3.3, along | flows to selective tunnels. The procedures of Section 3.3, along | |||
with the procedures of Section 3.2.2, Section 3.2.3, or | with the procedures of Sections 3.2.2, 3.2.3, and 3.2.4, provide the | |||
Section 3.2.4, provide the functionality of selective tunnels without | functionality of selective tunnels without the need to use S-PMSI A-D | |||
the need to use S-PMSI A-D routes. | routes. | |||
3.3. Advertising SMET Routes | 3.3. Advertising SMET Routes | |||
[RFC9251] allows an egress EVPN-PE to express its interest in a | [RFC9251] allows an egress EVPN PE to express its interest in a | |||
particular multicast flow or set of flows by originating an SMET | particular multicast flow or set of flows by originating a SMET | |||
route. The NLRI of the SMET route identifies the flow or set of | route. The NLRI of the SMET route identifies the flow or set of | |||
flows as (C-*,C-*) or (C-*,C-G) or (C-S,C-G). | flows as (C-*,C-*), (C-*,C-G), or (C-S,C-G). | |||
Each SMET route belongs to a particular BD. The Tag ID for the BD | Each SMET route belongs to a particular BD. The Tag ID for the BD | |||
appears in the NLRI of the route, and the route carries the RT | appears in the NLRI of the route, and the route carries the RT | |||
associated with that BD. From this <RT, tag> pair, other EVPN-PEs | associated with that BD. From this <RT, tag> pair, other EVPN PEs | |||
can identify the BD to which a received SMET route belongs. | can identify the BD to which a received SMET route belongs. | |||
(Remember though that the route may be carrying multiple RTs.) | (Remember though that the route may be carrying multiple RTs.) | |||
There are three cases to consider: | There are three cases to consider: | |||
* Case 1: It is known that no BD of a Tenant Domain contains a | Case 1: It is known that no BD of a Tenant Domain contains a | |||
multicast router. | multicast router. | |||
In this case, an egress PE advertises its interest in a flow or | In this case, an egress PE advertises its interest in a flow | |||
set of flows by originating an SMET route that belongs to the SBD. | or set of flows by originating a SMET route that belongs to | |||
We refer to this as an SBD-SMET route. The SBD-SMET route carries | the SBD. We refer to this as an SBD-SMET route. The SBD- | |||
the SBD-RT, and has the Tag ID for the SBD in its NLRI. SMET | SMET route carries the SBD-RT and has the Tag ID for the SBD | |||
routes for the individual BDs are not needed, because there is no | in its NLRI. SMET routes for the individual BDs are not | |||
need for a PE that receives an SMET route to send a corresponding | needed, because there is no need for a PE that receives a | |||
IGMP/MLD Join message on any of its ACs. | SMET route to send a corresponding IGMP/MLD Join message on | |||
any of its ACs. | ||||
* Case 2: It is known that more than one BD of a Tenant Domain may | Case 2: It is known that more than one BD of a Tenant Domain may | |||
contain a multicast router. | contain a multicast router. | |||
This is very like Case 1. An egress PE advertises its interest in | This is much like Case 1. An egress PE advertises its | |||
a flow or set of flows by originating an SBD-SMET route. The | interest in a flow or set of flows by originating an SBD- | |||
SBD-SMET route carries the SBD-RT, and has the Tag ID for the SBD | SMET route. The SBD-SMET route carries the SBD-RT and has | |||
in its NLRI. | the Tag ID for the SBD in its NLRI. | |||
In this case, it is important to be sure that SMET routes for the | In this case, it is important to be sure that SMET routes | |||
individual BDs are not originated. Suppose, for example, that PE1 | for the individual BDs are not originated. For example, | |||
had local receivers for a given flow on both BD1 and BD2, and that | suppose that PE1 had local receivers for a given flow on | |||
it originated SMET routes for both those BDs. Then PEs receiving | both BD1 and BD2 and that it originated SMET routes for both | |||
those SMET routes might send IGMP/MLD Joins on both those BDs. | those BDs. Then, PEs receiving those SMET routes might send | |||
This could cause externally sourced multicast traffic to enter the | IGMP/MLD Joins on both those BDs. This could cause | |||
Tenant Domain at both BDs, which could result in duplication of | externally sourced multicast traffic to enter the Tenant | |||
data. | Domain at both BDs, which could result in duplication of | |||
data. | ||||
Note that if it is possible that more than one BD contains a | Note that if it is possible that more than one BD contains a | |||
tenant multicast router, then in order to receive multicast data | tenant multicast router, then in order to receive multicast | |||
originating from outside EVPN, the PEs MUST follow the procedures | data originating from outside EVPN, the PEs MUST follow the | |||
of Section 6. | procedures of Section 6. | |||
* Case 3: It is known that only a single BD of a Tenant Domain | Case 3: It is known that only a single BD of a Tenant Domain | |||
contains a multicast router. | contains a multicast router. | |||
Suppose that an egress PE is attached to a BD on which there might | Suppose that an egress PE is attached to a BD on which there | |||
be a tenant multicast router. (The tenant router is not | might be a tenant multicast router. (The tenant router is | |||
necessarily on a segment that is attached to that PE.) And | not necessarily on a segment that is attached to that PE.) | |||
suppose that the PE has one or more ACs attached to that BD which | And suppose that the PE has one or more ACs attached to that | |||
are interested in a given multicast flow. In this case, in | BD, which are interested in a given multicast flow. In this | |||
addition to the SMET route for the SBD, the egress PE MAY | case, in addition to the SMET route for the SBD, the egress | |||
originate an SMET route for that BD. This will enable the ingress | PE MAY originate a SMET route for that BD. This will enable | |||
PE(s) to send IGMP/MLD messages on ACs for the BD, as specified in | the ingress PE(s) to send IGMP/MLD messages on ACs for the | |||
[RFC9251]. As long as that is the only BD on which there is a | BD, as specified in [RFC9251]. As long as that is the only | |||
tenant multicast router, there is no possibility of duplication of | BD on which there is a tenant multicast router, there is no | |||
data. | possibility of duplication of data. | |||
This document does not specify procedures for dynamically determining | This document does not specify procedures for dynamically determining | |||
which of the three cases applies to a given deployment; the PEs of a | which of the three cases applies to a given deployment; the PEs of a | |||
given Tenant Domain MUST be provisioned to know which case applies. | given Tenant Domain MUST be provisioned to know which case applies. | |||
As detailed in [RFC9251], an SMET route carries flags indicating | As detailed in [RFC9251], a SMET route carries flags indicating | |||
whether IGMP (v1, v2 or v3) or MLD (v1 or v2) messages should be | whether IGMP (v1, v2, or v3) or MLD (v1 or v2) messages should be | |||
triggered on the ACs of the BD to which the SMET route belongs. For | triggered on the ACs of the BD to which the SMET route belongs. For | |||
IGMP v3 and MLD v2, the IE flag also indicates whether the source | IGMP v3 and MLD v2, the Include/Exclude (IE) flag also indicates | |||
information in the SMET route is of an Include Group type or Exclude | whether the source information in the SMET route is of an Include | |||
Group type. If an SBD PE needs to generate IGMP/MLD reports as it is | Group type or Exclude Group type. If an SBD PE needs to generate | |||
the case in section 6.2), or the route is for an (S, G) state, the | IGMP/MLD reports (as it is the case in Section 6.2) or the route is | |||
value of the flags MUST be set according to the rules in [RFC9251]. | for an (S, G) state, the value of the flags MUST be set according to | |||
Otherwise, the flags SHOULD be set to 0. | the rules in [RFC9251]. Otherwise, the flags SHOULD be set to 0. | |||
Note that a PE only needs to originate the set of SBD-SMET routes | Note that a PE only needs to originate the set of SBD-SMET routes | |||
that are needed to receive multicast traffic in which it is | that are needed in order to receive multicast traffic that the PE is | |||
interested. Suppose PE1 has ACs attached to BD1 that are interested | interested in. Suppose PE1 has ACs attached to BD1 that are | |||
in (C-*,C-G) traffic, and ACs attached to BD2 that are interested in | interested in (C-*,C-G) traffic and ACs attached to BD2 that are | |||
(C-S,C-G) traffic. A single SBD-SMET route specifying (C-*,C-G) will | interested in (C-S,C-G) traffic. A single SBD-SMET route specifying | |||
attract all the necessary flows. | (C-*,C-G) will attract all the necessary flows. | |||
As another example, suppose the ACs attached to BD1 are interested in | As another example, suppose the ACs attached to BD1 are interested in | |||
(C-*,C-G) but not in (C-S,C-G), while the ACs attached to BD2 are | (C-*,C-G) but not in (C-S,C-G), while the ACs attached to BD2 are | |||
interested in (C-S,C-G). A single SBD-SMET route specifying | interested in (C-S,C-G). A single SBD-SMET route specifying | |||
(C-*,C-G) will pull in all the necessary flows. | (C-*,C-G) will pull in all the necessary flows. | |||
In other words, to determine the set of SBD-SMET routes that have to | In other words, to determine the set of SBD-SMET routes that have to | |||
be sent for a given C-G, the PE has to merge the IGMP/MLD state for | be sent for a given C-G, the PE has to merge the IGMP/MLD state for | |||
all the BDs (of the given Tenant Domain) to which it is attached. | all the BDs (of the given Tenant Domain) to which it is attached. | |||
Per [RFC9251], importing an SMET route for a particular BD will cause | Per [RFC9251], importing a SMET route for a particular BD will cause | |||
IGMP/MLD state to be instantiated for the IRB interface to that BD. | the IGMP/MLD state to be instantiated for the IRB interface to that | |||
This applies as well when the BD is the SBD. | BD. This also applies when the BD is the SBD. | |||
However, traffic that originates in one of the actual BDs of a | However, traffic that originates in one of the actual BDs of a | |||
particular Tenant Domain MUST NOT be sent down the IRB interface that | particular Tenant Domain MUST NOT be sent down the IRB interface that | |||
connects the L3 routing instance of that Tenant Domain to the SBD. | connects the L3 routing instance of that Tenant Domain to the SBD. | |||
That would cause duplicate delivery of traffic, since such traffic | That would cause duplicate delivery of traffic, since such traffic | |||
will have already been distributed throughout the Tenant Domain. | will have already been distributed throughout the Tenant Domain. | |||
Therefore, when setting up the IGMP/MLD state based on SBD-SMET | Therefore, when setting up the IGMP/MLD state based on SBD-SMET | |||
routes, care must be taken to ensure that the IRB interface to the | routes, care must be taken to ensure that the IRB interface to the | |||
SBD is not added to the Outgoing Interface (OIF) list if the traffic | SBD is not added to the Outgoing Interface (OIF) list if the traffic | |||
originates within the Tenant Domain. | originates within the Tenant Domain. | |||
There are some multicast scenarios that make use of "anycast | There are some multicast scenarios that make use of anycast sources. | |||
sources". For example, two different sources may share the same | For example, two different sources may share the same anycast IP | |||
anycast IP address, say S1, and each may transmit an (S1,G) multicast | address, say S1, and each may transmit an (S1,G) multicast flow. In | |||
flow. In such a scenario, the two (S1,G) flows are typically | such a scenario, the two (S1,G) flows are typically identical. | |||
identical. Ordinary PIM procedures will cause only one the flows to | Ordinary PIM procedures will cause only one of the flows to be | |||
be delivered to each receiver that has expressed interest in either | delivered to each receiver that has expressed interest in either | |||
(*,G) or (S1,G). However, the OISM procedures described in this | (*,G) or (S1,G). However, the OISM procedures described in this | |||
document will result in both of the (S1,G) flows being distributed in | document will result in both of the (S1,G) flows being distributed in | |||
the Tenant Domain, and duplicate delivery will result. Therefore, if | the Tenant Domain, and duplicate delivery will result. Therefore, if | |||
there are receivers for (*,G) in a given Tenant Domain, there MUST | there are receivers for (*,G) in a given Tenant Domain, there MUST | |||
NOT be anycast sources for G within that Tenant Domain. (This | NOT be anycast sources for G within that Tenant Domain. (This | |||
restriction could be lifted by defining additional procedures; | restriction could be lifted by defining additional procedures; | |||
however that is outside the scope of this document.) | however, that is outside the scope of this document.) | |||
4. Constructing Multicast Forwarding State | 4. Constructing Multicast Forwarding State | |||
4.1. Layer 2 Multicast State | 4.1. Layer 2 Multicast State | |||
An EVPN-PE maintains "layer 2 multicast state" for each BD to which | An EVPN PE maintains Layer 2 multicast state for each BD to which it | |||
it is attached. Note that this is used for forwarding IP multicast | is attached. Note that this is used for forwarding IP multicast | |||
frames based on the inner IP header. The state is learned through | frames based on the inner IP header. The state is learned through | |||
IGMP/MLD snooping [RFC4541] and procedures in this document. | IGMP/MLD snooping [RFC4541] and procedures in this document. | |||
Let PE1 be an EVPN-PE, and BD1 be a BD to which it is attached. At | Let PE1 be an EVPN PE and BD1 be a BD to which it is attached. At | |||
PE1, BD1's layer 2 multicast state for a given (C-S,C-G) or (C-*,C-G) | PE1, BD1's Layer 2 multicast state for a given (C-S,C-G) or (C-*,C-G) | |||
governs the disposition of an IP multicast packet that is received by | governs the disposition of an IP multicast packet that is received by | |||
BD1's layer 2 multicast function on an EVPN-PE. | BD1's Layer 2 multicast function on an EVPN PE. | |||
An IP multicast (S,G) packet is considered to have been received by | An IP multicast (S,G) packet is considered to have been received by | |||
BD1's layer 2 multicast function in PE1 in the following cases: | BD1's Layer 2 multicast function in PE1 in the following cases: | |||
* The packet is the payload of an Ethernet frame received by PE1 | * The packet is the payload of an Ethernet frame received by PE1 | |||
from an AC that attaches to BD1. | from an AC that attaches to BD1. | |||
* The packet is the payload of an Ethernet frame whose apparent | * The packet is the payload of an Ethernet frame whose apparent | |||
source BD is BD1, and which is received by the PE1 over a tunnel | source BD is BD1, which is received by the PE1 over a tunnel from | |||
from another EVPN-PE. | another EVPN PE. | |||
* The packet is received from BD1's IRB interface (i.e., has been | * The packet is received from BD1's IRB interface (i.e., has been | |||
transmitted by PE1's L3 routing instance down BD1's IRB | transmitted by PE1's L3 routing instance down BD1's IRB | |||
interface). | interface). | |||
According to the procedures of this document, all transmission of IP | According to the procedures of this document, all transmissions of IP | |||
multicast packets from one EVPN-PE to another is done at layer 2. | multicast packets from one EVPN PE to another are done at Layer 2. | |||
That is, the packets are transmitted as Ethernet frames, according to | That is, the packets are transmitted as Ethernet frames, according to | |||
the layer 2 multicast state. | the Layer 2 multicast state. | |||
Each layer 2 multicast state (S,G) or (*,G) contains a set of "output | Each Layer 2 multicast state (S,G) or (*,G) contains a set of | |||
interfaces" (OIF list). The disposition of an (S,G) multicast frame | outgoing interfaces (an OIF list). The disposition of an (S,G) | |||
received by BD1's layer 2 multicast function is determined as | multicast frame received by BD1's Layer 2 multicast function is | |||
follows: | determined as follows: | |||
* The OIF list is taken from BD1's layer 2 (S,G) state, or if there | * The OIF list is taken from BD1's Layer 2 (S,G) state, or if there | |||
is no such (S,G) state, then from BD1's (*,G) state. (If neither | is no such (S,G) state, then it is taken from BD1's (*,G) state. | |||
state exists, the OIF list is considered to be null.) | (If neither state exists, the OIF list is considered to be null.) | |||
* The rules of Section 4.1.2 are applied to the OIF list. This will | * The rules of Section 4.1.2 are applied to the OIF list. This will | |||
generally result in the frame being transmitted to some, but not | generally result in the frame being transmitted to some, but not | |||
all, elements of the OIF list. | all, elements of the OIF list. | |||
Note that there is no Reverse Path Forwarding (RPF) check at layer 2. | Note that there is no Reverse Path Forwarding (RPF) check at Layer 2. | |||
4.1.1. Constructing the OIF List | 4.1.1. Constructing the OIF List | |||
In this document, we have extended the procedures of [RFC9251] so | In this document, we have extended the procedures of [RFC9251] so | |||
that IMET and SMET routes for a particular BD are distributed not | that IMET and SMET routes for a particular BD are distributed not | |||
just to PEs that attach to that BD, but to PEs that attach to any BD | just to PEs that attach to that BD but to PEs that attach to any BD | |||
in the Tenant Domain. In this way, each PE attached to a given | in the Tenant Domain. In this way, each PE attached to a given | |||
Tenant Domain learns, from other PE attached to the same Tenant | Tenant Domain learns, from another PE attached to the same Tenant | |||
Domain, the set of flows that are of interest to each of those other | Domain, the set of flows that are of interest to each of those other | |||
PEs. (If some PE attached to the Tenant Domain does not support | PEs. (If some PE attached to the Tenant Domain does not support | |||
[RFC9251], it will be assumed to be interested in all flows. Whether | [RFC9251], it will be assumed to be interested in all flows. Whether | |||
a particular remote PE supports [RFC9251] is determined by the | or not a particular remote PE supports [RFC9251] is determined by the | |||
presence of an Extended Community in its IMET route; this is | presence of an Extended Community in its IMET route; this is | |||
specified in [RFC9251].) If a set of remote PEs are interested in a | specified in [RFC9251].) If a set of remote PEs are interested in a | |||
particular flow, the tunnels used to reach those PEs are added to the | particular flow, the tunnels used to reach those PEs are added to the | |||
OIF list of the multicast states corresponding to that flow. | OIF list of the multicast states corresponding to that flow. | |||
An EVPN-PE may run IGMP/MLD snooping procedures [RFC4541] on each of | An EVPN PE may run IGMP/MLD snooping procedures [RFC4541] on each of | |||
its ACs, in order to determine the set of flows of interest to each | its ACs in order to determine the set of flows of interest to each | |||
AC. (An AC is said to be interested in a given flow if it connects | AC. (An AC is said to be interested in a given flow if it connects | |||
to a segment that has tenant systems interested in that flow.) If | to a segment that has tenant systems interested in that flow.) If | |||
IGMP/MLD procedures are not being run on a given AC, that AC is | IGMP/MLD procedures are not being run on a given AC, that AC is | |||
considered to be interested in all flows. For each BD, the set of | considered to be interested in all flows. For each BD, the set of | |||
ACs interested in a given flow is determined, and the ACs of that set | ACs interested in a given flow is determined, and the ACs of that set | |||
are added to the OIF list of that BD's multicast state for that flow. | are added to the OIF list of that BD's multicast state for that flow. | |||
The OIF list for each multicast state must also contain the IRB | The OIF list for each multicast state must also contain the IRB | |||
interface for the BD to which the state belongs. | interface for the BD to which the state belongs. | |||
Implementors should note that the OIF list of a multicast state will | Implementors should note that the OIF list of a multicast state will | |||
change from time to time as ACs and/or remote PEs either become | change from time to time as ACs and/or remote PEs either become | |||
interested in, or lose interest in, particular multicast flows. | interested in or lose interest in particular multicast flows. | |||
4.1.2. Data Plane: Applying the OIF List to an (S,G) Frame | 4.1.2. Data Plane: Applying the OIF List to an (S,G) Frame | |||
When an (S,G) multicast frame is received by the layer 2 multicast | When an (S,G) multicast frame is received by the Layer 2 multicast | |||
function of a given EVPN-PE, say PE1, its disposition depends (a) on | function of a given EVPN PE, say PE1, its disposition depends upon | |||
the way it was received, (b) upon the OIF list of the corresponding | (a) the way it was received, (b) the OIF list of the corresponding | |||
multicast state (see Section 4.1.1), (c) upon the "eligibility" of an | multicast state (see Section 4.1.1), (c) the eligibility of an AC to | |||
AC to receive a given frame (see Section 4.1.2.1) and (d) upon its | receive a given frame (see Section 4.1.2.1), and (d) its apparent | |||
apparent source BD (see Section 3.2 for information about determining | source BD (see Section 3.2 for information about determining the | |||
the apparent source BD of a frame received over a tunnel from another | apparent source BD of a frame received over a tunnel from another | |||
PE). | PE). | |||
4.1.2.1. Eligibility of an AC to Receive a Frame | 4.1.2.1. Eligibility of an AC to Receive a Frame | |||
A given (S,G) multicast frame is eligible to be transmitted by a | A given (S,G) multicast frame is eligible to be transmitted by a | |||
given PE, say PE1, on a given AC, say AC1, only if one of the | given PE, say PE1, on a given AC, say AC1, only if one of the | |||
following conditions holds: | following conditions holds: | |||
1. ESI labels are being used, PE1 is the DF for the segment to which | 1. Ethernet Segment Identifier (ESI) labels are being used, PE1 is | |||
AC1 is connected, and the frame did not originate from that same | the DF for the segment to which AC1 is connected, and the frame | |||
segment (as determined by the ESI label), or | did not originate from that same segment (as determined by the | |||
ESI label). | ||||
2. The ingress PE for the frame is a remote PE, say PE2, local bias | 2. The ingress PE for the frame is a remote PE, say PE2, local bias | |||
is being used, and PE2 is not connected to the same segment as | is being used, and PE2 is not connected to the same segment as | |||
AC1. | AC1. | |||
4.1.2.2. Applying the OIF List | 4.1.2.2. Applying the OIF List | |||
Assume a given (S,G) multicast frame has been received by a given PE, | Assume a given (S,G) multicast frame has been received by a given PE, | |||
say PE1. PE1 determines the apparent source BD of the frame, finds | say PE1. PE1 determines the apparent source BD of the frame, finds | |||
the layer 2 (S,G) state for that BD (or the (*,G) state if there is | the Layer 2 (S,G) state for that BD (or the (*,G) state if there is | |||
no (S,G) state), and uses the OIF list from that state. (Note that | no (S,G) state), and uses the OIF list from that state. (Note that | |||
if PE1 is not attached to the actual source BD, the apparent source | if PE1 is not attached to the actual source BD, the apparent source | |||
BD will be the SBD.) | BD will be the SBD.) | |||
Suppose PE1 has determined the frame's apparent source BD to be BD1 | If PE1 has determined the frame's apparent source BD to be BD1 (which | |||
(which may or may not be the SBD.) There are the following cases to | may or may not be the SBD), then the following cases should be | |||
consider: | considered: | |||
1. The frame was received by PE1 from a local AC, say AC1, that | 1. The frame was received by PE1 from a local AC, say AC1, that | |||
attaches to BD1. | attaches to BD1. | |||
a. The frame MUST be sent on all local ACs of BD1 that appear in | a. The frame MUST be sent on all local ACs of BD1 that appear in | |||
the OIF list, except for AC1 itself. | the OIF list, except for AC1 itself. | |||
b. The frame MUST also be delivered to any other EVPN-PEs that | b. The frame MUST also be delivered to any other EVPN PEs that | |||
have interest in it. This is achieved as follows: | have interest in it. This is achieved as follows: | |||
i. If (a) AR is being used, and (b) PE1 is an AR-LEAF, and | i. If (a) AR is being used, (b) PE1 is an AR-LEAF, and (c) | |||
(c) the OIF list is non-null, PE1 MUST send the frame to | the OIF list is non-null, PE1 MUST send the frame to the | |||
the AR-REPLICATOR. | AR-REPLICATOR. | |||
ii. Otherwise the frame MUST be sent on all tunnels in the | ii. Otherwise, the frame MUST be sent on all tunnels in the | |||
OIF list. | OIF list. | |||
c. The frame MUST be sent to the local L3 routing instance by | c. The frame MUST be sent to the local L3 routing instance by | |||
being sent up the IRB interface of BD1. It MUST NOT be sent | being sent up the IRB interface of BD1. It MUST NOT be sent | |||
up any other IRB interfaces. | up any other IRB interfaces. | |||
2. The frame was received by PE1 over a tunnel from another PE. | 2. The frame was received by PE1 over a tunnel from another PE. | |||
(See Section 3.2 for the rules to determine the apparent source | (See Section 3.2 for the rules to determine the apparent source | |||
BD of a packet received from another PE. Note that if PE1 is not | BD of a packet received from another PE. Note that if PE1 is not | |||
attached to the source BD, it will regard the SBD as the apparent | attached to the source BD, it will regard the SBD as the apparent | |||
skipping to change at page 41, line 34 ¶ | skipping to change at line 1875 ¶ | |||
a. The frame MUST be sent on all local ACs in the OIF list that | a. The frame MUST be sent on all local ACs in the OIF list that | |||
connect to BD1 and that are eligible (per Section 4.1.2.1) to | connect to BD1 and that are eligible (per Section 4.1.2.1) to | |||
receive the frame. | receive the frame. | |||
b. The frame MUST be sent up the IRB interface of the apparent | b. The frame MUST be sent up the IRB interface of the apparent | |||
source BD. (Note that this may be the SBD.) The frame MUST | source BD. (Note that this may be the SBD.) The frame MUST | |||
NOT be sent up any other IRB interfaces. | NOT be sent up any other IRB interfaces. | |||
c. If PE1 is not an AR-REPLICATOR, it MUST NOT send the frame to | c. If PE1 is not an AR-REPLICATOR, it MUST NOT send the frame to | |||
any other EVPN-PEs. However, if PE1 is an AR-REPLICATOR, it | any other EVPN PEs. However, if PE1 is an AR-REPLICATOR, it | |||
MUST send the frame to all tunnels in the OIF list, except | MUST send the frame to all tunnels in the OIF list, except | |||
for the tunnel over which the frame was received. | for the tunnel over which the frame was received. | |||
3. The frame was received by PE1 from the BD1 IRB interface (i.e., | 3. The frame was received by PE1 from the BD1 IRB interface (i.e., | |||
the frame has been transmitted by PE1's L3 routing instance down | the frame has been transmitted by PE1's L3 routing instance down | |||
the BD1 IRB interface), and BD1 is NOT the SBD. | the BD1 IRB interface), and BD1 is NOT the SBD. | |||
a. The frame MUST be sent on all local ACs in the OIF list that | a. The frame MUST be sent on all local ACs in the OIF list that | |||
are eligible, as per Section 4.1.2.1, to receive the frame. | are eligible, as per Section 4.1.2.1, to receive the frame. | |||
b. The frame MUST NOT be sent to any other EVPN-PEs. | b. The frame MUST NOT be sent to any other EVPN PEs. | |||
c. The frame MUST NOT be sent up any IRB interfaces. | c. The frame MUST NOT be sent up any IRB interfaces. | |||
4. The frame was received from the SBD IRB interface (i.e., has been | 4. The frame was received from the SBD IRB interface (i.e., has been | |||
transmitted by PE1's L3 routing instance down the SBD IRB | transmitted by PE1's L3 routing instance down the SBD IRB | |||
interface). | interface). | |||
a. The frame MUST be sent on all tunnels in the OIF list. This | a. The frame MUST be sent on all tunnels in the OIF list. This | |||
causes the frame to be delivered to any other EVPN-PEs that | causes the frame to be delivered to any other EVPN PEs that | |||
have interest in it. | have interest in it. | |||
b. The frame MUST NOT be sent on any local ACs. | b. The frame MUST NOT be sent on any local ACs. | |||
c. The frame MUST NOT be sent up any IRB interfaces. | c. The frame MUST NOT be sent up any IRB interfaces. | |||
4.2. Layer 3 Forwarding State | 4.2. Layer 3 Forwarding State | |||
If an EVPN-PE is performing IGMP/MLD procedures on the ACs of a given | If an EVPN PE is performing IGMP/MLD procedures on the ACs of a given | |||
BD, it processes those messages at layer 2 to help form the layer 2 | BD, it processes those messages at Layer 2 to help form the Layer 2 | |||
multicast state. It also sends those messages up that BD's IRB | multicast state. It also sends those messages up that BD's IRB | |||
interface to the L3 routing instance of a particular tenant domain. | interface to the L3 routing instance of a particular Tenant Domain. | |||
This causes (C-S,C-G) or (C-*,C-G) L3 state to be created/updated. | This causes the (C-S,C-G) or (C-*,C-G) L3 state to be created/ | |||
updated. | ||||
A layer 3 multicast state has both an Input Interface (IIF) and an | A Layer 3 multicast state has both an Input Interface (IIF) and an | |||
OIF list. | OIF list. | |||
For a (C-S,C-G) state, if the source BD is present on the PE, the IIF | For a (C-S,C-G) state, if the source BD is present on the PE, the IIF | |||
is set to the IRB interface that attaches to that BD. Otherwise the | is set to the IRB interface that attaches to that BD. Otherwise, the | |||
IIF is set to the SBD IRB interface. | IIF is set to the SBD IRB interface. | |||
For (C-*,C-G) states, traffic can arrive from any BD, so the IIF | For (C-*,C-G) states, traffic can arrive from any BD, so the IIF | |||
needs to be set to a wildcard value meaning "any IRB interface". | needs to be set to a wildcard value meaning "any IRB interface". | |||
The OIF list of these states includes one or more of the IRB | The OIF list of these states includes one or more of the IRB | |||
interfaces of the Tenant Domain. In general, maintenance of the OIF | interfaces of the Tenant Domain. In general, maintenance of the OIF | |||
list does not require any EVPN-specific procedures. However, there | list does not require any EVPN-specific procedures. However, there | |||
is one EVPN-specific rule: | is one EVPN-specific rule: | |||
If the IIF is one of the IRB interfaces (or the wild card meaning | If the IIF is one of the IRB interfaces (or the wildcard meaning | |||
"any IRB interface"), then the SBD IRB interface MUST NOT be added | "any IRB interface"), then the SBD IRB interface MUST NOT be added | |||
to the OIF list. Traffic originating from within a particular | to the OIF list. Traffic originating from within a particular | |||
EVPN Tenant Domain must not be sent down the SBD IRB interface, as | EVPN Tenant Domain must not be sent down the SBD IRB interface, as | |||
such traffic has already been distributed to all EVPN-PEs attached | such traffic has already been distributed to all EVPN PEs attached | |||
to that Tenant Domain. | to that Tenant Domain. | |||
Please also see Section 6.1.1, which states a modification of this | Please also see Section 6.1.1, which states a modification of this | |||
rule for the case where OISM is interworking with external Layer 3 | rule for the case where OISM is interworking with external Layer 3 | |||
multicast routing. | multicast routing. | |||
5. Interworking with non-OISM EVPN-PEs | 5. Interworking with Non-OISM EVPN PEs | |||
It is possible that a given Tenant Domain will be attached to both | It is possible that a given Tenant Domain will be attached to both | |||
OISM PEs and non-OISM PEs. Inter-subnet IP multicast should be | OISM PEs and non-OISM PEs. Inter-subnet IP multicast should be | |||
possible and fully functional even if not all PEs attaching to a | possible and fully functional even if not all PEs attaching to a | |||
Tenant Domain can be upgraded to support OISM functionality. | Tenant Domain can be upgraded to support OISM functionality. | |||
Note that the non-OISM PEs are not required to have IRB support, or | Note that the non-OISM PEs are not required to have IRB support or | |||
support for [RFC9251]. It is however advantageous for the non-OISM | support for [RFC9251]. However, it is advantageous for the non-OISM | |||
PEs to support [RFC9251]. | PEs to support [RFC9251]. | |||
In this section, we will use the following terminology: | In this section, we will use the following terminology: | |||
* PE-S: the ingress PE for an (S,G) flow. | PE-S: The ingress PE for an (S,G) flow. | |||
* PE-R: an egress PE for an (S,G) flow. | PE-R: An egress PE for an (S,G) flow. | |||
* BD-S: the source BD for an (S,G) flow. PE-S must have one or more | BD-S: The source BD for an (S,G) flow. PE-S must have one or more | |||
ACs attached BD-S, at least one of which attaches to host S. | ACs attached to BD-S, at least one of which attaches to host S. | |||
* BD-R: a BD that contains a host interested in the flow. The host | BD-R: A BD that contains a host interested in the flow. The host is | |||
is attached to PE-R via an AC that belongs to BD-R. | attached to PE-R via an AC that belongs to BD-R. | |||
To allow OISM PEs to interwork with non-OISM PEs, a given Tenant | To allow OISM PEs to interwork with non-OISM PEs, a given Tenant | |||
Domain needs to contain one or more "IP Multicast Gateways" (IPMGs). | Domain needs to contain one or more IP Multicast Gateways (IPMGs). | |||
An IPMG is an OISM PE with special responsibilities regarding the | An IPMG is an OISM PE with special responsibilities regarding the | |||
interworking between OISM and non-OISM PEs. | interworking between OISM and non-OISM PEs. | |||
If a PE is functioning as an IPMG, it MUST signal this fact by | If a PE is functioning as an IPMG, it MUST signal this fact by | |||
setting the "IPMG" flag in the Multicast Flags EC that it attaches to | setting the IPMG flag in the Multicast Flags EC that it attaches to | |||
its IMET routes. An IPMG SHOULD attach this EC, with the IPMG flag | its IMET routes. An IPMG SHOULD attach this EC, with the IPMG flag | |||
set, to all IMET routes it originates. Furthermore, if PE1 imports | set, to all IMET routes it originates. Furthermore, if PE1 imports | |||
any IMET route from PE2 that has the EC present with the "IPMG" flag | any IMET route from PE2 that has the EC present with the IPMG flag | |||
set, then the PE1 will assume that PE2 is an IPMG. | set, then the PE1 will assume that PE2 is an IPMG. | |||
An IPMG Designated Forwarder (IPMG-DF) selection procedure is used to | An IPMG Designated Forwarder (IPMG-DF) selection procedure is used to | |||
ensure that, at any given time, there is exactly one active IPMG-DF | ensure that there is exactly one active IPMG-DF for any given BD at | |||
for any given BD. Details of the IPMG-DF selection procedure are in | any given time. Details of the IPMG-DF selection procedure are in | |||
Section 5.1. The IPMG-DF for a given BD, say BD-S, has special | Section 5.1. The IPMG-DF for a given BD, say BD-S, has special | |||
functions to perform when it receives (S,G) frames on that BD: | functions to perform when it receives (S,G) frames on that BD: | |||
* If the frames are from a non-OISM PE-S: | * If the frames are from a non-OISM PE-S: | |||
- The IPMG-DF forwards them to OISM PEs that do not attach to | - The IPMG-DF forwards them to OISM PEs that do not attach to | |||
BD-S but have interest in (S,G). | BD-S but have interest in (S,G). | |||
Note that OISM PEs that do attach to BD-S will have received | Note that OISM PEs that do attach to BD-S will have received | |||
the frames on the BUM tunnel from the non-OISM PE-S. | the frames on the BUM tunnel from the non-OISM PE-S. | |||
skipping to change at page 44, line 16 ¶ | skipping to change at line 2000 ¶ | |||
with interest in (S,G), it will receive one copy of the frame | with interest in (S,G), it will receive one copy of the frame | |||
for each such BD. This is necessary because the non-OISM PEs | for each such BD. This is necessary because the non-OISM PEs | |||
cannot move IP multicast traffic from one BD to another. | cannot move IP multicast traffic from one BD to another. | |||
* If the frames are from an OISM PE, the IPMG-DF forwards them to | * If the frames are from an OISM PE, the IPMG-DF forwards them to | |||
non-OISM PEs that have interest in (S,G) on ACs that do not belong | non-OISM PEs that have interest in (S,G) on ACs that do not belong | |||
to BD-S. | to BD-S. | |||
If a non-OISM PE has interest in (S,G) on an AC belonging to BD-S, | If a non-OISM PE has interest in (S,G) on an AC belonging to BD-S, | |||
it will have received a copy of the (S,G) frame, encapsulated for | it will have received a copy of the (S,G) frame, encapsulated for | |||
BD-S, from the OISM PE-S. (See Section 3.2.2.) If the non-OISM | BD-S, from the OISM PE-S (see Section 3.2.2). If the non-OISM PE | |||
PE has interest in (S,G) on one or more ACs belonging to | has interest in (S,G) on one or more ACs belonging to BD- | |||
BD-R1,...,BD-Rk where the BD-Ri are distinct from BD-S, the | R1,...,BD-Rk where the BD-Ri are distinct from BD-S, the IPMG-DF | |||
IPMG-DF needs to send it a copy of the frame for each BD-Ri. | needs to send it a copy of the frame for each BD-Ri. | |||
If an IPMG receives a frame on a BD for which it is not the IPMG-DF, | If an IPMG receives a frame on a BD for which it is not the IPMG-DF, | |||
it just follows normal OISM procedures. | it just follows normal OISM procedures. | |||
This section specifies several sets of procedures: | This section specifies several sets of procedures: | |||
* the procedures that the IPMG-DF for a given BD needs to follow | * the procedures that the IPMG-DF for a given BD needs to follow | |||
when receiving, on that BD, an IP multicast frame from a non-OISM | when receiving, on that BD, an IP multicast frame from a non-OISM | |||
PE; | PE; | |||
* the procedures that the IPMG-DF for a given BD needs to follow | * the procedures that the IPMG-DF for a given BD needs to follow | |||
when receiving, on that BD, an IP multicast frame from an OISM PE; | when receiving, on that BD, an IP multicast frame from an OISM PE; | |||
and | ||||
* the procedures that an OISM PE needs to follow when receiving, on | * the procedures that an OISM PE needs to follow when receiving, on | |||
a given BD, an IP multicast frame from a non-OISM PE, when the | a given BD, an IP multicast frame from a non-OISM PE, when the | |||
OISM PE is not the IPMG-DF for that BD. | OISM PE is not the IPMG-DF for that BD. | |||
To enable OISM/non-OISM interworking in a given Tenant Domain, the | To enable OISM/non-OISM interworking in a given Tenant Domain, the | |||
Tenant Domain MUST have some EVPN-PEs that can function as IPMGs. An | Tenant Domain MUST have some EVPN PEs that can function as IPMGs. An | |||
IPMG must be configured with the SBD. It must also be configured | IPMG must be configured with the SBD. It must also be configured | |||
with every BD of the Tenant Domain that exists on any of the non-OISM | with every BD of the Tenant Domain that exists on any of the non-OISM | |||
PEs of that domain. (Operationally, it may be simpler to configure | PEs of that domain. (Operationally, it may be simpler to configure | |||
the IPMG with all the BDs of the Tenant Domain.) | the IPMG with all the BDs of the Tenant Domain.) | |||
A non-OISM PE of course only needs to be configured with BDs for | Of course, a non-OISM PE only needs to be configured with BDs for | |||
which it has ACs. An OISM PE that is not an IPMG only needs to be | which it has ACs. An OISM PE that is not an IPMG only needs to be | |||
configured with the SBD and with the BDs for which it has ACs. | configured with the SBD and with the BDs for which it has ACs. | |||
An IPMG MUST originate a wildcard SMET route (with (C-*,C-*) in the | An IPMG MUST originate a wildcard SMET route (with (C-*,C-*) in the | |||
NLRI) for each BD in the Tenant Domain. This will cause it to | NLRI) for each BD in the Tenant Domain. This will cause it to | |||
receive all the IP multicast traffic that is sourced in the Tenant | receive all the IP multicast traffic that is sourced in the Tenant | |||
Domain. Note that non-OISM nodes that do not support [RFC9251] will | Domain. Note that non-OISM nodes that do not support [RFC9251] will | |||
send all the multicast traffic from a given BD to all PEs attached to | send all the multicast traffic from a given BD to all PEs attached to | |||
that BD, even if those PEs do not originate an SMET route. | that BD, even if those PEs do not originate a SMET route. | |||
The interworking procedures vary somewhat depending upon whether | The interworking procedures vary somewhat depending upon whether | |||
packets are transmitted from PE to PE via Ingress Replication (IR) or | packets are transmitted from PE to PE via IR or via P2MP tunnels. In | |||
via Point-to-Multipoint (P2MP) tunnels. We do not consider the use | this section, we do not consider the use of BIER due to the low | |||
of BIER in this section, due to the low likelihood of there being a | likelihood of there being a non-OISM PE that supports BIER. | |||
non-OISM PE that supports BIER. | ||||
5.1. IPMG Designated Forwarder | 5.1. IPMG Designated Forwarder | |||
Every PE that is eligible for selection as an IPMG-DF for a | Every PE that is eligible for selection as an IPMG-DF for a | |||
particular BD originates both an IMET route for that BD and an | particular BD originates both an IMET route for that BD and an SBD- | |||
SBD-IMET route. As stated in Section 5, these SBD-IMET routes carry | IMET route. As stated in Section 5, these SBD-IMET routes carry a | |||
a Multicast Flags EC with the IPMG Flag set. | Multicast Flags EC with the IPMG flag set. | |||
These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | |||
Election EC and its use is specified in [RFC8584]. When the route is | Election EC and its use is specified in [RFC8584]. When the route is | |||
originated, the AC-DF bit in the DF Election EC SHOULD not be set. | originated, the AC-DF bit in the DF Election EC SHOULD NOT be set. | |||
This bit is not used when selecting an IPMSG-DF, i.e., it MUST be | This bit is not used when selecting an IPMG-DF, i.e., it MUST be | |||
ignored by the receiver of an SBD-IMET route. | ignored by the receiver of an SBD-IMET route. | |||
In the context of a given Tenant Domain, to select the IPMG-DF for a | In the context of a given Tenant Domain, to select the IPMG-DF for a | |||
particular BD, say BD1, the IPMGs of the Tenant Domain perform the | particular BD, say BD1, the IPMGs of the Tenant Domain perform the | |||
following procedure: | following procedures: | |||
* From the set of received SBD-IMET routes for the given tenant | * From the set of received SBD-IMET routes for the given Tenant | |||
domain, determine the candidate set of PEs that support IPMG | Domain, determine the candidate set of PEs that support IPMG | |||
functionality for that domain. | functionality for that domain. | |||
* Eliminate from that candidate set any PEs from which an IMET route | * From that candidate set, eliminate any PEs from which an IMET | |||
for BD1 has not been received. | route for BD1 has not been received. | |||
* Select a DF Election algorithm as specified in [RFC8584]. Some of | * Select a DF election algorithm as specified in [RFC8584]. Some of | |||
the possible algorithms can be found, e.g., in [RFC8584], | the possible algorithms can be found, e.g., in [RFC8584], | |||
[RFC7432], and [I-D.ietf-bess-evpn-pref-df]. | [RFC7432], and [EVPN-DF]. | |||
* Apply the DF Election Algorithm (see [RFC8584]) to the candidate | * Apply the DF election algorithm (see [RFC8584]) to the candidate | |||
set of PEs. The "winner' becomes the IPMG-DF for BD1. | set of PEs. The winner becomes the IPMG-DF for BD1. | |||
Note that even if a given PE supports MEG Section 6.1.2) and/or PEG | Note that even if a given PE supports MEG (Section 6.1.2) and/or PEG | |||
(Section 6.1.4) functionality, as well as IPMG functionality, its | (Section 6.1.4) functionality, as well as IPMG functionality, its | |||
SBD-IMET routes carry only one DF Election EC. | SBD-IMET routes carry only one DF Election EC. | |||
5.2. Ingress Replication | 5.2. Ingress Replication | |||
The procedures of this section are used when Ingress Replication is | The procedures of this section are used when IR is used to transmit | |||
used to transmit packets from one PE to another. | packets from one PE to another. | |||
When a non-OISM PE-S transmits a multicast frame from BD-S to another | When a non-OISM PE-S transmits a multicast frame from BD-S to another | |||
PE, PE-R, PE-S will use the encapsulation specified in the BD-S IMET | PE, say PE-R, PE-S will use the encapsulation specified in the BD-S | |||
route that was originated by PE-R. This encapsulation will include | IMET route that was originated by PE-R. This encapsulation will | |||
the label that appears in the "MPLS label" field of the PMSI Tunnel | include the label that appears in the MPLS Label field of the PTA of | |||
attribute (PTA) of the IMET route. If the tunnel type is VXLAN, the | the IMET route. If the tunnel type is VXLAN, the label is actually a | |||
"label" is actually a Virtual Network Identifier (VNI); for other | Virtual Network Identifier (VNI); for other tunnel types, the label | |||
tunnel types, the label is an MPLS label. In either case, we will | is an MPLS label. In either case, the frames are transmitted with a | |||
speak of the transmitted frames as carrying a label that was assigned | label that was assigned to a particular BD by the PE-R to which the | |||
to a particular BD by the PE-R to which the frame is being | frame is being transmitted. | |||
transmitted. | ||||
To support OISM/non-OISM interworking, an OISM PE-R MUST originate, | To support OISM/non-OISM interworking, an OISM PE-R MUST originate, | |||
for each of its BDs, both an IMET route and an S-PMSI (C-*,C-*) A-D | for each of its BDs, both an IMET route and an (C-*,C-*) S-PMSI A-D | |||
route. Note that even when IR is being used, interworking between | route. Note that even when IR is being used, interworking between | |||
OISM and non-OISM PEs requires the OISM PEs to follow the rules of | OISM and non-OISM PEs requires the OISM PEs to follow the rules of | |||
Section 3.2.5.2, as modified below. | Section 3.2.5.2, as modified below. | |||
Non-OISM PEs will not understand S-PMSI A-D routes. So when a | Non-OISM PEs will not understand S-PMSI A-D routes. So when a non- | |||
non-OISM PE-S transmits an IP multicast frame with a particular | OISM PE-S transmits an IP multicast frame with a particular source BD | |||
source BD to an IPMG, it encapsulates the frame using the label | to an IPMG, it encapsulates the frame using the label specified in | |||
specified in that IPMG's BD-S IMET route. (This is just the | that IPMG's BD-S IMET route. (This is just the procedure of | |||
procedure of [RFC7432].) | [RFC7432].) | |||
The (C-*,C-*) S-PMSI A-D route originated by a given OISM PE will | The (C-*,C-*) S-PMSI A-D route originated by a given OISM PE will | |||
have a PTA that specifies IR. | have a PTA that specifies IR. | |||
* If MPLS tunneling is being used, the MPLS label field SHOULD | * If MPLS tunneling is being used, the MPLS Label field SHOULD | |||
contain a non-zero value, and the LIR flag SHOULD be zero. (The | contain a non-zero value, and the LIR flag SHOULD be zero. (The | |||
case where the MPLS label field is zero or the LIR flag is set is | case where the MPLS Label field is zero or the LIR flag is set is | |||
outside the scope of this document.) | outside the scope of this document.) | |||
* If the tunnel encapsulation is VXLAN, the MPLS label field MUST | * If the tunnel encapsulation is VXLAN, the MPLS Label field MUST | |||
contain a non-zero value, and the LIR flag MUST be zero. | contain a non-zero value, and the LIR flag MUST be zero. | |||
When an OISM PE-S transmits an IP multicast frame to an IPMG, it will | When an OISM PE-S transmits an IP multicast frame to an IPMG, it will | |||
use the label specified in that IPMG's (C-*,C-*) S-PMSI A-D route. | use the label specified in that IPMG's (C-*,C-*) S-PMSI A-D route. | |||
When a PE originates both an IMET route and a (C-*,C-*) S-PMSI A-D | When a PE originates both an IMET route and a (C-*,C-*) S-PMSI A-D | |||
route, the values of the MPLS label field in the respective PTAs must | route, the values of the MPLS Label field in the respective PTAs must | |||
be distinct. Further, each MUST map uniquely (in the context of the | be distinct. Further, each MUST map uniquely (in the context of the | |||
originating PE) to the route's BD. | originating PE) to the route's BD. | |||
As a result, an IPMG receiving an MPLS-encapsulated IP multicast | As a result, an IPMG receiving an MPLS-encapsulated IP multicast | |||
frame can always tell by the label whether the frame's ingress PE is | frame can always tell by the label whether the frame's ingress PE is | |||
an OISM PE or a non-OISM PE. When an IPMG receives a VXLAN- | an OISM PE or a non-OISM PE. When an IPMG receives a VXLAN- | |||
encapsulated IP multicast frame it may need to determine the identity | encapsulated IP multicast frame, it may need to determine the | |||
of the ingress PE from the outer IP encapsulation; it can then | identity of the ingress PE from the outer IP encapsulation; it can | |||
determine whether the ingress PE is an OISM PE or a non-OISM PE by | then determine whether the ingress PE is an OISM PE or a non-OISM PE | |||
looking the IMET route from that PE. | by looking at the IMET route from that PE. | |||
Suppose an IPMG receives an IP multicast frame from another EVPN-PE | Suppose an IPMG receives an IP multicast frame from another EVPN PE | |||
in the Tenant Domain, and the IPMG is not the IPMG-DF for the frame's | in the Tenant Domain and the IPMG is not the IPMG-DF for the frame's | |||
source BD. Then the IPMG performs only the ordinary OISM functions; | source BD. Then, the IPMG performs only the ordinary OISM functions; | |||
it does not perform the IPMG-specific functions for that frame. In | it does not perform the IPMG-specific functions for that frame. In | |||
the remainder of this section, when we discuss the procedures applied | the remainder of this section, when we discuss the procedures applied | |||
by an IPMG when it receives an IP multicast frame, we are presuming | by an IPMG when it receives an IP multicast frame, we are presuming | |||
that the source BD of the frame is a BD for which the IPMG is the | that the source BD of the frame is a BD for which the IPMG is the | |||
IPMG-DF. | IPMG-DF. | |||
We have two basic cases to consider: (1) a frame's ingress PE is a | We have two basic cases to consider: (1) a frame's ingress PE is a | |||
non-OISM node, and (2) a frame's ingress PE is an OISM node. | non-OISM node and (2) a frame's ingress PE is an OISM node. | |||
5.2.1. Ingress PE is non-OISM | 5.2.1. Ingress PE is Non-OISM | |||
In this case, a non-OISM PE, PE-S, has received an (S,G) multicast | In this case, a non-OISM PE, say PE-S, has received an (S,G) | |||
frame over an AC that is attached to a particular BD, BD-S. By | multicast frame over an AC that is attached to a particular BD, say | |||
virtue of normal EVPN procedures, PE-S has sent a copy of the frame | BD-S. By virtue of normal EVPN procedures, PE-S has sent a copy of | |||
to every PE-R (both OISM and non-OISM) in the Tenant Domain that is | the frame to every PE-R (both OISM and non-OISM) in the Tenant Domain | |||
attached to BD-S. If the non-OISM node supports [RFC9251], only PEs | that is attached to BD-S. If the non-OISM node supports [RFC9251], | |||
that have expressed interest in (S,G) receive the frame. The IPMG | only PEs that have expressed interest in (S,G) receive the frame. | |||
will have expressed interest via a (C-*,C-*) SMET route and thus | The IPMG will have expressed interest via a (C-*,C-*) SMET route and | |||
receives the frame. | thus receives the frame. | |||
Any OISM PE (including an IPMG) receiving the frame will apply normal | Any OISM PE (including an IPMG) receiving the frame will apply normal | |||
OISM procedures. As a result it will deliver the frame to any of its | OISM procedures. As a result, it will deliver the frame to any of | |||
local ACs (in BD-S or in any other BD) that have interest in (S,G). | its local ACs (in BD-S or in any other BD) that have interest in | |||
(S,G). | ||||
An OISM PE that is also the IPMG-DF for a particular BD, say BD-S, | An OISM PE that is also the IPMG-DF for a particular BD, say BD-S, | |||
has additional procedures that it applies to frames received on BD-S | has additional procedures that it applies to frames received on BD-S | |||
from non-OISM PEs: | from non-OISM PEs: | |||
1. When the IPMG-DF for BD-S receives an (S,G) frame from a non-OISM | 1. When the IPMG-DF for BD-S receives an (S,G) frame from a non-OISM | |||
node, it MUST forward a copy of the frame to every OISM PE that | node, it MUST forward a copy of the frame to every OISM PE that | |||
is NOT attached to BD-S but has interest in (S,G). The copy sent | is NOT attached to BD-S but has interest in (S,G). The copy sent | |||
to a given OISM PE-R must carry the label that PE-R has assigned | to a given OISM PE-R must carry the label that PE-R has assigned | |||
to the SBD in an S-PMSI A-D route. The IPMG MUST NOT do any IP | to the SBD in an S-PMSI A-D route. The IPMG MUST NOT do any IP | |||
processing of the frame's IP payload. TTL decrement and other IP | processing of the frame's IP payload. TTL decrement and other IP | |||
processing will be done by PE-R, per the normal OISM procedures. | processing will be done by PE-R, per the normal OISM procedures. | |||
There is no need for the IPMG to include an ESI label in the | There is no need for the IPMG to include an ESI label in the | |||
frame's tunnel encapsulation, because it is already known that | frame's tunnel encapsulation, because it is already known that | |||
the frame's source BD has no presence on PE-R. There is also no | the frame's source BD has no presence on PE-R. There is also no | |||
need for the IPMG to modify the frame's MAC SA. | need for the IPMG to modify the frame's MAC SA. | |||
2. In addition, when the IPMG-DF for BD-S receives an (S,G) frame | 2. In addition, when the IPMG-DF for BD-S receives an (S,G) frame | |||
from a non-OISM node, it may need to forward copies of the frame | from a non-OISM node, it may need to forward copies of the frame | |||
to other non-OISM nodes. Before it does so, it MUST decapsulate | to other non-OISM nodes. Before it does so, it MUST decapsulate | |||
the (S,G) packet, and do the IP processing (e.g., TTL decrement). | the (S,G) packet and do the IP processing (e.g., TTL decrement). | |||
Suppose PE-R is a non-OISM node that has an AC to BD-R, where | Suppose PE-R is a non-OISM node that has an AC to BD-R, where | |||
BD-R is not the same as BD-S, and that AC has interest in (S,G). | BD-R is not the same as BD-S, and that AC has interest in (S,G). | |||
The IPMG must then encapsulate the (S,G) packet (after the IP | The IPMG must then encapsulate the (S,G) packet (after the IP | |||
processing has been done) in an Ethernet header. The MAC SA | processing has been done) in an Ethernet header. The MAC SA | |||
field will have the MAC address of the IPMG's IRB interface for | field will have the MAC address of the IPMG's IRB interface for | |||
BD-R. The IPMG then sends the frame to PE-R. The tunnel | BD-R. The IPMG then sends the frame to PE-R. The tunnel | |||
encapsulation will carry the label that PE-R advertised in its | encapsulation will carry the label that PE-R advertised in its | |||
IMET route for BD-R. There is no need to include an ESI label, | IMET route for BD-R. There is no need to include an ESI label, | |||
as the source and destination BDs are known to be different. | as the source and destination BDs are known to be different. | |||
Note that if a non-OISM PE-R has several BDs (other than BD-S) | Note that if a non-OISM PE-R has several BDs (other than BD-S) | |||
with local ACs that have interest in (S,G), the IPMG will send it | with local ACs that have interest in (S,G), the IPMG will send it | |||
one copy for each such BD. This is necessary because the | one copy for each such BD. This is necessary because the non- | |||
non-OISM PE cannot move packets from one BD to another. | OISM PE cannot move packets from one BD to another. | |||
There may be deployment scenarios in which every OISM PE is | There may be deployment scenarios in which every OISM PE is | |||
configured with every BD that is present on any non-OISM PE. In such | configured with every BD that is present on any non-OISM PE. In such | |||
scenarios, the procedures of item 1 above will not actually result in | scenarios, the procedures of item 1 above will not actually result in | |||
the transmission of any packets. Hence if it is known a priori that | the transmission of any packets. Hence, if it is known a priori that | |||
this deployment scenario exists for a given tenant domain, the | this deployment scenario exists for a given Tenant Domain, the | |||
procedures of item 1 above can be disabled. | procedures of item 1 above can be disabled. | |||
5.2.2. Ingress PE is OISM | 5.2.2. Ingress PE is OISM | |||
In this case, an OISM PE, PE-S, has received an (S,G) multicast frame | In this case, an OISM PE, say PE-S, has received an (S,G) multicast | |||
over an AC that attaches to a particular BD, BD-S. | frame over an AC that attaches to a particular BD, say BD-S. | |||
By virtue of receiving all the IMET routes for BD-S, PE-S will know | By virtue of receiving all the IMET routes for BD-S, PE-S will know | |||
all the PEs attached to BD-S. By virtue of normal OISM procedures: | all the PEs attached to BD-S. By virtue of normal OISM procedures: | |||
* PE-S will send a copy of the frame to every OISM PE-R (including | * PE-S will send a copy of the frame to every OISM PE-R (including | |||
the IPMG) in the Tenant Domain that is attached to BD-S and has | the IPMG) in the Tenant Domain that is attached to BD-S and has | |||
interest in (S,G). The copy sent to a given PE-R carries the | interest in (S,G). The copy sent to a given PE-R carries the | |||
label that that the PE-R has assigned to BD-S in its (C-*,C-*) | label that the PE-R has assigned to BD-S in its (C-*,C-*) S-PMSI | |||
S-PMSI A-D route. | A-D route. | |||
* PE-S will also transmit a copy of the (S,G) frame to every OISM | * PE-S will also transmit a copy of the (S,G) frame to every OISM | |||
PE-R that has interest in (S,G) but is not attached to BD-S. The | PE-R that has interest in (S,G) but is not attached to BD-S. The | |||
copy will contain the label that the PE-R has assigned to the SBD. | copy will contain the label that the PE-R has assigned to the SBD. | |||
(As specified in Section 5.2.1, an IPMG is assumed to have | (As specified in Section 5.2.1, an IPMG is assumed to have | |||
indicated interest in all multicast flows.) | indicated interest in all multicast flows.) | |||
* PE-S will also transmit a copy of the (S,G) frame to every | * PE-S will also transmit a copy of the (S,G) frame to every non- | |||
non-OISM PE-R that is attached to BD-S. It does this using the | OISM PE-R that is attached to BD-S. It does this using the label | |||
label advertised by that PE-R in its IMET route for BD-S. | advertised by that PE-R in its IMET route for BD-S. | |||
The PE-Rs follow their normal procedures. An OISM PE that receives | The PE-Rs follow their normal procedures. An OISM PE that receives | |||
the (S,G) frame on BD-S applies the OISM procedures to deliver the | the (S,G) frame on BD-S applies the OISM procedures to deliver the | |||
frame to its local ACs, as necessary. A non-OISM PE that receives | frame to its local ACs as necessary. A non-OISM PE that receives the | |||
the (S,G) frame on BD-S delivers the frame only to its local BD-S | (S,G) frame on BD-S delivers the frame only to its local BD-S ACs as | |||
ACs, as necessary. | necessary. | |||
Suppose that a non-OISM PE-R has interest in (S,G) on a BD, BD-R, | Suppose that a non-OISM PE-R has interest in (S,G) on a BD that is | |||
that is different than BD-S. If the non-OISM PE-R is attached to | different than BD-S, say BD-R. If the non-OISM PE-R is attached to | |||
BD-S, the OISM PE-S will send it the original (S,G) multicast frame, | BD-S, the OISM PE-S will send it the original (S,G) multicast frame, | |||
but the non-OISM PE-R will not be able to send the frame to ACs that | but the non-OISM PE-R will not be able to send the frame to ACs that | |||
are not in BD-S. If PE-R is not even attached to BD-S, the OISM PE-S | are not in BD-S. If PE-R is not even attached to BD-S, the OISM PE-S | |||
will not send it a copy of the frame at all, because PE-R is not | will not send it a copy of the frame at all, because PE-R is not | |||
attached to the SBD. In these cases, the IPMG needs to relay the | attached to the SBD. In these cases, the IPMG needs to relay the | |||
(S,G) multicast traffic from OISM PE-S to non-OISM PE-R. | (S,G) multicast traffic from OISM PE-S to non-OISM PE-R. | |||
When the IPMG-DF for BD-S receives an (S,G) frame from an OISM PE-S, | When the IPMG-DF for BD-S receives an (S,G) frame from an OISM PE-S, | |||
it has to forward it to every non-OISM PE-R that that has interest in | it has to forward it to every non-OISM PE-R that has interest in | |||
(S,G) on a BD-R that is different than BD-S. The IPMG MUST | (S,G) on a BD-R that is different than BD-S. The IPMG MUST | |||
decapsulate the IP multicast packet, do the IP processing, re- | decapsulate the IP multicast packet, do the IP processing, re- | |||
encapsulate it for BD-R (changing the MAC SA to the IPMG's own MAC | encapsulate it for BD-R (changing the MAC SA to the IPMG's own MAC | |||
address for BD-R), and send a copy of the frame to PE-R. Note that a | address for BD-R), and send a copy of the frame to PE-R. Note that a | |||
given non-OISM PE-R will receive multiple copies of the frame, if it | given non-OISM PE-R will receive multiple copies of the frame if it | |||
has multiple BDs on which there is interest in the frame. | has multiple BDs on which there is interest in the frame. | |||
5.3. P2MP Tunnels | 5.3. P2MP Tunnels | |||
When IR is used to distribute the multicast traffic among the | When IR is used to distribute the multicast traffic among the EVPN | |||
EVPN-PEs, the procedures of Section 5.2 ensure that there will be no | PEs, the procedures described in Section 5.2 ensure that there will | |||
duplicate delivery of multicast traffic. That is, no egress PE will | be no duplicate delivery of multicast traffic. That is, no egress PE | |||
ever send a frame twice on any given AC. If P2MP tunnels are being | will ever send a frame twice on any given AC. If P2MP tunnels are | |||
used to distribute the multicast traffic, it is necessary to have | being used to distribute the multicast traffic, it is necessary to | |||
additional procedures to prevent duplicate delivery. | have additional procedures to prevent duplicate delivery. | |||
At the present time, it is not clear that there will be a use case in | At the present time, it is not clear that there will be a use case in | |||
which OISM nodes need to interwork with non-OISM nodes that use P2MP | which OISM nodes need to interwork with non-OISM nodes that use P2MP | |||
tunnels. If it is determined that there is such a use case, | tunnels. If it is determined that there is such a use case, | |||
procedures for P2MP may be specified in a separate document. | procedures for P2MP may be specified in a separate document. | |||
6. Traffic to/from Outside the EVPN Tenant Domain | 6. Traffic to/from Outside the EVPN Tenant Domain | |||
In this section, we discuss scenarios where a multicast source | In this section, we discuss scenarios where a multicast source | |||
outside a given EVPN Tenant Domain sends traffic to receivers inside | outside a given EVPN Tenant Domain sends traffic to receivers inside | |||
the domain (as well as, possibly, to receivers outside the domain). | the domain (as well as, possibly, to receivers outside the domain). | |||
This requires the OISM procedures to interwork with various layer 3 | This requires the OISM procedures to interwork with various Layer 3 | |||
multicast routing procedures. | multicast routing procedures. | |||
We assume in this section that the Tenant Domain is not being used as | In this section, we assume that the Tenant Domain is not being used | |||
an intermediate transit network for multicast traffic; that is, we do | as an intermediate transit network for multicast traffic; that is, we | |||
not consider the case where the Tenant Domain contains multicast | do not consider the case where the Tenant Domain contains multicast | |||
routers that will receive traffic from sources outside the domain and | routers that will receive traffic from sources outside the domain and | |||
forward the traffic to receivers outside the domain. The transit | forward the traffic to receivers outside the domain. The transit | |||
scenario is considered in Section 7. | scenario is considered in Section 7. | |||
We can divide the non-transit scenarios into two classes: | We can divide the non-transit scenarios into two classes: | |||
1. One or more of the EVPN PE routers provide the functionality | 1. One or more of the EVPN PE routers provide the functionality | |||
needed to interwork with layer 3 multicast routing procedures. | needed to interwork with Layer 3 multicast routing procedures. | |||
2. A single BD in the Tenant Domain contains external multicast | 2. A single BD in the Tenant Domain contains external multicast | |||
routers ("tenant multicast routers"), and those tenant multicast | routers (tenant multicast routers), and those tenant multicast | |||
routers are used to interwork, on behalf of the entire Tenant | routers are used to interwork, on behalf of the entire Tenant | |||
Domain, with layer 3 multicast routing procedures. | Domain, with Layer 3 multicast routing procedures. | |||
6.1. Layer 3 Interworking via EVPN OISM PEs | 6.1. Layer 3 Interworking via EVPN OISM PEs | |||
6.1.1. General Principles | 6.1.1. General Principles | |||
Sometimes it is necessary to interwork an EVPN Tenant Domain with an | Sometimes it is necessary to interwork an EVPN Tenant Domain with an | |||
external layer 3 multicast domain (the "external domain"), e.g., a | external Layer 3 multicast domain (the external domain), e.g., a PIM | |||
PIM or MVPN domain. This is needed to allow EVPN tenant systems to | or MVPN domain. This is needed to allow EVPN tenant systems to | |||
receive multicast traffic from sources ("external sources") outside | receive multicast traffic from sources (external sources) outside the | |||
the EVPN Tenant Domain. It is also needed to allow receivers | EVPN Tenant Domain. It is also needed to allow receivers (external | |||
("external receivers") outside the EVPN Tenant Domain to receive | receivers) outside the EVPN Tenant Domain to receive traffic from | |||
traffic from sources inside the Tenant Domain. | sources inside the Tenant Domain. | |||
In order to allow interworking between an EVPN Tenant Domain and an | In order to allow interworking between an EVPN Tenant Domain and an | |||
external domain, one or more OISM PEs must be "L3 Gateways". An L3 | external domain, one or more OISM PEs must be L3 Gateways. An L3 | |||
Gateway participates both in the OISM procedures and in the L3 | Gateway participates both in the OISM procedures and in the L3 | |||
multicast routing procedures of the external domain, as shown in the | multicast routing procedures of the external domain, as shown in the | |||
following figure. | following figure. | |||
src1 rcvr1 | src1 rcvr1 | |||
| | | | | | |||
R1 RP R2 | R1 RP R2 | |||
PIM/MVPN | PIM/MVPN | |||
domain | Domain | |||
+---+ +---+ | +---+ +---+ | |||
-----|GW1|----------------------|GW2|---- | -----|GW1|----------------------|GW2|---- | |||
+---+ +---+ | +---+ +---+ | |||
| \ \ / / | | | \ \ / / | | |||
| \ \ / / | | | \ \ / / | | |||
BD1 BD2 SBD SBD BD2 BD1 | BD1 BD2 SBD SBD BD2 BD1 | |||
EVPN Domain | EVPN Domain | |||
SBD SBD | SBD SBD | |||
/ \ | / \ | |||
/ \ | / \ | |||
+---+ +---+ | +---+ +---+ | |||
|PE1| |PE2| | |PE1| |PE2| | |||
+---+ +---+ | +---+ +---+ | |||
| \ / | | | \ / | | |||
BD1 BD2 BD2 BD1 | BD1 BD2 BD2 BD1 | |||
| | | | | | | | | | |||
src2 rcvr2 src3 rcvr3 | src2 rcvr2 src3 rcvr3 | |||
Figure 1: Interworking via OISM PEs | ||||
An L3 Gateway that has interest in receiving (S,G) traffic must be | An L3 Gateway that has interest in receiving (S,G) traffic must be | |||
able to determine the best route to S. If an L3 Gateway has interest | able to determine the best route to S. If an L3 Gateway has interest | |||
in (*,G), it must be able to determine the best route to G's RP. In | in (*,G), it must be able to determine the best route to G's RP. In | |||
these interworking scenarios, the L3 Gateway must be running a layer | these interworking scenarios, the L3 Gateway must be running a Layer | |||
3 unicast routing protocol. Via this protocol, it imports unicast | 3 unicast routing protocol. Via this protocol, it imports unicast | |||
routes (either IP routes or VPN-IP routes) from routers other than | routes (either IP routes or VPN-IP routes) from routers other than | |||
EVPN PEs. And since there may be multicast sources inside the EVPN | EVPN PEs. And since there may be multicast sources inside the EVPN | |||
Tenant Domain, the EVPN PEs also need to export, either as IP routes | Tenant Domain, the EVPN PEs also need to export, either as IP routes | |||
or as VPN-IP routes (depending upon the external domain), unicast | or as VPN-IP routes (depending upon the external domain), unicast | |||
routes to those sources. | routes to those sources. | |||
When selecting the best route to a multicast source or RP, an L3 | When selecting the best route to a multicast source or RP, an L3 | |||
Gateway might have a choice between an EVPN route and an IP/VPN-IP | Gateway might have a choice between an EVPN route and an IP/VPN-IP | |||
route. When such a choice exists, the L3 Gateway SHOULD always | route. When such a choice exists, the L3 Gateway SHOULD always | |||
prefer the EVPN route. This will ensure that when traffic originates | prefer the EVPN route. This will ensure that when traffic originates | |||
in the Tenant Domain and has a receiver in the Tenant Domain, the | in the Tenant Domain and has a receiver in the Tenant Domain, the | |||
path to that receiver will remain within the EVPN Tenant Domain, even | path to that receiver will remain within the EVPN Tenant Domain, even | |||
if the source is also reachable via a routed path. This also | if the source is also reachable via a routed path. This also | |||
provides protection against sub-optimal routing that might occur if | provides protection against sub-optimal routing that might occur if | |||
two EVPN PEs export IP/VPN-IP routes and each imports the other's IP/ | two EVPN PEs export IP/VPN-IP routes and each imports the other's IP/ | |||
VPN-IP routes. | VPN-IP routes. | |||
Section 4.2 discusses the way layer 3 multicast states are | Section 4.2 discusses the way Layer 3 multicast states are | |||
constructed by OISM PEs. These layer 3 multicast states have IRB | constructed by OISM PEs. These Layer 3 multicast states have IRB | |||
interfaces as their IIF and OIF list entries, and are the basis for | interfaces as their IIF and OIF list entries and are the basis for | |||
interworking OISM with other layer 3 multicast procedures such as | interworking OISM with other Layer 3 multicast procedures such as | |||
MVPN or PIM. From the perspective of the layer 3 multicast | MVPN or PIM. From the perspective of the Layer 3 multicast | |||
procedures running in a given L3 Gateway, an EVPN Tenant Domain is a | procedures running in a given L3 Gateway, an EVPN Tenant Domain is a | |||
set of IRB interfaces. | set of IRB interfaces. | |||
When interworking an EVPN Tenant Domain with an external domain, the | When interworking an EVPN Tenant Domain with an external domain, the | |||
L3 Gateway's layer 3 multicast states will not only have IRB | L3 Gateway's Layer 3 multicast states will not only have IRB | |||
interfaces as IIF and OIF list entries, but also other "interfaces" | interfaces as IIF and OIF list entries but also other interfaces that | |||
that lead outside the Tenant Domain. For example, when interworking | lead outside the Tenant Domain. For example, when interworking with | |||
with MVPN, the multicast states may have MVPN tunnels as well as IRB | MVPN, the multicast states may have MVPN tunnels as well as IRB | |||
interfaces as IIF or OIF list members. When interworking with PIM, | interfaces as IIF or OIF list members. When interworking with PIM, | |||
the multicast states may have PIM-enabled non-IRB interfaces as IIF | the multicast states may have PIM-enabled non-IRB interfaces as IIF | |||
or OIF list members. | or OIF list members. | |||
As long as a Tenant Domain is not being used as an intermediate | As long as a Tenant Domain is not being used as an intermediate | |||
transit network for IP multicast traffic, it is not necessary to | transit network for IP multicast traffic, it is not necessary to | |||
enable PIM on its IRB interfaces. | enable PIM on its IRB interfaces. | |||
In general, an L3 Gateway has the following responsibilities: | In general, an L3 Gateway has the following responsibilities: | |||
skipping to change at page 52, line 41 ¶ | skipping to change at line 2395 ¶ | |||
* It imports, from the external domain, unicast routes to multicast | * It imports, from the external domain, unicast routes to multicast | |||
sources that are in the external domain. | sources that are in the external domain. | |||
* It executes the procedures necessary to draw externally sourced | * It executes the procedures necessary to draw externally sourced | |||
multicast traffic that is of interest to locally attached | multicast traffic that is of interest to locally attached | |||
receivers in the EVPN Tenant Domain. When such traffic is | receivers in the EVPN Tenant Domain. When such traffic is | |||
received, the traffic is sent down the IRB interfaces of the BDs | received, the traffic is sent down the IRB interfaces of the BDs | |||
on which the locally attached receivers reside. | on which the locally attached receivers reside. | |||
One of the L3 Gateways in a given Tenant Domain becomes the "DR" for | One of the L3 Gateways in a given Tenant Domain becomes the DR for | |||
the SBD. (See Section 6.1.2.4.) This L3 gateway has the following | the SBD (see Section 6.1.2.4). This L3 Gateway has the following | |||
additional responsibilities: | additional responsibilities: | |||
* It exports, to the external domain, unicast routes to multicast | * It exports, to the external domain, unicast routes to multicast | |||
sources in the EVPN Tenant Domain that are not locally attached to | sources in the EVPN Tenant Domain that are not locally attached to | |||
any L3 gateway. | any L3 Gateway. | |||
* It imports, from the external domain, unicast routes to multicast | * It imports, from the external domain, unicast routes to multicast | |||
sources that are in the external domain. | sources that are in the external domain. | |||
* It executes the procedures necessary to draw externally sourced | * It executes the procedures necessary to draw externally sourced | |||
multicast traffic that is of interest to receivers in the EVPN | multicast traffic that is of interest to receivers in the EVPN | |||
Tenant Domain that are not locally attached to an L3 gateway. | Tenant Domain that are not locally attached to an L3 Gateway. | |||
When such traffic is received, the traffic is sent down the SBD | When such traffic is received, the traffic is sent down the SBD | |||
IRB interface. OISM procedures already described in this document | IRB interface. OISM procedures already described in this document | |||
will then ensure that the IP multicast traffic gets distributed | will then ensure that the IP multicast traffic gets distributed | |||
throughout the Tenant Domain to any EVPN PEs that have interest in | throughout the Tenant Domain to any EVPN PEs that have interest in | |||
it. Thus to an OISM PE that is not an L3 gateway the externally | it. Thus, to an OISM PE that is not an L3 Gateway, the externally | |||
sourced traffic will appear to have been sourced on the SBD. | sourced traffic will appear to have been sourced on the SBD. | |||
In order for this to work, some special care is needed when an L3 | In order for this to work, some special care is needed when an L3 | |||
gateway creates or modifies a layer 3 (*,G) multicast state. Suppose | Gateway creates or modifies a Layer 3 (*,G) multicast state. Suppose | |||
group G has both external sources (sources outside the EVPN Tenant | group G has both external sources (sources outside the EVPN Tenant | |||
Domain) and internal sources (sources inside the EVPN tenant domain). | Domain) and internal sources (sources inside the EVPN Tenant Domain). | |||
Section 4.2 states that when there are internal sources, the SBD IRB | Section 4.2 states that when there are internal sources, the SBD IRB | |||
interface must not be added to the OIF list of the (*,G) state. | interface must not be added to the OIF list of the (*,G) state. | |||
Traffic from internal sources will already have been delivered to all | Traffic from internal sources will already have been delivered to all | |||
the EVPN PEs that have interest in it. However, if the OIF list of | the EVPN PEs that have interest in it. However, if the OIF list of | |||
the (*,G) state does not contain its SBD IRB interface, then traffic | the (*,G) state does not contain its SBD IRB interface, then traffic | |||
from external sources will not get delivered to other EVPN PEs. | from external sources will not get delivered to other EVPN PEs. | |||
One way of handling this is the following. When an L3 gateway | One way of handling this is the following. When an L3 Gateway | |||
receives (S,G) traffic from other than an IRB interface, and the | receives (S,G) traffic that is from an interface other than IRB, and | |||
traffic corresponds to a layer 3 (*,G) state, the L3 gateway can | the traffic corresponds to a Layer 3 (*,G) state, the L3 Gateway can | |||
create (S,G) state. The IIF will be set to the external interface | create (S,G) state. The IIF will be set to the external interface | |||
over which the traffic is expected. The OIF list will contain the | over which the traffic is expected. The OIF list will contain the | |||
SBD IRB interface, as well as the IRB interfaces of any other BDs | SBD IRB interface, as well as the IRB interfaces of any other BDs | |||
attached to the PEG DR that have locally attached receivers with | attached to the PEG DR that have locally attached receivers with | |||
interest in the (S,G) traffic. The (S,G) state will ensure that the | interest in the (S,G) traffic. The (S,G) state will ensure that the | |||
external traffic is sent down the SBD IRB interface. The following | external traffic is sent down the SBD IRB interface. The following | |||
text will assume this procedure; however other implementation | text will assume this procedure; however, other implementation | |||
techniques may also be possible. | techniques may also be possible. | |||
If a particular BD is attached to several L3 Gateways, one of the L3 | If a particular BD is attached to several L3 Gateways, one of the L3 | |||
Gateways becomes the DR for that BD. (See Section 6.1.2.4.) If the | Gateways becomes the DR for that BD (see Section 6.1.2.4). If the | |||
interworking scenario requires FHR functionality, it is generally the | interworking scenario requires FHR functionality, it is generally the | |||
DR for a particular BD that is responsible for performing that | DR for a particular BD that is responsible for performing that | |||
functionality on behalf of the source hosts on that BD. (E.g., if | functionality on behalf of the source hosts on that BD (e.g., if the | |||
the interworking scenario requires that PIM Register messages be sent | interworking scenario requires that PIM Register messages be sent by | |||
by an FHR, the DR for a given BD would send the PIM Register messages | an FHR, the DR for a given BD would send the PIM Register messages | |||
for sources on that BD.) Note though that the DR for the SBD does | for sources on that BD). Although, note that the DR for the SBD does | |||
not perform FHR functionality on behalf of external sources. | not perform FHR functionality on behalf of external sources. | |||
An optional alternative is to have each L3 gateway perform FHR | An optional alternative is to have each L3 Gateway perform FHR | |||
functionality for locally attached sources. Then the DR would only | functionality for locally attached sources. Then, the DR would only | |||
have to perform FHR functionality on behalf of sources that are | have to perform FHR functionality on behalf of sources that are | |||
locally attached to itself AND sources that are not attached to any | locally attached to itself AND sources that are not attached to any | |||
L3 gateway. | L3 Gateway. | |||
N.B.: If it is possible that more than one BD contains a tenant | Note that if it is possible that more than one BD contains a tenant | |||
multicast router, then a PE receiving an SMET route for that BD MUST | multicast router, then a PE receiving a SMET route for that BD MUST | |||
NOT reconstruct IGMP/MLD Join Reports from the SMET route, and MUST | NOT reconstruct IGMP/MLD Join Reports from the SMET route and MUST | |||
NOT transmit any such IGMP/MLD Join Reports on its local ACs | NOT transmit any such IGMP/MLD Join Reports on its local ACs | |||
attaching to that BD. Otherwise, multicast traffic may be | attaching to that BD. Otherwise, multicast traffic may be | |||
duplicated. | duplicated. | |||
6.1.2. Interworking with MVPN | 6.1.2. Interworking with MVPN | |||
In this section, we specify the procedures necessary to allow EVPN | In this section, we specify the procedures necessary to allow EVPN | |||
PEs running OISM procedures to interwork with L3VPN PEs that run BGP- | PEs running OISM procedures to interwork with L3VPN PEs that run BGP- | |||
based MVPN [RFC6514] procedures. More specifically, the procedures | based MVPN [RFC6514] procedures. More specifically, the procedures | |||
herein allow a given EVPN Tenant Domain to become part of an L3VPN/ | herein allow a given EVPN Tenant Domain to become part of an L3VPN/ | |||
MVPN, and support multicast flows where either: | MVPN and support multicast flows where either of the following | |||
occurs: | ||||
* The source of a given multicast flow is attached to an Ethernet | * The source of a given multicast flow is attached to an Ethernet | |||
segment whose BD is part of an EVPN Tenant Domain, and one or more | segment whose BD is part of an EVPN Tenant Domain, and one or more | |||
receivers of the flow are attached to the network via L3VPN/MVPN. | receivers of the flow are attached to the network via L3VPN/MVPN. | |||
(Other receivers may be attached to the network via EVPN.) | (Other receivers may be attached to the network via EVPN.) | |||
* The source of a given multicast flow is attached to the network | * The source of a given multicast flow is attached to the network | |||
via L3VPN/MVPN, and one or more receivers of the flow are attached | via L3VPN/MVPN, and one or more receivers of the flow are attached | |||
to an Ethernet segment that is part of an EVPN tenant domain. | to an Ethernet segment that is part of an EVPN Tenant Domain. | |||
(Other receivers may be attached via L3VPN/MVPN.) | (Other receivers may be attached via L3VPN/MVPN.) | |||
In this interworking model, existing L3VPN/MVPN PEs are unaware that | In this interworking model, existing L3VPN/MVPN PEs are unaware that | |||
certain sources or receivers are part of an EVPN Tenant Domain. The | certain sources or receivers are part of an EVPN Tenant Domain. The | |||
existing L3VPN/MVPN nodes run only their standard procedures and are | existing L3VPN/MVPN nodes run only their standard procedures and are | |||
entirely unaware of EVPN. Interworking is achieved by having some or | entirely unaware of EVPN. Interworking is achieved by having some or | |||
all of the EVPN PEs function as L3 Gateways running L3VPN/MVPN | all of the EVPN PEs function as L3 Gateways running L3VPN/MVPN | |||
procedures, as detailed in the following sub-sections. | procedures, as detailed in the following subsections. | |||
In this section, we assume that there are no tenant multicast routers | In this section, we assume that there are no tenant multicast routers | |||
on any of the EVPN-attached Ethernet segments. (There may of course | on any of the EVPN-attached Ethernet segments. (Of course, there may | |||
be multicast routers in the L3VPN.) Consideration of the case where | be multicast routers in the L3VPN.) Consideration of the case where | |||
there are tenant multicast routers is deferred till Section 7.) | there are tenant multicast routers is addressed in Section 7. | |||
To support MVPN/EVPN interworking, we introduce the notion of an | To support MVPN/EVPN interworking, we introduce the notion of an | |||
MVPN/EVPN Gateway, or MEG. | MVPN/EVPN Gateway (MEG). | |||
A MEG is an L3 Gateway (see Section 6.1.1), hence is both an OISM PE | A MEG is an L3 Gateway (see Section 6.1.1); hence, it is both an OISM | |||
and an L3VPN/MVPN PE. For a given EVPN Tenant Domain, it will have | PE and an L3VPN/MVPN PE. For a given EVPN Tenant Domain, it will | |||
an IP-VRF. If the Tenant Domain is part of an L3VPN/MVPN, the IP-VRF | have an IP-VRF. If the Tenant Domain is part of an L3VPN/MVPN, the | |||
also serves as an L3VPN VRF [RFC4364]. The IRB interfaces of the | IP-VRF also serves as an L3VPN VRF [RFC4364]. The IRB interfaces of | |||
IP-VRF are considered to be "VRF interfaces" of the L3VPN VRF. The | the IP-VRF are considered to be VRF interfaces of the L3VPN VRF. The | |||
L3VPN VRF may also have other local VRF interfaces that are not EVPN | L3VPN VRF may also have other local VRF interfaces that are not EVPN | |||
IRB interfaces. | IRB interfaces. | |||
The VRF on the MEG will import VPN-IP routes [RFC4364] from other | The VRF on the MEG will import VPN-IP routes [RFC4364] from other | |||
L3VPN Provider Edge (PE) routers. It will also export VPN-IP routes | L3VPN PE routers. It will also export VPN-IP routes to other L3VPN | |||
to other L3VPN PE routers. In order to do so, it must be | PE routers. In order to do so, it must be appropriately configured | |||
appropriately configured with the Route Targets used in the L3VPN to | with the RTs used in the L3VPN to control the distribution of the | |||
control the distribution of the VPN-IP routes. These Route Targets | VPN-IP routes. In general, these RTs will be different than the RTs | |||
will in general be different than the Route Targets used for | used for controlling the distribution of EVPN routes, as there is no | |||
controlling the distribution of EVPN routes, as there is no need to | need to distribute EVPN routes to L3VPN-only PEs and no reason to | |||
distribute EVPN routes to L3VPN-only PEs and no reason to distribute | distribute L3VPN/MVPN routes to EVPN-only PEs. | |||
L3VPN/MVPN routes to EVPN-only PEs. | ||||
Note that the RDs in the imported VPN-IP routes will not necessarily | Note that the RDs in the imported VPN-IP routes will not necessarily | |||
conform to the EVPN rules (as specified in [RFC7432]) for creating | conform to the EVPN rules (as specified in [RFC7432]) for creating | |||
RDs. Therefore a MEG MUST NOT expect the RDs of the VPN-IP routes to | RDs. Therefore, a MEG MUST NOT expect the RDs of the VPN-IP routes | |||
be of any particular format other than what is required by the L3VPN/ | to be of any particular format other than what is required by the | |||
MVPN specifications. | L3VPN/MVPN specifications. | |||
The VPN-IP routes that a MEG exports to L3VPN are subnet routes and/ | The VPN-IP routes that a MEG exports to L3VPN are subnet routes and/ | |||
or host routes for the multicast sources that are part of the EVPN | or host routes for the multicast sources that are part of the EVPN | |||
tenant domain. The exact set of routes that need to be exported is | Tenant Domain. The exact set of routes that need to be exported is | |||
discussed in Section 6.1.2.2. | discussed in Section 6.1.2.2. | |||
Each IMET route originated by a MEG SHOULD carry a Multicast Flags | Each IMET route originated by a MEG SHOULD carry a Multicast Flags | |||
Extended Community with the "MEG" flag set, indicating that the | Extended Community with the MEG flag set, indicating that the | |||
originator of the IMET route is a MEG. However, PE1 will consider | originator of the IMET route is a MEG. However, PE1 will consider | |||
PE2 to be a MEG if PE1 imports at least one IMET route from PE2 that | PE2 to be a MEG if PE1 imports at least one IMET route from PE2 that | |||
carries the Multicast Flags EC with the MEG flag set. | carries the Multicast Flags EC with the MEG flag set. | |||
All the MEGs of a given Tenant Domain attach to the SBD of that | All the MEGs of a given Tenant Domain attach to the SBD of that | |||
domain, and one of them is selected to be the SBD's Designated Router | domain, and one of them is selected to be the SBD's Designated Router | |||
(the "MEG SBD-DR") for the domain. The selection procedure is | (the MEG SBD-DR) for the domain. The selection procedure is | |||
discussed in Section 6.1.2.4. | discussed in Section 6.1.2.4. | |||
In this model of operation, MVPN procedures and EVPN procedures are | In this model of operation, MVPN procedures and EVPN procedures are | |||
largely independent. In particular, there is no assumption that MVPN | largely independent. In particular, there is no assumption that MVPN | |||
and EVPN use the same kind of tunnels. Thus no special procedures | and EVPN use the same kind of tunnels. Thus, no special procedures | |||
are needed to handle the common scenarios where, e.g., EVPN uses | are needed to handle the common scenarios where, e.g., EVPN uses | |||
VXLAN tunnels but MVPN uses MPLS P2MP tunnels, or where EVPN uses | VXLAN tunnels but MVPN uses MPLS P2MP tunnels, or where EVPN uses IR | |||
Ingress Replication but MVPN uses MPLS P2MP tunnels. | but MVPN uses MPLS P2MP tunnels. | |||
Similarly, no special procedures are needed to prevent duplicate data | Similarly, no special procedures are needed to prevent duplicate data | |||
delivery on Ethernet segments that are multi-homed. | delivery on Ethernet segments that are multihomed. | |||
The MEG does have some special procedures (described below) for | The MEG does have some special procedures (described below) for | |||
interworking between EVPN and MVPN; these have to do with selection | interworking between EVPN and MVPN; these have to do with selection | |||
of the Upstream PE for a given multicast source, with the exporting | of the Upstream PE for a given multicast source, with the exporting | |||
of VPN-IP routes, and with the generation of MVPN C-multicast routes | of VPN-IP routes and with the generation of MVPN C-multicast routes | |||
triggered by the installation of SMET routes. | triggered by the installation of SMET routes. | |||
6.1.2.1. MVPN Sources with EVPN Receivers | 6.1.2.1. MVPN Sources with EVPN Receivers | |||
6.1.2.1.1. Identifying MVPN Sources | 6.1.2.1.1. Identifying MVPN Sources | |||
Consider a multicast source S. It is possible that a MEG will import | Consider a multicast source S. It is possible that a MEG will import | |||
both an EVPN unicast route to S and a VPN-IP route (or an ordinary IP | both an EVPN unicast route to S and a VPN-IP route (or an ordinary IP | |||
route), where the prefix length of each route is the same. In order | route), where the prefix length of each route is the same. In order | |||
to draw (S,G) multicast traffic for any group G, the MEG SHOULD use | to draw (S,G) multicast traffic for any group G, the MEG SHOULD use | |||
the EVPN route rather than the VPN-IP or IP route to determine the | the EVPN route rather than the VPN-IP or IP route to determine the | |||
"Upstream PE" (see section 5 of [RFC6513]). | Upstream PE (see Section 5 of [RFC6513]). | |||
Doing so ensures that when an EVPN tenant system desires to receive a | Doing so ensures that when an EVPN tenant system desires to receive a | |||
multicast flow from another EVPN tenant system, the traffic from the | multicast flow from another EVPN tenant system, the traffic from the | |||
source to that receiver stays within the EVPN domain. This prevents | source to that receiver stays within the EVPN domain. This prevents | |||
problems that might arise if there is a unicast route via L3VPN to S, | problems that might arise if there is a unicast route via L3VPN to S | |||
but no multicast routers along the routed path. This also prevents | but no multicast routers along the routed path. This also prevents | |||
problem that might arise as a result of the fact that the MEGs will | problem that might arise as a result of the fact that the MEGs will | |||
import each others' VPN-IP routes. | import each others' VPN-IP routes. | |||
In the Section 6.1.2.1.2, we describe the procedures to be used when | In Section 6.1.2.1.2, we describe the procedures to be used when the | |||
the selected route to S is a VPN-IP route. | selected route to S is a VPN-IP route. | |||
6.1.2.1.2. Joining a Flow from an MVPN Source | 6.1.2.1.2. Joining a Flow from an MVPN Source | |||
Consider a tenant system, R, on a particular BD, BD-R. Suppose R | Consider a tenant system, say R, on a particular BD, say BD-R. | |||
wants to receive (S,G) multicast traffic, where source S is not | Suppose R wants to receive (S,G) multicast traffic, where source S is | |||
attached to any PE in the EVPN Tenant Domain, but is attached to an | not attached to any PE in the EVPN Tenant Domain but is attached to | |||
MVPN PE. | an MVPN PE. | |||
* Suppose R is on a singly homed Ethernet segment of BD-R, and that | * Suppose R is on a singly homed Ethernet segment of BD-R and that | |||
segment is attached to PE1, where PE1 is a MEG. PE1 learns via | segment is attached to PE1, where PE1 is a MEG. PE1 learns via | |||
IGMP/MLD listening that R is interested in (S,G). PE1 determines | IGMP/MLD listening that R is interested in (S,G). PE1 determines | |||
from its VRF that there is no route to S within the Tenant Domain | from its VRF that there is no route to S within the Tenant Domain | |||
(i.e., no EVPN RT-2 route matching on S's IP address), but that | (i.e., no EVPN RT-2 route matching on S's IP address) but that | |||
there is a route to S via L3VPN (i.e., the VRF contains a subnet | there is a route to S via L3VPN (i.e., the VRF contains a subnet | |||
or host route to S that was received as a VPN-IP route). PE1 thus | or host route to S that was received as a VPN-IP route). Thus, | |||
originates (if it hasn't already) an MVPN C-multicast Source Tree | PE1 originates (if it hasn't already) an MVPN C-multicast Source | |||
Join(S,G) route. The route is constructed according to normal | Tree Join (S,G) route. The route is constructed according to | |||
MVPN procedures. | normal MVPN procedures. | |||
The layer 2 multicast state is constructed as specified in | The Layer 2 multicast state is constructed as specified in | |||
Section 4.1. | Section 4.1. | |||
In the layer 3 multicast state, the IIF is the appropriate MVPN | In the Layer 3 multicast state, the IIF is the appropriate MVPN | |||
tunnel, and the IRB interface to BD-R is added to the OIF list. | tunnel, and the IRB interface to BD-R is added to the OIF list. | |||
When PE1 receives (S,G) traffic from the appropriate MVPN tunnel, | When PE1 receives (S,G) traffic from the appropriate MVPN tunnel, | |||
it performs IP processing of the traffic, and then sends the | it performs IP processing of the traffic and then sends the | |||
traffic down its IRB interface to BD-R. Following normal OISM | traffic down its IRB interface to BD-R. Following normal OISM | |||
procedures, the (S,G) traffic will be encapsulated for Ethernet | procedures, the (S,G) traffic will be encapsulated for Ethernet | |||
and sent on the AC to which R is attached. | and sent on the AC to which R is attached. | |||
* Suppose R is on a singly homed Ethernet segment of BD-R, and that | * Suppose R is on a singly homed Ethernet segment of BD-R and that | |||
segment is attached to PE1, where PE1 is an OISM PE but is NOT a | segment is attached to PE1, where PE1 is an OISM PE but is NOT a | |||
MEG. PE1 learns via IGMP/MLD listening that R is interested in | MEG. PE1 learns via IGMP/MLD listening that R is interested in | |||
(S,G). PE1 follows normal OISM procedures, originating an SBD- | (S,G). PE1 follows normal OISM procedures, originating an SBD- | |||
SMET route for (S,G); this route will be received by all the MEGs | SMET route for (S,G); this route will be received by all the MEGs | |||
of the Tenant Domain, including the MEG SBD-DR. The MEG SBD-DR | of the Tenant Domain, including the MEG SBD-DR. From PE1's IMET | |||
can determine from PE1's IMET routes whether PE1 is itself a MEG. | routes, the MEG SBD-DR can determine whether or not PE1 is itself | |||
If PE1 is not a MEG, the MEG SBD-DR will originate (if it hasn't | a MEG. If PE1 is not a MEG, the MEG SBD-DR will originate (if it | |||
already) an MVPN C-multicast Source Tree Join(S,G) route. This | hasn't already) an MVPN C-multicast Source Tree Join (S,G) route. | |||
will cause the MEG SBD-DR to receive (S,G) traffic on an MVPN | This will cause the MEG SBD-DR to receive (S,G) traffic on an MVPN | |||
tunnel. | tunnel. | |||
The layer 2 multicast state is constructed as specified in | The Layer 2 multicast state is constructed as specified in | |||
Section 4.1. | Section 4.1. | |||
In the layer 3 multicast state, the IIF is the appropriate MVPN | In the Layer 3 multicast state, the IIF is the appropriate MVPN | |||
tunnel, and the IRB interface to the SBD is added to the OIF list. | tunnel, and the IRB interface to the SBD is added to the OIF list. | |||
When the MEG SBD-DR receives (S,G) traffic on an MVPN tunnel, it | When the MEG SBD-DR receives (S,G) traffic on an MVPN tunnel, it | |||
performs IP processing of the traffic, and the sends the traffic | performs IP processing of the traffic and then sends the traffic | |||
down its IRB interface to the SBD. Following normal OISM | down its IRB interface to the SBD. Following normal OISM | |||
procedures, the traffic will be encapsulated for Ethernet and | procedures, the traffic will be encapsulated for Ethernet and | |||
delivered to all PEs in the Tenant Domain that have interest in | delivered to all PEs in the Tenant Domain that have interest in | |||
(S,G), including PE1. | (S,G), including PE1. | |||
* If R is on a multi-homed Ethernet segment of BD-R, one of the PEs | * If R is on a multihomed Ethernet segment of BD-R, one of the PEs | |||
attached to the segment will be its DF (following normal EVPN | attached to the segment will be its DF (following normal EVPN | |||
procedures), and the DF will know (via IGMP/MLD listening or the | procedures), and the DF will know (via IGMP/MLD listening or the | |||
procedures of [RFC9251]) that a tenant system reachable via one of | procedures of [RFC9251]) that a tenant system reachable via one of | |||
its local ACs to BD-R is interested in (S,G) traffic. The DF is | its local ACs to BD-R is interested in (S,G) traffic. The DF is | |||
responsible for originating an SBD-SMET route for (S,G), following | responsible for originating an SBD-SMET route for (S,G), following | |||
normal OISM procedures. If the DF is a MEG, it MUST originate the | normal OISM procedures. If the DF is a MEG, it MUST originate the | |||
corresponding MVPN C-multicast Source Tree Join(S,G) route; if the | corresponding MVPN C-multicast Source Tree Join (S,G) route; if | |||
DF is not a MEG, the MEG SBD-DR SBD MUST originate the C-multicast | the DF is not a MEG, the MEG SBD-DR SBD MUST originate the | |||
route when it receives the SMET route. | C-multicast route when it receives the SMET route. | |||
Optionally, if the non-DF is a MEG, it MAY originate the | Optionally, if the non-DF is a MEG, it MAY originate the | |||
corresponding MVPN C-multicast Source Tree Join(S,G) route. This | corresponding MVPN C-multicast Source Tree Join (S,G) route. This | |||
will cause the traffic to flow to both the DF and the non-DF, but | will cause the traffic to flow to both the DF and the non-DF, but | |||
only the DF will forward the traffic out an AC. This allows for | only the DF will forward the traffic out an AC. This allows for | |||
quicker recovery if the DF's local AC to R fails. | quicker recovery if the DF's local AC to R fails. | |||
* If R is attached to a non-OISM PE, it will receive the traffic via | * If R is attached to a non-OISM PE, it will receive the traffic via | |||
an IPMG, as specified in Section 5. | an IPMG, as specified in Section 5. | |||
If an EVPN-attached receiver is interested in (*,G) traffic, and if | If an EVPN-attached receiver is interested in (*,G) traffic, and if | |||
it is possible for there to be sources of (*,G) traffic that are | it is possible for there to be sources of (*,G) traffic that are | |||
attached only to L3VPN nodes, the MEGs will have to know the group- | attached only to L3VPN nodes, the MEGs will have to know the group- | |||
to-RP mappings. That will enable them to originate MVPN C-multicast | to-RP mappings. That will enable them to originate MVPN C-multicast | |||
Shared Tree Join(*,G) routes and to send them towards the RP. (Since | Shared Tree Join (*,G) routes and to send them toward the RP. (Since | |||
we are assuming in this section that there are no tenant multicast | we are assuming in this section that there are no tenant multicast | |||
routers attached to the EVPN Tenant Domain, the RP must be attached | routers attached to the EVPN Tenant Domain, the RP must be attached | |||
via L3VPN. Alternatively, the MEG itself could be configured to | via L3VPN. Alternatively, the MEG itself could be configured to | |||
function as an RP for group G.) | function as an RP for group G.) | |||
The layer 2 multicast states are constructed as specified in | The Layer 2 multicast states are constructed as specified in | |||
Section 4.1. | Section 4.1. | |||
In the layer 3 (*,G) multicast state, the IIF is the appropriate MVPN | In the Layer 3 (*,G) multicast state, the IIF is the appropriate MVPN | |||
tunnel. A MEG will add to the (*,G) OIF list its IRB interfaces for | tunnel. A MEG will add its IRB interfaces to the (*,G) OIF list for | |||
any BDs containing locally attached receivers. If there are | any BDs containing locally attached receivers. If there are | |||
receivers attached to other EVPN PEs, then whenever (S,G) traffic | receivers attached to other EVPN PEs, then whenever (S,G) traffic | |||
from an external source matches a (*,G) state, the MEG will create | from an external source matches a (*,G) state, the MEG will create | |||
(S,G) state, with the MVPN tunnel as the IIF, the OIF list copied | (S,G) state, with the MVPN tunnel as the IIF, the OIF list copied | |||
from the (*,G) state, and the SBD IRB interface added to the OIF | from the (*,G) state, and the SBD IRB interface added to the OIF | |||
list. (Please see the discussion in Section 6.1.1 regarding the | list. (Please see the discussion in Section 6.1.1 regarding the | |||
inclusion of the SBD IRB interface in a (*,G) state; the SBD IRB | inclusion of the SBD IRB interface in a (*,G) state; the SBD IRB | |||
interface is used in the OIF list only for traffic from external | interface is only used in the OIF list for traffic from external | |||
sources.) | sources.) | |||
Normal MVPN procedures will then result in the MEG getting the (*,G) | Normal MVPN procedures will then result in the MEG getting the (*,G) | |||
traffic from all the multicast sources for G that are attached via | traffic from all the multicast sources for G that are attached via | |||
L3VPN. This traffic arrives on MVPN tunnels. When the MEG removes | L3VPN. This traffic arrives on MVPN tunnels. When the MEG removes | |||
the traffic from these tunnels, it does the IP processing. If there | the traffic from these tunnels, it does the IP processing. If there | |||
are any receivers on a given BD, BD-R, that are attached via local | are any receivers on a given BD, say BD-R, that are attached via | |||
EVPN ACs, the MEG sends the traffic down its BD-R IRB interface. If | local EVPN ACs, the MEG sends the traffic down its BD-R IRB | |||
there are any other EVPN PEs that are interested in the (*,G) | interface. If there are any other EVPN PEs that are interested in | |||
traffic, the MEG sends the traffic down the SBD IRB interface. | the (*,G) traffic, the MEG sends the traffic down the SBD IRB | |||
Normal OISM procedures then distribute the traffic as needed to other | interface. Normal OISM procedures then distribute the traffic as | |||
EVPN-PEs. | needed to other EVPN PEs. | |||
6.1.2.2. EVPN Sources with MVPN Receivers | 6.1.2.2. EVPN Sources with MVPN Receivers | |||
6.1.2.2.1. General procedures | ||||
6.1.2.2.1. General Procedures | ||||
Consider the case where an EVPN tenant system S is sending IP | Consider the case where an EVPN tenant system S is sending IP | |||
multicast traffic to group G, and there is a receiver R for the (S,G) | multicast traffic to group G and there is a receiver R for the (S,G) | |||
traffic that is attached to the L3VPN, but not attached to the EVPN | traffic that is attached to the L3VPN but not attached to the EVPN | |||
Tenant Domain. (We assume in this document that the L3VPN/MVPN-only | Tenant Domain. (In this document, we assume that the L3VPN-/MVPN- | |||
nodes will not have any special procedures to deal with the case | only nodes will not have any special procedures to deal with the case | |||
where a source is inside an EVPN domain.) | where a source is inside an EVPN domain.) | |||
In this case, an L3VPN PE through which R can be reached has to send | In this case, an L3VPN PE through which R can be reached has to send | |||
an MVPN C-multicast Join(S,G) route to one of the MEGs that is | an MVPN C-multicast Join (S,G) route to one of the MEGs that is | |||
attached to the EVPN Tenant Domain. For this to happen, the L3VPN PE | attached to the EVPN Tenant Domain. For this to happen, the L3VPN PE | |||
must have imported a VPN-IP route for S (either a host route or a | must have imported a VPN-IP route for S (either a host route or a | |||
subnet route) from a MEG. | subnet route) from a MEG. | |||
If a MEG determines that there is multicast source transmitting on | If a MEG determines that there is multicast source transmitting on | |||
one of its ACs, the MEG SHOULD originate a VPN-IP host route for that | one of its ACs, the MEG SHOULD originate a VPN-IP host route for that | |||
source. This determination SHOULD be made by examining the IP | source. This determination SHOULD be made by examining the IP | |||
multicast traffic that arrives on the ACs. (It MAY be made by | multicast traffic that arrives on the ACs. (It MAY be made by | |||
provisioning.) A MEG SHOULD NOT export a VPN-IP host route for any | provisioning.) A MEG SHOULD NOT export a VPN-IP host route for any | |||
IP address that is not known to be a multicast source (unless it has | IP address that is not known to be a multicast source (unless it has | |||
some other reason for exporting such a route). The VPN-IP host route | some other reason for exporting such a route). The VPN-IP host route | |||
for a given multicast source MUST be withdrawn if the source goes | for a given multicast source MUST be withdrawn if the source goes | |||
silent for a configurable period of time, or if it can be determined | silent for a configurable period of time or if it can be determined | |||
that the source is no longer reachable via a local AC. | that the source is no longer reachable via a local AC. | |||
A MEG SHOULD also originate a VPN-IP subnet route for each of the BDs | A MEG SHOULD also originate a VPN-IP subnet route for each of the BDs | |||
in the Tenant Domain. | in the Tenant Domain. | |||
VPN-IP routes exported by a MEG must carry any attributes or extended | VPN-IP routes exported by a MEG must carry any attributes or Extended | |||
communities that are required by L3VPN and MVPN. In particular, a | Communities that are required by L3VPN and MVPN. In particular, a | |||
VPN-IP route exported by a MEG must carry a VRF Route Import Extended | VPN-IP route exported by a MEG must carry a VRF Route Import Extended | |||
Community corresponding to the IP-VRF from which it is imported, and | Community corresponding to the IP-VRF from which it is imported and a | |||
a Source AS Extended Community. | Source AS Extended Community. | |||
As a result, if S is attached to a MEG, the L3VPN nodes will direct | As a result, if S is attached to a MEG, the L3VPN nodes will direct | |||
their MVPN C-multicast Join routes to that MEG. Normal MVPN | their MVPN C-multicast Join routes to that MEG. Normal MVPN | |||
procedures will cause the traffic to be delivered to the L3VPN nodes. | procedures will cause the traffic to be delivered to the L3VPN nodes. | |||
The layer 3 multicast state for (S,G) will have the MVPN tunnel on | The Layer 3 multicast state for (S,G) will have the MVPN tunnel on | |||
its OIF list. The IIF will be the IRB interface leading to the BD | its OIF list. The IIF will be the IRB interface leading to the BD | |||
containing S. | containing S. | |||
If S is not attached to a MEG, the L3VPN nodes will direct their | If S is not attached to a MEG, the L3VPN nodes will direct their | |||
C-multicast Join routes to whichever MEG appears to be on the best | C-multicast Join routes to whichever MEG appears to be on the best | |||
route to S's subnet. Upon receiving the C-multicast Join, that MEG | route to S's subnet. Upon receiving the C-multicast Join, that MEG | |||
will originate an EVPN SMET route for (S,G). As a result, the MEG | will originate an EVPN SMET route for (S,G). As a result, the MEG | |||
will receive the (S,G) traffic at layer 2 via the OISM procedures. | will receive the (S,G) traffic at Layer 2 via the OISM procedures. | |||
The (S,G) traffic will be sent up the appropriate IRB interface, and | The (S,G) traffic will be sent up the appropriate IRB interface, and | |||
the layer 3 MVPN procedures will ensure that the traffic is delivered | the Layer 3 MVPN procedures will ensure that the traffic is delivered | |||
to the L3VPN nodes that have requested it. The layer 3 multicast | to the L3VPN nodes that have requested it. The Layer 3 multicast | |||
state for (S,G) will have the MVPN tunnel in the OIF list, and the | state for (S,G) will have the MVPN tunnel in the OIF list, and the | |||
IIF will be one of the following: | IIF will be one of the following: | |||
* If S belongs to a BD that is attached to the MEG, the IIF will be | * If S belongs to a BD that is attached to the MEG, the IIF will be | |||
the IRB interface to that BD; | the IRB interface to that BD. | |||
* Otherwise the IIF will be the SBD IRB interface. | * Otherwise, the IIF will be the SBD IRB interface. | |||
Note that this works even if S is attached to a non-OISM PE, per the | Note that this works even if S is attached to a non-OISM PE, per the | |||
procedures of Section 5. | procedures of Section 5. | |||
6.1.2.2.2. Any-Source Multicast (ASM) Groups | 6.1.2.2.2. Any-Source Multicast (ASM) Groups | |||
Suppose the MEG SBD-DR learns that one of the PEs in its Tenant | Suppose the MEG SBD-DR learns that one of the PEs in its Tenant | |||
Domain is interested in (*,G), traffic, where G is an Any-Source | Domain is interested in (*,G) traffic, where G is an ASM group. If | |||
Multicast (ASM) group. If there are no tenant multicast routers, the | there are no tenant multicast routers, the MEG SBD-DR SHOULD perform | |||
MEG SBD-DR SHOULD perform the "First Hop Router" (FHR) functionality | the First Hop Router (FHR) functionality for group G on behalf of the | |||
for group G on behalf of the Tenant Domain, as described in | Tenant Domain, as described in [RFC7761]. This means that the MEG | |||
[RFC7761]. This means that the MEG SBD-DR must know the identity of | SBD-DR must know the identity of the RP for each group, must send | |||
the Rendezvous Point (RP) for each group, must send Register messages | Register messages to the RP, etc. | |||
to the Rendezvous Point, etc. | ||||
If the MEG SBD-DR is to be the FHR for the Tenant Domain, it must see | If the MEG SBD-DR is to be the FHR for the Tenant Domain, it must see | |||
all the multicast traffic that is sourced from within the domain and | all the multicast traffic that is sourced from within the domain and | |||
destined to an ASM group address. The MEG can ensure this by | destined to an ASM group address. The MEG can ensure this by | |||
originating an SBD-SMET route for (*,*). | originating an SBD-SMET route for (*,*). | |||
(As a possible optimization, an SBD-SMET route for (*, "any ASM | (As a possible optimization, an SBD-SMET route for (*, any ASM group) | |||
group") may be defined in a separate document.) | may be defined in a separate document.) | |||
In some deployment scenarios, it may be preferred that the MEG that | In some deployment scenarios, it may be preferred that the MEG that | |||
receives the (S,G) traffic over an AC be the one providing the FHR | receives the (S,G) traffic over an AC be the one providing the FHR | |||
functionality. This behavior is OPTIONAL. If this option is used, | functionality. This behavior is OPTIONAL. If this option is used, | |||
it MUST be ensured that the MEG DR does not provide the FHR | it MUST be ensured that the MEG DR does not provide the FHR | |||
functionality for (S,G) traffic that is attached to another MEG; FHR | functionality for (S,G) traffic that is attached to another MEG; FHR | |||
functionality for (S,G) traffic from a particular source S MUST be | functionality for (S,G) traffic from a particular source S MUST be | |||
provided by only a single router. | provided by only a single router. | |||
Other deployment scenarios are also possible. For example, one might | Other deployment scenarios are also possible. For example, one might | |||
want to configure the MEGs themselves to be RPs. In this case, the | want to configure the MEGs themselves to be RPs. In this case, the | |||
RPs would have to exchange with each other information about which | RPs would have to exchange with each other information about which | |||
sources are active. The method exchanging such information is | sources are active. The method exchanging such information is | |||
outside the scope of this document. | outside the scope of this document. | |||
6.1.2.2.3. Source on Multihomed Segment | 6.1.2.2.3. Source on Multihomed Segment | |||
Suppose S is attached to a segment that is all-active multi-homed to | Suppose S is attached to a segment that is all-active multihomed to | |||
PEl and PE2. If S is transmitting to two groups, say G1 and G2, it | PE1 and PE2. If S is transmitting to two groups, say G1 and G2, it | |||
is possible that PE1 will receive the (S,G1) traffic from S while PE2 | is possible that PE1 will receive the (S,G1) traffic from S, whereas | |||
receives the (S,G2) traffic from S. | PE2 will receive the (S,G2) traffic from S. | |||
This creates an issue for MVPN/EVPN interworking, because there is no | This creates an issue for MVPN/EVPN interworking, because there is no | |||
way to cause L3VPN/MVPN nodes to select PE1 as the ingress PE for | way to cause L3VPN/MVPN nodes to select PE1 as the ingress PE for | |||
(S,G1) traffic while selecting PE2 as the ingress PE for (S,G2) | (S,G1) traffic while selecting PE2 as the ingress PE for (S,G2) | |||
traffic. | traffic. | |||
However, the following procedure ensures that the IP multicast | However, the following procedure ensures that the IP multicast | |||
traffic will still flow, even if the L3VPN/MVPN nodes picks the | traffic will still flow, even if the L3VPN/MVPN nodes pick the wrong | |||
"wrong" EVPN-PE as the Upstream PE for (say) the (S,G1) traffic. | EVPN PE as the Upstream PE for, e.g., the (S,G1) traffic. | |||
Suppose S is on an Ethernet segment, belonging to BD1, that is | Suppose S is on an Ethernet segment, belonging to BD1, that is | |||
multi-homed to both PE1 and PE2, where PE1 is a MEG. And suppose | multihomed to both PE1 and PE2, where PE1 is a MEG. And suppose that | |||
that IP multicast traffic from S to G travels over the AC that | IP multicast traffic from S to G travels over the AC that attaches | |||
attaches the segment to PE2. If PE1 receives a C-multicast Source | the segment to PE2. If PE1 receives a C-multicast Source Tree Join | |||
Tree Join (S,G) route, it MUST originate an SMET route for (S,G). | (S,G) route, it MUST originate a SMET route for (S,G). Normal OISM | |||
Normal OISM procedures will then cause PE2 to send the (S,G) traffic | procedures will then cause PE2 to send the (S,G) traffic to PE1 on an | |||
to PE1 on an EVPN IP multicast tunnel. Normal OISM procedures will | EVPN IP multicast tunnel. Normal OISM procedures will also cause PE1 | |||
also cause PE1 to send the (S,G) traffic up its BD1 IRB interface. | to send the (S,G) traffic up its BD1 IRB interface. Normal MVPN | |||
Normal MVPN procedures will then cause PE1 to forward the traffic on | procedures will then cause PE1 to forward the traffic on an MVPN | |||
an MVPN tunnel. In this case, the routing is not optimal, but the | tunnel. In this case, the routing is not optimal, but the traffic | |||
traffic does flow correctly. | does flow correctly. | |||
6.1.2.3. Obtaining Optimal Routing of Traffic Between MVPN and EVPN | 6.1.2.3. Obtaining Optimal Routing of Traffic between MVPN and EVPN | |||
The routing of IP multicast traffic between MVPN nodes and EVPN nodes | The routing of IP multicast traffic between MVPN nodes and EVPN nodes | |||
will be optimal as long as there is a MEG along the optimal route. | will be optimal as long as there is a MEG along the optimal route. | |||
There are various deployment strategies that can be used to obtain | There are various deployment strategies that can be used to obtain | |||
optimal routing between MVPN and EVPN. | optimal routing between MVPN and EVPN. | |||
In one such scenario, a Tenant Domain will have a small number of | In one such scenario, a Tenant Domain will have a small number of | |||
strategically placed MEGs. For example, a Data Center may have a | strategically placed MEGs. For example, a data center may have a | |||
small number of MEGs that connect it to a wide-area network. Then | small number of MEGs that connect it to a wide-area network. Then, | |||
the optimal route into or out of the Data Center would be through the | the optimal route into or out of the data center would be through the | |||
MEGs. | MEGs. | |||
In this scenario, the MEGs do not need to originate VPN-IP host | In this scenario, the MEGs do not need to originate VPN-IP host | |||
routes for the multicast sources, they only need to originate VPN-IP | routes for the multicast sources; they only need to originate VPN-IP | |||
subnet routes. The internal structure of the EVPN is completely | subnet routes. The internal structure of the EVPN is completely | |||
hidden from the MVPN node. EVPN actions such as MAC Mobility and | hidden from the MVPN node. EVPN actions, such as MAC Mobility and | |||
Mass Withdrawal [RFC7432] have zero impact on the MVPN control plane. | Mass Withdrawal [RFC7432], have zero impact on the MVPN control | |||
plane. | ||||
While this deployment scenario provides the most optimal routing and | While this deployment scenario provides the most optimal routing and | |||
has the least impact on the installed based of MVPN nodes, it does | has the least impact on the installed based of MVPN nodes, it does | |||
complicate network planning considerations. | complicate network planning considerations. | |||
Another way of providing routing that is close to optimal is to turn | Another way of providing routing that is close to optimal is to turn | |||
each EVPN PE into a MEG. Then routing of MVPN-to-EVPN traffic is | each EVPN PE into a MEG. Then, routing of MVPN-to-EVPN traffic is | |||
optimal. However, routing of EVPN-to-MVPN traffic is not guaranteed | optimal. However, routing of EVPN-to-MVPN traffic is not guaranteed | |||
to be optimal when a source host is on a multi-homed Ethernet segment | to be optimal when a source host is on a multihomed Ethernet segment | |||
(as discussed in Section 6.1.2.2.) | (as discussed in Section 6.1.2.2.) | |||
The obvious disadvantage of this method is that it requires every | The obvious disadvantage of this method is that it requires every | |||
EVPN PE to be a MEG. | EVPN PE to be a MEG. | |||
The procedures specified in this document allow an operator to add | The procedures specified in this document allow an operator to add | |||
MEG functionality to any subset of his EVPN OISM PEs. This allows an | MEG functionality to any subset of its EVPN OISM PEs. This allows an | |||
operator to make whatever trade-offs deemed appropriate between | operator to make whatever trade-offs deemed appropriate between | |||
optimal routing and MEG deployment. | optimal routing and MEG deployment. | |||
6.1.2.4. Selecting the MEG SBD-DR | 6.1.2.4. Selecting the MEG SBD-DR | |||
Every PE that is eligible for selection as the MEG SBD-DR originates | Every PE that is eligible for selection as the MEG SBD-DR originates | |||
an SBD-IMET route. As stated in Section 5, these SBD-IMET routes | an SBD-IMET route. As stated in Section 5, these SBD-IMET routes | |||
carry a Multicast Flags EC with the MEG Flag set. | carry a Multicast Flags EC with the MEG flag set. | |||
These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | These SBD-IMET routes SHOULD also carry a DF Election EC. The DF | |||
Election EC and its use are specified in [RFC8584]. When the route | Election EC and its use are specified in [RFC8584]. When the route | |||
is originated, the AC-DF bit in the DF Election EC SHOULD be set to | is originated, the AC-DF bit in the DF Election EC SHOULD be set to | |||
zero. This bit is not used when selecting a MEG SBD-DR, i.e., it | zero. This bit is not used when selecting a MEG SBD-DR, i.e., it | |||
MUST be ignored by the receiver of an SBD-IMET route. | MUST be ignored by the receiver of an SBD-IMET route. | |||
In the context of a given Tenant Domain, to select the MEG SBD-DR, | In the context of a given Tenant Domain, to select the MEG SBD-DR, | |||
the MEGs of the Tenant Domain perform the following procedure: | the MEGs of the Tenant Domain perform the following procedure: | |||
* From the set of received SBD-IMET routes for the given tenant | * From the set of received SBD-IMET routes for the given Tenant | |||
domain, determine the candidate set of PEs that support MEG | Domain, determine the candidate set of PEs that support MEG | |||
functionality for that domain. | functionality for that domain. | |||
* Select a DF Election algorithm as specified in [RFC8584]. Some of | * Select a DF election algorithm as specified in [RFC8584]. Some of | |||
the possible algorithms can be found, e.g., in [RFC7432], | the possible algorithms can be found, e.g., in [RFC7432], | |||
[RFC8584], and [I-D.ietf-bess-evpn-pref-df]. | [RFC8584], and [EVPN-DF]. | |||
* Apply the DF Election Algorithm (see [RFC8584]) to the candidate | * Apply the DF election algorithm (see [RFC8584]) to the candidate | |||
set of PEs. The "winner" becomes the MEG SBD-DR. | set of PEs. The winner becomes the MEG SBD-DR. | |||
Note that if a given PE supports IPMG (Section 6.1.2) or PEG | Note that if a given PE supports IPMG (Section 6.1.2) or PEG | |||
(Section 6.1.4) functionality as well as MEG functionality, its | (Section 6.1.4) functionality as well as MEG functionality, its SBD- | |||
SBD-IMET routes carry only one DF Election EC. | IMET routes carry only one DF Election EC. | |||
6.1.3. Interworking with 'Global Table Multicast' | 6.1.3. Interworking with Global Table Multicast | |||
If multicast service to the outside sources and/or receivers is | If multicast service to the outside sources and/or receivers is | |||
provided via the BGP-based "Global Table Multicast" (GTM) procedures | provided via the BGP-based Global Table Multicast (GTM) procedures of | |||
of [RFC7716], the procedures of Section 6.1.2 can easily be adapted | [RFC7716], the procedures of Section 6.1.2 can easily be adapted for | |||
for EVPN/GTM interworking. The way to adapt the MVPN procedures to | EVPN/GTM interworking. The way to adapt the MVPN procedures to GTM | |||
GTM is explained in [RFC7716]. | is explained in [RFC7716]. | |||
6.1.4. Interworking with PIM | 6.1.4. Interworking with PIM | |||
As we have been discussing, there may be receivers in an EVPN tenant | As discussed, there may be receivers in an EVPN Tenant Domain that | |||
domain that are interested in multicast flows whose sources are | are interested in multicast flows whose sources are outside the EVPN | |||
outside the EVPN Tenant Domain. Or there may be receivers outside an | Tenant Domain. Or there may be receivers outside an EVPN Tenant | |||
EVPN Tenant Domain that are interested in multicast flows whose | Domain that are interested in multicast flows whose sources are | |||
sources are inside the Tenant Domain. | inside the Tenant Domain. | |||
If the outside sources and/or receivers are part of an MVPN, | If the outside sources and/or receivers are part of an MVPN, see the | |||
interworking procedures are covered in Section 6.1.2. | procedures for interworking that are covered in Section 6.1.2. | |||
There are also cases where an external source or receiver are | There are also cases where an external source or receiver are | |||
attached via IP, and the layer 3 multicast routing is done via PIM. | attached via IP and the Layer 3 multicast routing is done via PIM. | |||
In this case, the interworking between the "PIM domain" and the EVPN | In this case, the interworking between the PIM domain and the EVPN | |||
tenant domain is done at L3 Gateways that perform "PIM/EVPN Gateway" | Tenant Domain is done at L3 Gateways that perform PIM/EVPN Gateway | |||
(PEG) functionality. A PEG is very similar to a MEG, except that its | (PEG) functionality. A PEG is very similar to a MEG, except that its | |||
layer 3 multicast routing is done via PIM rather than via BGP. | Layer 3 multicast routing is done via PIM rather than via BGP. | |||
If external sources or receivers for a given group are attached to a | If external sources or receivers for a given group are attached to a | |||
PEG via a layer 3 interface, that interface should be treated as a | PEG via a Layer 3 interface, that interface should be treated as a | |||
VRF interface attached to the Tenant Domain's L3VPN VRF. The layer 3 | VRF interface attached to the Tenant Domain's L3VPN VRF. The Layer 3 | |||
multicast routing instance for that Tenant Domain will either run PIM | multicast routing instance for that Tenant Domain will either run PIM | |||
on the VRF interface or will listen for IGMP/MLD messages on that | on the VRF interface or listen for IGMP/MLD messages on that | |||
interface. If the external receiver is attached elsewhere on an IP | interface. If the external receiver is attached elsewhere on an IP | |||
network, the PE has to enable PIM on its interfaces to the backbone | network, the PE has to enable PIM on its interfaces to the backbone | |||
network. In both cases, the PE needs to perform PEG functionality, | network. In both cases, the PE needs to perform PEG functionality, | |||
and its IMET routes must carry the Multicast Flags EC with the PEG | and its IMET routes must carry the Multicast Flags EC with the PEG | |||
flag set. | flag set. | |||
For each BD on which there is a multicast source or receiver, one of | For each BD on which there is a multicast source or receiver, one of | |||
the PEGs will becomes the PEG DR. DR selection can be done using the | the PEGs will become the PEG DR. DR selection can be done using the | |||
same procedures specified in Section 6.1.2.4, except with "PEG" | same procedures specified in Section 6.1.2.4, except with PEG | |||
substituted for "MEG". | substituted for MEG. | |||
As long as there are no tenant multicast routers within the EVPN | As long as there are no tenant multicast routers within the EVPN | |||
Tenant Domain, the PEGs do not need to run PIM on their IRB | Tenant Domain, the PEGs do not need to run PIM on their IRB | |||
interfaces. | interfaces. | |||
6.1.4.1. Source Inside EVPN Domain | 6.1.4.1. Source Inside EVPN Domain | |||
If a PEG receives a PIM Join(S,G) from outside the EVPN tenant | If a PEG receives a PIM Join (S,G) from outside the EVPN Tenant | |||
domain, it may find it necessary to create (S,G) state. The PE needs | Domain, it may find it necessary to create (S,G) state. The PE needs | |||
to determine whether S is within the Tenant Domain. If S is not | to determine whether S is within the Tenant Domain. If S is not | |||
within the EVPN Tenant Domain, the PE carries out normal layer 3 | within the EVPN Tenant Domain, the PE carries out normal Layer 3 | |||
multicast routing procedures. If S is within the EVPN tenant domain, | multicast routing procedures. If S is within the EVPN Tenant Domain, | |||
the IIF of the (S,G) state is set as follows: | the IIF of the (S,G) state is set as follows: | |||
* if S is on a BD that is attached to the PE, the IIF is the PE's | * If S is on a BD that is attached to the PE, the IIF is the PE's | |||
IRB interface to that BD; | IRB interface to that BD. | |||
* if S is not on a BD that is attached to the PE, the IIF is the | * If S is not on a BD that is attached to the PE, the IIF is the | |||
PE's IRB interface to the SBD. | PE's IRB interface to the SBD. | |||
When the PE creates such an (S,G) state, it MUST originate (if it | When the PE creates such an (S,G) state, it MUST originate (if it | |||
hasn't already) an SBD-SMET route for (S,G). This will cause it to | hasn't already) an SBD-SMET route for (S,G). This will cause it to | |||
pull the (S,G) traffic via layer 2. When the traffic arrives over an | pull the (S,G) traffic via Layer 2. When the traffic arrives over an | |||
EVPN tunnel, it gets sent up an IRB interface where the layer 3 | EVPN tunnel, it gets sent up an IRB interface where the Layer 3 | |||
multicast routing determines the packet's disposition. The SBD-SMET | multicast routing determines the packet's disposition. The SBD-SMET | |||
route is withdrawn when the (S,G) state no longer exists (unless | route is withdrawn when the (S,G) state no longer exists (unless | |||
there is some other reason for not withdrawing it). | there is some other reason for not withdrawing it). | |||
If there are no tenant multicast routers within the EVPN tenant | If there are no tenant multicast routers within the EVPN Tenant | |||
domain, there cannot be an RP in the Tenant Domain, so a PEG does not | Domain, there cannot be an RP in the Tenant Domain, so a PEG does not | |||
have to handle externally arriving PIM Join(*,G) messages. | have to handle externally arriving PIM Join (*,G) messages. | |||
The PEG DR for a particular BD MUST act as the a First Hop Router for | The PEG DR for a particular BD MUST act as the a First Hop Router for | |||
that BD. It will examine all (S,G) traffic on the BD, and whenever G | that BD. It will examine all (S,G) traffic on the BD, and whenever G | |||
is an ASM group, the PEG DR will send Register messages to the RP for | is an ASM group, the PEG DR will send Register messages to the RP for | |||
G. This means that the PEG DR will need to pull all the (S,G) | G. This means that the PEG DR will need to pull all the (S,G) | |||
traffic originating on a given BD, by originating an SMET (*,*) route | traffic originating on a given BD by originating a SMET (*,*) route | |||
for that BD. If a PEG DR is the DR for all the BDs, in SHOULD | for that BD. If a PEG DR is the DR for all the BDs, it SHOULD | |||
originate just an SBD-SMET (*,*) route rather than an SMET (*,*) | originate just an SBD-SMET (*,*) route rather than a SMET (*,*) route | |||
route for each BD. | for each BD. | |||
The rules for exporting IP routes to multicast sources are the same | The rules for exporting IP routes to multicast sources are the same | |||
as those specified for MEGs in Section 6.1.2.2, except that the | as those specified for MEGs in Section 6.1.2.2, except that the | |||
exported routes will be IP routes rather than VPN-IP routes, and it | exported routes will be IP routes rather than VPN-IP routes, and it | |||
is not necessary to attach the VRF Route Import EC or the Source AS | is not necessary to attach the VRF Route Import EC or the Source AS | |||
EC. | EC. | |||
When a source is on a multi-homed segment, the same issue discussed | When a source is on a multihomed segment, the same issue discussed in | |||
in Section 6.1.2.2.3 exists. Suppose S is on an Ethernet segment, | Section 6.1.2.2.3 exists. Suppose S is on an Ethernet segment, | |||
belonging to BD1, that is multi-homed to both PE1 and PE2, where PE1 | belonging to BD1, that is multihomed to both PE1 and PE2, where PE1 | |||
is a PEG. And suppose that IP multicast traffic from S to G travels | is a PEG. And suppose that IP multicast traffic from S to G travels | |||
over the AC that attaches the segment to PE2. If PE1 receives an | over the AC that attaches the segment to PE2. If PE1 receives an | |||
external PIM Join (S,G) route, it MUST originate an SMET route for | external PIM Join (S,G) route, it MUST originate a SMET route for | |||
(S,G). Normal OISM procedures will cause PE2 to send the (S,G) | (S,G). Normal OISM procedures will cause PE2 to send the (S,G) | |||
traffic to PE1 on an EVPN IP multicast tunnel. Normal OISM | traffic to PE1 on an EVPN IP multicast tunnel. Normal OISM | |||
procedures will also cause PE1 to send the (S,G) traffic up its BD1 | procedures will also cause PE1 to send the (S,G) traffic up its BD1 | |||
IRB interface. Normal PIM procedures will then cause PE1 to forward | IRB interface. Normal PIM procedures will then cause PE1 to forward | |||
the traffic along a PIM tree. In this case, the routing is not | the traffic along a PIM tree. In this case, the routing is not | |||
optimal, but the traffic does flow correctly. | optimal, but the traffic does flow correctly. | |||
6.1.4.2. Source Outside EVPN Domain | 6.1.4.2. Source Outside EVPN Domain | |||
By means of normal OISM procedures, a PEG learns whether there are | By means of normal OISM procedures, a PEG learns whether there are | |||
receivers in the Tenant Domain that are interested in receiving (*,G) | receivers in the Tenant Domain that are interested in receiving (*,G) | |||
or (S,G) traffic. The PEG must determine whether S (or the RP for G) | or (S,G) traffic. The PEG must determine whether or not S (or the RP | |||
is outside the EVPN Tenant Domain. If so, and if there is a receiver | for G) is outside the EVPN Tenant Domain. If so, and if there is a | |||
on BD1 interested in receiving such traffic, the PEG DR for BD1 is | receiver on BD1 interested in receiving such traffic, the PEG DR for | |||
responsible for originating a PIM Join(S,G) or Join(*,G) control | BD1 is responsible for originating a PIM Join (S,G) or Join (*,G) | |||
message. | control message. | |||
An alternative would be to allow any PEG that is directly attached to | An alternative would be to allow any PEG that is directly attached to | |||
a receiver to originate the PIM Joins. Then the PEG DR would only | a receiver to originate the PIM Joins. Then, the PEG DR would only | |||
have to originate PIM Joins on behalf of receivers that are not | have to originate PIM Joins on behalf of receivers that are not | |||
attached to a PEG. However, if this is done, it is necessary for the | attached to a PEG. However, if this is done, it is necessary for the | |||
PEGs to run PIM on all their IRB interfaces, so that the PIM Assert | PEGs to run PIM on all their IRB interfaces so that the PIM Assert | |||
procedures can be used to prevent duplicate delivery to a given BD. | procedures can be used to prevent duplicate delivery to a given BD. | |||
The IIF for the layer 3 (S,G) or (*,G) state is determined by normal | The IIF for the Layer 3 (S,G) or (*,G) state is determined by normal | |||
PIM procedures. If a receiver is on BD1, and the PEG DR is attached | PIM procedures. If a receiver is on BD1, and the PEG DR is attached | |||
to BD1, its IRB interface to BD1 is added to the OIF list. This | to BD1, its IRB interface to BD1 is added to the OIF list. This | |||
ensures that any receivers locally attached to the PEG DR will | ensures that any receivers locally attached to the PEG DR will | |||
receive the traffic. If there are receivers attached to other EVPN | receive the traffic. If there are receivers attached to other EVPN | |||
PEs, then whenever (S,G) traffic from an external source matches a | PEs, then whenever (S,G) traffic from an external source matches a | |||
(*,G) state, the PEG will create (S,G) state. The IIF will be set to | (*,G) state, the PEG will create (S,G) state. The IIF will be set to | |||
whatever external interface the traffic is expected to arrive on | whatever external interface the traffic is expected to arrive on | |||
(copied from the (*,G) state), the OIF list is copied from the (*,G) | (copied from the (*,G) state), the OIF list is copied from the (*,G) | |||
state, and the SBD IRB interface is added to the OIF list. | state, and the SBD IRB interface is added to the OIF list. | |||
6.2. Interworking with PIM via an External PIM Router | 6.2. Interworking with PIM via an External PIM Router | |||
Section 6.1 describes how to use an OISM PE router as the gateway to | Section 6.1 describes how to use an OISM PE router as the gateway to | |||
a non-EVPN multicast domain, when the EVPN tenant domain is not being | a non-EVPN multicast domain when the EVPN Tenant Domain is not being | |||
used as an intermediate transit network for multicast. An | used as an intermediate transit network for multicast. An | |||
alternative approach is to have one or more external PIM routers | alternative approach is to have one or more external PIM routers | |||
(perhaps operated by a tenant) on one of the BDs of the tenant | (perhaps operated by a tenant) on one of the BDs of the Tenant | |||
domain. We will refer to this BD as the "gateway BD". | Domain. We will refer to this BD as the "gateway BD". | |||
In this model: | In this model: | |||
* The EVPN Tenant Domain is treated as a stub network attached to | * The EVPN Tenant Domain is treated as a stub network attached to | |||
the external PIM routers. | the external PIM routers. | |||
* The external PIM routers follow normal PIM procedures, and provide | * The external PIM routers follow normal PIM procedures and provide | |||
the FHR and LHR functionality for the entire Tenant Domain. | the FHR and LHR functionality for the entire Tenant Domain. | |||
* The OISM PEs do not run PIM. | * The OISM PEs do not run PIM. | |||
* There MUST NOT be more than one gateway BD. | * There MUST NOT be more than one gateway BD. | |||
* If an OISM PE not attached to the gateway BD has interest in a | * If an OISM PE not attached to the gateway BD has interest in a | |||
given multicast flow, it conveys that interest, following normal | given multicast flow, it conveys that interest, following normal | |||
OISM procedures, by originating an SBD-SMET route for that flow. | OISM procedures, by originating an SBD-SMET route for that flow. | |||
* If a PE attached to the gateway BD receives an SBD-SMET, it may | * If a PE attached to the gateway BD receives an SBD-SMET, it may | |||
need to generate and transmit a corresponding IGMP/MLD Join on one | need to generate and transmit a corresponding IGMP/MLD Join on one | |||
or more of its ACs. (Procedures for generating an IGMP/MLD Join | or more of its ACs. (Procedures for generating an IGMP/MLD Join | |||
as a result of receiving an SMET route are given in [RFC9251].) | as a result of receiving a SMET route are given in [RFC9251].) | |||
The PE MUST know which BD is the Gateway BD and MUST NOT transmit | The PE MUST know which BD is the gateway BD and MUST NOT transmit | |||
an IGMP/MLD Join to any other BDs. Furthermore, even if a | an IGMP/MLD Join to any other BDs. Furthermore, even if a | |||
particular AC is part of that BD, the PE SHOULD NOT transmit an | particular AC is part of that BD, the PE SHOULD NOT transmit an | |||
IGMP/MLD Join on that AC unless there is an external PIM router | IGMP/MLD Join on that AC unless there is an external PIM router | |||
attached via that AC. | attached via that AC. | |||
As a result, IGMP/MLD messages will be received by the external | As a result, IGMP/MLD messages will be received by the external | |||
PIM routers on the gateway BD, and those external PIM routers will | PIM routers on the gateway BD, and those external PIM routers will | |||
send PIM Join messages externally as required. Traffic for the | send PIM Join messages externally as required. Traffic for the | |||
given multicast flow will then be received by one of the external | given multicast flow will then be received by one of the external | |||
PIM routers, and that traffic will be forwarded by that router to | PIM routers, and that traffic will be forwarded by that router to | |||
the gateway BD. | the gateway BD. | |||
The normal OISM procedures will then cause the given multicast | The normal OISM procedures will then cause the given multicast | |||
flow to be tunneled to any PEs of the EVPN Tenant Domain that have | flow to be tunneled to any PEs of the EVPN Tenant Domain that have | |||
interest in the flow. PEs attached to the gateway BD will see the | interest in the flow. PEs attached to the gateway BD will see the | |||
flow as originating from the gateway BD and other PEs will see the | flow as originating from the gateway BD, and other PEs will see | |||
flow as originating from the SBD. | the flow as originating from the SBD. | |||
* An OISM PE attached to a gateway BD MUST set its layer 2 multicast | * An OISM PE attached to a gateway BD MUST set its Layer 2 multicast | |||
state to indicate that each AC to the gateway BD has interest in | state to indicate that each AC to the gateway BD has interest in | |||
all multicast flows. It MUST also originate an SMET route for | all multicast flows. It MUST also originate a SMET route for | |||
(*,*). The procedures for originating SMET routes are discussed | (*,*). The procedures for originating SMET routes are discussed | |||
in Section 2.5. | in Section 2.5. | |||
This will cause the OISM PEs attached to the gateway BD to receive | This will cause the OISM PEs attached to the gateway BD to receive | |||
all the IP multicast traffic that is sourced within the EVPN | all the IP multicast traffic that is sourced within the EVPN | |||
tenant domain, and to transmit that traffic to the gateway BD, | Tenant Domain and to transmit that traffic to the gateway BD, | |||
where the external PIM routers will receive it. This enables the | where the external PIM routers will receive it. This enables the | |||
external PIM routers to perform FHR functions on behalf of the | external PIM routers to perform FHR functions on behalf of the | |||
entire Tenant Domain. (Of course, if the gateway BD has a | entire Tenant Domain. (Of course, if the gateway BD has a | |||
multi-homed segment, only the PE that is the DF for that segment | multihomed segment, only the PE that is the DF for that segment | |||
will transmit the multicast traffic to the segment.) | will transmit the multicast traffic to the segment.) | |||
7. Using an EVPN Tenant Domain as an Intermediate (Transit) Network for | 7. Using an EVPN Tenant Domain as an Intermediate (Transit) Network for | |||
Multicast traffic | Multicast Traffic | |||
In this section, we consider the scenario where one or more BDs of an | In this section, we consider the scenario where one or more BDs of an | |||
EVPN Tenant Domain are being used to carry IP multicast traffic for | EVPN Tenant Domain are being used to carry IP multicast traffic for | |||
which the source and at least one receiver are not part the tenant | which the source and at least one receiver are not part the Tenant | |||
domain. That is, one or more BDs of the Tenant Domain are | Domain. That is, one or more BDs of the Tenant Domain are | |||
intermediate "links" of a larger multicast tree created by PIM. | intermediate links of a larger multicast tree created by PIM. | |||
We define a "tenant multicast router" as a multicast router, running | We define a "tenant multicast router" as a multicast router, running | |||
PIM, that is: | PIM, that: | |||
1. attached to one or more BDs of the Tenant Domain, but | 1. is attached to one or more BDs of the Tenant Domain but | |||
2. is not an EVPN PE router. | 2. is not an EVPN PE router. | |||
In order an EVPN Tenant Domain to be used as a transit network for IP | In order for an EVPN Tenant Domain to be used as a transit network | |||
multicast, one or more of its BDs must have tenant multicast routers, | for IP multicast, one or more of its BDs must have tenant multicast | |||
and an OISM PE that attaching to such a BD MUST be provisioned to | routers, and an OISM PE attached to such a BD MUST be provisioned to | |||
enable PIM on its IRB interface to that BD. (This is true even if | enable PIM on its IRB interface to that BD. (This is true even if | |||
none of the tenant routers is on a segment attached to the PE.) | none of the tenant routers is on a segment attached to the PE.) | |||
Further, all the OISM PEs (even ones not attached to a BD with tenant | Further, all the OISM PEs (even ones not attached to a BD with tenant | |||
multicast routers) MUST be provisioned to enable PIM on their SBD IRB | multicast routers) MUST be provisioned to enable PIM on their SBD IRB | |||
interfaces. | interfaces. | |||
If PIM is enabled on a particular BD, the DR Selection procedure of | If PIM is enabled on a particular BD, the DR selection procedure of | |||
Section 6.1.2.4 MUST be replaced by the normal PIM DR Election | Section 6.1.2.4 MUST be replaced by the normal PIM DR Election | |||
procedure of [RFC7761]. Note that this may result in one of the | procedure of [RFC7761]. Note that this may result in one of the | |||
tenant routers being selected as the DR, rather than one of the OISM | tenant routers being selected as the DR rather than one of the OISM | |||
PE routers. In this case, First Hop Router and Last Hop Router | PE routers. In this case, First Hop Router and Last Hop Router | |||
functionality will not be performed by any of the EVPN PEs. | functionality will not be performed by any of the EVPN PEs. | |||
A PIM control message on a particular BD is considered to be a | A PIM control message on a particular BD is considered to be a link- | |||
link-local multicast message, and as such is sent transparently from | local multicast message and, as such, is sent transparently from PE | |||
PE to PE via the BUM tunnel for that BD. This is true whether the | to PE via the BUM tunnel for that BD. This is true whether the | |||
control message was received from an AC, or whether it was received | control message was received from an AC or from the local Layer 3 | |||
from the local layer 3 routing instance via an IRB interface. | routing instance via an IRB interface. | |||
A PIM Join/Prune message contains three fields that are relevant to | A PIM Join/Prune message contains three fields that are relevant to | |||
the present discussion: | the present discussion: | |||
* Upstream Neighbor | * Upstream Neighbor | |||
* Group Address (G) | * Group Address (G) | |||
* Source Address (S), omitted in the case of (*,G) Join/Prune | * Source Address (S), omitted in the case of (*,G) Join/Prune | |||
messages. | messages | |||
We will generally speak of a PIM Join as a "Join(S,G)" or a | We will generally speak of a PIM Join as a Join (S,G) or a Join (*,G) | |||
"Join(*,G)" message, and will use the term "Join(X,G)" to mean | message and will use the term "Join (X,G)" to mean either "Join | |||
"either Join(S,G) or Join(*,G)". In the context of a Join(X,G), we | (S,G)" or "Join (*,G)". In the context of a Join (X,G), we will use | |||
will use the term "X" to mean "S in the case of (S,G), or G's RP in | the term "X" to mean "S" in the case of (S,G) or "G's RP" in the case | |||
the case of (*,G)". | of (*,G). | |||
Suppose BD1 contains two tenant multicast routers, C1 and C2. | Suppose BD1 contains two tenant multicast routers, say C1 and C2. | |||
Suppose C1 is on a segment attached to PE1, and C2 is on a segment | Suppose C1 is on a segment attached to PE1 and C2 is on a segment | |||
attached to PE2. When C1 sends a PIM Join(X,G) to BD1, the Upstream | attached to PE2. When C1 sends a PIM Join (X,G) to BD1, the Upstream | |||
Neighbor field might be set to either PE1, PE2, or C2. C1 chooses | Neighbor field might be set to PE1, PE2, or C2. C1 chooses the | |||
the Upstream Neighbor based on its unicast routing. Typically, it | Upstream Neighbor based on its unicast routing. Typically, it will | |||
will choose as the Upstream Neighbor the PIM router on BD1 that is | choose the PIM router on BD1 that is closest (according to the | |||
"closest" (according to the unicast routing) to X. Note that this | unicast routing) to X as the Upstream Neighbor. Note that this will | |||
will not necessarily be PE1. PE1 may not even be visible to the | not necessarily be PE1. PE1 may not even be visible to the unicast | |||
unicast routing algorithm used by the tenant routers. Even if it is, | routing algorithm used by the tenant routers. Even if it is, it is | |||
it is unlikely to be the PIM router that is closest to X. So we need | unlikely to be the PIM router that is closest to X. So we need to | |||
to consider the following two cases: | consider the following two cases: | |||
1. C1 sends a PIM Join(X,G) to BD1, with PE1 as the Upstream | 1. C1 sends a PIM Join (X,G) to BD1, with PE1 as the Upstream | |||
Neighbor. | Neighbor. | |||
PE1's PIM routing instance will receive the Join arrive on the | PE1's PIM routing instance will receive the Join arrive on the | |||
BD1 IRB interface. If X is not within the Tenant Domain, PE1 | BD1 IRB interface. If X is not within the Tenant Domain, PE1 | |||
handles the Join according to normal PIM procedures. This will | handles the Join according to normal PIM procedures. This will | |||
generally result in PE1 selecting an Upstream Neighbor and | generally result in PE1 selecting an Upstream Neighbor and | |||
sending it a Join(X,G). | sending it a Join (X,G). | |||
If X is within the Tenant Domain, but is attached to some other | If X is within the Tenant Domain but is attached to some other | |||
PE, PE1 sends (if it hasn't already) an SBD-SMET route for (X,G). | PE, PE1 sends (if it hasn't already) an SBD-SMET route for (X,G). | |||
The IIF of the layer 3 (X,G) state will be the SBD IRB interface, | The IIF of the Layer 3 (X,G) state will be the SBD IRB interface, | |||
and the OIF list will include the IRB interface to BD1. | and the OIF list will include the IRB interface to BD1. | |||
The SBD-SMET route will pull the (X,G) traffic to PE1, and the | The SBD-SMET route will pull the (X,G) traffic to PE1, and the | |||
(X,G) state will result in the (X,G) traffic being forwarded to | (X,G) state will result in the (X,G) traffic being forwarded to | |||
C1. | C1. | |||
If X is within the Tenant Domain, but is attached to PE1 itself, | If X is within the Tenant Domain but is attached to PE1 itself, | |||
no SBD-SMET route is sent. The IIF of the layer 3 (X,G) state | no SBD-SMET route is sent. The IIF of the Layer 3 (X,G) state | |||
will be the IRB interface to X's BD, and the OIF list will | will be the IRB interface to X's BD, and the OIF list will | |||
include the IRB interface to BD1. | include the IRB interface to BD1. | |||
2. C1 sends a PIM Join(X,G) to BD1, with either PE2 or C2 as the | 2. C1 sends a PIM Join (X,G) to BD1, with either PE2 or C2 as the | |||
Upstream Neighbor. | Upstream Neighbor. | |||
PE1's PIM routing instance will receive the Join arrive on the | PE1's PIM routing instance will receive the Join arrive on the | |||
BD1 IRB interface. If neither X nor Upstream Neighbor is within | BD1 IRB interface. If neither X nor Upstream Neighbor is within | |||
the tenant domain, PE1 handles the Join according to normal PIM | the Tenant Domain, PE1 handles the Join according to normal PIM | |||
procedures. This will NOT result in PE1 sending a Join(X,G). | procedures. This will NOT result in PE1 sending a Join (X,G). | |||
If either X or Upstream Neighbor is within the Tenant Domain, PE1 | If either X or Upstream Neighbor is within the Tenant Domain, PE1 | |||
sends (if it hasn't already) an SBD-SMET route for (X,G). The | sends (if it hasn't already) an SBD-SMET route for (X,G). The | |||
IIF of the layer 3 (X,G) state will be the SBD IRB interface, and | IIF of the Layer 3 (X,G) state will be the SBD IRB interface, and | |||
the OIF list will include the IRB interface to BD1. | the OIF list will include the IRB interface to BD1. | |||
The SBD-SMET route will pull the (X,G) traffic to PE1, and the | The SBD-SMET route will pull the (X,G) traffic to PE1, and the | |||
(X,G) state will result in the (X,G) traffic being forwarded to | (X,G) state will result in the (X,G) traffic being forwarded to | |||
C1. | C1. | |||
8. IANA Considerations | 8. IANA Considerations | |||
IANA is requested to assign new flags in the "Multicast Flags | IANA has assigned new flags in the "Multicast Flags Extended | |||
Extended Community Flags" registry. These flags are: | Community" registry under the "Border Gateway Protocol (BGP) Extended | |||
Communities" registry as shown below. | ||||
* IPMG | ||||
* MEG | ||||
* PEG | ||||
* OISM SBD | +=====+================+===========+===================+ | |||
| Bit | Name | Reference | Change Controller | | ||||
+=====+================+===========+===================+ | ||||
| 7 | OISM SBD | RFC 9625 | IETF | | ||||
+-----+----------------+-----------+-------------------+ | ||||
| 9 | IPMG | RFC 9625 | IETF | | ||||
+-----+----------------+-----------+-------------------+ | ||||
| 10 | MEG | RFC 9625 | IETF | | ||||
+-----+----------------+-----------+-------------------+ | ||||
| 11 | PEG | RFC 9625 | IETF | | ||||
+-----+----------------+-----------+-------------------+ | ||||
| 12 | OISM-supported | RFC 9625 | IETF | | ||||
+-----+----------------+-----------+-------------------+ | ||||
* OISM-supported | Table 1: Multicast Flags Extended Community Registry | |||
9. Security Considerations | 9. Security Considerations | |||
This document uses protocols and procedures defined in the normative | This document uses protocols and procedures defined in the normative | |||
references, and inherits the security considerations of those | references and inherits the security considerations of those | |||
references. | references. | |||
This document adds flags or Extended Communities (ECs) to a number of | This document adds flags or Extended Communities (ECs) to a number of | |||
BGP routes, in order to signal that particular nodes support the | BGP routes in order to signal that particular nodes support the OISM, | |||
OISM, IPMG, MEG, and/or PEG functionalities that are defined in this | IPMG, MEG, and/or PEG functionalities that are defined in this | |||
document. Incorrect addition, removal, or modification of those | document. Incorrect addition, removal, or modification of those | |||
flags and/or ECs will cause the procedures defined herein to | flags and/or ECs will cause the procedures defined herein to | |||
malfunction, in which case loss or diversion of data traffic is | malfunction, in which case loss or diversion of data traffic is | |||
possible. Implementations should provide tools to easily debug | possible. Implementations should provide tools to easily debug | |||
configuration mistakes that cause the signaling of incorrect | configuration mistakes that cause the signaling of incorrect | |||
information. | information. | |||
The interworking with non-OISM networks described in sections 5 and | The interworking with non-OISM networks described in Sections 5 and 6 | |||
6, require gateway functions in multiple redundant PEs, among which | requires gateway functions in multiple redundant PEs, among which one | |||
one of them is elected as Designated Forwarder for a given BD (or | of them is elected as Designated Forwarder for a given BD (or SBD). | |||
SBD). The election of the MEG or PEG Designated Router, as well as | The election of the MEG or PEG DR, as well as the IPMG Designated | |||
the IPMG Designated Forwarder makes use of the RFC8584 Designated | Forwarder, makes use of the Designated Forwarder election procedures | |||
Forwarder election procedures. An attacker with access to one of | [RFC8584]. An attacker with access to one of these Gateways may | |||
these Gateways may influence such election and therefore modify the | influence such election and therefore modify the forwarding of | |||
forwarding of multicast traffic between the OISM network and the | multicast traffic between the OISM network and the external domain. | |||
external domain. The operator should be especially careful with the | The operator should be especially careful with the protection of | |||
protection of these gateways by making sure the management interfaces | these gateways by making sure the management interfaces to access the | |||
to access the gateways are only allowed to authorized operators. | gateways are only allowed to authorized operators. | |||
The document also introduces the concept of per-Tenant-Domain | The document also introduces the concept of per-Tenant-Domain | |||
dissemination for the SMET routes, as opposed to per-BD distribution | dissemination for the SMET routes, as opposed to per-BD distribution | |||
in [RFC9251]. That is, e.g., an SMET route triggered by the | in [RFC9251]. That is, a SMET route triggered by the reception of an | |||
reception of an IGMP/MLD join in BD-1 on PE1, needs to be distributed | IGMP/MLD Join in BD-1 on PE1 needs to be distributed and imported by | |||
and imported by all PEs of the Tenant Domain, even to those PEs that | all PEs of the Tenant Domain, even to those PEs that are not attached | |||
are not attached to BD-1. This means that an attacker with access to | to BD-1. This means that an attacker with access to only one BD in a | |||
only one BD in a PE of the Tenant Domain, might force the | PE of the Tenant Domain might force the advertisement of SMET routes | |||
advertisement of SMET routes and impact the resources of all the PEs | and impact the resources of all the PEs of the Tenant Domain, as | |||
of the Tenant Domain, as opposed to only the PEs of that particular | opposed to only the PEs of that particular BD (as in [RFC9251]). The | |||
BD (as in RFC9251). The implementation should provide ways to | implementation should provide ways to filter/control the client IGMP/ | |||
filter/control the client IGMP/MLD reports that are received by the | MLD reports that are received by the attached hosts. | |||
attached hosts. | ||||
10. Acknowledgements | ||||
The authors thank Vikram Nagarajan and Princy Elizabeth for their | ||||
work on Section 6.2 and Section 3.2.3.1. The authors also benefited | ||||
tremendously from discussions with Aldrin Isaac on EVPN multicast | ||||
optimizations. | ||||
11. References | ||||
11.1. Normative References | ||||
[I-D.ietf-bess-evpn-bum-procedure-updates] | 10. References | |||
Zhang, Z. J., Lin, W., Rabadan, J., Patel, K., and A. | ||||
Sajassi, "Updates on EVPN BUM Procedures", Work in | ||||
Progress, Internet-Draft, draft-ietf-bess-evpn-bum- | ||||
procedure-updates-14, 18 November 2021, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-bess- | ||||
evpn-bum-procedure-updates-14>. | ||||
[I-D.ietf-bess-evpn-optimized-ir] | 10.1. Normative References | |||
Rabadan, J., Sathappan, S., Lin, W., Katiyar, M., and A. | ||||
Sajassi, "Optimized Ingress Replication Solution for | ||||
Ethernet VPN (EVPN)", Work in Progress, Internet-Draft, | ||||
draft-ietf-bess-evpn-optimized-ir-12, 25 January 2022, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-bess- | ||||
evpn-optimized-ir-12>. | ||||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., | |||
Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack | |||
Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, | Encoding", RFC 3032, DOI 10.17487/RFC3032, January 2001, | |||
<https://www.rfc-editor.org/info/rfc3032>. | <https://www.rfc-editor.org/info/rfc3032>. | |||
skipping to change at page 72, line 16 ¶ | skipping to change at line 3303 ¶ | |||
A. Sajassi, "IP Prefix Advertisement in Ethernet VPN | A. Sajassi, "IP Prefix Advertisement in Ethernet VPN | |||
(EVPN)", RFC 9136, DOI 10.17487/RFC9136, October 2021, | (EVPN)", RFC 9136, DOI 10.17487/RFC9136, October 2021, | |||
<https://www.rfc-editor.org/info/rfc9136>. | <https://www.rfc-editor.org/info/rfc9136>. | |||
[RFC9251] Sajassi, A., Thoria, S., Mishra, M., Patel, K., Drake, J., | [RFC9251] Sajassi, A., Thoria, S., Mishra, M., Patel, K., Drake, J., | |||
and W. Lin, "Internet Group Management Protocol (IGMP) and | and W. Lin, "Internet Group Management Protocol (IGMP) and | |||
Multicast Listener Discovery (MLD) Proxies for Ethernet | Multicast Listener Discovery (MLD) Proxies for Ethernet | |||
VPN (EVPN)", RFC 9251, DOI 10.17487/RFC9251, June 2022, | VPN (EVPN)", RFC 9251, DOI 10.17487/RFC9251, June 2022, | |||
<https://www.rfc-editor.org/info/rfc9251>. | <https://www.rfc-editor.org/info/rfc9251>. | |||
11.2. Informative References | [RFC9572] Zhang, Z., Lin, W., Rabadan, J., Patel, K., and A. | |||
Sajassi, "Updates to EVPN Broadcast, Unknown Unicast, or | ||||
Multicast (BUM) Procedures", RFC 9572, | ||||
DOI 10.17487/RFC9572, May 2024, | ||||
<https://www.rfc-editor.org/info/rfc9572>. | ||||
[I-D.ietf-bess-evpn-pref-df] | [RFC9574] Rabadan, J., Ed., Sathappan, S., Lin, W., Katiyar, M., and | |||
Rabadan, J., Sathappan, S., Lin, W., Drake, J., and A. | A. Sajassi, "Optimized Ingress Replication Solution for | |||
Ethernet VPNs (EVPNs)", RFC 9574, DOI 10.17487/RFC9574, | ||||
May 2024, <https://www.rfc-editor.org/info/rfc9574>. | ||||
10.2. Informative References | ||||
[EVPN-DF] Rabadan, J., Sathappan, S., Lin, W., Drake, J., and A. | ||||
Sajassi, "Preference-based EVPN DF Election", Work in | Sajassi, "Preference-based EVPN DF Election", Work in | |||
Progress, Internet-Draft, draft-ietf-bess-evpn-pref-df-13, | Progress, Internet-Draft, draft-ietf-bess-evpn-pref-df-13, | |||
9 October 2023, <https://datatracker.ietf.org/doc/html/ | 9 October 2023, <https://datatracker.ietf.org/doc/html/ | |||
draft-ietf-bess-evpn-pref-df-13>. | draft-ietf-bess-evpn-pref-df-13>. | |||
[I-D.ietf-bier-evpn] | ||||
Zhang, Z. J., Przygienda, T., Sajassi, A., and J. Rabadan, | ||||
"EVPN BUM Using BIER", Work in Progress, Internet-Draft, | ||||
draft-ietf-bier-evpn-14, 2 January 2024, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-bier- | ||||
evpn-14>. | ||||
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private | |||
Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February | |||
2006, <https://www.rfc-editor.org/info/rfc4364>. | 2006, <https://www.rfc-editor.org/info/rfc4364>. | |||
[RFC4541] Christensen, M., Kimball, K., and F. Solensky, | [RFC4541] Christensen, M., Kimball, K., and F. Solensky, | |||
"Considerations for Internet Group Management Protocol | "Considerations for Internet Group Management Protocol | |||
(IGMP) and Multicast Listener Discovery (MLD) Snooping | (IGMP) and Multicast Listener Discovery (MLD) Snooping | |||
Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, | Switches", RFC 4541, DOI 10.17487/RFC4541, May 2006, | |||
<https://www.rfc-editor.org/info/rfc4541>. | <https://www.rfc-editor.org/info/rfc4541>. | |||
skipping to change at page 73, line 28 ¶ | skipping to change at line 3364 ¶ | |||
Multicast - Sparse Mode (PIM-SM): Protocol Specification | Multicast - Sparse Mode (PIM-SM): Protocol Specification | |||
(Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March | (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March | |||
2016, <https://www.rfc-editor.org/info/rfc7761>. | 2016, <https://www.rfc-editor.org/info/rfc7761>. | |||
[RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., | [RFC8296] Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A., | |||
Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation | Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation | |||
for Bit Index Explicit Replication (BIER) in MPLS and Non- | for Bit Index Explicit Replication (BIER) in MPLS and Non- | |||
MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January | MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January | |||
2018, <https://www.rfc-editor.org/info/rfc8296>. | 2018, <https://www.rfc-editor.org/info/rfc8296>. | |||
[RFC9624] Zhang, Z., Przygienda, T., Sajassi, A., and J. Rabadan, | ||||
"EVPN Broadcast, Unknown Unicast, or Multicast (BUM) Using | ||||
Bit Index Explicit Replication (BIER)", RFC 9624, | ||||
DOI 10.17487/RFC9624, August 2024, | ||||
<https://www.rfc-editor.org/info/rfc9624>. | ||||
Appendix A. Integrated Routing and Bridging | Appendix A. Integrated Routing and Bridging | |||
This Appendix provides a short tutorial on the interaction of routing | This appendix provides a short tutorial on the interaction of routing | |||
and bridging. First it shows the traditional model, where bridging | and bridging. First, it shows a model, where bridging and routing | |||
and routing are performed in separate devices. Then it shows the | are performed in separate devices. Then, it shows the model | |||
model specified in [RFC9135], where a single device contains both | specified in [RFC9135], where a single device contains both routing | |||
routing and bridging functions. The latter model is presupposed in | and bridging functions. The latter model is presupposed in the body | |||
the body of this document. | of this document. | |||
Figure 1 shows a "traditional" router that only does routing and has | Figure 2 shows the model where a router only does routing and has no | |||
no L2 bridging capabilities. There are two LANs, LAN1 and LAN2. | L2 bridging capabilities. There are two LANs: LAN1 and LAN2. LAN1 | |||
LAN1 is realized by switch1, LAN2 by switch2. The router has an | is realized by switch1, and LAN2 is realized by switch2. The router | |||
interface, "lan1" that attaches to LAN1 (via switch1) and an | has an interface, lan1, that attaches to LAN1 (via switch1) and an | |||
interface "lan2" that attachs to LAN2 (via switch2). Each intreface | interface, lan2, that attaches to LAN2 (via switch2). Each interface | |||
is configured, as an IP interface, with an IP address and a subnet | is configured, as an IP interface, with an IP address and a subnet | |||
mask. | mask. | |||
+-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | lan1| |lan2 | | | | | lan1| |lan2 | | | |||
H1 -----+Switch1+--------+ Router1+--------+Switch2+------H3 | H1 -----+Switch1+--------+ Router1+--------+Switch2+------H3 | |||
| | | | | | | | | | | | | | |||
H2 -----| | | | | | | H2 -----| | | | | | | |||
+-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
|_________________| |__________________| | |_________________| |__________________| | |||
LAN1 LAN2 | LAN1 LAN2 | |||
Figure 1: Conventional Router with LAN Interfaces | Figure 2: Conventional Router with LAN Interfaces | |||
IP traffic (unicast or multicast) that remains within a single subnet | IP traffic (unicast or multicast) that remains within a single subnet | |||
never reaches the router. For instance, if H1 emits an Ethernet | never reaches the router. For instance, if H1 emits an Ethernet | |||
frame with H2's MAC address in the Ethernet destination address | frame with H2's MAC address in the Ethernet Destination Address | |||
field, the frame will go from H1 to Switch1 to H2, without ever | field, the frame will go from H1 to Switch1 to H2 without ever | |||
reaching the router. Since the frame is never seen by a router, the | reaching the router. Since the frame is never seen by a router, the | |||
IP datagram within the frame remains entirely unchanged, e.g., its | IP datagram within the frame remains entirely unchanged, e.g., its | |||
TTL is not decremented. The Ethernet Source and Destination MAC | TTL is not decremented. The Ethernet Source and Destination MAC | |||
addresses are not changed either. | addresses are not changed either. | |||
If H1 wants to send a unicast IP datagram to H3, which is on a | If H1 wants to send a unicast IP datagram to H3, which is on a | |||
different subnet, H1 has to be configured with the IP address of a | different subnet, H1 has to be configured with the IP address of a | |||
"default router". Let's assume that H1 is configured with an IP | default router. Let's assume that H1 is configured with an IP | |||
address of Router1 as its default router address. H1 compares H3's | address of Router1 as its default router address. H1 compares H3's | |||
IP address with its own IP address and IP subnet mask, and determines | IP address with its own IP address and IP subnet mask and determines | |||
that H3 is on a different subnet. So the packet has to be routed. | that H3 is on a different subnet. So the packet has to be routed. | |||
H1 uses ARP to map Router1's IP address to a MAC address on LAN1. H1 | H1 uses ARP to map Router1's IP address to a MAC address on LAN1. H1 | |||
then encapsulates the datagram in an Ethernet frame, using router1's | then encapsulates the datagram in an Ethernet frame, using Router1's | |||
MAC address as the destination MAC address, and sends the frame to | MAC address as the destination MAC address, and sends the frame to | |||
Router1. | Router1. | |||
Router1 then receives the frame over its lan1 interface. Router1 | Router1 then receives the frame over its lan1 interface. Router1 | |||
sees that the frame is addressed to it, so it removes the Ethernet | sees that the frame is addressed to it, so it removes the Ethernet | |||
encapsulation and processes the IP datagram. The datagram is not | encapsulation and processes the IP datagram. The datagram is not | |||
addressed to Router1, so it must be forwarded further. Router1 does | addressed to Router1, so it must be forwarded further. Router1 does | |||
a lookup of the datagram's IP destination field, and determines that | a lookup of the datagram's IP Destination Address field and | |||
the destination (H3) can be reached via Router1's lan2 interface. | determines that the destination (H3) can be reached via Router1's | |||
Router1 now performs the IP processing of the datagram: it decrements | lan2 interface. Router1 now performs the IP processing of the | |||
the IP TTL, adjusts the IP header checksum (if present), may fragment | datagram: it decrements the IP TTL, adjusts the IP header checksum | |||
the packet is necessary, etc. Then the datagram (or its fragments) | (if present), may fragment the packet as necessary, etc. Then, the | |||
are encapsulated in an Ethernet header, with Router1's MAC address on | datagram (or its fragments) is encapsulated in an Ethernet header, | |||
LAN2 as the MAC Source Address, and H3's MAC address on LAN2 (which | with Router1's MAC address on LAN2 as the MAC Source Address and H3's | |||
Router1 determines via ARP) as the MAC Destination Address. Finally | MAC address on LAN2 (which Router1 determines via ARP) as the | |||
the packet is sent on the lan2 interface. | Destination MAC Address. Finally, the packet is sent on the lan2 | |||
interface. | ||||
If H1 has an IP multicast datagram to send (i.e., an IP datagram | If H1 has an IP multicast datagram to send (i.e., an IP datagram | |||
whose Destination Address field is an IP Multicast Address), it | whose Destination Address field is an IP Multicast Address), it | |||
encapsulates it in an Ethernet frame whose MAC Destination Address is | encapsulates it in an Ethernet frame whose Destination MAC Address is | |||
computed from the IP Destination Address. | computed from the IP Destination Address. | |||
If H2 is a receiver for that multicast address, H2 will receive a | If H2 is a receiver for that multicast address, H2 will receive a | |||
copy of the frame, unchanged, from H1. The MAC Source Address in the | copy of the frame, unchanged, from H1. The MAC Source Address in the | |||
Ethernet encapsulation does not change, the IP TTL field does not get | Ethernet encapsulation does not change, the IP TTL field does not get | |||
decremented, etc. | decremented, etc. | |||
If H3 is a receiver for that multicast address, the datagram must be | If H3 is a receiver for that multicast address, the datagram must be | |||
routed to H3. In order for this to happen, Router1 must be | routed to H3. In order for this to happen, Router1 must be | |||
configured as a multicast router, and it must accept traffic sent to | configured as a multicast router, and it must accept traffic sent to | |||
Ethernet multicast addresses. Router1 will receive H1's multicast | Ethernet multicast addresses. Router1 will receive H1's multicast | |||
frame on its lan1 interface, will remove the Ethernet encapsulation, | frame on its lan1 interface, remove the Ethernet encapsulation, and | |||
and will determine how to dispatch the IP datagram based on Router1's | determine how to dispatch the IP datagram based on Router1's | |||
multicast forwarding states. If Router1 knows that there is a | multicast forwarding states. If Router1 knows that there is a | |||
receiver for the multicast datagram on LAN2, it makes a copy of the | receiver for the multicast datagram on LAN2, it makes a copy of the | |||
datagram, decrements the TTL (and performs any other necessary IP | datagram, decrements the TTL (and performs any other necessary IP | |||
processing), then encapsulates the datagram in Ethernet frame for | processing), and then encapsulates the datagram in the Ethernet frame | |||
LAN2. The MAC Source Address for this frame will be Router1's MAC | for LAN2. The MAC Source Address for this frame will be Router1's | |||
Source Address on LAN2. The MAC Destination Address is computed from | MAC Source Address on LAN2. The Destination MAC Address is computed | |||
the IP Destination Address. Finally, the frame is sent on Router1's | from the IP Destination Address. Finally, the frame is sent on | |||
LAN2 interface. | Router1's LAN2 interface. | |||
Figure 2 shows an Integrated Router/Bridge that supports the routing/ | Figure 3 shows an integrated router/bridge that supports the routing/ | |||
bridging integration model of [RFC9135]. | bridging integration model of [RFC9135]. | |||
+------------------------------------------+ | +------------------------------------------+ | |||
| Integrated Router/Bridge | | | Integrated Router/Bridge | | |||
+-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | IRB1| L3 |IRB2 | | | | | IRB1| L3 |IRB2 | | | |||
H1 -----+ BD1 +--------+Routing +--------+ BD2 +------H3 | H1 -----+ BD1 +--------+Routing +--------+ BD2 +------H3 | |||
| | |Instance| | | | | | |Instance| | | | |||
H2 -----| | | | | | | H2 -----| | | | | | | |||
+-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
|___________________| |____________________| | |___________________| |____________________| | |||
LAN1 LAN2 | LAN1 LAN2 | |||
Figure 2: Integrated Router/Bridge | Figure 3: Integrated Router/Bridge | |||
In Figure 2, a single device consists of one or more "L3 Routing | In Figure 3, a single device consists of one or more L3 Routing | |||
Instances". The routing/forwarding tables of a given routing | Instances. The routing/forwarding tables of a given routing instance | |||
instance is known as an IP-VRF [RFC9135]. In the context of EVPN, it | is known as an IP-VRF [RFC9135]. In the context of EVPN, it is | |||
is convenient to think of each routing instance as representing the | convenient to think of each routing instance as representing the | |||
routing of a particular tenant. Each IP-VRF is attached to one or | routing of a particular tenant. Each IP-VRF is attached to one or | |||
more interfaces. | more interfaces. | |||
When several EVPN PEs have a routing instance of the same tenant | When several EVPN PEs have a routing instance of the same Tenant | |||
domain, those PEs advertise IP routes to the attached hosts. This is | Domain, those PEs advertise IP routes to the attached hosts. This is | |||
done as specified in [RFC9135]. | done as specified in [RFC9135]. | |||
The integrated router/bridge shown in Figure 2 also attaches to a | The integrated router/bridge shown in Figure 3 also attaches to a | |||
number of "Broadcast Domains" (BDs). Each BD performs the functions | number of Broadcast Domains (BDs). Each BD performs the functions | |||
that are performed by the bridges in Figure 1. To the L3 routing | that are performed by the bridges in Figure 2. To the L3 routing | |||
instance, each BD appears to be a LAN. The interface attaching a | instance, each BD appears to be a LAN. The interface attaching a | |||
particular BD to a particular IP-VRF is known as an "IRB Interface". | particular BD to a particular IP-VRF is known as an "IRB interface". | |||
From the perspective of L3 routing, each BD is a subnet. Thus, each | ||||
From the perspective of L3 routing, each BD is a subnet. Thus each | ||||
IRB interface is configured with a MAC address (which is the router's | IRB interface is configured with a MAC address (which is the router's | |||
MAC address on the corresponding LAN), as well as an IP address and | MAC address on the corresponding LAN), as well as an IP address and | |||
subnet mask. | subnet mask. | |||
The integrated router/bridge shown in Figure 2 may have multiple ACs | The integrated router/bridge shown in Figure 3 may have multiple ACs | |||
to each BD. These ACs are visible only to the bridging function, not | to each BD. These ACs are visible only to the bridging function, not | |||
to the routing instance. To the L3 routing instance, there is just | to the routing instance. To the L3 routing instance, there is just | |||
one "interface" to each BD. | one interface to each BD. | |||
If the L3 routing instance represents the IP routing of a particular | If the L3 routing instance represents the IP routing of a particular | |||
tenant, the BDs attached to that routing instance are BDs belonging | tenant, the BDs attached to that routing instance are BDs belonging | |||
to that same tenant. | to that same tenant. | |||
Bridging and routing now proceed exactly as in the case of Figure 1, | Bridging and routing now proceed exactly as in the case of Figure 2, | |||
except that BD1 replaces Switch1, BD2 replaces Switch2, interface | except that BD1 replaces Switch1, BD2 replaces Switch2, interface | |||
IRB1 replaces interface lan1, and interface IRB2 replaces interface | IRB1 replaces interface lan1, and interface IRB2 replaces interface | |||
lan2. | lan2. | |||
It is important to understand that an IRB interface connects an L3 | It is important to understand that an IRB interface connects an L3 | |||
routing instance to a BD, NOT to a "MAC-VRF". (See [RFC7432] for the | routing instance to a BD, NOT to a MAC-VRF (see [RFC7432] for the | |||
definition of "MAC-VRF".) A MAC-VRF may contain several BDs, as long | definition of MAC-VRF). A MAC-VRF may contain several BDs, as long | |||
as no MAC address appears in more than one BD. From the perspective | as no MAC address appears in more than one BD. From the perspective | |||
of the L3 routing instance, each individual BD is an individual IP | of the L3 routing instance, each individual BD is an individual IP | |||
subnet; whether each BD has its own MAC-VRF or not is irrelevant to | subnet; whether or not each BD has its own MAC-VRF is irrelevant to | |||
the L3 routing instance. | the L3 routing instance. | |||
Figure 3 illustrates IRB when a pair of BDs (subnets) are attached to | Figure 4 illustrates IRB when a pair of BDs (subnets) are attached to | |||
two different PE routers. In this example, each BD has two segments, | two different PE routers. In this example, each BD has two segments, | |||
and one segment of each BD is attached to one PE router. | and one segment of each BD is attached to one PE router. | |||
+------------------------------------------+ | +------------------------------------------+ | |||
| Integrated Router/Bridges | | | Integrated Router/Bridges | | |||
+-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | IRB1| |IRB2 | | | | | IRB1| |IRB2 | | | |||
H1 -----+ BD1 +--------+ PE1 +--------+ BD2 +------H3 | H1 -----+ BD1 +--------+ PE1 +--------+ BD2 +------H3 | |||
|(Seg-1)| |(L3 Rtg)| |(Seg-1)| | |(Seg-1)| |(L3 Rtg)| |(Seg-1)| | |||
skipping to change at page 77, line 25 ¶ | skipping to change at line 3542 ¶ | |||
LAN1 | LAN2 | LAN1 | LAN2 | |||
| | | | |||
| | | | |||
+-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
| | IRB1| |IRB2 | | | | | IRB1| |IRB2 | | | |||
H4 -----+ BD1 +--------+ PE2 +--------+ BD2 +------H5 | H4 -----+ BD1 +--------+ PE2 +--------+ BD2 +------H5 | |||
|(Seg-2)| |(L3 Rtg)| |(Seg-2)| | |(Seg-2)| |(L3 Rtg)| |(Seg-2)| | |||
| | | | | | | | | | | | | | |||
+-------+ +--------+ +-------+ | +-------+ +--------+ +-------+ | |||
Figure 3: Integrated Router/Bridges with Distributed Subnet | Figure 4: Integrated Router/Bridges with Distributed Subnet | |||
If H1 needs to send an IP packet to H4, it determines from its IP | If H1 needs to send an IP packet to H4, it determines from its IP | |||
address and subnet mask that H4 is on the same subnet as H1. | address and subnet mask that H4 is on the same subnet as H1. | |||
Although H1 and H4 are not attached to the same PE router, EVPN | Although H1 and H4 are not attached to the same PE router, EVPN | |||
provides Ethernet communication among all hosts that are on the same | provides Ethernet communication among all hosts that are on the same | |||
BD. H1 thus uses ARP to find H4's MAC address, and sends an Ethernet | BD. Thus, H1 uses ARP to find H4's MAC address and sends an Ethernet | |||
frame with H4's MAC address in the Destination MAC address field. | frame with H4's MAC address in the Destination MAC Address field. | |||
The frame is received at PE1, but since the Destination MAC address | The frame is received at PE1, but since the Destination MAC address | |||
is not PE1's MAC address, PE1 assumes that the frame is to remain on | is not PE1's MAC address, PE1 assumes that the frame is to remain on | |||
BD1. Therefore the packet inside the frame is NOT decapsulated, and | BD1. Therefore, the packet inside the frame is NOT decapsulated and | |||
is NOT send up the IRB interface to PE1's routing instance. Rather, | is NOT sent up the IRB interface to PE1's routing instance. Rather, | |||
standard EVPN intra-subnet procedures (as detailed in [RFC7432]) are | standard EVPN intra-subnet procedures (as detailed in [RFC7432]) are | |||
used to deliver the frame to PE2, which then sends it to H4. | used to deliver the frame to PE2, which then sends it to H4. | |||
If H1 needs to send an IP packet to H5, it determines from its IP | If H1 needs to send an IP packet to H5, it determines from its IP | |||
address and subnet mask that H5 is NOT on the same subnet as H1. | address and subnet mask that H5 is NOT on the same subnet as H1. | |||
Assuming that H1 has been configured with the IP address of PE1 as | Assuming that H1 has been configured with the IP address of PE1 as | |||
its default router, H1 sends the packet in an Ethernet frame with | its default router, H1 sends the packet in an Ethernet frame with | |||
PE1's MAC address in its Destination MAC Address field. PE1 receives | PE1's MAC address in its Destination MAC Address field. PE1 receives | |||
the frame, and sees that the frame is addressed to it. PE1 thus | the frame and sees that the frame is addressed to it. Thus, PE1 | |||
sends the frame up its IRB1 interface to the L3 routing instance. | sends the frame up its IRB1 interface to the L3 routing instance. | |||
Appropriate IP processing is done, e.g., TTL decrement. The L3 | Appropriate IP processing is done, e.g., TTL decrement. The L3 | |||
routing instance determines that the "next hop" for H5 is PE2, so the | routing instance determines that the next hop for H5 is PE2, so the | |||
packet is encapsulated (e.g., in MPLS) and sent across the backbone | packet is encapsulated (e.g., in MPLS) and sent across the backbone | |||
to PE2's routing instance. PE2 will see that the packet's | to PE2's routing instance. PE2 will see that the packet's | |||
destination, H5, is on BD2 segment-2, and will send the packet down | destination, H5, is on BD2 segment-2 and will send the packet down | |||
its IRB2 interface. This causes the IP packet to be encapsulated in | its IRB2 interface. This causes the IP packet to be encapsulated in | |||
an Ethernet frame with PE2's MAC address (on BD2) in the Source | an Ethernet frame with PE2's MAC address (on BD2) in the Source | |||
Address field and H5's MAC address in the Destination Address field. | Address field and H5's MAC address in the Destination Address field. | |||
Note that if H1 has an IP packet to send to H3, the forwarding of the | Note that if H1 has an IP packet to send to H3, the forwarding of the | |||
packet is handled entirely within PE1. PE1's routing instance sees | packet is handled entirely within PE1. PE1's routing instance sees | |||
the packet arrive on its IRB1 interface, and then transmits the | the packet arrive on its IRB1 interface and then transmits the packet | |||
packet by sending it down its IRB2 interface. | by sending it down its IRB2 interface. | |||
Often, all the hosts in a particular Tenant Domain will be | Often, all the hosts in a particular Tenant Domain will be | |||
provisioned with the same value of the default router IP address. | provisioned with the same value of the default router IP address. | |||
This IP address can be provisioned as an "anycast address" in all the | This IP address can be provisioned as an anycast address in all the | |||
EVPN PEs attached to that Tenant Domain. Thus although all hosts are | EVPN PEs attached to that Tenant Domain. Thus, although all hosts | |||
provisioned with the same "default router address", the actual | are provisioned with the same default router address, the actual | |||
default router for a given host will be one of the PEs attached to | default router for a given host will be one of the PEs attached to | |||
the same Ethernet segment as the host. This provisioning method | the same Ethernet segment as the host. This provisioning method | |||
ensures that IP packets from a given host are handled by the closest | ensures that IP packets from a given host are handled by the closest | |||
EVPN PE that supports IRB. | EVPN PE that supports IRB. | |||
In the topology of Figure 3, one could imagine that H1 is configured | In the topology of Figure 4, one could imagine that H1 is configured | |||
with a default router address that belongs to PE2 but not to PE1. | with a default router address that belongs to PE2 but not to PE1. | |||
Inter-subnet routing would still work, but IP packets from H1 to H3 | Inter-subnet routing would still work, but IP packets from H1 to H3 | |||
would then follow the non-optimal path H1-->PE1-->PE2-->PE1-->H3. | would then follow the non-optimal path H1-->PE1-->PE2-->PE1-->H3. | |||
Sending traffic on this sort of path, where it leaves a router and | Sending traffic on this sort of path, where it leaves a router and | |||
then comes back to the same router, is sometimes known as | then comes back to the same router, is sometimes known as | |||
"hairpinning". Similarly, if PE2 supports IRB but PE1 dos not, the | "hairpinning". Similarly, if PE2 supports IRB but PE1 dos not, the | |||
same non-optimal path from H1 to H3 would have to be followed. To | same non-optimal path from H1 to H3 would have to be followed. To | |||
avoid hairpinning, each EVPN PE needs to support IRB. | avoid hairpinning, each EVPN PE needs to support IRB. | |||
It is worth pointing out the way IRB interfaces interact with | It is worth pointing out the way IRB interfaces interact with | |||
multicast traffic. Referring again to Figure 3, suppose PE1 and PE2 | multicast traffic. Referring again to Figure 4, suppose PE1 and PE2 | |||
are functioning as IP multicast routers. Also Suppose that H3 | are functioning as IP multicast routers. Also, suppose that H3 | |||
transmits a multicast packet, and both H1 and H4 are interested in | transmits a multicast packet and both H1 and H4 are interested in | |||
receiving that packet. PE1 will receive the packet from H3 via its | receiving that packet. PE1 will receive the packet from H3 via its | |||
IRB2 interface. The Ethernet encapsulation from BD2 is removed, the | IRB2 interface. The Ethernet encapsulation from BD2 is removed, the | |||
IP header processing is done, and the packet is then reencapsulated | IP header processing is done, and the packet is then re-encapsulated | |||
for BD1, with PE1's MAC address in the MAC Source Address field. | for BD1, with PE1's MAC address in the MAC Source Address field. | |||
Then the packet is sent down the IRB1 interface. Layer 2 procedures | Then, the packet is sent down the IRB1 interface. Layer 2 procedures | |||
(as defined in [RFC7432] would then be used to deliver a copy of the | (as defined in [RFC7432]) would then be used to deliver a copy of the | |||
packet locally to H1, and remotely to H4. | packet locally to H1 and remotely to H4. | |||
Please be aware that his document modifies the semantics, described | Please be aware that this document modifies the semantics, described | |||
in the previous paragraph, of sending/receiving multicast traffic on | in the previous paragraph, of sending/receiving multicast traffic on | |||
an IRB interface. This is explained in Section 1.5.1 and subsequent | an IRB interface. This is explained in Section 1.5.1 and subsequent | |||
sections. | sections. | |||
Acknowledgements | ||||
The authors thank Vikram Nagarajan and Princy Elizabeth for their | ||||
work on Sections 6.2 and 3.2.3.1. The authors also benefited | ||||
tremendously from discussions with Aldrin Isaac on EVPN multicast | ||||
optimizations. | ||||
Authors' Addresses | Authors' Addresses | |||
Wen Lin | Wen Lin | |||
Juniper Networks, Inc. | Juniper Networks, Inc. | |||
10 Technology Park Drive | 10 Technology Park Drive | |||
Westford, Massachusetts 01886 | Westford, MA 01886 | |||
United States | United States of America | |||
Email: wlin@juniper.net | Email: wlin@juniper.net | |||
Zhaohui Zhang | Zhaohui Zhang | |||
Juniper Networks, Inc. | Juniper Networks, Inc. | |||
10 Technology Park Drive | 10 Technology Park Drive | |||
Westford, Massachusetts 01886 | Westford, MA 01886 | |||
United States | United States of America | |||
Email: zzhang@juniper.net | Email: zzhang@juniper.net | |||
John Drake | John Drake | |||
Juniper Networks, Inc. | Juniper Networks, Inc. | |||
1194 N. Mathilda Ave | 1194 N. Mathilda Ave | |||
Sunnyvale, CA 94089 | Sunnyvale, CA 94089 | |||
United States | United States of America | |||
Email: jdrake@juniper.net | Email: jdrake@juniper.net | |||
Eric C. Rosen (editor) | Eric C. Rosen (editor) | |||
Juniper Networks, Inc. | Juniper Networks, Inc. | |||
10 Technology Park Drive | 10 Technology Park Drive | |||
Westford, Massachusetts 01886 | Westford, MA 01886 | |||
United States | United States of America | |||
Email: erosen52@gmail.com | Email: erosen52@gmail.com | |||
Jorge Rabadan | Jorge Rabadan | |||
Nokia | Nokia | |||
777 E. Middlefield Road | 777 E. Middlefield Road | |||
Mountain View, CA 94043 | Mountain View, CA 94043 | |||
United States | United States of America | |||
Email: jorge.rabadan@nokia.com | Email: jorge.rabadan@nokia.com | |||
Ali Sajassi | Ali Sajassi | |||
Cisco Systems | Cisco Systems | |||
170 West Tasman Drive | 170 West Tasman Drive | |||
San Jose, CA 95134 | San Jose, CA 95134 | |||
United States | United States of America | |||
Email: sajassi@cisco.com | Email: sajassi@cisco.com | |||
End of changes. 552 change blocks. | ||||
1362 lines changed or deleted | 1341 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |