Internet Engineering Task Force (IETF) D. Voyer, Ed. Request for Comments: 9524 Bell Canada Category: Standards Track C. Filsfils ISSN: 2070-1721 R. Parekh Cisco Systems, Inc. H. Bidgoli Nokia Z. Zhang Juniper NetworksJanuaryFebruary 2024 Segment Routing ReplicationSegmentfor Multipoint Service Delivery Abstract This document describes the Segment Routing Replication segment for multipoint service delivery. A Replication segment allows a packet to be replicated from aReplicationreplication node toDownstreamdownstream nodes. Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9524. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction 1.1. Terminology 1.2. Use Cases 2. Replication Segment 2.1. SR-MPLS Data Plane 2.2. SRv6 Data Plane 2.2.1. End.Replicate: Replicate and/or Decapsulate 2.2.2. OAM Operations 2.2.3. ICMPv6 Error Messages 3. IANA Considerations 4. Security Considerations 5. References 5.1. Normative References 5.2. Informative References Appendix A. Illustration of a Replication Segment A.1. SR-MPLS A.2. SRv6 A.2.1. Pinging aReplication SIDReplication-SID Acknowledgements Contributors Authors' Addresses 1. Introduction The Replication segment is a new type of segment for Segment Routing (SR) [RFC8402], which allows a node (henceforth called a"Replication"replication node") to replicate packets to a set of other nodes (called"Downstream"downstream nodes") ina Segment Routingan SR domain. A Replication segment can replicate packets to directly connected nodes or to downstream nodes (without the need for state on the transit routers). This document focuses on specifying the behavior of a Replication segment for both Segment Routing with Multiprotocol Label Switching (SR-MPLS) [RFC8660] and Segment Routing with IPv6 (SRv6) [RFC8986]. The examples in Appendix A illustrate the behavior of a Replication Segment in an SR domain. The use of two or more Replication segments stitched together to form a tree using a control plane is left to be specified in other documents. The management of IP multicast groups, building IP multicast trees, and performing multicast congestion control are out of scope of this document. 1.1. Terminology This section defines terms introduced and used frequently in this document. Refer to the Terminology sections of [RFC8402], [RFC8754], and [RFC8986] for other terms used inSegment Routing.SR. Replication segment: A segment in an SR domain that replicates packets. See Section 2 for details. Replication node: A node in an SR domain that replicates packets based on a Replication segment. Downstream nodes: A Replication segment replicates packets to a set of nodes. These nodes areDownstreamdownstream nodes. Replication state: State held for a Replication segment at aReplicationreplication node. It is conceptually a list of Replication branches toDownstreamdownstream nodes. The list can be empty.Replication SID:Replication-SID: Data plane identifier of a Replication segment. This is an SR-MPLS label or SRv6 Segment Identifier (SID). SRH: IPv6 Segment Routing Header [RFC8754]. Point-to-Multipoint (P2MP) Service: A service that has one ingress node and one or more egress nodes. A packet is delivered to all the egress nodes. Root node: An ingress node of a P2MP service. Leaf node: An egress node of a P2MP service. Bud node: A node that is both aReplicationreplication node and aLeafleaf node. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 1.2. Use Cases In the simplest use case, a single Replication segment includes the ingress node of a multipoint service and the egress nodes of the service as all theDownstreamdownstream nodes. This achieves Ingress Replication [RFC7988] that has been widely used for Multicast VPN (MVPN) [RFC6513] and Ethernet VPN (EVPN) [RFC7432] bridging of Broadcast, Unknown Unicast, and Multicast (BUM) traffic. This Replication segmentcan either be provisioned locallyon ingress and egress nodes can either be provisioned locally or using dynamic autodiscovery procedures for MVPN and EVPN. Note SRv6 [RFC8986] has End.DT2M replication behavior for EVPN BUM traffic. Replication segments can also be used to form trees by stitching Replication segments on aRootroot node, intermediateReplicationreplication nodes, andLeafleaf nodes for efficient delivery of MVPN and EVPN BUM traffic. 2. Replication Segment Ina Segment Routingan SR domain, a Replication segment is a logical construct that connects aReplicationreplication node to a set ofDownstreamdownstream nodes. A Replication segment is a local segment instantiated at a Replication node. It can be either provisioned locally on a node or programmed by a control plane. Replication segments can be stitched together to form a tree by either local provisioning on nodes or using a control plane. The procedures for doing this are out of scope of this document. One such control plane using a PCE with the SR P2MP policy is specified in [P2MP-POLICY]. However, if local provisioning is used to stitch Replication segments, then a chain of Replication segments SHOULD NOT form a loop. If a control plane is used to stitch Replication segments, the control plane specification MUST prevent loops or detect and mitigate loops in steady state. A Replication segment is identified by the tuple <Replication-ID, Node-ID>, where: Replication-ID: An identifier for a Replication segment that is unique in context of theReplicationreplication node. Node-ID: The address of theReplicationreplication node for the Replication segment. Note that theRootroot of a multipoint service is also a Replication node. Replication-ID is a variable-length field. In the simplest case, it can be a 32-bit number, but it can be extended or modified as required based on the specific use of a Replication segment. This is out of scope for this document. The length of the Replication-ID is specified in the signaling mechanism used for the Replication segment. Examples of such signaling and extensions are described in [P2MP-POLICY]. When the PCE signals a Replication segment to its node, the <Replication-ID, Node-ID> tuple identifies the segment. A Replication segment includes the following elements:Replication SID:Replication-SID: The Segment Identifier of a Replication segment. This is an SR-MPLS label or an SRv6 SID [RFC8402]. Downstream nodes: Set of nodes ina Segment Routingan SR domain to which a packet is replicated by the Replication segment. Replication state: See below. TheDownstreamdownstream nodes and Replication state (RS) of a Replication segment can change over time, depending on the network state andLeafleaf nodes of a multipoint service that the segment is part of. TheReplication SIDReplication-SID identifies the Replication segment in the forwarding plane. At aReplicationreplication node, theReplication SIDReplication-SID operates on theReplication stateRS of the Replication segment.Replication stateRS is a list of Replication branches to theDownstreamdownstream nodes. In this document, each branch is abstracted to a<Downstream<downstream node,Downstream Replication SID>downstream Replication-SID> tuple.<Downstream<downstream node> represents the reachability from theReplicationreplication node to theDownstreamdownstream node. In its simplest form, this MAY be specified as an interface ornext- hopnext-hop if the downstream node is adjacent to theReplicationreplication node. The reachability may be specified in terms of a Flexible Algorithm path (including the default algorithm) [RFC9350] or specified by an SR- explicit path represented either by a SID list (of one or more SIDs) or by a Segment Routing Policy [RFC9256]. TheDownstream Replication SIDdownstream Replication-SID is theReplication SIDReplication-SID of the Replication segment at theDownstreamdownstream node. A packet is steered into a Replication segment at aReplicationreplication node in two ways: * When the active segment [RFC8402] is a locally instantiatedReplication SID.Replication-SID. * By theRootroot of a multipoint service based on local configuration that is outside the scope of this document. In either case, the packet is replicated to eachDownstreamdownstream node in the associatedReplication state.RS. If aDownstreamdownstream node is an egress(Leaf)(leaf) of the multipoint service, no further replication is needed. TheLeafleaf node's Replication segment has an indicator for theLeafleaf role, and it does not have anyReplication stateRS (i.e., the list of Replication branches is empty). TheReplication SIDReplication-SID at aLeafleaf node MAY be used to identify the multipoint service. Notice that the segment on theLeafleaf node is still referred to as a "Replication segment" for the purpose of generalization. A node can be aBudbud node (i.e., it is aReplicationreplication node and aLeafleaf node of a multipoint service [P2MP-POLICY]). The Replication segment of aBudbud node has a list of Replication branches as well as aLeafleaf role indicator. In principle, it is possible for different Replication segments to replicate packets to the same Replication segment on aDownstreamdownstream node. However, such usage is intentionally left out of scope of this document. 2.1. SR-MPLS Data Plane When the active segment is aReplication SID,Replication-SID, the processing results in a POP [RFC8402] operation and the lookup of the associatedReplication state.RS. For each replication in theReplication state,RS, the operation is a PUSH [RFC8402] of the downstreamReplication SIDReplication-SID and an optional segment list onto the packet to steer the packet to theDownstreamdownstream node. The operation performed on the incomingReplication SIDReplication-SID is NEXT [RFC8402] atLeaf/Bud nodesa leaf or bud node where delivery of payload off the tree is per local configuration. For some usages, this may involve looking at the next SID, for example, to get the necessary context. When theRootroot of a multipoint service steers a packet to a Replication segment, it results in a replication to eachDownstreamdownstream node in the associatedreplication state.RS. The operation is a PUSH of thereplication SIDReplication-SID and an optional segment list onto the packet, which is forwarded to the downstream node. The following applies to aReplication SIDReplication-SID in MPLS encapsulation: * SIDs MAY be inserted before the downstream SR-MPLSReplication SIDReplication-SID in order to guide a packet from a non-adjacent SR node to aReplicationreplication node. * AReplicationreplication node MAY replicate a packet to a non-adjacentDownstreamdownstream node using SIDs it inserts in the copy preceding the downstreamReplication SID.Replication-SID. TheDownstreamdownstream node may be aLeafleaf node of the Replication segment, anotherReplicationreplication node, or both in the case of aBudbud node. * AReplicationreplication node MAY use anAnycast SIDAnycast-SID or a Border Gateway Protocol (BGP)PeerSet SIDPeerSet-SID in the segment list to send a replicated packet to one downstreamReplicationreplication node inana set of Anycastset.nodes. This occurs if and only if all nodes in the set have an identicalReplication SIDReplication-SID and reach the same set of receivers. * For some use cases, there MAY be SIDs after theReplication SIDReplication-SID in the segment list of a packet. These SIDs are used only by theLeaf/Budleaf and bud nodes to forward a packet off the tree independent of theReplication SID.Replication-SID. Coordination regarding the absence or presence and value of context information forLeaf/Budleaf and bud nodes is outside the scope of this document. 2.2. SRv6 Data Plane For SRv6 [RFC8986], this document specifies "Endpoint withreplication"replication and/or decapsulate" behavior (End.Replicate for short) to replicate a packet and forward the replicas according toa Replication state.an RS. When processing a packet destined to a localReplication SID,Replication-SID, the packet is replicated according to the associatedReplication stateRS toDownstreamdownstream nodes and/or locally delivered off the tree when this is aLeaf/Budleaf or bud node. For replication, the outer header is reused, and theDownstream Replication SID,downstream Replication-SID, fromReplication state,RS, is written into the outer IPv6 header Destination Address (DA). If required, an optional segment list may be used on some branches using H.Encaps.Red [RFC8986] (while some other branches may not need that). Note that this H.Encaps.Red is independent of thereplicationReplication segment: it is just used to steer the replicated packet on a traffic-engineered path to aDownstreamdownstream node. The penultimate segment in the encapsulating IPv6 header will execute the Ultimate Segment Decapsulation (USD) flavor [RFC8986] of End/End.X behavior and forward the inner (replicated) packet to theDownstreamdownstream node. If H.Encaps.Red is used to steer a replicated packet to aDownstreamdownstream node, the operator must ensure the MTU on path to theDownstreamdownstream node is sufficient to account for additional SRv6 encapsulation. This also applies when the Replication segment is for theRootroot node, whose upstream node has placed the Replication-SID in the header. A local application onRootroot (e.g., MVPN [RFC6513] or EVPN [RFC7432]) may also apply H.Encaps.Red and then steer the resulting traffic into the Replication segment. Again, note that H.Encaps.Red is independent of the Replication segment: it is the action of the application (e.g.MVPN/EVPNMVPN or EVPN service). If the service is on aRootroot node, then the two H.Encaps mentioned, one for the service and the other in the previous paragraph for replication to theDownstreamdownstream node, SHOULD be combined for optimization (to avoid extra IPv6 encapsulation). When processing a packet destined to a localReplication SID,Replication-SID, the IPv6 Hop Limit MUST be decremented and MUST be non-zero to replicate the packet. ARootroot node that encapsulates a payload can set the IPv6 Hop Limit based on a local policy. This local policy SHOULD set the IPv6 Hop Limit so that a replicated packet can reach the furthestLeafleaf node. ARootroot node can also have a local policy to set the IPv6 Hop Limit from the payload. In this case, the IPv6 Hop Limit may not be sufficient to get the replicated packet to all theLeafleaf nodes. Non-replication nodes (i.e., nodes that forward replicated packets based on the IPv6 locator unicast prefix) can decrement the IPv6 Hop Limit to zero and originate ICMPv6 error packets to theRootroot node. This can result in a storm of ICMPv6 packets (see Section 2.2.3) to theRootroot node. To avoid this, a ReplicationSegmentsegment has an optional IPv6 Hop Limit Threshold. If this threshold is set, aReplicationreplication node MUST discard an incoming packet with a localReplication SIDReplication-SID if the IPv6 Hop Limit in the packet is less than the threshold and log this in a rate-limited manner. The IPv6 Hop Limit Threshold SHOULD be set so that an incoming packet can be replicated to the furthestLeafleaf node. ForLeaf/Budleaf and bud nodes, local delivery off the tree is perReplication SIDReplication-SID or the next SID (if present in the SRH). For some usages, this may involve getting the necessary context either from the next SID (e.g., MVPN with a shared tree) or from thereplicationReplication- SID itself (e.g., MVPN with a non-shared tree). In both cases, the context association is achieved with signaling and is out of scope of this document. The following applies to aReplication SIDReplication-SID in SRv6 encapsulation: * There MAY be SIDs preceding the SRv6Replication SIDReplication-SID in order to guide a packet from a non-adjacent SR node to aReplicationreplication node via an explicit path. * AReplicationreplication node MAY steer a replicated packet on an explicit path to a non-adjacentDownstreamdownstream node using SIDs it inserts in the copy preceding the downstreamReplication SID.Replication-SID. TheDownstreamdownstream node may be aLeafleaf node of the Replication segment, anotherReplicationreplication node, or both in the case of aBudbud node. * For SRv6, as described in above paragraphs, the insertion of SIDs prior to theReplication SIDReplication-SID entails a new IPv6 encapsulation with the SRH. However, this can be optimized on theRootroot node or for compressed SRv6 SIDs. * The locator of theReplication SIDReplication-SID is sufficient to guide a packet on the shortest path between non-adjacent nodes for default or Flexible Algorithms. * AReplicationreplication node MAY use anAnycast SIDAnycast-SID or a BGPPeerSet SIDPeerSet-SID in the segment list to send a replicated packet to one downstreamReplicationreplication node in an Anycast set. This occurs if and only if all nodes in the set have an identicalReplication SIDReplication-SID and reach the same set of receivers. * There MAY be SIDs after theReplication SIDReplication-SID in the SRH of a packet. These SIDs are used to provide additional context for processing a packet locally at the node where theReplication SIDReplication-SID is the active segment. Coordination regarding the absence or presence and value of context information forLeaf/Budleaf and bud nodes is outside the scope of this document. 2.2.1. End.Replicate: Replicate and/or Decapsulate The "Endpoint with replication and/ordecapsulate behavior"decapsulate" (End.Replicate for short) is a variant of End behavior. The pseudocode in this section follows the convention introduced in [RFC8986].A Replication stateAn RS conceptually contains the following elements: Replication state: { Node-Role: {Head, Transit, Leaf, Bud}; IPv6 Hop Limit Threshold; # default is zero # On Leaf, replication list is zero length Replication-List: {Downstreamdownstream node: <Node-Identifier>;Downstream Replication SID:downstream Replication-SID: R-SID; # Segment-List may be empty Segment-List: [SID-1, .... SID-N]; } } Below is the Replicate function on a packet for Replication state (RS). S01. Replicate(RS, packet) S02. { S03. For each Replication R in RS.Replication-List { S04. Make a copy of the packet S05. Set IPv6 DA = RS.R-SID S06. If RS.Segment-List is not empty { S07. # Head node may optimize below encapsulation and S08. # the encapsulation of packet in a single encapsulation S09. Execute H.Encaps or H.Encaps.Red with RS.Segment-List on packet copy #RFC 8986, Sections 5.1 and 5.2 S10. } S11. Submit the packet to the egress IPv6 FIB lookup and transmission to the new destination S12. } S13. } Notes: * The IPv6Destination AddressDA in the copy of a packet is set from the local state and not from the SRH. When N receives a packet whose IPv6 DA is S and S is a local End.Replicate SID, N does: S01. Lookup FUNCT portion of S to get Replication stateRS(RS) S02. If (IPv6 Hop Limit <= 1) { S03. Discard the packet S04. # ICMPv6 Time Exceeded is not permitted(ICMPv6 section below)(see Section 2.2.3) S05. } S06. If RS is not found { S07. Discard the packet S08. } S09. If (IPv6 Hop Limit < RS.IPv6 Hop Limit Threshold) { S10. Discard the packet S11. # Rate-limited logging S12. } S13. Decrement IPv6 Hop Limit by 1 S14. If (IPv6 NH == SRH and SRH TLVs present) { S15. Process SRH TLVs if allowed by local configuration S16. } S17. Call Replicate(RS, packet) S18. If (RS.Node-Role == Leaf OR RS.Node-Role ==Bud)bud) { S19. If (IPv6 NH == SRH and Segments Left > 0) { S20. Derive packet processingcontext(PPC)context (PPC) from Segment List S21. If (Segments Left != 0) { S22. Discard the packet S23. # ICMPv6 Parameter Problem message with Code 0 S24. # (Erroneous header field encountered) S25. # is not permitted(ICMPv6 section below)(Section 2.2.3) S26. } S27. } Else { S28. Derive packet processingcontext(PPC)context (PPC) from FUNCT ofReplication SIDReplicatio-SID S29. } S30. Process the next header S31. } The processing of the Upper-Layer header of a packet matching the End.Replicate SID at aLeaf/Budleaf or bud node is as follows: S01. If (Upper-Layer header type == 4(IPv4) OR Upper-Layer header type == 41(IPv6) ) { S02. Remove the outer IPv6 header with all its extension headers S03. Process the packet in context of PPC S04. } Else If (Upper-Layer header type == 143(Ethernet) ) { S05. Remove the outer IPv6 header with all its extension headers S06. Process the Ethernet Frame in context of PPC S07. } Else If (Upper-Layer header type is allowed by local configuration) { S08. Proceed to process the Upper-Layer header S09. } Else { S10. Discard the packet S11. # ICMPv6 Parameter Problem message with Code 4 S12. # (SRUpper-layer HeaderUpper-Layer header Error) S13. # is not permitted(ICMPv6 section below)(Section 2.2.3) S14. } Notes: * The behavior above MAY result in a packet with a partially processed segment list in the SRH under some circumstances. For example, a head node may encode acontext SIDcontext-SID in an SRH. As per the pseudocode above, aReplicationreplication node that receives a packet with a localReplication SIDReplication-SID will not process the SRH segment list and will just forward a copy with an unmodified SRH toDownstreamdownstream nodes. * The packet processing context is usually a FIB table "T". If configured to process TLVs, processing theReplication SIDReplication-SID may modify the "variable-length data" of TLV types that change en route. Therefore, TLVs that change en route are mutable. The remainder of the SRH (Segments Left, Flags, Tag, Segment List, and TLVs that do not change en route) are immutable while processing this SID. 2.2.1.1. Hashed Message Authentication Code (HMAC) SRH TLV If aRootroot node encodes acontext SIDcontext-SID in an SRH with an optional HMAC SRH TLV [RFC8754], it MUST set the 'D' bit as defined in Section 2.1.2 of [RFC8754] because theReplication SIDReplication-SID is not part of the segment list in the SRH. HMAC generation and verification is as specified in [RFC8754]. Verification of an HMAC TLV is determined by local configuration. If verification fails, an implementation of aReplication SIDReplication-SID MUST NOT originate an ICMPv6errorParameter Problem message(parameter problem,with code0).0. The failure SHOULD be logged (rate-limited) and the packet SHOULD be discarded. 2.2.2. OAM Operations [RFC9259] specifies procedures for Operations, Administration, and Maintenance (OAM) like ping and traceroute on SRv6 SIDs. Assuming the source node knows theReplication SIDReplication-SID a priori, it is possible to ping aReplication SIDReplication-SID of aLeaf/Budleaf or bud node directly by putting it in the IPv6destination addressDA without an SRH or in an SRH as the last segment. While it is not possible to ping aReplication SIDReplication-SID of a transit node because transit nodes do not processupper layerUpper-Layer headers, it is still possible to ping aReplication SIDReplication-SID of aLeaf/Budleaf or bud node of a tree via theReplication SIDReplication-SID of intermediate transit nodes. The source of the ping MUST compute the ICMPv6 Echo Request checksum using theReplication SIDReplication-SID of theLeaf/Budleaf or bud node as thedestination address.DA. The source can then send the Echo Request packet to a transit node'sReplication SID.Replication-SID. The transitnodes replicatenode replicates the packet by replacing the IPv6destination addressDA until the packet reaches theLeaf/Budleaf or bud node, which responds with an ICMPv6 Echo Reply. Note that a transitReplicationreplication node may replicate Echo Request packets to otherLeaf/Budleaf or bud nodes. These nodes will drop the Echo Request due to an incorrect checksum. Procedures to prevent the misdelivery of an Echo Request may be addressed in a future document. Appendix A.2.1 illustrates examples of a ping to aReplication SID.Replication-SID. Traceroute to aLeaf/Budleaf or bud nodeReplication SIDReplication-SID is not possible due to restrictions prohibiting the origination of the ICMPv6 Time Exceeded error message for aReplication SIDReplication-SID as described in Section 2.2.3. 2.2.3. ICMPv6 Error Messages Section 2.4 of [RFC4443] states an ICMPv6 error message MUST NOT be originated as a result of receiving a packet destined to an IPv6 multicast address. This is to prevent a source node from being overwhelmed by a storm of ICMPv6 error messages resulting from replicated IPv6packets from overwhelming a source node.packets. There are two exceptions: 1. The Packet Too Big message for Path MTU discovery, and 2. The ICMPv6 Parameter ProblemMessage,message with Code 2 reporting an unrecognized IPv6 option. An implementation of a Replication segment for SRv6 MUST enforce these same restrictions and exceptions. 3. IANA Considerations IANA has assigned the following codepoint for End.Replicate behavior in the "SRv6 Endpoint Behaviors" registry in the "Segment Routing" registry group. +=======+========+===================+===========+============+ | Value | Hex | Endpoint Behavior | Reference | Change | | | | | | Controller | +=======+========+===================+===========+============+ | 75 | 0x004B | End.Replicate | RFC 9524 | IETF | +-------+--------+-------------------+-----------+------------+ Table 1: SRv6 Endpoint Behavior 4. Security Considerations The SID behaviors defined in this document are deployed within an SR domain [RFC8402]. An SR domain needs protection from outside attackers (as described in [RFC8754]). The following is a brief reminder of the same: * For SR-MPLS deployments: - Disable MPLS on external interfaces of each edge node or any other technique to filter labeled traffic ingress on these interfaces. * For SRv6 deployments: - Allocate all the SIDs from an IPv6 prefix block S/s and configure each external interface of each edge node of the domain with an inboundinfrastructure access listInfrastructure Access Control List (IACL) that drops any incoming packet with adestination addressDA in S/s. - Additionally, aniACLIACL may be applied to all nodes (k) provisioning SIDs as defined in this specification: o Assign all interface addresses from within IPv6 prefix A/a. At node k, all SIDs local to k are assigned from prefix Sk/ sk. Configure each internal interface of each SR node k in the SR domain with an inbound IACL that drops any incoming packet with adestination addressDA in Sk/sk if the source address is not inA/a.A/ a. - Deny traffic with spoofed source addresses by implementing recommendations in BCP 84 [RFC3704]. - Additionally, the block S/s from which SIDs are allocated may be an address that is not globally routable such as a Unique Local Address (ULA) or the prefix defined in [SIDS-SRv6]. Failure to protect the SR-MPLS domain by correctly provisioning MPLS support per interface permits attackers from outside the domain to send packets that use the replication services provisioned within the domain. Failure to protect the SRv6 domain with IACLs on external interfaces combined with failure to implement the recommendations of BCP 38 [RFC2827] or apply IACLs on nodes provisioning SIDs permits attackers from outside the SR domain to send packets that use the replication services provisioned within the domain. Given the definition of the Replication segment in this document, an attacker subverting the ingress filters above cannot take advantage of a stack ofreplicationReplication segments to perform amplification attacks nor link exhaustion attacks. Replication segment trees always terminate at aLeafleaf orBudbud node resulting in a decapsulation. However, this does allow an attacker to inject traffic to the receivers within a P2MP service. This document introduces an SR segment endpoint behavior that replicates and decapsulates an inner payload for both the MPLS and IPv6 data planes. Similar to any MPLS end-of-stack label, or SRv6 END.D* behavior, if the protections described above are not implemented, an attacker can perform an attack via the decapsulating segment (including the one described in this document). Incorrect provisioning of Replication segments can result in a chain of Replication segments forming a loop. This can happen if Replication segments are provisioned on SR nodes without using a control plane. In this case, replicated packets can create a storm until MPLS TTL (for SR-MPLS) or IPv6 Hop Limit (for SRv6) decrements to zero. A control plane such as PCE can be used to prevent loops. The control plane protocols (like Path Computation Element Communication Protocol (PCEP), BGP, etc.) used to instantiate Replication segments can leverage their own security mechanisms such as encryption, authentication filtering, etc. For SRv6, Section 2.2.3 describes an exception for the ICMPv6 Parameter ProblemMessage, code 2 ICMPv6 error messages.message with Code 2. If an attacker sends a packet destined to aReplication SIDReplication-SID with the source address of a node and with an extension header using the unknown option type marked as mandatory, then a large number of ICMPv6 Parameter Problem messages can cause a denial-of-service attack on the source node. Although this document does not specify any extension headers, any future extension of this document that does so is susceptible to this security concern. If an attacker can forge an IPv6 packetwithwith: * the source address of a node, * aReplication SIDReplication-SID as thedestination address,DA, and * an IPv6 Hop Limit such that nodes that forward replicated packets on an IPv6 locator unicast prefix, decrement the Hop Limit to zero, then these nodes can cause a storm of ICMPv6 error packets to overwhelm the source node under attack. The IPv6 Hop Limit Threshold check described in Section 2.2 can help mitigate such attacks. 5. References 5.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", STD 89, RFC 4443, DOI 10.17487/RFC4443, March 2006, <https://www.rfc-editor.org/info/rfc4443>. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, July 2018, <https://www.rfc-editor.org/info/rfc8402>. [RFC8754] Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header (SRH)", RFC 8754, DOI 10.17487/RFC8754, March 2020, <https://www.rfc-editor.org/info/rfc8754>. [RFC8986] Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 (SRv6) Network Programming", RFC 8986, DOI 10.17487/RFC8986, February 2021, <https://www.rfc-editor.org/info/rfc8986>. [RFC9259] Ali, Z., Filsfils, C., Matsushima, S., Voyer, D., and M. Chen, "Operations, Administration, and Maintenance (OAM) in Segment Routing over IPv6 (SRv6)", RFC 9259, DOI 10.17487/RFC9259, June 2022, <https://www.rfc-editor.org/info/rfc9259>. 5.2. Informative References [P2MP-POLICY] Voyer, D., Ed., Filsfils, C., Parekh, R., Bidgoli, H., and Z. J. Zhang, "Segment Routing Point-to-Multipoint Policy", Work in Progress, Internet-Draft, draft-ietf-pim-sr-p2mp- policy-07, 11 October 2023, <https://datatracker.ietf.org/doc/html/draft-ietf-pim-sr- p2mp-policy-07>. [PGM-ILLUSTRATION] Filsfils, C., Camarillo, P., Ed., Li, Z., Matsushima, S., Decraene, B., Steinberg, D., Lebrun, D., Raszuk, R., and J. Leddy, "Illustrations for SRv6 Network Programming", Work in Progress, Internet-Draft, draft-filsfils-spring- srv6-net-pgm-illustration-04, 30 March 2021, <https://datatracker.ietf.org/doc/html/draft-filsfils- spring-srv6-net-pgm-illustration-04>. [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: Defeating Denial of Service Attacks which employ IP Source Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, May 2000, <https://www.rfc-editor.org/info/rfc2827>. [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, March 2004, <https://www.rfc-editor.org/info/rfc3704>. [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 2012, <https://www.rfc-editor.org/info/rfc6513>. [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 2015, <https://www.rfc-editor.org/info/rfc7432>. [RFC7988] Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress Replication Tunnels in Multicast VPN", RFC 7988, DOI 10.17487/RFC7988, October 2016, <https://www.rfc-editor.org/info/rfc7988>. [RFC8660] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing with the MPLS Data Plane", RFC 8660, DOI 10.17487/RFC8660, December 2019, <https://www.rfc-editor.org/info/rfc8660>. [RFC9256] Filsfils, C., Talaulikar, K., Ed., Voyer, D., Bogdanov, A., and P. Mattes, "Segment Routing Policy Architecture", RFC 9256, DOI 10.17487/RFC9256, July 2022, <https://www.rfc-editor.org/info/rfc9256>. [RFC9350] Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., and A. Gulko, "IGP Flexible Algorithm", RFC 9350, DOI 10.17487/RFC9350, February 2023, <https://www.rfc-editor.org/info/rfc9350>. [SIDS-SRv6] Krishnan, S., "Segment Identifiers in SRv6", Work in Progress, Internet-Draft, draft-ietf-6man-sids-05, 8 January 2024, <https://datatracker.ietf.org/doc/html/ draft-ietf-6man-sids-05>. Appendix A. Illustration of a Replication Segment This section illustrates an example of a single Replication segment. Examples showing Replication segments stitched together to form a P2MP tree (based on SR P2MP policy) are in [P2MP-POLICY]. Consider the following topology: R3------R6 / \ R1----R2----R5-----R7 \ / +--R4---+ Figure 1: Topology for Illustration of a Replication Segment A.1. SR-MPLS In this example, the Node-SID of a node Rn is N-SIDn and theAdjacency-SIDAdj-SID from node Rm to node Rn is A-SIDmn. The interface between Rm and Rn is Lmn. The state representation uses "R-SID->Lmn" to represent a packet replication with outgoingreplication SIDReplication-SID R-SID sent on interface Lmn. Assume a Replication segment identified with R-ID at Replication node R1 and downstream nodes R2, R6, and R7. TheReplication SIDReplication-SID at node n is R-SIDn. A packet replicated from R1 to R7 has to traverse R4. The Replicationsegment statesegments at nodes R1, R2, R6, and R7isare shown below. Note nodes R3, R4, and R5 do not havestate for thea Replication segment. Replication segment at R1: Replication segment <R-ID,R1>:Replication SID:Replication-SID: R-SID1 Replication state: R2: <R-SID2->L12> R6: <N-SID6, R-SID6> R7: <N-SID4, A-SID47, R-SID7> Replication to R2 steers the packet directly to R2 on interface L12. Replication to R6, using N-SID6, steers the packet via the shortest path to that node. Replication to R7 is steered via R4, using N-SID4 and then adjacency SID A-SID47 to R7. Replication segment at R2: Replication segment <R-ID,R2>:Replication SID:Replication-SID: R-SID2 Replication state: R2: <Leaf> Replication segment at R6: Replication segment <R-ID,R6>:Replication SID:Replication-SID: R-SID6 Replication state: R6: <Leaf> Replication segment at R7: Replication segment <R-ID,R7>:Replication SID:Replication-SID: R-SID7 Replication state: R7: <Leaf> When a packet is steered into the Replication segment at R1: * R1 performs the PUSH operation with just the <R-SID2> label for the replicated copy and sends it to R2 on interface L12, since R1 is directly connected to R2. R2, asLeaf,leaf, performs the NEXT operation, pops the R-SID2 label, and delivers the payload. * R1 performs the PUSH operation with the <N-SID6, R-SID6> label stack for the replicated copy to R6 and sends it to R2, which is the nexthop on the shortest path to R6. R2 performs the CONTINUE operation on N-SID6 and forwards it to R3. R3 is the penultimate hop for N-SID6; it performs penultimate hop popping, which corresponds to the NEXT operation. The packet is then sent to R6 with <R-SID6> in the label stack. R6, asLeaf,leaf, performs the NEXT operation, pops the R-SID6 label, and delivers the payload. * R1 performs the PUSH operation with the <N-SID4, A-SID47, R-SID7> label stack for the replicated copy to R7 and sends it to R2, which is the nexthop on the shortest path to R4. R2 is the penultimate hop for N-SID4; it performs penultimate hop popping, which corresponds to the NEXT operation. The packet is then sent to R4 with <A-SID47, R-SID1> in the label stack. R4 performs the NEXT operation, pops A-SID47, and delivers the packet to R7 with <R-SID7> in the label stack. R7, asLeaf,leaf, performs the NEXT operation, pops the R-SID7 label, and delivers the payload. A.2. SRv6 For SRv6, we use the SID allocation scheme, reproduced below, from "Illustrations for SRv6 Network Programming" [PGM-ILLUSTRATION]: * 2001:db8::/32 is an IPv6 block allocated by a Regional Internet Registry (RIR) to the operator. * 2001:db8:0::/48 is dedicated to the internal address space. * 2001:db8:cccc::/48 is dedicated to the internal SRv6 SID space. * We assume a location expressed in 64 bits and a function expressed in 16 bits. * Node k has a classic IPv6 loopback address 2001:db8::k/128, which is advertised in the Interior Gateway Protocol (IGP). * Node k has 2001:db8:cccc:k::/64 for its local SID space. Its SIDs will be explicitly assigned from that block. * Node k advertises 2001:db8:cccc:k::/64 in its IGP. * Function :1:: (function 1, for short) represents the End function with the Penultimate Segment Pop (PSP) of the SRH [RFC8986] and USD support. * Function :Cn:: (function Cn, for short) represents the End.X function from to Node n with PSP and USD support. Each node k has: * An explicit SID instantiation 2001:db8:cccc:k:1::/128 bound to an End function with additional support for PSP and USD. * An explicit SID instantiation 2001:db8:cccc:k:Cj::/128 bound to an End.X function to neighbor J with additional support for PSP and USD. * An explicit SID instantiation 2001:db8:cccc:k:Fk::/128 bound to an End.Replicate function. Assume a Replication segment identified with R-ID at Replication node R1 and downstream nodes R2, R6, and R7. TheReplication SIDReplication-SID at node k, bound to an End.Replicate function, is 2001:db8:cccc:k:Fk::/128. A packet replicated from R1 to R7 has to traverse R4. The Replicationsegment statesegments at nodes R1, R2, R6, and R7isare shown below. Note nodes R3, R4, and R5 do not havestate for thea Replication segment. The state representation uses "R-SID->Lmn" to represent a packet replication with outgoingreplication SIDReplication-SID R-SID sent on interface Lmn. "SL" represents an optional segment list used to steer a replicated packet on a specific path to aDownstreamdownstream node. Replication segment at R1: Replication segment <R-ID,R1>:Replication SID:Replication-SID: 2001:db8:cccc:1:F1::0 Replication state: R2: <2001:db8:cccc:2:F2::0->L12> R6: <2001:db8:cccc:6:F6::0> R7: <2001:db8:cccc:4:C7::0>, SL: <2001:db8:cccc:7:F7::0> Replication to R2 steers the packet directly to R2 on interface L12. Replication to R6, using 2001:db8:cccc:6:F6::0, steers the packet via the shortest path to that node. Replication to R7 is steered via R4, using H.Encaps.Red with End.X SID 2001:db8:cccc:4:C7::0 at R4 to R7. Replication segment at R2: Replication segment <R-ID,R2>:Replication SID:Replication-SID: 2001:db8:cccc:2:F2::0 Replication state: R2: <Leaf> Replication segment at R6: Replication segment <R-ID,R6>:Replication SID:Replication-SID: 2001:db8:cccc:6:F6::0 Replication state: R6: <Leaf> Replication segment at R7: Replication segment <R-ID,R7>:Replication SID:Replication-SID: 2001:db8:cccc:7:F7::0 Replication state: R7: <Leaf> When a packet, (A,B2), is steered into the Replication segment at R1: * R1 creates an encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:2:F2::0) (A, B2), and sends it to R2 on interface L12, since R1 is directly connected to R2. R2, asLeaf,leaf, removes the outer IPv6 header and delivers the payload. * R1 creates an encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:6:F6::0) (A, B2) then forwards the resulting packet on the shortest path to 2001:db8:cccc:6::/64. R2 and R3 forward the packet using 2001:db8:cccc:6::/64. R6, asLeaf,leaf, removes the outer IPv6 header and delivers the payload. * R1 has to steer the packet toDownstreamdownstream node R7 via node R4. It can do this in one of two ways: - R1 creates an encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) and then performs H.Encaps.Red using the SL to create the (2001:db8::1, 2001:db8:cccc:4:C7::0) (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) packet. It sends this packet to R2, which is the nexthop on the shortest path to 2001:db8:cccc:4::/64. R2 forwards the packet to R4 using 2001:db8:cccc:4::/64. R4 executes the End.X function on 2001:db8:cccc:4:C7::0, performs a USD action, removes the outer IPv6 encapsulation, and sends the resulting packet (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, asLeaf,leaf, removes the outer IPv6 header and delivers the payload. - R1 is theRootroot of thereplicationReplication segment. Therefore, it can combine above encapsulations to create an encapsulated replicated copy (2001:db8::1, 2001:db8:cccc:4:C7::0) (2001:db8:cccc:7:F7::0; SL=1) (A, B2) and sends it to R2, which is the nexthop on the shortest path to 2001:db8:cccc:4::/64. R2 forwards the packet to R4 using 2001:db8:cccc:4::/64. R4 executes the End.X function on 2001:db8:cccc:4:C7::0, performs a PSP action, removes the SRH, and sends the resulting packet (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, asLeaf,leaf, removes the outer IPv6 header and delivers the payload. A.2.1. Pinging aReplication SIDReplication-SID This section illustrates the ping of aReplication SID.Replication-SID. Node R1 pings thereplication SIDReplication-SID of node R6 directly by sending the following packet: 1. R1 to R6: (2001:db8::1, 2001:db8:cccc:6:F6::0; NH=ICMPv6) (ICMPv6 Echo Request). 2. Node R6 as aLeafleaf processes the upper-layer ICMPv6 Echo Request and responds with an ICMPv6 Echo Reply. Node R1 pings theReplication SIDReplication-SID of R7 via R4 by sending the following packet with the SRH: 1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:C7::0) (2001:db8:cccc:7:F7::0; SL=1; NH=ICMPV6) (ICMPv6 Echo Request). 2. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 Echo Request). 3. Node R7 as aLeafleaf processes the upper-layer ICMPv6 Echo Request and responds with an ICMPv6 Echo Reply. Assume node R4 is a transitReplicationreplication node withReplication SIDReplication-SID 2001:db8:cccc:4:F4::0 replicating to R7. Node R1 pings theReplication SIDReplication-SID of R7 via theReplication SIDReplication-SID of R4 as follows: 1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:F4::0; NH=ICMPv6) (ICMPv6 Echo Request). 2. R4 replicates to R7 by replacing the IPv6destination addressDA with theReplication SIDReplication-SID of R7 from its Replication state. 3. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 Echo Request). 4. Node R7 as aLeafleaf processes the upper-layer ICMPv6 Echo Request and responds with an ICMPv6 Echo Reply. Acknowledgements The authors would like to acknowledge Siva Sivabalan, Mike Koldychev, Vishnu Pavan Beeram, Alexander Vainshtein, Bruno Decraene, Thierry Couture, Joel Halpern, Ketan Talaulikar, Darren Dukes and Jingrong Xie for their valuable inputs. Contributors Clayton Hassen Bell Canada Vancouver Canada Email: clayton.hassen@bell.ca Kurtis Gillis Bell Canada Halifax Canada Email: kurtis.gillis@bell.ca Arvind Venkateswaran Cisco Systems, Inc. San Jose, CA United States of America Email: arvvenka@cisco.com Zafar Ali Cisco Systems, Inc. United States of America Email: zali@cisco.com Swadesh Agrawal Cisco Systems, Inc. San Jose, CA United States of America Email: swaagraw@cisco.com Jayant Kotalwar Nokia Mountain View, CA United States of America Email: jayant.kotalwar@nokia.com Tanmoy Kundu Nokia Mountain View, CA United States of America Email: tanmoy.kundu@nokia.com Andrew Stone Nokia Ottawa Canada Email: andrew.stone@nokia.com Tarek Saad Cisco Systems, Inc. Canada Email: tsaad@cisco.com Kamran Raza Cisco Systems, Inc. Canada Email: skraza@cisco.com Jingrong Xie Huawei Technologies Beijing China Email: xiejingrong@huawei.com Authors' Addresses Daniel Voyer (editor) Bell Canada Montreal Canada Email: daniel.voyer@bell.ca Clarence Filsfils Cisco Systems, Inc. Brussels Belgium Email: cfilsfil@cisco.com Rishabh Parekh Cisco Systems, Inc. San Jose, CA United States of America Email: riparekh@cisco.com Hooman Bidgoli Nokia Ottawa Canada Email: hooman.bidgoli@nokia.com Zhaohui Zhang Juniper Networks Email: zzhang@juniper.net