Label Switched Path (LSP) Ping/Trace Multipath Support for Link Aggregation Group (LAG) Interfaces
Cisco Systemsnobo@cisco.comCisco Systemsswallow@cisco.comOrangestephane.litkowski@orange.comOrangebruno.decraene@orange.comJuniper Networksjdrake@juniper.net
MPLS Working Group
Internet Engineering Task ForceMPLSLSP PingLAGThis document defines an extension to the Multiprotocol Label Switching (MPLS) Label Switched Path (LSP) Ping and Traceroute to describe Multipath Information for Link Aggregation (LAG) member links separately, thus allowing MPLS LSP Ping and Traceroute to discover and exercise specific paths of layer 2 Equal-Cost Multipath (ECMP) over LAG interfaces.This document updates RFC4379 and RFC6424.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.The following acronyms/terminologies are used in this document:
MPLS - Multiprotocol Label Switching.LSP - Label Switched Path.LSR - Label Switching Router.ECMP - Equal-Cost Multipath.LAG - Link Aggregation.Initiating LSR - LSR which sends MPLS echo request.Responder LSR - LSR which receives MPLS echo request and sends MPLS echo reply.The Multiprotocol Label Switching (MPLS) Label Switched Path (LSP) Ping and Traceroute are powerful tools designed to diagnose all available layer 3 paths of LSPs, i.e. provides diagnostic coverage of layer 3 Equal-Cost Multipath (ECMP). In many MPLS networks, Link Aggregation (LAG) as defined in , which provide layer 2 ECMP, are often used for various reasons. MPLS LSP Ping and Traceroute tools were not designed to discover and exercise specific paths of layer 2 ECMP. Result raises a limitation for following scenario when LSP X traverses over LAG Y:
MPLS switching of LSP X over one or more member links of LAG Y is succeeding.MPLS switching of LSP X over one or more member links of LAG Y is failing.MPLS echo request for LSP X over LAG Y is load balanced over a member link which is MPLS switching successfully.
With above scenario, MPLS LSP Ping and Traceroute will not be able to detect the MPLS switching failure of problematic member link(s) of the LAG. In other words, lack of layer 2 ECMP discovery and exercise capability can produce an outcome where MPLS LSP Ping and Traceroute can be blind to MPLS switching failures over LAG interface that are impacting MPLS traffic. It is, thus, desirable to extend the MPLS LSP Ping and Traceroute to have deterministic diagnostic coverage of LAG interfaces.This document defines an extension to the MPLS LSP Ping and Traceroute to describe Multipath Information for LAG member links separately, thus allowing MPLS LSP Ping and Traceroute to discover and exercise specific paths of layer 2 ECMP over LAG interfaces. Reader is expected to be familiar with mechanics of the MPLS LSP Ping and Traceroute described in Section 3.3 of and Downstream Detailed Mapping TLV (DDMAP) described in Section 3.3 of . MPLS echo request carries a DDMAP and an optional TLV to indicate that separate load balancing information for each layer 2 nexthop over LAG is desired in MPLS echo reply. Responder LSR places the same optional TLV in the MPLS echo reply to provide acknowledgement back to the initiator. It also adds, for each downstream LAG member, a load balance information (i.e. multipath information and interface index). For example:
When node A is initiating LSP Traceroute to node E, node B will return to node A load balance information for following entries.
Downstream C over Non-LAG (upper path).First Downstream C over LAG (middle path).Second Downstream C over LAG (middle path).Downstream D over Non-LAG (lower path).
This document defines:
In , a mechanism to discover L2 ECMP multipath information;In , a mechanism to validate L2 ECMP traversal in some LAG provisioning models;In , the LAG Interface Info TLV;In , the LAG Description Indicator flag;In , the Interface Index Sub-TLV;In , the Detailed Interface and Label Stack TLV.The MPLS echo request carries a DDMAP and the LAG Interface Info TLV (described in ) to indicate that separate load balancing information for each layer 2 nexthop over LAG is desired in MPLS echo reply. Responder LSR:
MUST add the LAG Interface Info TLV in the MPLS echo reply to provide acknowledgement back to the initiator. Downstream LAG Info Accommodation flag MUST be set in LAG Interface Info Flags.For each downstream that is a LAG interface:
MUST add DDMAP in the MPLS echo reply.MUST set LAG Description Indicator flag in DS Flags (described in ) of DDMAP.All fields and Sub-TLVs, except for Multipath Data Sub-TLV and Interface Index Sub-TLV, are set/added to DDMAP to describe this LAG interface, as per .For each LAG member link of this LAG interface:
MUST add Interface Index Sub-TLV (described in ) with LAG Member Link Indicator flag set in Interface Index Flags, describing this LAG member link.MUST add Multipath Data Sub-TLV for this LAG member link, if received DDMAP requested multipath information.
Each LAG member link is described with Interface Index Sub-TLV and conditionally with Multipath Data Sub-TLV (if multipath information is requested). If both Sub-TLVs are placed in the DDMAP to describe a LAG member link, Interface Index Sub-TLV MUST be added first with Multipath Data Sub-TLV immediately following.For example, a responder LSR possessing a LAG interface with two member links would send the following DDMAP for this LAG interface:
These procedures allow initiating LSR to:
Identify whether responder LSR understands this mechanism.Identify whether each DDMAP describes a LAG interface or a non-LAG interface.Obtain multipath information which is expected to traverse the specific LAG member link described by interface index.The MPLS echo request is sent with a DDMAP with DS Flags I set and the optional LAG Interface Info TLV to indicate the request for Detailed Interface and Label Stack TLV with additional LAG member link information (i.e. interface index) in the MPLS echo reply. Responder LSR MUST:
Add LAG Interface Info TLV in the MPLS echo reply to provide acknowledgement back to the initiator. Upstream LAG Info Accommodation flag MUST be set in LAG Interface Info Flags.Add the Detailed Interface and Label Stack TLV (described in ) in the MPLS echo reply.Add the Incoming Interface Index Sub-TLV (described in ) for LAG interfaces. The LAG Member Link Indicator flag MUST be set in Interface Index Flags, and the incoming Interface Index set to LAG member link which received the MPLS echo request.
Described procedures allow initiating LSR to know:
The expected load balance information of every LAG member link, at LSR with TTL=n.The actual incoming interface at LSR with TTL=n+1, including the interface index of LAG member link if incoming interface is a LAG interface.
Note that defined procedures will provide a deterministic result for LAG interfaces that are back-to-back connected between routers (i.e. no L2 switch in between). If there is a L2 switch between LSR at TTL=n and LSR at TTL=n+1, there is no guarantee that traversal of every LAG member link at TTL=n will result in reaching different interface index at TTL=n+1. Issues resulting from LAG with L2 switch in between are further described in . LAG provisioning models in operated network should be considered when analyzing the output of LSP Traceroute exercising L2 ECMPs.The LAG Interface Info object is a new TLV that MAY be included in the MPLS echo request message. An MPLS echo request MUST NOT include more than one LAG Interface Info object. Presence of LAG Interface Info object is a request that responder LSR describes upstream and downstream LAG interfaces according to procedures defined in this document. If the responder LSR is able to accommodate this request, then the LAG Interface Info object MUST be included in the MPLS echo reply message.LAG Interface Info TLV Type is TBD1. Length is 4. The Value field of LAG Interface TLV has following format:
LAG Interface Info Flags
LAG Interface Info Flags field is a bit vector with following format.
Two flags are defined: U and D. The remaining flags MUST be set to zero when sending and ignored on receipt. Both U and D flags MUST be cleared in MPLS echo request message when sending, and ignored on receipt. Either or both U and D flags MAY be set in MPLS echo reply message.
One flag, G, is added in DS Flags field of the DDMAP TLV. In the MPLS echo request message, G flag MUST be cleared when sending, and ignored on receipt. In the MPLS echo reply message, G flag MUST be set if the DDMAP TLV describes a LAG interface. It MUST be cleared otherwise.
DS Flags
DS Flags G is added, in Bit Number 3, in DS Flags bit vector.
The Interface Index object is a Sub-TLV that MAY be included in a DDMAP TLV. Zero or more Interface Index object MAY appear in a DDMAP TLV. The Interface Index Sub-TLV describes the index assigned by the upstream LSR to the interface.Interface Index Sub-TLV Type is TBD2. Length is 8, and the Value field has following format:
Interface Index Flags
Interface Index Flags field is a bit vector with following format.
One flag is defined: M. The remaining flags MUST be set to zero when sending and ignored on receipt.
Interface Index
Index assigned by the LSR to this interface.The Detailed Interface and Label Stack object is a TLV that MAY be included in a MPLS echo reply message to report the interface on which the MPLS echo request message was received and the label stack that was on the packet when it was received. A responder LSR MUST NOT insert more than one instance of this TLV. This TLV allows the initiating LSR to obtain the exact interface and label stack information as it appears at the responder LSR.Detailed Interface and Label Stack TLV Type is TBD3. Length is K + Sub-TLV Length, and the Value field has following format:
The Detailed Interface and Label Stack TLV format is derived from the Interface and Label Stack TLV format (from ). Two changes are introduced. First is that label stack, which is of variable length, is converted into a sub-TLV. Second is that a new sub-TLV is added to describe an interface index. The fields of Detailed Interface and Label Stack TLV have the same use and meaning as in . A summary of the fields taken from the Interface and Label Stack TLV is as below:
Address Type
The Address Type indicates if the interface is numbered or unnumbered. It also determines the length of the IP Address and Interface fields. The resulting total for the initial part of the TLV is listed in the table below as "K Octets". The Address Type is set to one of the following values:
IP Address and Interface
IPv4 addresses and interface indices are encoded in 4 octets; IPv6 addresses are encoded in 16 octets.If the interface upon which the echo request message was received is numbered, then the Address Type MUST be set to IPv4 Numbered or IPv6 Numbered, the IP Address MUST be set to either the LSR's Router ID or the interface address, and the Interface MUST be set to the interface address.If the interface is unnumbered, the Address Type MUST be either IPv4 Unnumbered or IPv6 Unnumbered, the IP Address MUST be the LSR's Router ID, and the Interface MUST be set to the index assigned to the interface.Note: Usage of IPv6 Unnumbered has the same issue as , described in Section 3.4.2 of . A solution should be considered an applied to both and this document.Sub-TLV Length
Total length in octets of the sub-TLVs associated with this TLV.This section defines the sub-TLVs that MAY be included as part of the Detailed Interface and Label Stack TLV.
The Incoming Label Stack sub-TLV contains the label stack as received by the LSR. If any TTL values have been changed by this LSR, they SHOULD be restored.Incoming Label Stack Sub-TLV Type is 1. Length is variable, and the Value field has following format:
The Incoming Interface Index object is a Sub-TLV that MAY be included in a Detailed Interface and Label Stack TLV. The Incoming Interface Index Sub-TLV describes the index assigned by this LSR to the interface which received the MPLS echo request message.Incoming Interface Index Sub-TLV Type is 2. Length is 8, and the Value field has following format:
Interface Index Flags
Interface Index Flags field is a bit vector with following format.
One flag is defined: M. The remaining flags MUST be set to zero when sent and ignored on receipt.
Interface Index
Index assigned by the LSR to this interface.This document extends LSP Traceroute mechanism to discover and exercise layer 2 ECMP paths. Additional processing are required for initiating LSR and responder LSR, especially to compute and handle increasing number of multipath information. Due to additional processing, it is critical that proper security measures described in and are followed.The IANA is requested to assign new value TBD1 for LAG Interface Info TLV from the "Multiprotocol Label Switching Architecture (MPLS) Label Switched Paths (LSPs) Ping Parameters - TLVs" registry.
The IANA is requested to assign new value TBD2 for Interface Index Sub-TLV from the "Multiprotocol Label Switching Architecture (MPLS) Label Switched Paths (LSPs) Ping Parameters - TLVs" registry, "Sub-TLVs for TLV Types 20" sub-registry.
The IANA is requested to assign new value TBD3 for Detailed Interface and Label Stack TLV from the "Multiprotocol Label Switching Architecture (MPLS) Label Switched Paths (LSPs) Ping Parameters - TLVs" registry.
defines the Downstream Mapping TLV, which has the Type 2 assigned from the "Multi-Protocol Label Switching (MPLS) Label Switched Paths (LSPs) Ping Parameters - TLVs" registry. [RFC6424] defines the Downstream Detailed Mapping TLV, which has the Type 20 assigned from the "Multi-Protocol Label Switching (MPLS) Label Switched Paths (LSPs) Ping Parameters - TLVs" registry. DSMAP has been deprecated by DDMAP, but both TLVs shares a field: "DS Flags". This document requires allocation of a new value in the "DS Flags" field, which is not maintained by IANA today. Therefore, this document requests IANA to create new registries within protocol to maintain "DS Flags" field. Initial values for this registry, "DS Flags", are described below.
Assignments of DS Flags are via Standards Action or IESG Approval .Note that "DS Flags" is a field included in two TLVs defined in "Multi-Protocol Label Switching (MPLS) Label Switched Paths (LSPs) Ping Parameters - TLVs" registry: Downstream Mapping TLV (value 2) and Downstream Detailed Mapping TLV (value 20). Modification to "DS Flags" registry will affect both TLVs.Also note that makes request to create a new retry for "DS Flags", with new values being added for Bit Number 4 and 5. If becomes RFC and "DS Flags" IANA registry is created as result, then this document simply requests Bit Number 3 (G: LAG Description Indicator) to be added to the registry.The IANA is requested to make a new "Sub-TLVs for TLV Type TBD3" sub-registry under "Multiprotocol Label Switching Architecture (MPLS) Label Switched Paths (LSPs) Ping Parameters - TLVs" registry.
Initial values for this sub-registry, "Sub-TLVs for TLV Types TBD3", are described below.
Assignments of Sub-Types are via Standards Action or IESG Approval .TBDMulti-Protocol Label Switching (MPLS) Label Switched Paths (LSPs) Ping ParametersIANAIEEE Standard for Local and metropolitan area
networks - Link Aggregation
IEEE Std. 802.1AXSeveral flavors of "LAG with L2 switch" provisioning models are described in this section, with MPLS data plane ECMP traversal validation issues with each.
The issue with this LAG provisioning model is that packets traversing a LAG member from R1 to S1 can get load balanced by S1 towards R2. Therefore, MPLS echo request messages traversing specific LAG member from R1 to S1 can actually reach R2 via any LAG members, and sender of MPLS echo request messages have no knowledge of this nor no way to control this traversal. In the worst case, MPLS echo request messages with specific entropies to exercise every LAG members from R1 to S1 can all reach R2 via same LAG member. Thus it is impossible for MPLS echo request sender to verify that packets intended to traverse specific LAG member from R1 to S1 did actually traverse that LAG member, and to deterministically exercise "receive" processing of every LAG member on R2.
There are deviating number of LAG members on the two sides of the L2 switch. The issue with this LAG provisioning model is the same as previous model, sender of MPLS echo request messages have no knowledge of L2 load balance algorithm nor entropy values to control the traversal.
The issue with this LAG provisioning model is that there is no way for MPLS echo request sender to deterministically exercise both LAG members from S1 to R2. And without such, "receive" processing of R2 on each LAG member cannot be verified.
MPLS echo request sender has knowledge of how to traverse both LAG members from R1 to S1. However, both types of packets will terminate on the non-LAG interface at R2. It becomes impossible for MPLS echo request sender to know that MPLS echo request messages intended to traverse a specific LAG member from R1 to S1 did indeed traverse that LAG member.