rfc9268.original | rfc9268.txt | |||
---|---|---|---|---|
Network Working Group R. Hinden | Internet Engineering Task Force (IETF) R. Hinden | |||
Internet-Draft Check Point Software | Request for Comments: 9268 Check Point Software | |||
Intended status: Experimental G. Fairhurst | Category: Experimental G. Fairhurst | |||
Expires: 11 November 2022 University of Aberdeen | ISSN: 2070-1721 University of Aberdeen | |||
10 May 2022 | August 2022 | |||
IPv6 Minimum Path MTU Hop-by-Hop Option | IPv6 Minimum Path MTU Hop-by-Hop Option | |||
draft-ietf-6man-mtu-option-15 | ||||
Abstract | Abstract | |||
This document specifies a new IPv6 Hop-by-Hop option that is used to | This document specifies a new IPv6 Hop-by-Hop Option that is used to | |||
record the minimum Path MTU along the forward path between a source | record the Minimum Path MTU (PMTU) along the forward path between a | |||
host to a destination host. The recorded value can then be | source host to a destination host. The recorded value can then be | |||
communicated back to the source using the return Path MTU field in | communicated back to the source using the return Path MTU field in | |||
the option. | the Option. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This document is not an Internet Standards Track specification; it is | |||
provisions of BCP 78 and BCP 79. | published for examination, experimental implementation, and | |||
evaluation. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document defines an Experimental Protocol for the Internet | |||
and may be updated, replaced, or obsoleted by other documents at any | community. This document is a product of the Internet Engineering | |||
time. It is inappropriate to use Internet-Drafts as reference | Task Force (IETF). It represents the consensus of the IETF | |||
material or to cite them other than as "work in progress." | community. It has received public review and has been approved for | |||
publication by the Internet Engineering Steering Group (IESG). Not | ||||
all documents approved by the IESG are candidates for any level of | ||||
Internet Standard; see Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 11 November 2022. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9268. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
in the Revised BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction | |||
1.1. Example Operation . . . . . . . . . . . . . . . . . . . . 3 | 1.1. Example Operation | |||
1.2. Use of the IPv6 Hop-by-Hop Options Header . . . . . . . . 4 | 1.2. Use of the IPv6 Hop-by-Hop Options Header | |||
2. Motivation and Problem Solved . . . . . . . . . . . . . . . . 5 | 2. Motivation and Problem Solved | |||
3. Requirements Language . . . . . . . . . . . . . . . . . . . . 6 | 3. Requirements Language | |||
4. Applicability Statements . . . . . . . . . . . . . . . . . . 6 | 4. Applicability Statements | |||
5. IPv6 Minimum Path MTU Hop-by-Hop Option . . . . . . . . . . . 6 | 5. IPv6 Minimum Path MTU Hop-by-Hop Option | |||
6. Router, Host, and Transport Layer Behaviors . . . . . . . . . 8 | 6. Router, Host, and Transport Layer Behaviors | |||
6.1. Router Behavior . . . . . . . . . . . . . . . . . . . . . 8 | 6.1. Router Behavior | |||
6.2. Host Operating System Behavior . . . . . . . . . . . . . 8 | 6.2. Host Operating System Behavior | |||
6.3. Transport Layer Behavior . . . . . . . . . . . . . . . . 9 | 6.3. Transport Layer Behavior | |||
6.3.1. Including the Option in an Outgoing Packet . . . . . 10 | 6.3.1. Including the Option in an Outgoing Packet | |||
6.3.2. Validation of the Packet that includes the Option . . 12 | 6.3.2. Validation of the Packet that Includes the Option | |||
6.3.3. Receiving the Option . . . . . . . . . . . . . . . . 12 | 6.3.3. Receiving the Option | |||
6.3.4. Using the Rtn-PMTU Field . . . . . . . . . . . . . . 13 | 6.3.4. Using the Rtn-PMTU Field | |||
6.3.5. Detecting Path Changes . . . . . . . . . . . . . . . 14 | 6.3.5. Detecting Path Changes | |||
6.3.6. Detection of Dropping Packets that include the | 6.3.6. Detection of Dropping Packets that Include the Option | |||
Option . . . . . . . . . . . . . . . . . . . . . . . 14 | 7. IANA Considerations | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 | 8. Security Considerations | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 | 8.1. Router Option Processing | |||
8.1. Router Option Processing . . . . . . . . . . . . . . . . 15 | 8.2. Network-Layer Host Processing | |||
8.2. Network Layer Host Processing . . . . . . . . . . . . . . 15 | 8.3. Validating Use of the Option Data | |||
8.3. Validating use of the Option Data . . . . . . . . . . . . 16 | 8.4. Direct Use of the Rtn-PMTU Value | |||
8.4. Direct use of the Rtn-PMTU Value . . . . . . . . . . . . 16 | 8.5. Using the Rtn-PMTU Value as a Hint for Probing | |||
8.5. Using the Rtn-PMTU Value as a Hint for Probing . . . . . 17 | 8.6. Impact of Middleboxes | |||
8.6. Impact of Middleboxes . . . . . . . . . . . . . . . . . . 17 | 9. Experiment Goals | |||
9. Experiment Goals . . . . . . . . . . . . . . . . . . . . . . 17 | 10. Implementation Status | |||
10. Implementation Status . . . . . . . . . . . . . . . . . . . . 18 | 11. References | |||
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 | 11.1. Normative References | |||
12. Change log [RFC Editor: Please remove] . . . . . . . . . . . 18 | 11.2. Informative References | |||
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 | Appendix A. Examples of Usage | |||
13.1. Normative References . . . . . . . . . . . . . . . . . . 21 | Acknowledgments | |||
13.2. Informative References . . . . . . . . . . . . . . . . . 22 | Authors' Addresses | |||
Appendix A. Examples of Usage . . . . . . . . . . . . . . . . . 24 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 | ||||
1. Introduction | 1. Introduction | |||
This document specifies a new IPv6 Hop-by-Hop (HBH) Option to record | This document specifies a new IPv6 Hop-by-Hop (HBH) Option to record | |||
the minimum Maximum Transmission Unit (MTU) along the forward path | the minimum Maximum Transmission Unit (MTU) along the forward path | |||
between a source and a destination host. The source host creates a | between a source and a destination host. The source host creates a | |||
packet with this option and initializes the Min-PMTU field with the | packet with this Option and initializes the Min-PMTU field with the | |||
value of the MTU for the outbound link that will be used to forward | value of the MTU for the outbound link that will be used to forward | |||
the packet towards the destination host. | the packet towards the destination host. | |||
At each subsequent hop where the option is processed, the router | At each subsequent hop where the Option is processed, the router | |||
compares the value of the Min-PMTU Field in the option and the MTU of | compares the value of the Min-PMTU field in the Option and the MTU of | |||
its outgoing link. If the MTU of the link is less than the Min-PMTU, | its outgoing link. If the MTU of the link is less than the Min-PMTU, | |||
it rewrites the value in the option data with the smaller value. | it rewrites the value in the Option Data with the smaller value. | |||
When the packet arrives at the destination host, the host can send | When the packet arrives at the destination host, the host can send | |||
the value of the minimum reported MTU for the path back to the source | the value of the minimum Reported MTU for the path back to the source | |||
host using the Rtn-PMTU field in the option. The source host can | host using the Rtn-PMTU field in the Option. The source host can | |||
then use this value as input to the method that sets the Path MTU | then use this value as input to the method that sets the Path MTU | |||
(PMTU) used by upper layer protocols. | (PMTU) used by upper-layer protocols. | |||
The IPv6 Minimum Path MTU Hop-by-Hop (MinPMTU HBH) Option is designed | The IPv6 Minimum Path MTU Hop-by-Hop (MinPMTU HBH) Option is designed | |||
to work with packet sizes that can be specified in the IPv6 header. | to work with packet sizes that can be specified in the IPv6 header. | |||
The maximum packet size that can be specified in an IPv6 header is | The maximum packet size that can be specified in an IPv6 header is | |||
65,535 octets (2^^16). | 65,535 octets (2^16). | |||
This method has the potential to complete Path MTU discovery in a | This method has the potential to complete Path MTU Discovery (PMTUD) | |||
single round trip time, even over paths that have successive links | in a single round-trip time, even over paths that have successive | |||
each with a lower MTU. | links, each with a lower MTU. | |||
The mechanism defined in this document is focused on Unicast, it does | The mechanism defined in this document is focused on unicast; it does | |||
not describe Multicast. That is left for future work. | not describe multicast. That is left for future work. | |||
1.1. Example Operation | 1.1. Example Operation | |||
The figure below illustrates the operation of the method. In this | The figure below illustrates the operation of the method. In this | |||
case, the path between the source host and the destination host | case, the path between the source host and the destination host | |||
comprises three links, the source has a link MTU of size MTU-S, the | comprises three links: the source has a link MTU of size MTU-S, the | |||
link between routers R1 and R2 has an MTU of size 9000 bytes, and the | link between routers R1 and R2 has an MTU of size 9000 bytes, and the | |||
final link to the destination has an MTU of size MTU-D. | final link to the destination has an MTU of size MTU-D. | |||
+--------+ +----+ +----+ +-------+ | +--------+ +----+ +----+ +-------+ | |||
| | | | | | | | | | | | | | | | | | |||
| Sender +---------+ R1 +--------+ R2 +-------- + Dest. | | | Sender +---------+ R1 +--------+ R2 +-------- + Dest. | | |||
| | | | | | | | | | | | | | | | | | |||
+--------+ MTU-S +----+ 9000B +----+ MTU-D +-------+ | +--------+ MTU-S +----+ 9000B +----+ MTU-D +-------+ | |||
Figure 1 | Figure 1: An Example Path between the Source Host and the | |||
Destination Host | ||||
Three scenarios are described: | Three scenarios are described: | |||
* Scenario 1, considers all links to have an 9000 byte MTU and the | * Scenario 1 considers all links to have a 9000 byte MTU, and the | |||
method is supported by both routers. The initial Min-PMTU is not | method is supported by both routers. The initial Min-PMTU is not | |||
modified along the path, and therefore the PMTU is 9000 bytes. | modified along the path. Therefore, the PMTU is 9000 bytes. | |||
* Scenario 2, considers the link between R2 and destination host | * Scenario 2 considers the link between R2 and the destination host | |||
(MTU-D) to have an MTU of 1500 bytes. This is the smallest MTU, | (MTU-D) to have an MTU of 1500 bytes. This is the smallest MTU. | |||
router R2 updates the Min-PMTU to 1500 bytes and the method | Router R2 updates the Min-PMTU to 1500 bytes, and the method | |||
correctly updates the PMTU to 1500 bytes. Had there been another | correctly updates the PMTU to 1500 bytes. Had there been another | |||
smaller MTU at a link further along the path that also supports | smaller MTU at a link further along the path that also supports | |||
the method, the lower MTU would also have been detected. | the method, the lower MTU would also have been detected. | |||
* Scenario 3, considers the case where the router preceding the | * Scenario 3 considers the case where the router preceding the | |||
smallest link (R2) does not support the method, and the link to | smallest link (R2) does not support the method, and the link to | |||
the destination host (MTU-D) has an MTU of 1500 bytes. Therefore, | the destination host (MTU-D) has an MTU of 1500 bytes. Therefore, | |||
router R2 does not update the Min-PMTU to 1500 bytes. The method | router R2 does not update the Min-PMTU to 1500 bytes. The method | |||
then fails to detect the actual PMTU. | then fails to detect the actual PMTU. | |||
In Scenarios 2 and 3, a lower PMTU would also fail to be detected in | In Scenarios 2 and 3, a lower PMTU would also fail to be detected in | |||
the case where PMTUD had been used and an ICMPv6 Packet Too Big (PTB) | the case where PMTUD had been used and an ICMPv6 Packet Too Big (PTB) | |||
message had not been delivered to the sender [RFC8201]. | message had not been delivered to the sender [RFC8201]. | |||
These scenarios are summarized in the table below. "H" in R1 and/or | These scenarios are summarized in the table below. "H" in R1 and/or | |||
R2 columns means the router understands the MinPMTU HBH option. | R2 columns means the router understands the MinPMTU HBH Option. | |||
+-+-----+-----+----+----+----------+-----------------------+ | +===+========+========+====+====+==========+=======================+ | |||
| |MTU-S|MTU-D| R1 | R2 | Rec PMTU | Note | | | | MTU-S | MTU-D | R1 | R2 | Rec PMTU | Note | | |||
+-+-----+-----+----+----+----------+-----------------------+ | +===+========+========+====+====+==========+=======================+ | |||
|1|9000B|9000B| H | H | 9000 B | Endpoints attempt to | | | 1 | 9000 B | 9000 B | H | H | 9000 B | Endpoints attempt to | | |||
| | | | | | use a 9000 B PMTU. | | | | | | | | | use a 9000 B PMTU. | | |||
+-+-----+-----+----+----+----------+-----------------------+ | +---+--------+--------+----+----+----------+-----------------------+ | |||
|2|9000B|1500B| H | H | 1500 B | Endpoints attempt to | | | 2 | 9000 B | 1500 B | H | H | 1500 B | Endpoints attempt to | | |||
| | | | | | | use a 1500 B PMTU. | | | | | | | | | use a 1500 B PMTU. | | |||
+-+-----+-----+----+----+----------+-----------------------+ | +---+--------+--------+----+----+----------+-----------------------+ | |||
|3|9000B|1500B| H | - | 9000 B | Endpoints attempt to | | | 3 | 9000 B | 1500 B | H | - | 9000 B | Endpoints attempt to | | |||
| | | | | | | use a 9000 B PMTU, | | | | | | | | | use a 9000 B PMTU but | | |||
| | | | | | | but need to implement | | | | | | | | | need to implement a | | |||
| | | | | | | a method to fall back | | | | | | | | | method to fall back | | |||
| | | | | | | to discover and use a | | | | | | | | | to discover and use a | | |||
| | | | | | | 1500 B PMTU. | | | | | | | | | 1500 B PMTU. | | |||
+-+-----+-----+----+----+----------+-----------------------+ | +---+--------+--------+----+----+----------+-----------------------+ | |||
Figure 2 | Table 1: Three Scenarios That Arise from Using the Path Shown in | |||
Figure 1 | ||||
1.2. Use of the IPv6 Hop-by-Hop Options Header | 1.2. Use of the IPv6 Hop-by-Hop Options Header | |||
IPv6 as specified in [RFC8200] allows nodes to optionally process the | As specified in [RFC8200], IPv6 allows nodes to optionally process | |||
Hop-by-Hop header. Specifically, from Section 4: | the Hop-by-Hop header. Specifically, from Section 4 of [RFC8200]: | |||
* The Hop-by-Hop Options header is not inserted or deleted, but may | ||||
be examined or processed by any node along a packet's delivery | ||||
path, until the packet reaches the node (or each of the set of | ||||
nodes, in the case of multicast) identified in the Destination | ||||
Address field of the IPv6 header. The Hop-by-Hop Options header, | ||||
when present, must immediately follow the IPv6 header. Its | ||||
presence is indicated by the value zero in the Next Header field | ||||
of the IPv6 header. | ||||
* NOTE: While [RFC2460] required that all nodes must examine and | | The Hop-by-Hop Options header is not inserted or deleted, but may | |||
process the Hop-by-Hop Options header, it is now expected that | | be examined or processed by any node along a packet's delivery | |||
nodes along a packet's delivery path only examine and process the | | path, until the packet reaches the node (or each of the set of | |||
Hop-by-Hop Options header if explicitly configured to do so. | | nodes, in the case of multicast) identified in the Destination | |||
| Address field of the IPv6 header. The Hop-by-Hop Options header, | ||||
| when present, must immediately follow the IPv6 header. Its | ||||
| presence is indicated by the value zero in the Next Header field | ||||
| of the IPv6 header. | ||||
| | ||||
| NOTE: While [RFC2460] required that all nodes must examine and | ||||
| process the Hop-by-Hop Options header, it is now expected that | ||||
| nodes along a packet's delivery path only examine and process the | ||||
| Hop-by-Hop Options header if explicitly configured to do so. | ||||
The Hop-by-Hop Option defined in this document is designed to take | The Hop-by-Hop Option defined in this document is designed to take | |||
advantage of this property of how Hop-by-Hop options are processed. | advantage of this property of how Hop-by-Hop Options are processed. | |||
Nodes that do not support this Option SHOULD ignore them. This can | Nodes that do not support this Option SHOULD ignore them. This can | |||
mean that the Min-PMTU value does not account for all links along a | mean that the Min-PMTU value does not account for all links along a | |||
path. | path. | |||
2. Motivation and Problem Solved | 2. Motivation and Problem Solved | |||
The current state of Path MTU Discovery on the Internet is | The current state of Path MTU Discovery on the Internet is | |||
problematic. The mechanisms defined in [RFC8201] are known to not | problematic. The mechanisms defined in [RFC8201] are known to not | |||
work well in all environments. It fails to work in various cases, | work well in all environments. It fails to work in various cases, | |||
including when nodes in the middle of the network do not send ICMPv6 | including when nodes in the middle of the network do not send ICMPv6 | |||
PTB messages, or rate-limited ICMPv6 messages, or do not have a | PTB messages or rate-limited ICMPv6 messages or do not have a return | |||
return path to the source host. | path to the source host. This results in many transport-layer | |||
connections being configured to use smaller packets (e.g., 1280 | ||||
This results in many transport layer connections being configured to | bytes) by default and makes it difficult to take advantage of paths | |||
use smaller packets (e.g., 1280 bytes) by default and makes it | with a larger PMTU where they do exist. Applications that send large | |||
difficult to take advantage of paths with a larger PMTU where they do | packets are forced to use IPv6 fragmentation [RFC8200], which can | |||
exist. Applications that send large packets are forced to use IPv6 | reduce the reliability of Internet communication [RFC8900]. | |||
Fragmentation [RFC8200], which can reduce the reliability of Internet | ||||
communication [RFC8900]. | ||||
Encapsulations and network-layer tunnels further reduce the payload | Encapsulations and network-layer tunnels further reduce the payload | |||
size available for a transport protocol to use. Also, some use-cases | size available for a transport protocol to use. Also, some use cases | |||
increase packet overhead, for example, Network Virtualization Using | increase packet overhead, for example, Network Virtualization Using | |||
Generic Routing Encapsulation (NVGRE) [RFC7637] encapsulates L2 | Generic Routing Encapsulation (NVGRE) [RFC7637] encapsulates Layer 2 | |||
packets in an outer IP header and does not allow IP Fragmentation. | (L2) packets in an outer IP header and does not allow IP | |||
fragmentation. | ||||
Sending larger packets can improve host performance, e.g., avoiding | Sending larger packets can improve host performance, e.g., avoiding | |||
limits to packet processing by the packet rate. For example, the | limits to packet processing by the packet rate. An example of this | |||
packet per second rate required to reach wire speed on a 10G link | is how the packet-per-second rate required to reach wire speed on a | |||
with 1280 byte packets is about 977K packets per second (pps), vs. | 10G link with 1280 byte packets is about 977K packets per second | |||
139K pps for 9000 byte packets. | (pps) vs. 139K pps for 9000 byte packets. | |||
The purpose of this document is to improve the situation by defining | The purpose of this document is to improve the situation by defining | |||
a mechanism that does not rely on reception of ICMPv6 Packet Too Big | a mechanism that does not rely on reception of ICMPv6 PTB messages | |||
messages from nodes in the middle of the network. Instead, this | from nodes in the middle of the network. Instead, this provides | |||
provides information to the destination host about the minimum Path | information to the destination host about the Minimum Path MTU and | |||
MTU, and sends this information back to the source host. This is | sends this information back to the source host. This is expected to | |||
expected to work better than the current RFC8201-based mechanisms. | work better than the current mechanisms based on [RFC8201]. | |||
A similar mechanism was proposed in 1988 for IPv4 in [RFC1063] by | A similar mechanism was proposed in 1988 for IPv4 in [RFC1063] by | |||
Jeff Mogul, C. Kent, Craig Partridge, and Keith McCloghire. It was | Jeff Mogul, C. Kent, Craig Partridge, and Keith McCloghire. It was | |||
later obsoleted in 1990 by [RFC1191], the current deployed approach | later obsoleted in 1990 by [RFC1191], which is the current deployed | |||
to Path MTU Discovery. In contrast, the method described in this | approach to Path MTU Discovery. In contrast, the method described in | |||
document uses the Hop-by-Hop option of IPv6. It does not replace | this document uses the Hop-by-Hop Option of IPv6. It does not | |||
PMTUD [RFC8201], PLPPMTUD [RFC4821] or Datagram PLPMTUD [RFC8899], | replace PMTUD [RFC8201], Packetization Layer Path MTU Discovery | |||
but rather is designed to compliment these methods. | (PLPMTUD) [RFC4821], or Datagram Packetization Layer PMTU Discovery | |||
(DPLPMTUD) [RFC8899] but rather is designed to compliment these | ||||
methods. | ||||
3. Requirements Language | 3. Requirements Language | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
4. Applicability Statements | 4. Applicability Statements | |||
The Path MTU option is designed for environments where there is | The Path MTU Option is designed for environments where there is | |||
control over the hosts and nodes that connect them, and where there | control over the hosts and nodes that connect them and where there is | |||
is more than one MTU size in use. For example, in Data Centers and | more than one MTU size in use, for example, in data centers and on | |||
on paths between Data Centers, to allow hosts to better take | paths between data centers to allow hosts to better take advantage of | |||
advantage of a path that is able to support a large PMTU. | a path that is able to support a large PMTU. | |||
The design of the option is sufficiently simple that it can be | The design of the Option is so sufficiently simple that it can be | |||
executed on a router's fast path. A successful experiment depends on | executed on a router's fast path. A successful experiment depends on | |||
both implementation by host and router vendors and deployment by | both implementation by host and router vendors and deployment by | |||
operators. The contained use-case of connections within and between | operators. The contained use case of connections within and between | |||
Data Centers could be a driver for deployment. | data centers could be a driver for deployment. | |||
The method could also be useful in other environments, including the | The method could also be useful in other environments, including the | |||
general Internet, and offers advantage when this Hop-by-Hop Option is | general Internet, and offers an advantage when this Hop-by-Hop Option | |||
supported on all paths. The method is more robust when used to probe | is supported on all paths. The method is more robust when used to | |||
the path using packets that do not carry application data and when | probe the path using packets that do not carry application data and | |||
also paired with a method such as Packetization Layer PMTUD [RFC4821] | when also paired with a method like Packetization Layer PMTUD | |||
or Datagram PLPMTUD [RFC8899]. | [RFC4821] or Datagram Packetization Layer PMTU Discovery (DPLPMTUD) | |||
[RFC8899]. | ||||
5. IPv6 Minimum Path MTU Hop-by-Hop Option | 5. IPv6 Minimum Path MTU Hop-by-Hop Option | |||
The Minimum Path MTU Hop-by-Hop Option has the following format: | The Minimum Path MTU Hop-by-Hop Option has the following format: | |||
Option Option Option | Option Option Option | |||
Type Data Len Data | Type Data Len Data | |||
+--------+--------+--------+--------+---------+-------+-+ | +--------+--------+--------+--------+---------+-------+-+ | |||
|BBCTTTTT|00000100| Min-PMTU | Rtn-PMTU |R| | |BBCTTTTT|00000100| Min-PMTU | Rtn-PMTU |R| | |||
+--------+--------+--------+--------+---------+-------+-+ | +--------+--------+--------+--------+---------+-------+-+ | |||
Option Type (see Section 4.2 of [RFC8200]): | Figure 2: Format of the Minimum Path MTU Hop-by-Hop Option | |||
BB 00 Skip over this option and continue processing. | Option Type (see Section 4.2 of [RFC8200]): | |||
C 1 Option data can change en route to the packet's final | BB 00 Skip over this Option and continue processing. | |||
C 1 Option Data can change en route to the packet's final | ||||
destination. | destination. | |||
TTTTT 10000 Option Type assigned from IANA [IANA-HBH]. | TTTTT 10000 Option Type assigned from IANA [IANA-HBH]. | |||
Length: 4 The size of the value field in Option Data | Length: 4 The size of the value field in Option Data | |||
field supports PMTU values from 0 to 65,534 octets, the | field supports PMTU values from 0 to 65,534 | |||
maximum size represented by the Path MTU option. | octets, the maximum size represented by the | |||
Path MTU Option. | ||||
Min-PMTU: n 16-bits. The minimum MTU recorded along the path | Min-PMTU: n 16-bits. The minimum MTU recorded along the path | |||
in octets, reflecting the smallest link MTU that | in octets, reflecting the smallest link MTU that | |||
the packet experienced along the path. | the packet experienced along the path. | |||
A value less than the IPv6 minimum link | A value less than the IPv6 minimum link | |||
MTU [RFC8200] MUST be ignored. | MTU [RFC8200] MUST be ignored. | |||
Rtn-PMTU: n 15-bits. The returned Path MTU field, carrying the 15 | Rtn-PMTU: n 15-bits. The returned Path MTU field, carrying the 15 | |||
most significant bits of the latest received Min-PMTU | most significant bits of the latest received Min-PMTU | |||
field for the forward path. The value zero means that | field for the forward path. The value zero means that | |||
no Reported MTU is being returned. | no Reported MTU is being returned. | |||
R n 1-bit. R-Flag. Set by the source to signal that | R n 1-bit. R-Flag. Set by the source to signal that | |||
the destination host should include the received | the destination host should include the received | |||
Rtn-PMTU field updated by the reported Min-PMTU value | Rtn-PMTU field updated by the reported Min-PMTU value | |||
when the destination host is to send a PMTU Option back | when the destination host is to send a PMTU Option back | |||
to the source host. | to the source host. | |||
Figure 3 | ||||
NOTE: The encoding of the final two octets (Rtn-PMTU and R-Flag) | NOTE: The encoding of the final two octets (Rtn-PMTU and R-Flag) | |||
could be implemented by a mask of the latest received Min-PMTU value | could be implemented by a mask of the latest received Min-PMTU value | |||
with 0xFFFE, discarding the right-most bit and then performing a | with 0xFFFE, discarding the right-most bit and then performing a | |||
logical 'OR' with the R-Flag value of the sender. This encoding fits | logical 'OR' with the R-Flag value of the sender. This encoding fits | |||
in the minimum-sized Hop-by-Hop Option header. | in the minimum-sized Hop-by-Hop Option header. | |||
6. Router, Host, and Transport Layer Behaviors | 6. Router, Host, and Transport Layer Behaviors | |||
6.1. Router Behavior | 6.1. Router Behavior | |||
Routers that are not configured to support Hop-by-Hop Options are not | Routers that are not configured to support Hop-by-Hop Options are not | |||
expected to examine or process the contents of this option [RFC8200]. | expected to examine or process the contents of this Option [RFC8200]. | |||
Routers that support Hop-by-Hop Options, but are not configured to | Routers that support Hop-by-Hop Options but are not configured to | |||
support this option SHOULD skip over this option and continue to | support this Option SHOULD skip over this Option and continue to | |||
processing the header [RFC8200]. | process the header [RFC8200]. | |||
Routers that support this option MUST compare the value of the Min- | Routers that support this Option MUST compare the value of the Min- | |||
PMTU field with the MTU configured for the outgoing link. If the MTU | PMTU field with the MTU configured for the outgoing link. If the MTU | |||
of the outgoing link is less than the Min-PMTU, the router rewrites | of the outgoing link is less than the Min-PMTU, the router rewrites | |||
the Min-PMTU in the Option to use the smaller value. (The router | the Min-PMTU in the Option to use the smaller value. (The router | |||
processing is performed without checking the valid range of the Min- | processing is performed without checking the valid range of the Min- | |||
PMTU or the Rtn-PMTU fields.) | PMTU or the Rtn-PMTU fields.) | |||
A router MUST ignore and MUST NOT change the Rtn-PMTU field or the | A router MUST ignore and MUST NOT change the Rtn-PMTU field or the | |||
R-Flag in the option. | R-Flag in the Option. | |||
6.2. Host Operating System Behavior | 6.2. Host Operating System Behavior | |||
The PMTU entry associated with the destination in the host's | The PMTU entry associated with the destination in the host's | |||
destination cache [RFC4861] SHOULD be updated after detecting a | destination cache [RFC4861] SHOULD be updated after detecting a | |||
change using the IPv6 Minimum Path MTU Hop-by-Hop Option. This | change using the IPv6 Minimum Path MTU Hop-by-Hop Option. This | |||
cached value can be used by other flows that share the host's | cached value can be used by other flows that share the host's | |||
destination cache. | destination cache. | |||
The value in the host destination cache SHOULD be used by PLPMTUD to | The value in the host destination cache SHOULD be used by PLPMTUD to | |||
select an initial PMTU for a flow. The cached PMTU is only increased | select an initial PMTU for a flow. The cached PMTU is only increased | |||
by PLPMTUD when the Packetization Layer determines the path actually | by PLPMTUD when the Packetization Layer determines the path actually | |||
supports a larger PMTU [RFC4821] [RFC8899]. | supports a larger PMTU [RFC4821] [RFC8899]. | |||
When requested to send an IPv6 packet with the MinPMTU HBH option, | When requested to send an IPv6 packet with the MinPMTU HBH Option, | |||
the source host includes the option in an outgoing packet. The | the source host includes the Option in an outgoing packet. The | |||
source host MUST fill the Min-PMTU field with the MTU configured for | source host MUST fill the Min-PMTU field with the MTU configured for | |||
the link over which it will send the packet on the next hop towards | the link over which it will send the packet on the next hop towards | |||
the destination host. | the destination host. | |||
When a host includes the option in a packet it sends, the host SHOULD | When a host includes the Option in a packet it sends, the host SHOULD | |||
set the Rtn-PMTU field to the previously cached value of the received | set the Rtn-PMTU field to the previously cached value of the received | |||
Minimum Path MTU for the flow in the Rtn-PMTU field (see | Minimum Path MTU for the flow in the Rtn-PMTU field (see | |||
Section 6.3.3). If this value is not set (for example, because there | Section 6.3.3). If this value is not set (for example, because there | |||
is no cached reported Min-PMTU value), the Rtn-PMTU field value MUST | is no cached reported Min-PMTU value), the Rtn-PMTU field value MUST | |||
be set to zero. | be set to zero. | |||
The source host MAY request the destination host to return the | The source host MAY request the destination host to return the | |||
reported Min-PMTU value by setting the R-Flag in the option of an | reported Min-PMTU value by setting the R-Flag in the Option of an | |||
outgoing packet. The R-Flag SHOULD NOT be set when the MinPMTU HBH | outgoing packet. The R-Flag SHOULD NOT be set when the MinPMTU HBH | |||
Option was sent solely to provide requested feedback on the return | Option was sent solely to provide requested feedback on the return | |||
Path MTU to avoid each response generating another response. | Path MTU to avoid each response generating another response. | |||
The destination host controls when to send a packet with this option | The destination host controls when to send a packet with this Option | |||
in response to an R-flag, as well as which packets to include it in. | in response to an R-Flag, as well as which packets to include it in. | |||
The destination host MAY limit the rate at which it sends these | The destination host MAY limit the rate at which it sends these | |||
packets. | packets. | |||
A destination host only sets the R Flag if it wishes the source host | A destination host only sets the R-Flag if it wishes the source host | |||
to also return the discovered PMTU value for the path from the | to also return the discovered PMTU value for the path from the | |||
destination to the source. | destination to the source. | |||
The normal sequence of operation of the R-Flag using the terminology | The normal sequence of operation of the R-Flag using the terminology | |||
from the diagram in Figure 1 is: | from the diagram in Figure 1 is: | |||
1. The source sends a probe to the destination. The sender sets the | 1. The source sends a probe to the destination. The sender sets the | |||
R-Flag. | R-Flag. | |||
2. The destination responds by sending a probe including the | 2. The destination responds by sending a probe including the | |||
received Min-PMTU as the Rtn-PMTU. A destination that does not | received Min-PMTU as the Rtn-PMTU. A destination that does not | |||
wish to probe the return path sets the R-Flag to 0. | wish to probe the return path sets the R-Flag to 0. | |||
6.3. Transport Layer Behavior | 6.3. Transport Layer Behavior | |||
This Hop-by-Hop option is intended to be used with a path MTU | This Hop-by-Hop Option is intended to be used with a Path MTU | |||
discovery method. | Discovery method. | |||
PLPMTUD [RFC9000] uses probe packets for two distinct functions: | PLPMTUD [RFC8899] uses probe packets for two distinct functions: | |||
* Probe packets are used to confirm connectivity. Such probes can | * Probe packets are used to confirm connectivity. Such probes can | |||
be of any size up to the PLPMTU. These probe packets are sent to | be of any size up to the Packetization Layer Path MTU (PLPMTU). | |||
solicit a response use the path to the remote node. These probe | These probe packets are sent to solicit a response using the path | |||
packets can carry the Hop-by-Hop PMTU option, providing the final | to the remote node. These probe packets can carry the Hop-by-Hop | |||
size of the packet does not exceed the current PLPMTU. After | PMTU Option, providing the final size of the packet does not | |||
validating that the packet originates from the path (section | exceed the current PLPMTU. After validating that the packet | |||
4.6.1), the PLPMTUD method can use the reported size from the Hop- | originates from the path (Section 4.6.1 of [RFC8899]), the PLPMTUD | |||
by-Hop option as the next search point when it resumes the search | method can use the reported size from the Hop-by-Hop Option as the | |||
algorithm. (This use resembles the use of the PTB_SIZE | next search point when it resumes the search algorithm. (This use | |||
information in section 4.6.2 of [RFC8899] | resembles the use of the PTB_SIZE information in Section 4.6.2 of | |||
[RFC8899].) | ||||
* A second use of probe packets is to explore if a path supports a | * A second use of probe packets is to explore if a path supports a | |||
packet size greater than the current PLPMTU. If this probe packet | packet size greater than the current PLPMTU. If this probe packet | |||
is successfully delivered (as determined by the source host), then | is successfully delivered (as determined by the source host), then | |||
the PLPMTU is raised to the size of the successful probe. These | the PLPMTU is raised to the size of the successful probe. These | |||
probe packets do not usually set the Path MTU Hop-by-Hop option. | probe packets do not usually set the Path MTU Hop-by-Hop Option. | |||
See Section 1.2 of [RFC8899]. Section 4.1 of [RFC8899] also | ||||
See section 1.2 of [RFC8899]. Section 4.1 of [RFC8899] also | describes ways that a probe packet can be constructed, depending | |||
describes ways that a Probe Packet can be constructed, depending | ||||
on whether the probe packets carry application data. | on whether the probe packets carry application data. | |||
* The PMTU Hop-by-Hop Option Probe can be sent on packets that | The PMTU Hop-by-Hop Option probe can be sent on packets that include | |||
include application data, but needs to be robust to potential loss | application data but needs to be robust to potential loss of the | |||
of the packet (i.e., with the possibility that retransmission | packet (i.e., with the possibility that retransmission might be | |||
might be needed if the packet is lost). | needed if the packet is lost). | |||
* Using a PMTU Probe on packets that do not carry application data | Using a PMTU probe on packets that do not carry application data will | |||
will avoid the need for loss recovery if a router on the path | avoid the need for loss recovery if a router on the path drops | |||
drops packets that set this option. (This avoids the transport | packets that set this Option. (This avoids the transport needing to | |||
needing to retransmit a lost packet that includes this option.) | retransmit a lost packet that includes this Option.) This is the | |||
This is the normal default format for both uses of probes. | normal default format for both uses of probes. | |||
6.3.1. Including the Option in an Outgoing Packet | 6.3.1. Including the Option in an Outgoing Packet | |||
The upper layer protocol can request the MinPMTU HBH option to be | The upper-layer protocol can request the MinPMTU HBH Option to be | |||
included in an outgoing IPv6 packet. A transport protocol (or upper | included in an outgoing IPv6 packet. A transport protocol (or upper- | |||
layer protocol) can include this option only on specific packets used | layer protocol) can include this Option only on specific packets used | |||
to test the path. This option does not need to be included in all | to test the path. This Option does not need to be included in all | |||
packets belonging to a flow. | packets belonging to a flow. | |||
NOTE: Including this option in a large packet (e.g., one larger than | NOTE: Including this Option in a large packet (e.g., one larger than | |||
the present PMTU) is not likely to be useful, since the large packet | the present PMTU) is not likely to be useful, since the large packet | |||
would itself be dropped by any link along the path with a smaller | would itself be dropped by any link along the path with a smaller | |||
MTU, preventing the Min-PMTU information from reaching the | MTU, preventing the Min-PMTU information from reaching the | |||
destination host. | destination host. | |||
Discussion: | Discussion: | |||
* In the case of TCP, the option could be included in a packet that | * In the case of TCP, the Option could be included in a packet that | |||
carries a TCP segment sent after the connection is established. A | carries a TCP segment sent after the connection is established. A | |||
segment without data could be used, to avoid the need to | segment without data could be used to avoid the need to retransmit | |||
retransmit this data if the probe packet is lost. The discovered | this data if the probe packet is lost. The discovered value can | |||
value can be used to inform PLPMTUD [RFC4821]. | be used to inform PLPMTUD [RFC4821]. | |||
NOTE: A TCP SYN can also negotiate the Maximum Segment Size (MSS), | NOTE: A TCP SYN can also negotiate the Maximum Segment Size (MSS), | |||
which acts as an upper limit to the packet size that can be sent | which acts as an upper limit to the packet size that can be sent | |||
by a TCP sender. If this option were to be included in a TCP SYN, | by a TCP sender. If this Option were to be included in a TCP SYN, | |||
it could increase the probability that the SYN segment is lost | it could increase the probability that the SYN segment is lost | |||
when routers on the path drop packets with this option (see | when routers on the path drop packets with this Option (see | |||
Section 6.3.6), which could have an unwanted impact on the result | Section 6.3.6), which could have an unwanted impact on the result | |||
of racing options [I-D.ietf-taps-arch] or feature negotiation. | of racing Options [TAPS-ARCH] or feature negotiation. | |||
* The use with datagram transport protocols (e.g., UDP) is harder to | * The use with datagram transport protocols (e.g., UDP) is harder to | |||
characterize because applications using datagram transports range | characterize because applications using datagram transports range | |||
from very short-lived (low data-volume applications) exchanges, to | from very short-lived (low data-volume applications) exchanges to | |||
longer (bulk) exchanges of packets between the source and | longer (bulk) exchanges of packets between the source and | |||
destination hosts [RFC8085]. | destination hosts [RFC8085]. | |||
* Simple-exchange protocols (i.e., low data-volume applications | * Simple-exchange protocols (i.e., low data-volume applications | |||
[RFC8085] that only send one or a few packets per transaction), | [RFC8085] that only send one or a few packets per transaction) | |||
might assume that the PMTU is symmetrical. That is, the PMTU is | might assume that the PMTU is symmetrical. That is, the PMTU is | |||
the same in both directions, or at least not smaller for the | the same in both directions or at least not smaller for the return | |||
return path. This optimization does not hold when the paths are | path. This optimization does not hold when the paths are not | |||
not symmetric. | symmetric. | |||
* The MinPMTU HBH option can be used with ICMPv6 [RFC4443]. This | * The MinPMTU HBH Option can be used with ICMPv6 [RFC4443]. This | |||
requires a response from the remote node and therefore is | requires a response from the remote node and therefore is | |||
restricted to use with ICMPv6 echo messages. The MinPMTU HBH | restricted to use with ICMPv6 echo messages. The MinPMTU HBH | |||
option could provide additional information about the PMTU that | Option could provide additional information about the PMTU that | |||
might be supported by a path. This could be use as a diagnostic | might be supported by a path. This could be used as a diagnostic | |||
tool to measure the PMTU of a path. As with other uses, the | tool to measure the PMTU of a path. As with other uses, the | |||
actual supported PMTU is only confirmed after receiving a response | actual supported PMTU is only confirmed after receiving a response | |||
to a subsequent probe of the PMTU size. | to a subsequent probe of the PMTU size. | |||
* A datagram transport can utilise DPLPMTUD [RFC8899]. For example, | * A datagram transport can utilize DPLPMTUD [RFC8899]. For example, | |||
QUIC (see section 14.3 of [RFC9000]), can use DPLPMTUD to | QUIC (see Section 14.3 of [RFC9000]) can use DPLPMTUD to determine | |||
determine whether the path to a destination will support a desired | whether the path to a destination will support a desired maximum | |||
maximum datagram size. When using the IPv6 MinPMTU HBH option, | datagram size. When using the IPv6 MinPMTU HBH Option, the Option | |||
the option could be added to an additional QUIC PMTU Probe that is | could be added to an additional QUIC PMTU probe that is of minimal | |||
of minimal size (or one no larger than the currently supported | size (or one no larger than the currently supported PMTU size). | |||
PMTU size). Once the return Path MTU value in the MinPMTU HBH | Once the return Path MTU value in the MinPMTU HBH Option has been | |||
option has been learned, DPLPMTUD can be triggered to test for a | learned, DPLPMTUD can be triggered to test for a larger PLPMTU | |||
larger PLPMTU using an appropriately sized PLPMTU Probe Packet | using an appropriately sized PLPMTU probe packet (see | |||
(see section 5.3.1 of [RFC8899]). | Section 5.3.1 of [RFC8899]). | |||
* The use of this option with DNS and DNSSEC over UDP is expected to | * The use of this Option with DNS and DNSSEC over UDP is expected to | |||
work for paths where the PMTU is symmetric. The DNS server will | work for paths where the PMTU is symmetric. The DNS server will | |||
learn the PMTU from the DNS query messages. If the Rtn-PMTU value | learn the PMTU from the DNS query messages. If the Rtn-PMTU value | |||
is smaller, then a large DNSSEC response might be dropped and the | is smaller, then a large DNSSEC response might be dropped and the | |||
known problems with PMTUD will then occur. DNS and DNSSEC over | known problems with PMTUD will then occur. DNS and DNSSEC over | |||
transport protocols that can carry the PMTU ought to work. | transport protocols that can carry the PMTU ought to work. | |||
* This method also can be used with Anycast to discover the PMTU of | * This method also can be used with anycast to discover the PMTU of | |||
the path, but the use needs to be aware that the Anycast binding | the path, but the use needs to be aware that the anycast binding | |||
might change. | might change. | |||
6.3.2. Validation of the Packet that includes the Option | 6.3.2. Validation of the Packet that Includes the Option | |||
An upper layer protocol (e.g., transport endpoint) using this option | An upper-layer protocol (e.g., transport endpoint) using this Option | |||
needs to provide protection from data injection attacks by off-path | needs to provide protection from data injection attacks by off-path | |||
devices [RFC8085]. This requires a method to assure that the | devices [RFC8085]. This requires a method to assure that the | |||
information in the Option Data is provided by a node on the path. | information in the Option Data is provided by a node on the path. | |||
This validates that the packet forms a part of an existing flow, | This validates that the packet forms a part of an existing flow, | |||
using context available at the upper layer. For example, a TCP | using context available at the upper layer. For example, a TCP | |||
connection or UDP application that maintains the related state and | connection or UDP application that maintains the related state and | |||
uses a randomized ephemeral port would provide this basic validation | uses a randomized ephemeral port would provide this basic validation | |||
to protect from off-path data injection, see Section 5.1 of | to protect from off-path data injection; see Section 5.1 of | |||
[RFC8085]. IPsec [RFC4301] and TLS [RFC8446] provide greater | [RFC8085]. IPsec [RFC4301] and TLS [RFC8446] provide greater | |||
assurance. | assurance. | |||
The upper layer discards any received packet when the packet | The upper layer discards any received packet when the packet | |||
validation fails. When packet validation fails, the upper layer MUST | validation fails. When packet validation fails, the upper layer MUST | |||
also discard the associated Option Data from the MinPMTU HBH option | also discard the associated Option Data from the MinPMTU HBH Option | |||
without further processing. | without further processing. | |||
6.3.3. Receiving the Option | 6.3.3. Receiving the Option | |||
For a connection-oriented upper layer protocol, caching of the | For a connection-oriented upper-layer protocol, caching of the | |||
received Min-PMTU could be implemented by saving the value in the | received Min-PMTU could be implemented by saving the value in the | |||
connection context at the transport layer. A connection-less upper | connection context at the transport layer. A connectionless upper | |||
layer (e.g., one using UDP), requires the upper layer protocol to | layer (e.g., one using UDP) requires the upper-layer protocol to | |||
cache the value for each flow it uses. | cache the value for each flow it uses. | |||
A destination host that receives a MinPMTU HBH Option with the R-Flag | A destination host that receives a MinPMTU HBH Option with the R-Flag | |||
SHOULD include the MinPMTU HBH option in the next outgoing IPv6 | SHOULD include the MinPMTU HBH Option in the next outgoing IPv6 | |||
packet for the corresponding flow. | packet for the corresponding flow. | |||
A simple mechanism could only include this option (with the Rtn-PMTU | A simple mechanism could only include this Option (with the Rtn-PMTU | |||
field set) the first time this option is received or when it notifies | field set) the first time this Option is received or when it notifies | |||
a change in the Minimum Path MTU. This limits the number of packets | a change in the Minimum Path MTU. This limits the number of packets, | |||
including the option packets that are sent. However, this does not | including the Option packets, that are sent. However, this does not | |||
provide robustness to packet loss or recovery after a sender loses | provide robustness to packet loss or recovery after a sender loses | |||
state. | state. | |||
Discussion: | Discussion: | |||
* Some upper layer protocols send packets less frequently than the | * Some upper-layer protocols send packets less frequently than the | |||
rate at which the host receives packets. This provides less | rate at which the host receives packets. This provides less | |||
frequent feedback of the received Rtn-PMTU value. However, a host | frequent feedback of the received Rtn-PMTU value. However, a host | |||
always sends the most recent Rtn-PMTU value. | always sends the most recent Rtn-PMTU value. | |||
6.3.4. Using the Rtn-PMTU Field | 6.3.4. Using the Rtn-PMTU Field | |||
The Rtn-PMTU field provides an indication of the PMTU from on-path | The Rtn-PMTU field provides an indication of the PMTU from on-path | |||
routers. It does not necessarily reflect the actual PMTU between the | routers. It does not necessarily reflect the actual PMTU between the | |||
source and destination hosts. Care therefore needs to be exercised | source and destination hosts. Care therefore needs to be exercised | |||
in using the Rtn-PMTU value. Specifically: | in using the Rtn-PMTU value. Specifically: | |||
* The actual PMTU can be lower than the Rtn-PMTU value because the | * The actual PMTU can be lower than the Rtn-PMTU value because the | |||
Min-PMTU field was not updated by a router on the path that did | Min-PMTU field was not updated by a router on the path that did | |||
not process the option. | not process the Option. | |||
* The actual PMTU may be lower than the Rtn-PMTU value because there | * The actual PMTU may be lower than the Rtn-PMTU value because there | |||
is a layer-2 device with a lower MTU. | is a Layer 2 device with a lower MTU. | |||
* The actual PMTU may be larger than the Rtn-PMTU value because of a | * The actual PMTU may be larger than the Rtn-PMTU value because of a | |||
corrupted, delayed or mis-ordered response. A source host MUST | corrupted, delayed, or misordered response. A source host MUST | |||
ignore a Rtn-PMTU value larger than the MTU configured for the | ignore a Rtn-PMTU value larger than the MTU configured for the | |||
outgoing link. | outgoing link. | |||
* The path might have changed between the time when the probe was | * The path might have changed between the time when the probe was | |||
sent and when the Rtn-PMTU value received. | sent and when the Rtn-PMTU value received. | |||
IPv6 requires that every link in the Internet have an MTU of 1280 | IPv6 requires that every link in the Internet have an MTU of 1280 | |||
octets or greater. A node MUST ignore a Rtn-PMTU value less than | octets or greater. A node MUST ignore a Rtn-PMTU value less than | |||
1280 octets [RFC8200]. | 1280 octets [RFC8200]. | |||
To avoid unintentional dropping of packets that exceed the actual | To avoid unintentional dropping of packets that exceed the actual | |||
PMTU (e.g., Scenario 3 in Section 1.1), the source host can delay | PMTU (e.g., Scenario 3 in Section 1.1), the source host can delay | |||
increasing the PMTU until a probe packet with the size of the Rtn- | increasing the PMTU until a probe packet with the size of the Rtn- | |||
PMTU value has been successfully acknowledged by the upper layer, | PMTU value has been successfully acknowledged by the upper layer, | |||
confirming that the path supports the larger PMTU. This probing | confirming that the path supports the larger PMTU. This probing | |||
increases robustness, but adds one additional path round trip time | increases robustness but adds one additional path round-trip time | |||
before the PMTU is updated. This use resembles that of PTB messages | before the PMTU is updated. This use resembles that of PTB messages | |||
in section 4.6 of DPLPMTUD [RFC8899] (with the important difference | in Section 4.6 of DPLPMTUD [RFC8899] (with the important difference | |||
that a PTB message can only seek to lower the PMTU, whereas this | being that a PTB message can only seek to lower the PMTU, whereas | |||
option could trigger a probe packet to seek to increase the PMTU.) | this Option could trigger a probe packet to seek to increase the | |||
PMTU). | ||||
Section 5.2 of [RFC8201] provides guidance on the caching of PMTU | Section 5.2 of [RFC8201] provides guidance on the caching of PMTU | |||
information and also the relation to IPv6 flow labels. | information and also the relation to IPv6 flow labels. | |||
Implementations should consider the impact of Equal Cost Multipath | Implementations should consider the impact of Equal-Cost Multipath | |||
(ECMP) [RFC6438]. Specifically, whether a PMTU ought to be | (ECMP) [RFC6438], specifically, whether a PMTU ought to be maintained | |||
maintained for each transport endpoint, or for each network address. | for each transport endpoint or for each network address. | |||
6.3.5. Detecting Path Changes | 6.3.5. Detecting Path Changes | |||
Path characteristics can change and the actual PMTU could increase or | Path characteristics can change, and the actual PMTU could increase | |||
decrease over time. For instance, following a path change when | or decrease over time, for instance, following a path change when | |||
packets are forwarded over a link with a different MTU than that | packets are forwarded over a link with a different MTU than that | |||
previously used. To bound the delay in discovering an increase in | previously used. To bound the delay in discovering an increase in | |||
the actual PMTU, a host with a link MTU larger than the current PMTU | the actual PMTU, a host with a link MTU larger than the current PMTU | |||
SHOULD periodically send the MinPMTU HBH Option with the R-bit set. | SHOULD periodically send the MinPMTU HBH Option with the R-bit set. | |||
DPLPMTUD provides recommendations concerning how this could be | DPLPMTUD provides recommendations concerning how this could be | |||
implemented (see Section 5.3 of [RFC8899]). Since the option | implemented (see Section 5.3 of [RFC8899]). Since the Option | |||
consumes less capacity than a full-sized probe packet, there can be | consumes less capacity than a full-sized probe packet, there can be | |||
advantage in using this to detect a change in the path | an advantage in using this to detect a change in the path | |||
characteristics. | characteristics. | |||
6.3.6. Detection of Dropping Packets that include the Option | 6.3.6. Detection of Dropping Packets that Include the Option | |||
There is evidence that some middleboxes drop packets that include | There is evidence that some middleboxes drop packets that include | |||
Hop-by-Hop options. For example, a firewall might drop a packet that | Hop-by-Hop Options. For example, a firewall might drop a packet that | |||
carries an unknown extension header or option. This practice is | carries an unknown extension header or Option. This practice is | |||
expected to decrease as an option becomes more widely used. It could | expected to decrease as an Option becomes more widely used. It could | |||
result in generation of an ICMPv6 message indicating the problem. | result in the generation of an ICMPv6 message that indicates the | |||
This could be used to (temporarily) suspend use of this option. | problem. This could be used to (temporarily) suspend use of this | |||
Option. | ||||
A middlebox that silently discards a packet with this option results | A middlebox that silently discards a packet with this Option results | |||
in dropping of any packet using the option. This dropping can be | in the dropping of any packet using the Option. This dropping can be | |||
avoided by appropriate configuration in a controlled environment, | avoided by appropriate configuration in a controlled environment, | |||
such as within a data centre, but needs to be considered for Internet | such as within a data center, but it needs to be considered for | |||
usage. Section 6.2 recommends that this option is not used on | Internet usage. Section 6.2 recommends that this Option is not used | |||
packets where loss might adversely impact performance. | on packets where loss might adversely impact performance. | |||
7. IANA Considerations | 7. IANA Considerations | |||
IANA has assigned and registered an IPv6 Hop-by-Hop Option type with | IANA has registered an IPv6 Hop-by-Hop Option type in the | |||
Temporary status from the "Destination Options and Hop-by-Hop | "Destination Options and Hop-by-Hop Options" registry within the | |||
Options" registry [IANA-HBH]. This assignment is shown in Section 5. | "Internet Protocol Version 6 (IPv6) Parameters" registry group | |||
[IANA-HBH]. This assignment is shown in Section 5. | ||||
IANA is requested to update this registry to point to this document | ||||
and remove the Temporary status. | ||||
8. Security Considerations | 8. Security Considerations | |||
This section discusses the security considerations. It first reviews | This section discusses the security considerations. It first reviews | |||
router option processing. It then reviews host processing when | router Option processing. It then reviews host processing when | |||
receiving this option at the network layer. It then considers two | receiving this Option at the network layer. It then considers two | |||
ways in which the Option Data can be processed, followed by two | ways in which the Option Data can be processed, followed by two | |||
approaches for using the Option Data. Finally, it discusses | approaches for using the Option Data. Finally, it discusses | |||
middlebox implications related to use in the general Internet. | middlebox implications related to use in the general Internet. | |||
8.1. Router Option Processing | 8.1. Router Option Processing | |||
This option shares the characteristics of all other IPv6 Hop-by-Hop | This Option shares the characteristics of all other IPv6 Hop-by-Hop | |||
Options, in that if not supported at line rate it could be used to | Options, in that, if not supported at line rate, it could be used to | |||
degrade the performance of a router. This option, while simple, is | degrade the performance of a router. This Option, while simple, is | |||
no different to other uses of IPv6 Hop-by-Hop options. | no different than other uses of IPv6 Hop-by-Hop Options. | |||
It is common for routers to ignore the Hop-by-Hop Option header or | It is common for routers to ignore the Hop-by-Hop Option header or to | |||
drop packets containing a Hop-by-Hop Option header. Routers | drop packets containing a Hop-by-Hop Option header. Routers | |||
implementing IPv6 according to [RFC8200] only examine and process the | implementing IPv6 according to [RFC8200] only examine and process the | |||
Hop-by-Hop Options header if explicitly configured to do so. | Hop-by-Hop Options header if explicitly configured to do so. | |||
8.2. Network Layer Host Processing | 8.2. Network-Layer Host Processing | |||
A malicious attacker can forge a packet directed at a host that | A malicious attacker can forge a packet directed at a host that | |||
carries the MinPMTU HBH option. By design, the fields of this IP | carries the MinPMTU HBH Option. By design, the fields of this IP | |||
option can be modified by the network. | Option can be modified by the network. | |||
For comparison, the ICMPv6 Packet Too Big message used in [RFC8201] | For comparison, the ICMPv6 PTB message used in Path MTU Discovery | |||
Path MTU Discovery, the source host has an inherent trust | [RFC8201] and the source host have an inherent trust relationship | |||
relationship with the destination host including this option. This | with the destination host including this Option. This trust | |||
trust relationship can be used to help verify the option. ICMPv6 | relationship can be used to help verify the Option. ICMPv6 PTB | |||
Packet Too Big messages are sent from any router on the path to the | messages are sent from any router on the path to the destination | |||
destination host, the source host has no prior knowledge of these | host. The source host has no prior knowledge of these routers | |||
routers (except for the first hop router). | (except for the first hop router). | |||
Reception of this packet will require processing as the network stack | Reception of this packet will require processing as the network stack | |||
parses the packet before the packet is delivered to the upper layer | parses the packet before the packet is delivered to the upper-layer | |||
protocol. This network layer option processing is normally completed | protocol. This network-layer Option processing is normally completed | |||
before any upper layer protocol delivery checks are performed. | before any upper-layer protocol delivery checks are performed. | |||
The network layer does not normally have sufficient information to | The network layer does not normally have sufficient information to | |||
validate that the packet carrying an option originated from the | validate that the packet carrying an Option originated from the | |||
destination (or an on-path node). It also does not typically have | destination (or an on-path node). It also does not typically have | |||
sufficient context to demultiplex the packet to identify the related | sufficient context to demultiplex the packet to identify the related | |||
transport flow. This can mean that any changes resulting from | transport flow. This can mean that any changes resulting from | |||
reception of the option applies to all flows between a pair of | reception of the Option applies to all flows between a pair of | |||
endpoints. | endpoints. | |||
These considerations are no different to other uses of Hop-by-Hop | These considerations are no different than other uses of Hop-by-Hop | |||
options, and this is the use case for PMTUD. The following section | Options, and this is the use case for PMTUD. The following section | |||
describes a mitigation for this attack. | describes a mitigation for this attack. | |||
8.3. Validating use of the Option Data | 8.3. Validating Use of the Option Data | |||
Transport protocols should be designed to provide protection from | Transport protocols should be designed to provide protection from | |||
data injection attacks by off-path devices and mechanisms should be | data injection attacks by off-path devices, and mechanisms should be | |||
described in the Security Considerations for each transport | described in the Security Considerations section for each transport | |||
specification (see Section 5.1 of the UDP Guidelines [RFC8085]). For | specification (see Section 5.1 of "UDP Usage Guidelines" [RFC8085]). | |||
example, a TCP or UDP application that maintains the related state | For example, a TCP or UDP application that maintains the related | |||
and uses a randomized ephemeral port would provide basic protection. | state and uses a randomized ephemeral port would provide basic | |||
TLS [RFC8446] or IPsec [RFC4301] provide cryptographic | protection. TLS [RFC8446] or IPsec [RFC4301] provide cryptographic | |||
authentication. An upper layer protocol that validates each received | authentication. An upper-layer protocol that validates each received | |||
packet discards any packet when this validation fails. In this case, | packet discards any packet when this validation fails. In this case, | |||
the host MUST also discard the associated Option Data from the | the host MUST also discard the associated Option Data from the | |||
MinPMTU HBH option without further processing (Section 6.3). | MinPMTU HBH Option without further processing (Section 6.3). | |||
A network node on the path has visibility of all packets it forwards. | A network node on the path has visibility of all packets it forwards. | |||
By observing the network packet payload, the node might be able to | By observing the network packet payload, the node might be able to | |||
construct a packet that might be validated by the destination host. | construct a packet that might be validated by the destination host. | |||
Such a node would also be able to drop or limit the flow in other | Such a node would also be able to drop or limit the flow in other | |||
ways that could be potentially more disruptive. Authenticating the | ways that could be potentially more disruptive. Authenticating the | |||
packet, for example, using IPsec [RFC4301] or TLS [RFC8446] mitigates | packet, for example, using IPsec [RFC4301] or TLS [RFC8446] mitigates | |||
this attack. Note that AH style authentication [RFC4302] while | this attack. Note that the authentication style of the | |||
authenticating the payload and outer IPv6 header, does not check Hop- | Authentication Header (AH) [RFC4302], while authenticating the | |||
by-Hop options that change on route. | payload and outer IPv6 header, does not check Hop-by-Hop Options that | |||
change on route. | ||||
8.4. Direct use of the Rtn-PMTU Value | 8.4. Direct Use of the Rtn-PMTU Value | |||
The simplest way to utilize the Rtn-PMTU value is to directly use | The simplest way to utilize the Rtn-PMTU value is to directly use | |||
this to update the PMTU. This approach results in a set of security | this to update the PMTU. This approach results in a set of security | |||
issues when the option carries malicious data: | issues when the Option carries malicious data: | |||
* A direct update of the PMTU using the Rtn-PMTU value could result | * A direct update of the PMTU using the Rtn-PMTU value could result | |||
in an attacker inflating or reducing the size of the host PMTU for | in an attacker inflating or reducing the size of the host PMTU for | |||
the destination. Forcing a reduction in the PMTU can decrease the | the destination. Forcing a reduction in the PMTU can decrease the | |||
efficiency of network use, might increase the number of packets/ | efficiency of network use, might increase the number of packets/ | |||
fragments required to send the same volume of payload data, and | fragments required to send the same volume of payload data, and | |||
prevents sending an unfragmented datagram larger than the PMTU. | can prevent sending an unfragmented datagram larger than the PMTU. | |||
Increasing the PMTU can result in black-holing (see Section 1.1 of | Increasing the PMTU can result in a path silently dropping packets | |||
[RFC8899]) when the source host sends packets larger than the | (described as a black hole in [RFC8899]) when the source host | |||
actual PMTU. This persists until the PMTU is next updated. | sends packets larger than the actual PMTU. This persists until | |||
the PMTU is next updated. | ||||
* The method can be used to solicit a response from the destination | * The method can be used to solicit a response from the destination | |||
host. A malicious attacker could forge a packet that causes the | host. A malicious attacker could forge a packet that causes the | |||
destination to add the option to a packet sent to the source host. | destination to add the Option to a packet sent to the source host. | |||
A forged value of Rtn-PMTU in the Option Data might also impact | A forged value of Rtn-PMTU in the Option Data might also impact | |||
the remote endpoint, as described in the previous bullet. This | the remote endpoint, as described in the previous bullet. This | |||
persists until a valid MinPMTU HBH option is received. This | persists until a valid MinPMTU HBH Option is received. This | |||
attack could be mitigated by limiting the sending of the MinPMTU | attack could be mitigated by limiting the sending of the MinPMTU | |||
HBH option in reply to incoming packets that carry the option. | HBH Option in reply to incoming packets that carry the Option. | |||
8.5. Using the Rtn-PMTU Value as a Hint for Probing | 8.5. Using the Rtn-PMTU Value as a Hint for Probing | |||
Another way to utilize the Rtn-PMTU value is to indirectly trigger a | Another way to utilize the Rtn-PMTU value is to indirectly trigger a | |||
probe to determine if the path supports a PMTU of size Rtn-PMTU. | probe to determine if the path supports a PMTU of size Rtn-PMTU. | |||
This approach needs context for the flow, and hence assumes an upper | This approach needs context for the flow and hence assumes an upper- | |||
layer protocol that validates the packet that carries the option (see | layer protocol that validates the packet that carries the Option (see | |||
Section 8.3). This is the case when used in combination with | Section 8.3). This is the case when used in combination with | |||
DPLPMTUD [RFC8899]. A set of security considerations result when an | DPLPMTUD [RFC8899]. A set of security considerations result when an | |||
option carries malicious data: | Option carries malicious data: | |||
* If the forged packet carries a validated option with a non-zero | * If the forged packet carries a validated Option with a non-zero | |||
Rtn-PMTU field, the upper layer protocol could utilize the | Rtn-PMTU field, the upper-layer protocol could utilize the | |||
information in the Rtn-PMTU field. A Rtn-PMTU larger than the | information in the Rtn-PMTU field. A Rtn-PMTU larger than the | |||
current PMTU can trigger a probe for a new size. | current PMTU can trigger a probe for a new size. | |||
* If the forged packet carries a non-zero Min-PMTU field, the upper | * If the forged packet carries a non-zero Min-PMTU field, the upper- | |||
layer protocol would change the cached information about the path | layer protocol would change the cached information about the path | |||
from the source. The cached information at the destination host | from the source. The cached information at the destination host | |||
will be overwritten when the host receives another packet that | will be overwritten when the host receives another packet that | |||
includes a MinPMTU HBH option corresponding to the flow. | includes a MinPMTU HBH Option corresponding to the flow. | |||
* Processing of the option could cause a destination host to add the | * Processing of the Option could cause a destination host to add the | |||
MinPMTU HBH option to a packet sent to the source host. This | MinPMTU HBH Option to a packet sent to the source host. This | |||
option will carry a Rtn-PMTU value that could have been updated by | Option will carry a Rtn-PMTU value that could have been updated by | |||
the forged packet. The impact of the source host receiving this | the forged packet. The impact of the source host receiving this | |||
resembles that discussed previously. | resembles that discussed previously. | |||
8.6. Impact of Middleboxes | 8.6. Impact of Middleboxes | |||
There is evidence that some middleboxes drop packets that include | There is evidence that some middleboxes drop packets that include | |||
Hop-by-Hop options. For example, a firewall might drop a packet that | Hop-by-Hop Options. For example, a firewall might drop a packet that | |||
carries an unknown extension header or option. This practice is | carries an unknown extension header or Option. This practice is | |||
expected to decrease as the option becomes more widely used. Methods | expected to decrease as the Option becomes more widely used. Methods | |||
to address this are discussed in Section 6.3.6. | to address this are discussed in Section 6.3.6. | |||
When a forged packet causes a packet to be sent including the MinPMTU | When a forged packet causes a packet that includes the MinPMTU HBH | |||
HBH option, and the return path does not forward packets with this | Option to be sent and the return path does not forward packets with | |||
option, the packet will be dropped Section 6.3.6. This attack is | this Option, the packet will be dropped (see Section 6.3.6). This | |||
mitigated by validating the option data before use and by limiting | attack is mitigated by validating the Option Data before use and by | |||
the rate of responses generated. An upper layer could further | limiting the rate of responses generated. An upper layer could | |||
mitigate the impact by responding to an R-Flag by including the | further mitigate the impact by responding to an R-Flag by including | |||
option in a packet that does not carry application data. | the Option in a packet that does not carry application data. | |||
9. Experiment Goals | 9. Experiment Goals | |||
This section describes the experimental goals of this specification. | This section describes the experimental goals of this specification. | |||
A successful deployment of the method depends upon several components | A successful deployment of the method depends upon several components | |||
being implemented and deployed: | being implemented and deployed: | |||
* Support in the sending node (see Section 6.2). This also requires | * Support in the sending node (see Section 6.2). This also requires | |||
corresponding support in upper layer protocols (see Section 6.3). | corresponding support in upper-layer protocols (see Section 6.3). | |||
* Router support in nodes (see Section 6.1). The IETF continues to | * Router support in nodes (see Section 6.1). The IETF continues to | |||
provide recommendations on the use of IPv6 Hop-by-Hop options, for | provide recommendations on the use of IPv6 Hop-by-Hop Options, for | |||
example Section 2.2.2 of [RFC9099]. This document does not update | example, see Section 2.2.2 of [RFC9099]. This document does not | |||
the way router implementations configure support for Hop-by-Hop | update the way router implementations configure support for Hop- | |||
options. | by-Hop Options. | |||
* Support in the receiving node (see Section 6.3.3). | * Support in the receiving node (see Section 6.3.3). | |||
Experience from deployment is an expected input to any decision to | Experience from deployment is an expected input to any decision to | |||
progress this specification from Experimental to IETF Standards | progress this specification from Experimental to IETF Standards | |||
Track. Appropriate inputs might include: | Track. Appropriate inputs might include: | |||
* Reports of implementation experience; | * reports of implementation experience, | |||
* Measurements of the number paths where the method can be used; | * measurements of the number paths where the method can be used, or | |||
* Measurements showing the benefit realized or the implications of | * measurements showing the benefit realized or the implications of | |||
using specific methods over specific paths. | using specific methods over specific paths. | |||
10. Implementation Status | 10. Implementation Status | |||
At the time this document was published there are two known | At the time this document was published, there are two known | |||
implementations of the Path MTU Hop-by-Hop option. These are: | implementations of the Path MTU Hop-by-Hop Option. These are: | |||
* Wireshark dissector. This is shipping in production in Wireshark | * Wireshark dissector. This is shipping in production in Wireshark | |||
version 3.2 [WIRESHARK]. | version 3.2 [WIRESHARK]. | |||
* A prototype in the open source version of the FD.io Vector Packet | * A prototype in the open source version of the FD.io Vector Packet | |||
Processing (VPP) technology [VPP]. At the time this document was | Processing (VPP) technology [VPP]. At the time this document was | |||
published, the source code can be found [VPP_SRC]. | published, the source code can be found [VPP_SRC]. | |||
11. Acknowledgments | 11. References | |||
Helpful comments were received from Tom Herbert, Tom Jones, Fred | ||||
Templin, Ole Troan, Tianran Zhou, Jen Linkova, Brian Carpenter, Peng | ||||
Shuping, Mark Smith, Fernando Gont, Michael Dougherty, Erik Kline, | ||||
and other members of the 6MAN working group. | ||||
12. Change log [RFC Editor: Please remove] | ||||
draft-ietf-6man-mtu-option-15, 2022-May-10 | ||||
* Correcting an editing mistake in Appendix A. | ||||
* Editorial Change. | ||||
draft-ietf-6man-mtu-option-14, 2022-April-15 | ||||
* Area Director Reviews: | ||||
- Lars Eggert's Review: Fixed "nits". | ||||
- Eric Vyncke's Review: Added that this work is focused on | ||||
Unicast, removed Discussion from Section 6.1, revised text on | ||||
PLPMTUD probing, changed SHOULD to MUST in Section 6.3.4, and | ||||
fixed several NITs. | ||||
- Alvaro Retana's Review: Changed SHOULD language to more general | ||||
text in Section 6.1 | ||||
- ARTART Review: Added new Appendix "Examples of Usage" with | ||||
diagrams showing examples of use. | ||||
- Zaheduzzaman Sarker's Review: Fixed some editorial issues, and | ||||
updated SHOULD language. | ||||
* Editorial Changes. | ||||
draft-ietf-6man-mtu-option-13, 2022-February-28 | ||||
* Area Directorate Reviews: | ||||
- SECDIR Review: Fixed "nit". | ||||
- TSVART Review: Restructured Section 6 including making | ||||
Transport Behavior more prominent, added text about ICMPv6 to | ||||
Section 6.3.1, moved the text about prior work in RFC1063 to | ||||
Section 2. | ||||
- GENART Review: Added text to Section 1 that this option was | ||||
designed to work with packet sizes that can be specified in the | ||||
IPv6 Header. | ||||
* Editorial Changes. | ||||
draft-ietf-6man-mtu-option-12, 2022-January-26 | ||||
* Clarified a few issues raised by AD review by Erik Kline AD | ||||
review. | ||||
draft-ietf-6man-mtu-option-11, 2021-September-30 | ||||
* Clarifications and editorial changes to the Security | ||||
Considerations section based on early AD review by Erik Kline. | ||||
draft-ietf-6man-mtu-option-10, 2021-September-27 | ||||
* Clarifications and editorial changes based on second chair review | ||||
by Ole Troan. | ||||
* Editorial changes. | ||||
draft-ietf-6man-mtu-option-09, 2021-September-23 | ||||
* Clarifications and editorial changes based on review by Michael | ||||
Dougherty. | ||||
draft-ietf-6man-mtu-option-08, 2021-September-7 | ||||
* Clarifications and editorial changes based on chair review by Ole | ||||
Troan. | ||||
* Correction and clarifications based on review by Fernando Gont. | ||||
draft-ietf-6man-mtu-option-07, 2021-August-31 | ||||
* Added Experiment Goals section. | ||||
* Added Implementation Status section. | ||||
* Updated the IANA Considerations section to point to this document | ||||
and remove Temporary status. | ||||
* Clarifications and editorial changes based on review by Mark | ||||
Smith. | ||||
draft-ietf-6man-mtu-option-06, 2021-August-7 | ||||
* Transport usage of the mechanism clarified in response to feedback | ||||
and suggestions from Jen Linkova. | ||||
* Restructured Section 6 to improve readability. | ||||
* Editorial changes. | ||||
draft-ietf-6man-mtu-option-05, 2021-April-28 | ||||
* Editorial changes. | ||||
draft-ietf-6man-mtu-option-04, 2020-Oct-23 | ||||
* Fixes for typos. | ||||
draft-ietf-6man-mtu-option-03, 2020-Sept-14 | ||||
* Rewrite to make text and terminology more consistent. | ||||
* Added the notion of validating the packet before use of the HBH | ||||
option data. | ||||
* Method aligned with the way common APIs send/receive HBH option | ||||
data. | ||||
* Added reference to DPLPMTUD and clarified upper layer usage. | ||||
* Completed security considerations section. | ||||
draft-ietf-6man-mtu-option-02, 2020-March-9 | ||||
* Editorial changes to make text and terminology more consistent. | ||||
* Added reference to DPLPMTUD. | ||||
draft-ietf-6man-mtu-option-01, 2019-September-13 | ||||
* Changes to show IANA assigned code point. | ||||
* Editorial changes to make text and terminology more consistent. | ||||
* Added a reference to RFC8200 in Section 2 and a reference to | ||||
RFC6438 in Section 6.3. | ||||
draft-ietf-6man-mtu-option-00, 2019-August-9 | ||||
* First 6man w.g. draft version. | ||||
* Changes to request IANA allocation of code point. | ||||
* Editorial changes. | ||||
draft-hinden-6man-mtu-option-02, 2019-July-5 | ||||
* Changed option format to also include the Returned PMTU value and | ||||
Return flag and made related text changes in Section 6.2 to | ||||
describe this behavior. | ||||
* ICMPv6 Packet Too Big messages are no longer used for feedback to | ||||
the source host. | ||||
* Added to Acknowledgements Section that a similar mechanism was | ||||
proposed for IPv4 in 1988 in [RFC1063]. | ||||
* Editorial changes. | ||||
draft-hinden-6man-mtu-option-01, 2019-March-05 | ||||
* Changed requested status from Standards Track to Experimental to | ||||
allow use of experimental option type (11110) to allow for | ||||
experimentation. Removed request for IANA Option assignment. | ||||
* Added Section 2 "Motivation and Problem Solved" section to better | ||||
describe what the purpose of this document is. | ||||
* Added appendix describing planned experiments and how the results | ||||
will be measured. | ||||
* Editorial changes. | ||||
draft-hinden-6man-mtu-option-00, 2018-Oct-16 | ||||
* Initial draft. | ||||
13. References | ||||
13.1. Normative References | 11.1. Normative References | |||
[IANA-HBH] "Destination Options and Hop-by-Hop Options", | [IANA-HBH] IANA, "Destination Options and Hop-by-Hop Options", | |||
<https://www.iana.org/assignments/ipv6-parameters/ | <https://www.iana.org/assignments/ipv6-parameters/>. | |||
ipv6-parameters.xhtml#ipv6-parameters-2>. | ||||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
(IPv6) Specification", STD 86, RFC 8200, | (IPv6) Specification", STD 86, RFC 8200, | |||
DOI 10.17487/RFC8200, July 2017, | DOI 10.17487/RFC8200, July 2017, | |||
<https://www.rfc-editor.org/info/rfc8200>. | <https://www.rfc-editor.org/info/rfc8200>. | |||
[RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., | |||
"Path MTU Discovery for IP version 6", STD 87, RFC 8201, | "Path MTU Discovery for IP version 6", STD 87, RFC 8201, | |||
DOI 10.17487/RFC8201, July 2017, | DOI 10.17487/RFC8201, July 2017, | |||
<https://www.rfc-editor.org/info/rfc8201>. | <https://www.rfc-editor.org/info/rfc8201>. | |||
13.2. Informative References | 11.2. Informative References | |||
[I-D.ietf-taps-arch] | ||||
Pauly, T., Trammell, B., Brunstrom, A., Fairhurst, G., and | ||||
C. Perkins, "An Architecture for Transport Services", Work | ||||
in Progress, Internet-Draft, draft-ietf-taps-arch-12, 3 | ||||
January 2022, <https://datatracker.ietf.org/doc/html/ | ||||
draft-ietf-taps-arch-12>. | ||||
[RFC1063] Mogul, J., Kent, C., Partridge, C., and K. McCloghrie, "IP | [RFC1063] Mogul, J C., Kent, C A., Partridge, C., and K. McCloghrie, | |||
MTU discovery options", RFC 1063, DOI 10.17487/RFC1063, | "IP MTU discovery options", RFC 1063, | |||
July 1988, <https://www.rfc-editor.org/info/rfc1063>. | DOI 10.17487/RFC1063, July 1988, | |||
<https://www.rfc-editor.org/info/rfc1063>. | ||||
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, | [RFC1191] Mogul, J C. and S E. Deering, "Path MTU discovery", | |||
DOI 10.17487/RFC1191, November 1990, | RFC 1191, DOI 10.17487/RFC1191, November 1990, | |||
<https://www.rfc-editor.org/info/rfc1191>. | <https://www.rfc-editor.org/info/rfc1191>. | |||
[RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
(IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | |||
December 1998, <https://www.rfc-editor.org/info/rfc2460>. | December 1998, <https://www.rfc-editor.org/info/rfc2460>. | |||
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the | [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | |||
Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, | Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, | |||
December 2005, <https://www.rfc-editor.org/info/rfc4301>. | December 2005, <https://www.rfc-editor.org/info/rfc4301>. | |||
skipping to change at page 24, line 10 ¶ | skipping to change at line 930 ¶ | |||
[RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | |||
Multiplexed and Secure Transport", RFC 9000, | Multiplexed and Secure Transport", RFC 9000, | |||
DOI 10.17487/RFC9000, May 2021, | DOI 10.17487/RFC9000, May 2021, | |||
<https://www.rfc-editor.org/info/rfc9000>. | <https://www.rfc-editor.org/info/rfc9000>. | |||
[RFC9099] Vyncke, É., Chittimaneni, K., Kaeo, M., and E. Rey, | [RFC9099] Vyncke, É., Chittimaneni, K., Kaeo, M., and E. Rey, | |||
"Operational Security Considerations for IPv6 Networks", | "Operational Security Considerations for IPv6 Networks", | |||
RFC 9099, DOI 10.17487/RFC9099, August 2021, | RFC 9099, DOI 10.17487/RFC9099, August 2021, | |||
<https://www.rfc-editor.org/info/rfc9099>. | <https://www.rfc-editor.org/info/rfc9099>. | |||
[VPP] "VPP/What is VPP?", | [TAPS-ARCH] | |||
Pauly, T., Ed., Trammell, B., Ed., Brunstrom, A., | ||||
Fairhurst, G., and C. Perkins, "An Architecture for | ||||
Transport Services", Work in Progress, Internet-Draft, | ||||
draft-ietf-taps-arch-12, June 2022, | ||||
<https://datatracker.ietf.org/doc/bibxml3/draft-ietf-taps- | ||||
arch.xml>. | ||||
[VPP] FD.io, "VPP/What is VPP?", | ||||
<https://wiki.fd.io/view/VPP/What_is_VPP%3F>. | <https://wiki.fd.io/view/VPP/What_is_VPP%3F>. | |||
[VPP_SRC] "VPP Source", <https://gerrit.fd.io/r/c/vpp/+/21948>. | [VPP_SRC] "vpp", commit 21948, ip: HBH MTU recording for IPv6, | |||
<https://gerrit.fd.io/r/c/vpp/+/21948>. | ||||
[WIRESHARK] | [WIRESHARK] | |||
"Wireshark Network Protocol Analyzer", | "Wireshark Network Protocol Analyzer", | |||
<https://www.wireshark.org>. | <https://www.wireshark.org>. | |||
Appendix A. Examples of Usage | Appendix A. Examples of Usage | |||
This section provides examples that illustrate a use of the MinPMTU | This section provides examples that illustrate a use of the MinPMTU | |||
HBH option by a source using DPLPMTUD to discover the PLPMTU | HBH Option by a source using DPLPMTUD to discover the PLPMTU | |||
supported by a path. They consider a path where the on-path router | supported by a path. They consider a path where the on-path router | |||
has been configured with an outgoing MTU of d'. The source starts by | has been configured with an outgoing MTU of d'. The source starts by | |||
transmission of packets of size a, and then uses DPLPMTUD to seek to | transmission of packets of size a and then uses DPLPMTUD to seek to | |||
increase the size in steps resulting in sizes of b,c,d,e, etc., | increase the size in steps resulting in sizes of b, c, d, e, etc. | |||
(chosen by the search algorithm used by DPLPMTUD). The search | (chosen by the search algorithm used by DPLPMTUD). The search | |||
algorithm terminates with a PLPMTU that is at least d and is less | algorithm terminates with a PLPMTU that is at least d and is less | |||
than or equal to d'. | than or equal to d'. | |||
The first example considers DPLPMTUD without using the MinPMTU HBH | The first example considers DPLPMTUD without using the MinPMTU HBH | |||
option. In this case, DPLPMTUD searches using an increasing size of | Option. In this case, DPLPMTUD searches using a probe packet that | |||
probe packet. Probe packets of size (e) are sent, which are larger | increases in size. Probe packets of size e are sent, which are | |||
than the actual PMTU. In this example, PTB messages are not received | larger than the actual PMTU. In this example, PTB messages are not | |||
from the routers and repeated unsuccessful probes result in the | received from the routers, and repeated unsuccessful probes result in | |||
search phase completing. Packets of data are never sent with a size | the search phase completing. Packets of data are never sent with a | |||
larger than the size of the last confirmed probe packet. ACKs of | size larger than the size of the last confirmed probe packet. | |||
data packets are not shown. | Acknowledgments (ACKs) of data packets are not shown. | |||
----Packets of data size (a) ----------------------------> | ----Packets of data size a ------------------------------> | |||
----Probe size (b) --------------------------------------> | ----Probe size b ----------------------------------------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
----Packets of data size (b) ----------------------------> | ----Packets of data size b ------------------------------> | |||
----Probe size (c) --------------------------------------> | ----Probe size c ----------------------------------------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
----Packets of data size (c) ----------------------------> | ----Packets of data size c ------------------------------> | |||
----Probe size (d) --------------------------------------> | ----Probe size d ----------------------------------------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
----Packets of data size (d) ----------------------------> | ----Packets of data size d ------------------------------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
... | ... | |||
----Probe size (e) ------------X | ----Probe size e --------------X | |||
X----ICMPv6 PTB (d') --| | X----ICMPv6 PTB d' ----| | |||
----Packets of data size (d) ----------------------------> | ----Packets of data size d ------------------------------> | |||
----Probe size (e) ------------X (again) | ----Probe size e --------------X (again) | |||
X----ICMPv6 PTB (d') --| | X----ICMPv6 PTB d' ----| | |||
----Packets of data size (d) ----------------------------- | ----Packets of data size d ------------------------------> | |||
... | ... | |||
etc, until MaxProbes are unsuccessful and search phase completes. | etc. until MaxProbes are unsuccessful and search phase completes. | |||
----Packets of data size (d) ----------------------------> | ----Packets of data size d ------------------------------> | |||
Figure 4 | Figure 3 | |||
The second example considers DPLPMTUD with the MinPMTU HBH option set | The second example considers DPLPMTUD with the MinPMTU HBH Option set | |||
on a connectivity probe packet. | on a connectivity probe packet. | |||
The IPv6 option is sent end-to-end, and the Min-PMTU is updated by a | The IPv6 Option is sent end to end, and the Min-PMTU is updated by a | |||
router on the path to d', which is returned in a response that also | router on the path to d', which is returned in a response that also | |||
sets the MinPMTU HBH option. Upon receiving Rtn-PMTU value is | sets the MinPMTU HBH Option. Upon receiving the Rtn-PMTU value, | |||
received, DPLPMTUD immediately sends a probe packet of the target | DPLPMTUD immediately sends a probe packet of the target size d'. If | |||
size (d'). If the probe packet is confirmed for the path, the PLPMTU | the probe packet is confirmed for the path, the PLPMTU is updated, | |||
is updated, allowing the source to use data packets up to size d'. | allowing the source to use data packets up to size d'. (The search | |||
(The search algorithm is allowed to continue to probe to see if the | algorithm is allowed to continue to probe to see if the path supports | |||
path supports a larger size.) Packets of data are never sent with a | a larger size.) Packets of data are never sent with a size larger | |||
size larger than the last confirmed probe size, d'. | than the last confirmed probe size d'. | |||
----Packets of data size (a) ----------------------------> | ----Packets of data size a ------------------------------> | |||
----Connectivity probe with MinPMTU- | ----Connectivity probe with MinPMTU- | |||
+--updated to minPMTU=d'-----> | +--updated to minPMTU=d'-----> | |||
<-----------------ACK with Rtn-PMTU=d'-------------------- | <-----------------ACK with Rtn-PMTU=d'-------------------- | |||
----Packets of data size (a) ----------------------------> | ----Packets of data size a ------------------------------> | |||
----Probe size (d') -------------------------------------> | ----Probe size d' ---------------------------------------> | |||
<---------------------------------- ACK of probe --------- | <---------------------------------- ACK of probe --------- | |||
-----Packets of data size (d') --------------------------> | -----Packets of data size d' ----------------------------> | |||
Search phase completes. | Search phase completes. | |||
-----Packets of data size (d') --------------------------> | -----Packets of data size d' ----------------------------> | |||
Figure 5 | ||||
The final example considers DPLPMTUD with the MinPMTU HBH option set | Figure 4 | |||
on a connectivity probe packet, but shows the effect when this | ||||
The final example considers DPLPMTUD with the MinPMTU HBH Option set | ||||
on a connectivity probe packet but shows the effect when this | ||||
connectivity probe packet is dropped. | connectivity probe packet is dropped. | |||
In this case, the packet with the MinPMTU HBH option is not received. | In this case, the packet with the MinPMTU HBH Option is not received. | |||
DPLPMTUD searches using probe packets of increasing size, increasing | DPLPMTUD searches using probe packets of increasing size, increasing | |||
the PLPMTU when the probes are confirmed. An ICMPv6 PTB message is | the PLPMTU when the probes are confirmed. An ICMPv6 PTB message is | |||
received when the probed size exceeds the actual PMTU, indicating a | received when the probed size exceeds the actual PMTU, indicating a | |||
PTB_SIZE of d'. DPLPMTUD immediately sends a probe packet of the | PTB_SIZE of d'. DPLPMTUD immediately sends a probe packet of the | |||
target size (d'). If the probe packet is confirmed for the path, the | target size d'. If the probe packet is confirmed for the path, the | |||
PLPMTU is updated, allowing the source to use data packets up to size | PLPMTU is updated, allowing the source to use data packets up to size | |||
d'. If the ICMPv6 PTB message is not received, the DPLPMTU will be | d'. If the ICMPv6 PTB message is not received, the DPLPMTU will be | |||
the last confirmed probe size, d. | the last confirmed probe size, which is d. | |||
----Packets of data size (a) -----------------------------> | ----Packets of data size a -------------------------------> | |||
----Connectivity probe with MinPMTU --------X | ----Connectivity probe with MinPMTU --------X | |||
----Packets of data size (a) -----------------------------> | ----Packets of data size a -------------------------------> | |||
----Probe size (b) ---------------------------------------> | ----Probe size b -----------------------------------------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
----Packets of data size (b) -----------------------------> | ----Packets of data size b -------------------------------> | |||
----Probe size (c) ---------------------------------------> | ----Probe size c -----------------------------------------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
----Packets of data size (c) -----------------------------> | ----Packets of data size c -------------------------------> | |||
----Probe size (d) ---------------------------------------> | ----Probe size d -----------------------------------------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
----Packets of data size (d) -----------------------------> | ----Packets of data size d -------------------------------> | |||
----Probe size (e) ----------X | ----Probe size e ------------X | |||
<--ICMPv6 PTB PTB_SIZE(d') -| | <--ICMPv6 PTB PTB_SIZE d' --| | |||
----Packets of data size (d) -----------------------------> | ----Packets of data size d -------------------------------> | |||
----Probe size (d') using target set by PTB_SIZE ---------> | ----Probe size d' using target set by PTB_SIZE -----------> | |||
<---------------------------------- ACK of probe -------- | <---------------------------------- ACK of probe -------- | |||
Search phase completes. | Search phase completes. | |||
----Packets of data size (d') ----------------------------> | ----Packets of data size d' ------------------------------> | |||
Figure 6 | Figure 5 | |||
The number of probe rounds depends on the number of steps needed by | The number of probe rounds depends on the number of steps needed by | |||
the search algorithm, and is typically larger for a larger PMTU. | the search algorithm and is typically larger for a larger PMTU. | |||
Acknowledgments | ||||
Helpful comments were received from Tom Herbert, Tom Jones, Fred | ||||
Templin, Ole Troan, Tianran Zhou, Jen Linkova, Brian Carpenter, Peng | ||||
Shuping, Mark Smith, Fernando Gont, Michael Dougherty, Erik Kline, | ||||
and other members of the 6MAN Working Group. | ||||
Authors' Addresses | Authors' Addresses | |||
Robert M. Hinden | Robert M. Hinden | |||
Check Point Software | Check Point Software | |||
959 Skyway Road | 959 Skyway Road | |||
San Carlos, CA 94070 | San Carlos, CA 94070 | |||
United States of America | United States of America | |||
Email: bob.hinden@gmail.com | Email: bob.hinden@gmail.com | |||
End of changes. 171 change blocks. | ||||
596 lines changed or deleted | 464 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |