<?xml version="1.0"encoding="UTF-8"?>encoding="utf-8"?> <!DOCTYPE rfcSYSTEM "rfc2629.dtd" []> <?xml-stylesheet type="text/xsl" href="rfc2629.xslt"?> <?rfc toc="yes"?> <?rfc compact="no"?> <?rfc subcompact="no"?> <?rfc symrefs="yes" ?> <?rfc sortrefs="yes"?> <?rfc iprnotified="no"?> <?rfc strict="yes"?>[ <!ENTITY nbsp " "> <!ENTITY zwsp "​"> <!ENTITY nbhy "‑"> <!ENTITY wj "⁠"> ]> <rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" submissionType="IETF" category="std" consensus="true" docName="draft-ietf-ipsecme-iptfs-19"submissionType="IETF">number="9347" obsoletes="" updates="" xml:lang="en" tocInclude="true" symRefs="true" sortRefs="true" version="3"> <!-- xml2rfc v2v3 conversion 3.14.2 --> <front> <title abbrev="IP Traffic FlowSecurity">IP-TFS: AggregationSecurity">Aggregation and Fragmentation Mode forESPEncapsulating Security Payload (ESP) anditsIts Use for IP Traffic FlowSecurity</title>Security (IP-TFS)</title> <seriesInfo name="RFC" value="9347"/> <authorinitials='C.' surname='Hopps' fullname='Christian Hopps'><organization>LabNinitials="C." surname="Hopps" fullname="Christian Hopps"> <organization>LabN Consulting,L.L.C.</organization><address><email>chopps@chopps.org</email></address></author> <date/><abstract><t>ThisL.L.C.</organization> <address> <email>chopps@chopps.org</email> </address> </author> <date year="2023" month="January"/> <area>sec</area> <workgroup>ipsecme</workgroup> <abstract> <t>This document describes a mechanism for aggregation and fragmentation of IP packets when they are being encapsulated inESP payloads.Encapsulating Security Payload (ESP). This new payload type can be used for variouspurposespurposes, such as decreasing encapsulation overhead for small IP packets; however, the focus in this document is to enhanceIPsec traffic flow securityIP Traffic Flow Security (IP-TFS) by adding Traffic Flow Confidentiality (TFC) to encryptedIP encapsulatedIP-encapsulated traffic. TFC is provided by obscuring the size and frequency of IP traffic using afixed-sized,fixed-size, constant-send-rate IPsec tunnel. The solution allows for congestioncontrolcontrol, as well asnon-constantnonconstant send-rateusage.</t></abstract>usage.</t> </abstract> </front> <middle> <sectiontitle="Introduction" anchor="sec-introduction">anchor="sec-introduction" numbered="true" toc="default"> <name>Introduction</name> <t>TrafficAnalysis (<xref target="RFC4301"/>,analysis <xref target="RFC4301" format="default"/> <xreftarget="AppCrypt"/>)target="AppCrypt" format="default"/> is the act of extracting information about data being sent through a network. While directly obscuring the data with encryption <xreftarget="RFC4303"/>,target="RFC4303" format="default"/>, the patterns in the message traffic may expose information due to variations in its shape and timing(<xref target="RFC8546"/>,<xreftarget="AppCrypt"/>).target="RFC8546" format="default"/> <xref target="AppCrypt" format="default"/>. Hiding the size and frequency of traffic is referred to as Traffic Flow Confidentiality(TFC)(TFC), per <xreftarget="RFC4303"/>.</t>target="RFC4303" format="default"/>.</t> <t><xreftarget="RFC4303"/>target="RFC4303" format="default"/> provides for TFC by allowing padding to be added to encrypted IP packets and allowing for transmission of all-pad packets (indicated using protocol 59). This method has the major limitation that it can significantlyunder-utilizeunderutilize the available bandwidth.</t> <t>This document defines an aggregation and fragmentation (AGGFRAG) mode for ESP,and itsas well as ESP's use for IP Traffic Flow Security (IP-TFS). This solution provides for full TFC without the aforementioned bandwidth limitation. This is accomplished by using a constant-send-rate IPsec <xreftarget="RFC4303"/>target="RFC4303" format="default"/> tunnel withfixed-sizedfixed-size encapsulating packets; however, thesefixed-sizedfixed-size packets can contain partial,wholewhole, or multiple IP packets to maximize the bandwidth of the tunnel. Anon-constant send-ratenonconstant send rate is allowed, but the confidentiality properties of its use are outside the scope of this document.</t> <t>For a comparison of the overhead of IP-TFS with theRFC4303 prescribedTFC solution prescribed in <xref target="RFC4303" format="default"/>, see <xreftarget="sec-comparisons-of-ip-tfs"></xref>.</t>target="sec-comparisons-of-ip-tfs" format="default"/>.</t> <t>Additionally, IP-TFS provides for operating fairly within congested networks <xreftarget="RFC2914"/>.target="RFC2914" format="default"/>. This is important for when the IP-TFS user is not in full control of the domain through which the IP-TFS tunnel path flows.</t> <t>The mechanisms, such as the AGGFRAG mode, defined in this document are generic with the intent of allowing for non-TFS uses, but such uses are outside the scope of this document.</t> <sectiontitle="Terminologynumbered="true" toc="default"> <name>Terminology &Concepts"> <t>TheConcepts</name> <t> The key words"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY","<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", "<bcp14>MAY</bcp14>", and"OPTIONAL""<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as described inBCP 14BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they appear in all capitals, as shownhere.</t>here. </t> <t>This document assumes familiarity with IP securityconceptsconcepts, includingTFCTFC, as described in <xreftarget="RFC4301"/>.</t>target="RFC4301" format="default"/>.</t> </section> </section> <sectiontitle="Thenumbered="true" toc="default"> <name>The AGGFRAGTunnel">Tunnel</name> <t>As mentioned in <xreftarget="sec-introduction"></xref>,target="sec-introduction" format="default"/>, the AGGFRAG mode utilizes an IPsec <xreftarget="RFC4303"/>target="RFC4303" format="default"/> tunnel as its transport. For the purpose of IP-TFS,fixed-sizedfixed-size encapsulating packets are sent at a constant rate on the AGGFRAG tunnel.</t> <t>The primary input to the tunnel algorithm is the requested bandwidth to be used by the tunnel. Two values are then required to provide for this bandwidthuse,use: the fixed size of the encapsulatingpackets,packets and the rate at which to send them.</t> <t>The fixed packet sizeMAY<bcp14>MAY</bcp14> either be specified manually or be determined through othermethodsmethods, such as the Packetization Layer MTU Discovery (PLMTUD)(<xref target="RFC4821"/>,<xreftarget="RFC8899"/>)target="RFC4821" format="default"/> <xref target="RFC8899" format="default"/> or Path MTUdiscoveryDiscovery (PMTUD)(<xref target="RFC1191"/>,<xreftarget="RFC8201"/>).target="RFC1191" format="default"/> <xref target="RFC8201" format="default"/>. PMTUD is known to haveissuesissues, so PLMTUD is considered the more robust option. For PLMTUD, congestion control payloads can be used as in-band probes (see <xreftarget="sec-congestion-control-aggfrag-payload-payload-format"></xref>target="sec-congestion-control-aggfrag-payload-payload-format" format="default"/> and <xreftarget="RFC8899"/>).</t>target="RFC8899" format="default"/>).</t> <t>Given the encapsulating packet size and the requested bandwidth to be used, the corresponding packet send rate can be calculated. The packet send rate is the requested bandwidth to beusedused, which is then divided by the size of the encapsulating packet.</t> <t>The egress (receiving) side of the AGGFRAG tunnelMUST<bcp14>MUST</bcp14> allow for and expect the ingress (sending) side of the AGGFRAG tunnel to vary the size and rate of sent encapsulating packets, unless constrained by other policy.</t> <sectiontitle="Tunnel Content">numbered="true" toc="default"> <name>Tunnel Content</name> <t>As previously mentioned, one issue with the TFC padding solution in <xreftarget="RFC4303"/>target="RFC4303" format="default"/> is the large amount of wastedbandwidthbandwidth, as only one IP packet can be sent per encapsulating packet. In order to maximize bandwidth, IP-TFS breaks this one-to-one association by introducing an AGGFRAG mode for ESP.</t><t>AGGFRAG<t>The AGGFRAG mode aggregatesas well asand fragments the inner IP traffic flow into encapsulating IPsec tunnel packets. For IP-TFS, the IPsec encapsulating tunnel packets are a fixed size. Padding is only added to the tunnel packets if there is no data available to be sent at the time of tunnel packettransmission,transmission or if fragmentation has been disabled by the receiver.</t> <t>This is accomplished using a new Encapsulating Security Payload(ESP,(ESP) <xreftarget="RFC4303"/>)target="RFC4303" format="default"/> Next Header field value AGGFRAG_PAYLOAD (<xreftarget="sec-aggfrag-payload-payload"></xref>).</t>target="sec-aggfrag-payload-payload" format="default"/>).</t> <t>Other non-IP-TFS uses of this AGGFRAG mode have been suggested, such as increased performance through packet aggregation, as well as handling MTU issues using fragmentation. These uses are not definedhere,here but are also not restricted by this document.</t> </section> <sectiontitle="Payload Content">numbered="true" toc="default"> <name>Payload Content</name> <t>The AGGFRAG_PAYLOAD payload content defined in this document consists of a44- or24 octet header24-octet header, followed by either a partialdatablock,data block, a fulldatablock,data block, or multiple partial or fulldatablocks.data blocks. The following diagram illustrates this payload within the ESP packet. See <xreftarget="sec-aggfrag-payload-payload"></xref>target="sec-aggfrag-payload-payload" format="default"/> for the exact formats of the AGGFRAG_PAYLOAD payload.</t> <figuretitle="Layoutanchor="sec-layout-of-an-aggfrag-mode-ipsec-packet"> <name>Layout of an AGGFRAGmodeMode IPsecPacket" anchor="sec-layout-of-an-aggfrag-mode-ipsec-packet"><artwork><![CDATA[Packet</name> <artwork name="" type="" align="left" alt=""><![CDATA[ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Outer Encapsulating Header ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ESP Header... . +---------------------------------------------------------------+ | [AGGFRAG sub-type/flags] : BlockOffset | +---------------------------------------------------------------+ : [Optional Congestion Info] : +---------------------------------------------------------------+ | DataBlocks ... ~ ~ ~ ~ | +---------------------------------------------------------------| . ESP Trailer... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .]]></artwork></figure>]]></artwork> </figure> <t>The<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> value is either zero or some offset into or past the end of the<spanx style='verb'>DataBlocks</spanx><tt>DataBlocks</tt> data.</t> <t>If the<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> value iszerozero, it means that the<spanx style='verb'>DataBlocks</spanx><tt>DataBlocks</tt> data begins with a new data block.</t> <t>Conversely, if the<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> value isnon-zeronon-zero, it points to the start of the new data block, and the initial<spanx style='verb'>DataBlocks</spanx><tt>DataBlocks</tt> data belongs to the data block that is still beingre-assembled.</t>reassembled.</t> <t>If the<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> points past the end of the<spanx style='verb'>DataBlocks</spanx> data<tt>DataBlocks</tt> data, then the next data block occurs in a subsequent encapsulating packet.</t> <t>Having the<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> always point at the next available data block allows for recovering the next inner packet in the presence of outer encapsulating packet loss.</t> <t>An example AGGFRAG mode packet flow can be found in <xreftarget="sec-example-of-an-encapsulated-ip-packet-flow"></xref>.</t>target="sec-example-of-an-encapsulated-ip-packet-flow" format="default"/>.</t> <sectiontitle="Data Blocks">numbered="true" toc="default"> <name>DataBlocks</name> <figuretitle="Layoutanchor="sec-layout-of-a-datablock"> <name>Layout of aDataBlock" anchor="sec-layout-of-a-datablock"><artwork><![CDATA[Data Block</name> <artwork name="" type="" align="left" alt=""><![CDATA[ +---------------------------------------------------------------+ | Type | rest of IPv4,IPv6IPv6, orpad.pad... +--------]]></artwork></figure>]]></artwork> </figure> <t>A data block is defined by a 4-bit typecodecode, followed by the data block data. The type values have been carefully chosen to coincide with the IPv4/IPv6 version field values so that no per-data block type overhead is required to encapsulate an IP packet. Likewise, the length of the data block is extracted from the encapsulated IPv4's<spanx style='verb'>Total Length</spanx><tt>Total Length</tt> or IPv6's<spanx style='verb'>Payload Length</spanx><tt>Payload Length</tt> fields.</t> </section> <sectiontitle="End Padding">numbered="true" toc="default"> <name>End Padding</name> <t>Since a data block's type is identified in its first4-bits,4 bits, the only time padding is required is when there is no data to encapsulate. For this endpaddingpadding, a<spanx style='verb'>Pad<tt>Pad DataBlock</spanx>Block</tt> is used.</t> </section> <sectiontitle="Fragmentation,anchor="sec-fragmentation-sequence-numbers-and-all-pad-payloads" numbered="true" toc="default"> <name>Fragmentation, SequenceNumbersNumbers, and All-PadPayloads" anchor="sec-fragmentation-sequence-numbers-and-all-pad-payloads">Payloads</name> <t>In order for a receiver to reassemble fragmented inner packets, the senderMUST<bcp14>MUST</bcp14> send the inner packet fragmentsback-to-backback to back in the logical outer packet stream (i.e., using consecutive ESP sequence numbers). However, the sender is allowed to insert "all-pad" payloads (i.e., payloads with a<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> of zero and a single pad<spanx style='verb'>DataBlock</spanx>)data block ) in between the packets carrying the inner packet fragment payloads. This interleaving of all-pad payloads allows the sender to always send a tunnel packet, regardless of the encapsulation computational requirements.</t> <t>When a receiver is reassembling an inner packet, and it receives an "all-pad" payload, it increments the expected sequence number that the next inner packet fragment is expected to arrive in.</t> <t>Given the above, the receiver will need to handle out-of-order arrival of outer ESP packets prior to reassembly processing. ESP already provides for optionally detecting replay attacks. Detecting replay attacks normally utilizes a window method. A similarsequence number basedsequence-number-based sliding window can be used to correctre-orderingreordering of the outer packet stream. Receiving a larger (newer) sequence number packet advances the window, andreceivedif any older ESP packets whose sequence numbers the window has passed by are received, then the packets are dropped. A good choice for the size of this window depends on the amount of misordering the user is experiencing; however, a value of 3 has been suggested as a default when no more informed choice exists.</t> <t>As the amount of misordering that may be present is hard to predict, the window sizeSHOULD<bcp14>SHOULD</bcp14> be configurable by the user. ImplementationsMAY<bcp14>MAY</bcp14> also dynamically adjust the reordering window based on actual misordering seen in arriving packets.</t> <t>Please note, when IP-TFS sends a continuous stream of packets, there is no requirement for an explicit lost packet timer; however, using a lost packet timer isRECOMMENDED.<bcp14>RECOMMENDED</bcp14>. If an implementation does not use a lost packet timer and only considers an outer packet lost when the reorder window moves by it, the inner traffic can be delayed by up to the reorder window size times theper packetper-packet send rate. This delay could be significant for slower send rates or when larger reorder window sizes are in use. As the lost packet timer affects the delay of inner packet delivery, an implementation or user could choose to set it proportionate to the tunnel rate.</t> <t>While ESP guarantees an increasing sequence number with subsequently sent packets, it does not actually require the sequence numbers to be generated consecutively (e.g., sending onlyeven numberedeven-numbered sequence numbers would beallowedallowed, as long as they are always increasing). Gaps in the sequence numbers will not work for thisdocumentdocument, so the sequence number streamMUST<bcp14>MUST</bcp14> increase monotonically by 1 for each subsequent packet.</t> <t>When using the AGGFRAG_PAYLOAD in conjunction with replay detection, the window size for bothMAY<bcp14>MAY</bcp14> be reduced to the smaller of the two window sizes. This is because packets outside of the smaller window but inside the larger window would still be dropped by the mechanism with the smaller window size. However, there is also no requirement to make these values the same. Indeed, in some cases, such as slow tunnels where a very small or zero reorder window size is appropriate, the user may still want a large replay detection window to log replayed packets. Additionally, large replay windows can be implemented with very littleoverheadoverhead, compared to large reorder windows.</t> <t>Finally, as sequence numbers are reset when switchingSAsSecurity Associations (SAs) (e.g., whenre-keyingrekeying achildChild SA), sendersMUST NOT<bcp14>MUST NOT</bcp14> send initial fragments of an inner packet using one SA and subsequent fragments in a different SA.</t> <aside> <t>A note on<spanx style='verb'>BlockOffset</spanx> values, senders MUST<tt>BlockOffset</tt> values: Senders <bcp14>MUST</bcp14> encode the<spanx style='verb'>BlockOffset</spanx> consistent with the<tt>BlockOffset</tt> consistently with the immediately preceding non-all-pad payload packet. Specifically, if the immediately preceding non-all-pad payload packet ended with a Pad Data Block, this<spanx style='verb'>BlockOffset</spanx> MUST<tt>BlockOffset</tt> <bcp14>MUST</bcp14> be zero, as Pad Data Blocks are never fragmented. The<spanx style='verb'>BlockOffset</spanx> MUST<tt>BlockOffset</tt> <bcp14>MUST</bcp14> be consistent with theremaining sizeremaining size implied by thenativelengthencoding offield from the fragmented inner packet.</t> </aside> <sectiontitle="Optionalnumbered="true" toc="default"> <name>Optional ExtraPadding">Padding</name> <t>When the tunnel bandwidth is not being fully utilized, a senderMAY pad-out<bcp14>MAY</bcp14> pad out the current encapsulating packet in order to deliver an inner packetun-fragmentedunfragmented in the following outer packet. The benefit would be to avoid inner packet fragmentation in the presence of a bursty offered load (non-bursty traffic will naturally not fragment). SendersMAY<bcp14>MAY</bcp14> also choose to allow for a minimum fragment size to be configured (e.g., as a percentage of the AGGFRAG_PAYLOAD payload size) to avoid fragmentation at the cost of tunnel bandwidth. Thecostcosts with these methodsisare complexity and an added delay of inner traffic. The main advantage to avoiding fragmentation is to minimize inner packet loss in the presence of outer packet loss. When this is worthwhile (e.g., how much loss and what type of loss is required, given different inner traffic shapes and utilization, for this to makesense),sense) and what values to use for the allowable/added delay may be worth researching but is outside the scope of this document.</t> <t>While use of padding to avoid fragmentation does not impact interoperability, if padding is usedinappropriatelyinappropriately, it can reduce the effective throughput of a tunnel. Senders implementing either of the above approaches will need to take care to not reduce the effective capacity, and overall utility, of the tunnel through the overuse of padding.</t> </section> </section> <sectiontitle="Empty Payload">numbered="true" toc="default"> <name>Empty Payload</name> <t>To support reporting of congestion control information (described later) using a non-AGGFRAG_PAYLOAD-enabled SA, it is allowed to send an AGGFRAG_PAYLOAD payload with no data blocks (i.e., the ESP payload length is equal to the AGGFRAG_PAYLOAD header length). This special payload is called an empty payload.</t><t>Currently<t>Currently, this situation is only applicable innon-IKEv2usecases.</t>cases without Internet Key Exchange Protocol Version 2 (IKEv2).</t> </section> <sectiontitle="IPnumbered="true" toc="default"> <name>IP Header ValueMapping">Mapping</name> <t><xreftarget="RFC4301"/>target="RFC4301" format="default"/> provides some direction on when and how to map various values from an inner IP header to the outer encapsulating header, namely theDon't-FragmentDon't Fragment (DF) bit(<xref target="RFC0791"/> and<xreftarget="RFC8200"/>),target="RFC0791" format="default"/>, the Differentiated Services (DS) field <xreftarget="RFC2474"/>target="RFC2474" format="default"/>, and the Explicit Congestion Notification (ECN) field <xreftarget="RFC3168"/>.target="RFC3168" format="default"/>. Unlike in <xreftarget="RFC4301"/>,target="RFC4301" format="default"/>, the AGGFRAG modemaymay, and oftenwillwill, be encapsulating more than one IP packet per ESP packet. To deal with this, these mappings are restricted further.</t> <sectiontitle="DF bit"> <t>AGGFRAGnumbered="true" toc="default"> <name>DF Bit</name> <t>The AGGFRAG mode never maps the inner DFbitbit, as it is unrelated to the AGGFRAG tunnel functionality; the AGGFRAG mode never needs to IP fragment the innerpacketspackets, and the inner packets will not affect the fragmentation of the outer encapsulation packets.</t> </section> <sectiontitle="ECN value">numbered="true" toc="default"> <name>ECN Value</name> <t>The ECN value need not bemappedmapped, as any congestion related to the constant-send-rate IP-TFS tunnel is unrelated (by design) to the inner traffic flow. The senderMAY<bcp14>MAY</bcp14> still set the ECN value of inner packets based on the normal ECN specification <xreftarget="RFC3168"/>,target="RFC3168" format="default"/> <xreftarget="RFC4301"/> andtarget="RFC4301" format="default"/> <xreftarget="RFC6040"/>.</t>target="RFC6040" format="default"/>.</t> </section> <sectiontitle="DS field">numbered="true" toc="default"> <name>DS Field</name> <t>By default, the DS fieldSHOULD NOT<bcp14>SHOULD NOT</bcp14> be copied, although a senderMAY<bcp14>MAY</bcp14> choose to allow for configuration to override this behavior. A senderSHOULD<bcp14>SHOULD</bcp14> also allow the DS value to be set by configuration.</t> </section> </section> <sectiontitle="IPv4 Time-To-Livenumbered="true" toc="default"> <name>IPv4 Time To Live (TTL), IPv6 Hop Limit, and ICMPMessages"> <t><xref target="RFC4301"/> specifies howMessages</name> <t>How to modify the inner packet IPv4 TTL <xreftarget="RFC0791"/>target="RFC0791" format="default"/> or IPv6 Hop Limit <xreftarget="RFC8200"/>.</t>target="RFC8200" format="default"/> is specified in <xref target="RFC4301" format="default"/>.</t> <t><xreftarget="RFC4301"/> alsotarget="RFC4301" format="default"/> specifies how to apply policy to authenticated and unauthenticated ICMP error packets (e.g., Destination Unreachable) arriving at or being forwarded through theendpoint. Inendpoint, in particular, whether to process,ignoreignore, or forward said packets. With the one exception that this document does not change the handling of these packets, they should be handled as specified in <xreftarget="RFC4301"/>.</t>target="RFC4301" format="default"/>.</t> <t>The one way in which an AGGFRAG tunnel differs in ICMP error packet mechanics is with PMTU. When fragmentation is enabled on the AGGFRAG tunnel, then no ICMP"too-big""Too Big" errors need to be generated for arriving ingresstraffictraffic, as the arriving inner packets will be naturally fragmented by the AGGFRAGencapsultation.</t>encapsulation.</t> <t>Otherwise, when fragmentation has been disabled on the AGGFRAG tunnel, then the treatment of arriving inner traffic exactly maps to that of a non-AGGFRAG ESP tunnel. Explicitly, IPv4 with DF set and IPv6 packetswhichthat cannot fit init'sits own outer packet payload will generate the appropriate ICMP"too-big" error"Too Big" error, asdirected bydescribed in <xreftarget="RFC4301"/>,target="RFC4301" format="default"/>, and IPv4 packets without DF set will be IPfragmentedfragmented, asdirected bydescribed in <xreftarget="RFC4301"/>.</t>target="RFC4301" format="default"/>.</t> <t>Packets egressing the tunnel continue to be handled as specified in <xreftarget="RFC4301"/>.</t>target="RFC4301" format="default"/>.</t> <t>All other aspects of PMTU and the handling of ICMP "Too Big" messages (i.e., with regards to the outer AGGFRAG/ESP tunnel packet size) also remain unchanged from <xreftarget="RFC4301"/>.</t>target="RFC4301" format="default"/>.</t> </section> <sectiontitle="Effectivenumbered="true" toc="default"> <name>Effective MTU of theTunnel">Tunnel</name> <t>Unlike in <xreftarget="RFC4301"/>,target="RFC4301" format="default"/>, there is normally no effective MTU (EMTU) on an AGGFRAGtunneltunnel, as all IP packet sizes are properly transmitted without requiring IP fragmentation prior to tunnel ingress. That said, a senderMAY<bcp14>MAY</bcp14> allow for explicitly configuring an MTU for the tunnel.</t> <t>If fragmentation has been disabled on the AGGFRAG tunnel, then the tunnel's EMTU and behaviors are the same as normal IPsec tunnels <xreftarget="RFC4301"/>.</t>target="RFC4301" format="default"/>.</t> </section> </section> <sectiontitle="Exclusivenumbered="true" toc="default"> <name>Exclusive SAUse">Use</name> <t>This document does not specify mixed use of an AGGFRAG_PAYLOAD-enabled SA. A senderMUST<bcp14>MUST</bcp14> only send AGGFRAG_PAYLOAD payloads over an SA configured for AGGFRAG mode.</t> </section> <sectiontitle="Modesnumbered="true" toc="default"> <name>Modes ofOperation">Operation</name> <t>Just as with normal IPsec/ESP SAs, AGGFRAG SAs are unidirectional. Bidirectional IP-TFS functionality is achieved by setting up 2 AGGFRAG SAs, one in either direction.</t> <t>An AGGFRAG tunnel used for IP-TFS can operate in 2 modes, a non-congestion-controlled mode and congestion-controlled mode.</t> <sectiontitle="Non-Congestion-Controlled Mode">numbered="true" toc="default"> <name>Non-Congestion-Controlled Mode</name> <t>In the non-congestion-controlled mode, IP-TFS sendsfixed-sizedfixed-size packets over an AGGFRAG tunnel at a constant rate. The packet send rate is constant and is not automaticallyadjustedadjusted, regardless of any network congestion (e.g., packet loss).</t> <t>For similar reasons as given in <xreftarget="RFC7510"/>target="RFC7510" format="default"/>, the non-congestion-controlled modeMUST<bcp14>MUST</bcp14> only be used where the user has full administrative control over any path the tunnel willtake,take andMUST NOT<bcp14>MUST NOT</bcp14> be used if this is not the case. This is required so the user can guarantee the bandwidth and also be sure as to not be negatively affecting network congestion <xreftarget="RFC2914"/>.target="RFC2914" format="default"/>. In this case, packet loss should be reported to the administrator (e.g., via syslog, YANG notification, SNMP traps, etc.) so that any failures due to a lack of bandwidth can be corrected. The use of circuit breakers is alsoRECOMMENDED<bcp14>RECOMMENDED</bcp14> (<xreftarget="sec-circuit-breakers"></xref>).</t>target="sec-circuit-breakers" format="default"/>).</t> <t>Users that choose the non-congestion-controlled mode need to understand that this mode will send packets at a constantraterate, utilizing aconstantconstant, fixedbandwidthbandwidth, and will not adjust based on congestion. Thus, if they do not guarantee the bandwidth required by the tunnel, the tunnel's operation, as well as the rest of their network, may be negatively impacted.</t> <t>One expected use case for the non-congestion-controlled mode is to guarantee the full tunnel bandwidth is available and preferred over other non-tunnel traffic. In fact, a typical site-to-site use case might have all of the user traffic utilizing the IP-TFS tunnel.</t><t>Non-congestion-controlled<t>The non-congestion-controlled mode is also appropriate if ESP over TCP is in use <xreftarget="RFC8229"/>.target="RFC9329" format="default"/>. However, the use of TCP is considered ahighly non-preferred, and a fall-back onlyfallback-only solution forIPsec.IPsec; it is highly not preferred. This is also one of the reasons that TCP was not chosen as the encapsulation for IP-TFS instead of AGGFRAG.</t> </section> <sectiontitle="Congestion-Controlled Mode" anchor="sec-congestion-controlled-mode">anchor="sec-congestion-controlled-mode" numbered="true" toc="default"> <name>Congestion-Controlled Mode</name> <t>With the congestion-controlled mode, IP-TFS adapts to network congestion by lowering the packet send rate to accommodate the congestion, as well as raising the rate when congestion subsides. Since overhead is per packet, by allowing for maximal fixed-size packets and varying the send rate, transport overhead is minimized.</t> <t>The output of the congestion control algorithm will adjust the rate at which the ingress sends packets. While this document does not require a specific congestion control algorithm, best current practice RECOMMENDS that the algorithm conform to <xreftarget="RFC5348"/>.target="RFC5348" format="default"/>. Congestion control principles are documented in <xreftarget="RFC2914"/>target="RFC2914" format="default"/> as well.<xref target="RFC4342"/> providesThere is an example in <xref target="RFC4342" format="default"/> of the<xref target="RFC5348"/>algorithm in <xref target="RFC5348" format="default"/>, which matches the requirements of IP-TFS (i.e., designed for fixed-size packets and send rate varied based on congestion).</t> <t>The required inputs for theTCP friendlyTCP-friendly rate control algorithm described in <xreftarget="RFC5348"/>target="RFC5348" format="default"/> are the receiver's loss event rate and the sender's estimated round-trip time (RTT). These values are provided by IP-TFS using the congestion information header fields described in <xreftarget="sec-congestion-information"></xref>.target="sec-congestion-information" format="default"/>. In particular, these values are sufficient to implement the algorithm described in <xreftarget="RFC5348"/>.</t>target="RFC5348" format="default"/>.</t> <t>At a minimum, the congestion informationMUST<bcp14>MUST</bcp14> be sent, from the receiver and from the sender, at least once per RTT. Prior to establishing anRTTRTT, the informationSHOULD<bcp14>SHOULD</bcp14> be sent constantly from the sender and the receiver so that an RTT estimate can be established. Not receiving this information over multiple consecutive RTT intervals should be considered a congestion event that causes the sender to adjust its sending rate lower. For example,<xref target="RFC4342"/> callsthis is called the "no feedback timeout" in <xref target="RFC4342" format="default"/>, and it is equal to 4 RTT intervals. When a "no feedback timeout" hasoccurred <xref target="RFC4342"/> halvesoccurred, the sendingrate.</t>rate is halved, as per <xref target="RFC4342" format="default"/>.</t> <t>An implementationMAY<bcp14>MAY</bcp14> choose to always include the congestion information in its AGGFRAG payload header if it is sending it on an IP-TFS-enabled SA. Since IP-TFS normally will operate with a large packet size, the congestion information should represent a small portion of the available tunnel bandwidth. An implementation choosing to always send the dataMAY<bcp14>MAY</bcp14> also choose to only update the<spanx style='verb'>LossEventRate</spanx><tt>LossEventRate</tt> and<spanx style='verb'>RTT</spanx><tt>RTT</tt> header field values it sends every<spanx style='verb'>RTT</spanx> though.</t><tt>RTT</tt> through.</t> <t>When choosing a congestion control algorithm (or a selection of algorithms), note that IP-TFS is not providing for reliable delivery of IP traffic, and soper packet ACKsper-packet acknowledgements (ACKs) are not required and are not provided.</t> <t>It is worth noting that the variablesend-ratesend rate of a congestion-controlled AGGFRAGtunnel,tunnel is not private; however, thissend-ratesend rate is being driven by network congestion, and as long as the encapsulated (inner) traffic flow shape and timing are not directly affecting the (outer) network congestion, the variations in the tunnel rate will not weaken the provided inner traffic flow confidentiality.</t> <sectiontitle="Circuit Breakers" anchor="sec-circuit-breakers">anchor="sec-circuit-breakers" numbered="true" toc="default"> <name>Circuit Breakers</name> <t>Inadditionaladdition to congestion control, implementations that supportnon-congestion controlthe non-congestion-control modeSHOULD<bcp14>SHOULD</bcp14> implement circuit breakers <xreftarget="RFC8084"/>target="RFC8084" format="default"/> as a recovery method of last resort. When circuit breakers areenabledenabled, an implementationSHOULD<bcp14>SHOULD</bcp14> also enable congestion control reports so that circuit breakers have information to act on.</t> <t>The pseudowire congestion considerations <xreftarget="RFC7893"/>target="RFC7893" format="default"/> are equally applicable to the mechanisms defined in this document, notably the text oninellasticinelastic traffic.</t> <t>One example of asimplesimple, slow-trip circuit breaker(CB)that an implementation may provide would utilize 2values,values: the amount of persistent loss rate required to trip theCB,circuit breaker and the required length of time this persistent loss rate must be seen to trip theCB.circuit breaker. These 2 value are requiredconfigurationconfigurations from the user. When theCBcircuit breaker istrippedtripped, the tunnel traffic isdisabled,disabled and an appropriate log message or other management type alarm istriggeredtriggered, indicatingoperateoperation intervention is required.</t> </section> </section> </section> <sectiontitle="Summarynumbered="true" toc="default"> <name>Summary of ReceiverProcessing">Processing</name> <t>An AGGFRAG-enabled SA receiver has a few tasks to perform.</t> <t>The receiverMAY<bcp14>MAY</bcp14> process incoming AGGFRAG_PAYLOAD payloads as soon as theyarrivearrive, as much as itcan. I.e.,can, i.e., if the incoming AGGFRAG_PAYLOAD packet contains complete inner packet(s), the receiver should extract and transmit them immediately. For partial packets, the receiver needs to keep the partial packets in the memory until they fall out from the reorderingwindow,window or until the missing parts of the packets are received, in whichcasecase, it will reassemble and transmit them. If the AGGFRAG_PAYLOAD payload contains multiplepacketspackets, theySHOULD<bcp14>SHOULD</bcp14> be sent out in the order they are in the AGGFRAG_PAYLOAD (i.e., keep the original order they were received on the other end). The cost of using this method is that an amplification of out-of-order delivery of inner packets can occur due to inner packet aggregation.</t> <t>Instead of the method described in the previous paragraph, the receiverMAY<bcp14>MAY</bcp14> reorder out-of-order AGGFRAG_PAYLOAD payloads received into in-sequence-order AGGFRAG_PAYLOAD payloads (<xreftarget="sec-fragmentation-sequence-numbers-and-all-pad-payloads"></xref>),target="sec-fragmentation-sequence-numbers-and-all-pad-payloads" format="default"/>), and only after it has an in-order AGGFRAG_PAYLOAD payload stream would the receivertransmitstransmit the inner packets. Using this method will ensure the inner packets are sent in order. The cost of this method is that a lost packet will cause a delay of up to the lost packet timer interval (or the full reorder window if no lost packet timer is used). Additionally, there can be extra burstiness in the output stream. This burstiness can happen when a lost packet is dropped from there-orderreorder window, and the remaining outer packets in there-orderreorder window are immediately processed and sent out back to back.</t> <t>Additionally, if congestion control is enabled, the receiver sends congestion control data (<xreftarget="sec-congestion-control-aggfrag-payload-payload-format"></xref>)target="sec-congestion-control-aggfrag-payload-payload-format" format="default"/>) back to thesendersender, as described in Sections <xreftarget="sec-congestion-controlled-mode"></xref>target="sec-congestion-controlled-mode" format="counter"/> and <xreftarget="sec-congestion-information"></xref>.</t>target="sec-congestion-information" format="counter"/>.</t> <t>Finally, a note on receiving incorrect<spanx style='verb'>BlockOffset</spanx> values.<tt>BlockOffset</tt> values: To account for misbehaving senders, a receiverSHOULD<bcp14>SHOULD</bcp14> gracefully handle the case where the<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> of consecutive packets, and/or the inner packet they share, do not agree. ItMAY<bcp14>MAY</bcp14> drop the innerpacket,packet or one or both of the outer packets.</t> </section> </section> <sectiontitle="Congestion Information" anchor="sec-congestion-information">anchor="sec-congestion-information" numbered="true" toc="default"> <name>Congestion Information</name> <t>In order to support the congestion-controlled mode, the sender needs to know the loss event rate and to approximate the RTT <xreftarget="RFC5348"/>.target="RFC5348" format="default"/>. In order to obtain these values, the receiver sends congestion control information on its SA back to the sender. Thus, to support congestioncontrolcontrol, the receiverMUST<bcp14>MUST</bcp14> have a paired SA back to the sender (this is always the case when the tunnel was created using IKEv2). If the SA back to the sender is anon-AGGFRAG_PAYLOAD enabled SAnon-AGGFRAG_PAYLOAD-enabled SA, then an AGGFRAG_PAYLOAD empty payload (i.e., header only) is used to convey the information.</t> <t>In order to calculate a loss event rate compatible with <xreftarget="RFC5348"/>,target="RFC5348" format="default"/>, the receiver needs to havea round-trip timean RTT estimate.ThusThus, the sender communicates this estimate in the<spanx style='verb'>RTT</spanx><tt>RTT</tt> header field. Onstartupstartup, this value will bezerozero, as no RTT estimate is yet known.</t> <t>In order for the sender to estimate its<spanx style='verb'>RTT</spanx><tt>RTT</tt> value, the sender places a timestamp value in the<spanx style='verb'>TVal</spanx><tt>TVal</tt> header field. On first receipt of this<spanx style='verb'>TVal</spanx>,<tt>TVal</tt>, the receiver records the new<spanx style='verb'>TVal</spanx> value<tt>TVal</tt> value, along with the time it arrived locally. Subsequent receipt of the same<spanx style='verb'>TVal</spanx> MUST NOT<tt>TVal</tt> <bcp14>MUST NOT</bcp14> update the recorded time.</t> <t>When the receiver sends its congestion controlheaderheader, it places this latest recorded<spanx style='verb'>TVal</spanx><tt>TVal</tt> in the<spanx style='verb'>TEcho</spanx><tt>TEcho</tt> header field, along with 2 delayvalues, <spanx style='verb'>Echo Delay</spanx> and <spanx style='verb'>Transmit Delay</spanx>. The <spanx style='verb'>Echo Delay</spanx>values: <tt>Echo Delay</tt> and <tt>Transmit Delay</tt>. The <tt>Echo Delay</tt> value is the time delta from the recorded arrival time of<spanx style='verb'>TVal</spanx><tt>TVal</tt> and the current clock in microseconds. The second value,<spanx style='verb'>Transmit Delay</spanx>,<tt>Transmit Delay</tt>, is the receiver's current transmission delay on the tunnel (i.e., the average time between sending packets on its half of the AGGFRAG tunnel).</t> <t>When the sender receives back its<spanx style='verb'>TVal</spanx><tt>TVal</tt> in the<spanx style='verb'>TEcho</spanx><tt>TEcho</tt> headerfieldfield, it calculates 2 RTT estimates. The first is the actual delay found by subtracting the<spanx style='verb'>TEcho</spanx><tt>TEcho</tt> value from its current clock and then subtracting<spanx style='verb'>Echo Delay</spanx>the <tt>Echo Delay</tt> as well. The second RTT estimate is found by adding the received<spanx style='verb'>Transmit Delay</spanx><tt>Transmit Delay</tt> header value to the sender's own transmission delay (i.e., the average time between sending packets on its half of the AGGFRAG tunnel). The larger of these 2 RTT estimatesSHOULD<bcp14>SHOULD</bcp14> be used as the<spanx style='verb'>RTT</spanx><tt>RTT</tt> value.</t> <t>The two RTT estimates are required to handle different combinations of faster or slower tunnel packet paths with faster or slower fixed tunnel rates. Choosing the larger of the two values guarantees that the<spanx style='verb'>RTT</spanx><tt>RTT</tt> is never considered faster than the aggregate transmission delay based on the IP-TFS send rate (the second estimate), as well as never being considered faster than the actual RTT along the tunnel packet path (the first estimate).</t> <t>The receiver also calculates, and communicates in the<spanx style='verb'>LossEventRate</spanx><tt>LossEventRate</tt> header field, the loss event rate for use by the sender. This is slightly different from <xreftarget="RFC4342"/>target="RFC4342" format="default"/>, which periodically sends all the loss interval data back to the sender so that it can do the calculation. See <xreftarget="sec-a-send-and-loss-event-rate-calculation"></xref>target="sec-a-send-and-loss-event-rate-calculation" format="default"/> for a suggested way to calculate the loss event rate value.InitiallyInitially, this value will be zero (indicating no loss) until enough data has been collected by the receiver to update it.</t> <sectiontitle="ECN Support" anchor="sec-ecn-support">anchor="sec-ecn-support" numbered="true" toc="default"> <name>ECN Support</name> <t>In addition to normal packet loss information, the AGGFRAG mode supports use of the ECN bits in the encapsulating IP header <xreftarget="RFC3168"/>target="RFC3168" format="default"/> for identifying congestion. If ECN use is enabled and a packet arrives at the egress (receiving) side with the Congestion Experienced (CE) value set, then the receiver considers that packet as being dropped, although it does not drop it. The receiverMUST<bcp14>MUST</bcp14> set the E bit in any AGGFRAG_PAYLOAD payload header containing a<spanx style='verb'>LossEventRate</spanx><tt>LossEventRate</tt> value derived from a CE value being considered.</t><t><xref target="RFC3168"/> and<t>In <xreftarget="RFC4301"/>, updated bytarget="RFC6040" format="default"/>, which updates <xreftarget="RFC6040"/> definestarget="RFC3168" format="default"/> and <xref target="RFC4301" format="default"/>, behaviors for marking the outer ECN field value based on the ECN field of the innerpacket.packet are defined. As the AGGFRAG mode may have multiple inner packets present in a single outer packet, and there is no obvious correct way to map these multiple values to the single outer packet ECN field value, the tunnel ingress endpointSHOULD<bcp14>SHOULD</bcp14> operate in the "compatibility"modemode, rather than the "default" mode fromRFC6040.<xref target="RFC6040" format="default"/>. Inparticularparticular, this means that the ingress (sending) endpoint of the tunnel always sets the newly constructed outer encapsulating packet header ECN field to Not-ECT <xreftarget="RFC6040"/>.</t>target="RFC6040" format="default"/>.</t> </section> </section> <sectiontitle="Configurationnumbered="true" toc="default"> <name>Configuration of AGGFRAG Tunnels forIP-TFS">IP-TFS</name> <t>IP-TFS is meant to be deployable with a minimal amount of configuration. AllIP-TFS specificIP-TFS-specific configuration should be specified at the unidirectional tunnel ingress (sending) side. It is intended that non-IKEv2 operation is supported, at least, with local static configuration.</t> <t>YANG and MIB documents have been defined for IP-TFS in <xreftarget="I-D.ietf-ipsecme-yang-iptfs"/>target="RFC9348" format="default"/> and <xreftarget="I-D.ietf-ipsecme-mib-iptfs"/>.</t>target="RFC9349" format="default"/>.</t> <sectiontitle="Bandwidth">numbered="true" toc="default"> <name>Bandwidth</name> <t>Bandwidth is a local configuration option. For the non-congestion-controlled mode, the bandwidthSHOULD<bcp14>SHOULD</bcp14> be configured. For the congestion-controlled mode, the bandwidth can be configured or the congestion control algorithm discovers and uses the maximum bandwidth available. No standardized configuration method is required.</t> </section> <sectiontitle="Fixednumbered="true" toc="default"> <name>Fixed PacketSize">Size</name> <t>The fixed packet size to be used for the tunnel encapsulation packetsMAY<bcp14>MAY</bcp14> be configured manually or can be automatically determined using othermethodsmethods, such as PLMTUD(<xref target="RFC4821"/>,<xreftarget="RFC8899"/>)target="RFC4821" format="default"/> <xref target="RFC8899" format="default"/> or PMTUD(<xref target="RFC1191"/>,<xreftarget="RFC8201"/>).target="RFC1191" format="default"/> <xref target="RFC8201" format="default"/>. As PMTUD is known to have issues, PLMTUD is considered the more robust option. No standardized configuration method is required.</t> </section> <sectiontitle="Congestion Control">numbered="true" toc="default"> <name>Congestion Control</name> <t>Congestion control is a local configuration option. No standardized configuration method is required.</t> </section> </section> <sectiontitle="IKEv2"> <section title="USE_AGGFRAGnumbered="true" toc="default"> <name>IKEv2</name> <section anchor="sec-use-aggfrag-notification-message" numbered="true" toc="default"> <name>USE_AGGFRAG NotificationMessage" anchor="sec-use-aggfrag-notification-message">Message</name> <t>As mentionedpreviouslypreviously, AGGFRAG tunnels utilize ESP payloads of type AGGFRAG_PAYLOAD.</t> <t>When using IKEv2, a new "USE_AGGFRAG"Notification Messagenotification message enables the AGGFRAG_PAYLOAD payload on achildChild SA pair. The method used is similar to how USE_TRANSPORT_MODE is negotiated, as described in <xreftarget="RFC7296"/>.</t>target="RFC7296" format="default"/>.</t> <t>To request use of the AGGFRAG_PAYLOAD payload on the Child SA pair, the initiator includes the USE_AGGFRAG notification in an SA payload requesting a new Child SA (either during the initial IKE_AUTH or during CREATE_CHILD_SA exchanges). If the request isacceptedaccepted, then the responseMUST<bcp14>MUST</bcp14> also include a notification of type USE_AGGFRAG. If the responder declines therequestrequest, thechildChild SA will be established without AGGFRAG_PAYLOAD payload use enabled. If this is unacceptable to the initiator, the initiatorMUST<bcp14>MUST</bcp14> delete thechildChild SA.</t> <t>As the use of the AGGFRAG_PAYLOAD payload is currently only defined fornon-transport modenon-transport-mode tunnels, the USE_AGGFRAG notificationMUST NOT<bcp14>MUST NOT</bcp14> be combined with the USE_TRANSPORT notification.</t> <t>The USE_AGGFRAG notification contains a1 octet1-octet payload of flags that specify requirements from the sender of the notification. If any requirement flags are not understood or cannot be supported by thereceiverreceiver, then the receiverSHOULD NOT<bcp14>SHOULD NOT</bcp14> enable use of AGGFRAG_PAYLOAD (either by not responding with the USE_AGGFRAGnotification, ornotification or, in the case of the initiator, by deleting thechildChild SA if thenow establishednow-established non-AGGFRAG_PAYLOAD using SA is unacceptable).</t> <t>The notification type and payload flag values are defined in <xreftarget="sec-ikev2-use-aggfrag-notification-message"></xref>.</t>target="sec-ikev2-use-aggfrag-notification-message" format="default"/>.</t> </section> </section> <sectiontitle="Packetnumbered="true" toc="default"> <name>Packet and DataFormats">Formats</name> <t>The packet and data formats defined below are generic with the intent of allowing for non-IP-TFS uses, but such uses are outside the scope of this document.</t> <sectiontitle="AGGFRAG_PAYLOAD Payload" anchor="sec-aggfrag-payload-payload">anchor="sec-aggfrag-payload-payload" numbered="true" toc="default"> <name>AGGFRAG_PAYLOAD Payload</name> <t>ESP Next Header value: 144</t> <t>An AGGFRAG payload is identified by the ESP Next Header valueAGGFRAG_PAYLOADAGGFRAG_PAYLOAD, which has the value 144, which has been reserved in the IP protocol numbers space. The first octet of the payload indicates the format of the remaining payload data.</t> <figuretitle="AGGFRAG_PAYLOAD payload format" anchor="sec-aggfrag-payload-payload-format"><artwork><![CDATA[anchor="sec-aggfrag-payload-payload-format"> <name>AGGFRAG_PAYLOAD Payload Format</name> <artwork name="" type="" align="left" alt=""><![CDATA[ 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+-+-+- | Sub-type | ... +-+-+-+-+-+-+-+-+-+-+-]]></artwork></figure> <t><list style="hanging"> <t hangText="Sub-type:"><vspace/>An]]></artwork> </figure> <dl newline="true" spacing="normal"> <dt>Sub-type:</dt> <dd>An 8-bit value indicating the payloadformat.</t> </list></t>format.</dd> </dl> <t>This document defines 2 payload sub-types. These payload formats are defined in the following sections.</t> <sectiontitle="Non-Congestion Controlnumbered="true" toc="default"> <name>Non-Congestion-Control AGGFRAG_PAYLOAD PayloadFormat">Format</name> <t>Thenon-congestion controlnon-congestion-control AGGFRAG_PAYLOAD payload consists of a 4-octetheaderheader, followed by a variable amount of<spanx style='verb'>DataBlocks</spanx> data<tt>DataBlocks</tt> data, as shown below.</t> <figuretitle="Non-congestion control payload format" anchor="sec-non-congestion-control-payload-format"><artwork><![CDATA[anchor="sec-non-congestion-control-payload-format"> <name>Non-Congestion-Control Payload Format</name> <artwork name="" type="" align="left" alt=""><![CDATA[ 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sub-Type (0) | Reserved | BlockOffset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DataBlocks ... +-+-+-+-+-+-+-+-+-+-+-]]></artwork></figure> <t><list style="hanging"> <t hangText="Sub-type:"><vspace/>An]]></artwork> </figure> <dl newline="true" spacing="normal"> <dt>Sub-type:</dt> <dd>An octet indicating the payload format. For thisnon-congestion controlnon-congestion-control format, the value is0.</t> <t hangText="Reserved:"><vspace/>An0.</dd> <dt>Reserved:</dt> <dd>An octet set to 0 on generation and ignored onreceipt.</t> <t hangText="BlockOffset:"><vspace/>Areceipt.</dd> <dt>BlockOffset:</dt> <dd>A 16-bit unsigned integer counting the number of octets of<spanx style='verb'>DataBlocks</spanx><tt>DataBlocks</tt> data before the start of a new data block. If the start of a new data block occurs in a subsequentpayloadpayload, the<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> will point past the end of the<spanx style='verb'>DataBlocks</spanx><tt>DataBlocks</tt> data. In thiscasecase, all the<spanx style='verb'>DataBlocks</spanx><tt>DataBlocks</tt> data belongs to the current data block being assembled. When the<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> extends into subsequentpayloadspayloads, it continues to only count<spanx style='verb'>DataBlocks</spanx><tt>DataBlocks</tt> data (i.e., it does not count subsequent packetsnon-<spanx style='verb'>DataBlocks</spanx> dataof the non-<tt>DataBlocks</tt> data, such as headeroctets).</t> <t hangText="DataBlocks:"><vspace/>Variableoctets).</dd> <dt>DataBlocks:</dt> <dd>Variable number of octets that begins with the start of a datablock,block or the continuation of a previous data block, followed by zero or more additional datablocks.</t> </list></t>blocks.</dd> </dl> </section> <sectiontitle="Congestionanchor="sec-congestion-control-aggfrag-payload-payload-format" numbered="true" toc="default"> <name>Congestion Control AGGFRAG_PAYLOAD PayloadFormat" anchor="sec-congestion-control-aggfrag-payload-payload-format">Format</name> <t>The congestion control AGGFRAG_PAYLOAD payload consists of a24 octet header24-octet header, followed by a variable amount of<spanx style='verb'>DataBlocks</spanx> data<tt>DataBlocks</tt> data, as shown below.</t> <figuretitle="Congestion control payload format" anchor="sec-congestion-control-payload-format"><artwork><![CDATA[anchor="sec-congestion-control-payload-format"> <name>Congestion Control Payload Format</name> <artwork name="" type="" align="left" alt=""><![CDATA[ 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sub-type (1) | Reserved |P|E| BlockOffset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LossEventRate | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTT | Echo Delay ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... Echo Delay | Transmit Delay | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TVal | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TEcho | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DataBlocks ... +-+-+-+-+-+-+-+-+-+-+-]]></artwork></figure> <t><list style="hanging"> <t hangText="Sub-type:"><vspace/>An]]></artwork> </figure> <dl newline="true" spacing="normal"> <dt>Sub-type:</dt> <dd>An octet indicating the payload format. For this congestion control format, the value is1.</t> <t hangText="Reserved:"><vspace/>A1.</dd> <dt>Reserved:</dt> <dd>A 6-bit field set to 0 on generation and ignored onreceipt.</t> <t hangText="P:"><vspace/>Areceipt.</dd> <dt>P:</dt> <dd>A 1-bit valuethatthat, ifsetset, indicates that PLMTUD probing is in progress. This information can be used to avoid treating missing packets as loss events by theCCcongestion control algorithm when running the PLMTUD probealgorithm.</t> <t hangText="E:"><vspace/>Aalgorithm.</dd> <dt>E:</dt> <dd>A 1-bit valuethatthat, ifsetset, indicates that Congestion Experienced (CE) ECN bits were received and used in deriving the reported<spanx style='verb'>LossEventRate</spanx>.</t> <t hangText="BlockOffset:"><vspace/>The<tt>LossEventRate</tt>.</dd> <dt>BlockOffset:</dt> <dd>The same value as the non-congestion-controlled payload formatvalue.</t> <t hangText="LossEventRate:"><vspace/>Avalue.</dd> <dt>LossEventRate:</dt> <dd>A 32-bit value specifying the inverse of the current loss eventraterate, as calculated by the receiver. A value of zero indicates no loss.OtherwiseOtherwise, the loss event rate is<spanx style='verb'>1/LossEventRate</spanx>.</t> <t hangText="RTT:"><vspace/>A<tt>1/LossEventRate</tt>.</dd> <dt>RTT:</dt> <dd>A 22-bit value specifying the sender's currentround-trip timeRTT estimate in microseconds. The valueMAY<bcp14>MAY</bcp14> be zero prior to the sender having calculateda round-trip timean RTT estimate. The valueSHOULD<bcp14>SHOULD</bcp14> be set to zero on non-AGGFRAG_PAYLOAD-enabled SAs. If the RTT is equal to or larger than<spanx style='verb'>0x3FFFFF</spanx><tt>0x3FFFFF</tt>, the valueMUST<bcp14>MUST</bcp14> be set to<spanx style='verb'>0x3FFFFF</spanx>.</t> <t hangText="Echo Delay:"><vspace/>A<tt>0x3FFFFF</tt>.</dd> <dt>Echo Delay:</dt> <dd>A 21-bit value specifying the delay in microseconds incurred between the receiver first receiving the<spanx style='verb'>TVal</spanx> value<tt>TVal</tt> value, which it is sending back in<spanx style='verb'>TEcho</spanx>.<tt>TEcho</tt>. If the delay is equal to or larger than<spanx style='verb'>0x1FFFFF</spanx><tt>0x1FFFFF</tt>, the valueMUST<bcp14>MUST</bcp14> be set to<spanx style='verb'>0x1FFFFF</spanx>.</t> <t hangText="Transmit Delay:"><vspace/>A<tt>0x1FFFFF</tt>.</dd> <dt>Transmit Delay:</dt> <dd>A 21-bit value specifying the transmission delay in microseconds. This is the fixed (or average) delay on the receiver between it sending packets on theIPTFSIP-TFS tunnel. If the delay is equal to or larger than<spanx style='verb'>0x1FFFFF</spanx><tt>0x1FFFFF</tt>, the valueMUST<bcp14>MUST</bcp14> be set to<spanx style='verb'>0x1FFFFF</spanx>.</t> <t hangText="TVal:"><vspace/>An opaque<tt>0x1FFFFF</tt>.</dd> <dt>TVal:</dt> <dd>An opaque, 32-bit value that will be echoed back by the receiver in later packets in the<spanx style='verb'>TEcho</spanx><tt>TEcho</tt> field, along with an<spanx style='verb'>Echo Delay</spanx><tt>Echo Delay</tt> value of how long that echotook.</t> <t hangText="TEcho:"><vspace/>The opaquetook.</dd> <dt>TEcho:</dt> <dd>The opaque, 32-bit value from a received packet's<spanx style='verb'>TVal</spanx><tt>TVal</tt> field. The received<spanx style='verb'>TVal</spanx><tt>TVal</tt> is placed in<spanx style='verb'>TEcho</spanx><tt>TEcho</tt>, along with an<spanx style='verb'>Echo Delay</spanx><tt>Echo Delay</tt> value indicating how long it has been since receiving the<spanx style='verb'>TVal</spanx> value.</t> <t hangText="DataBlocks:"><vspace/>Variable<tt>TVal</tt> value.</dd> <dt>DataBlocks:</dt> <dd>Variable number of octets that begins with the start of a datablock,block or the continuation of a previous data block, followed by zero or more additional data blocks. For the special case of sending congestion control information on anon-IP-TFS enabled SAnon-IP-TFS-enabled SA, this fieldMUST<bcp14>MUST</bcp14> be empty (i.e., be zero octetslong).</t> </list></t>long).</dd> </dl> </section> <sectiontitle="Data Blocks">numbered="true" toc="default"> <name>Data Blocks</name> <figuretitle="Dataanchor="sec-data-block-format"> <name>Data Blockformat" anchor="sec-data-block-format"><artwork><![CDATA[Format</name> <artwork name="" type="" align="left" alt=""><![CDATA[ 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | IPv4,IPv6IPv6, or pad... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-]]></artwork></figure> <t><list style="hanging"> <t hangText="Type:"><vspace/>A]]></artwork> </figure> <dl newline="true" spacing="normal"> <dt>Type:</dt> <dd>A 4-bit field where 0x0 identifies apad data block,Pad Data Block, 0x4 indicates an IPv4 data block, and 0x6 indicates an IPv6 datablock.</t> </list></t>block.</dd> </dl> <sectiontitle="IPv4numbered="true" toc="default"> <name>IPv4 DataBlock">Block</name> <figuretitle="IPv4anchor="sec-ipv4-data-block-format"> <name>IPv4 Data Blockformat" anchor="sec-ipv4-data-block-format"><artwork><![CDATA[Format</name> <artwork name="" type="" align="left" alt=""><![CDATA[ 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x4 | IHL | TypeOfService | TotalLength | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Rest of the inner packet ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-]]></artwork></figure>]]></artwork> </figure> <t>These values are the actual values within the encapsulated IPv4 header. In other words, the start of this data block is the start of the encapsulated IP packet.</t><t><list style="hanging"> <t hangText="Type:"><vspace/>A<dl newline="true" spacing="normal"> <dt>Type:</dt> <dd>A 4-bit value of 0x4 indicating IPv4 (i.e., first nibble of the IPv4packet).</t> <t hangText="TotalLength:"><vspace/>Thepacket).</dd> <dt>TotalLength:</dt> <dd>The 16-bit unsigned integer "Total Length" field of the IPv4 innerpacket.</t> </list></t>packet.</dd> </dl> </section> <sectiontitle="IPv6numbered="true" toc="default"> <name>IPv6 DataBlock">Block</name> <figuretitle="IPv6anchor="sec-ipv6-data-block-format"> <name>IPv6 Data Blockformat" anchor="sec-ipv6-data-block-format"><artwork><![CDATA[Format</name> <artwork name="" type="" align="left" alt=""><![CDATA[ 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x6 | TrafficClass | FlowLabel | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | PayloadLength | Rest of the inner packet ... +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-]]></artwork></figure>]]></artwork> </figure> <t>These values are the actual values within the encapsulated IPv6 header. In other words, the start of this data block is the start of the encapsulated IP packet.</t><t><list style="hanging"> <t hangText="Type:"><vspace/>A<dl newline="true" spacing="normal"> <dt>Type:</dt> <dd>A 4-bit value of 0x6 indicating IPv6 (i.e., first nibble of the IPv6packet).</t> <t hangText="PayloadLength:"><vspace/>Thepacket).</dd> <dt>PayloadLength:</dt> <dd>The 16-bit unsigned integer "Payload Length" field of the inner IPv6 innerpacket.</t> </list></t>packet.</dd> </dl> </section> <sectiontitle="Padnumbered="true" toc="default"> <name>Pad DataBlock">Block</name> <figuretitle="Padanchor="sec-pad-data-block-format"> <name>Pad Data Blockformat" anchor="sec-pad-data-block-format"><artwork><![CDATA[Format</name> <artwork name="" type="" align="left" alt=""><![CDATA[ 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x0 | Padding ... +-+-+-+-+-+-+-+-+-+-+-]]></artwork></figure> <t><list style="hanging"> <t hangText="Type:"><vspace/>A]]></artwork> </figure> <dl newline="true" spacing="normal"> <dt>Type:</dt> <dd>A 4-bit value of 0x0 indicating a padding datablock.</t> <t hangText="Padding:"><vspace/>Extendsblock.</dd> <dt>Padding:</dt> <dd>Extends to end of the encapsulatingpacket.</t> </list></t>packet.</dd> </dl> </section> </section> <sectiontitle="IKEv2anchor="sec-ikev2-use-aggfrag-notification-message" numbered="true" toc="default"> <name>IKEv2 USE_AGGFRAG NotificationMessage" anchor="sec-ikev2-use-aggfrag-notification-message">Message</name> <t>As discussed in <xreftarget="sec-use-aggfrag-notification-message"></xref>,target="sec-use-aggfrag-notification-message" format="default"/>, a notification message USE_AGGFRAG is used to negotiate use of the ESP AGGFRAG_PAYLOAD Next Header value.</t> <t>The USE_AGGFRAG Notification Message State Type is16442</t>16442.</t> <t>The notification payload contains 1 octet of requirement flags. There are currently 2 requirement flags defined. This may be revised by later specifications.</t> <figuretitle="USE_AGGFRAG requirement flags" anchor="sec-use-aggfrag-requirement-flags"><artwork><![CDATA[anchor="sec-use-aggfrag-requirement-flags"> <name>USE_AGGFRAG Requirement Flags</name> <artwork name="" type="" align="left" alt=""><![CDATA[ +-+-+-+-+-+-+-+-+ |0|0|0|0|0|0|C|D| +-+-+-+-+-+-+-+-+]]></artwork></figure> <t><list style="hanging"> <t hangText="0:"><vspace/>6]]></artwork> </figure> <dl newline="true" spacing="normal"> <dt>0:</dt> <dd>6 bits - ReservedMUST<bcp14>MUST</bcp14> be zero on send, unless defined by laterspecifications.</t> <t hangText="C:"><vspace/>Congestionspecifications.</dd> <dt>C:</dt> <dd>Congestion Control bit. If set, then the sender is requiring that congestion control informationMUST<bcp14>MUST</bcp14> be returned to itperiodicallyperiodically, as defined in <xreftarget="sec-congestion-information"></xref>.</t> <t hangText="D:"><vspace/>Don'ttarget="sec-congestion-information" format="default"/>.</dd> <dt>D:</dt> <dd>Don't Fragment bit. If set, it indicates the sender of the notify message does not support receiving packet fragments (i.e., inner packetsMUST<bcp14>MUST</bcp14> be sent using a single<spanx style='verb'>Data Block</spanx>).<tt>Data Block</tt>). This value only applies to what the sender is capable of receiving; the senderMAY<bcp14>MAY</bcp14> still send packet fragments unless similarly restricted by the receiver in its USE_AGGFRAGnotification.</t> </list></t>notification.</dd> </dl> </section> </section> </section> <sectiontitle="IANA Considerations">numbered="true" toc="default"> <name>IANA Considerations</name> <sectiontitle="ESPnumbered="true" toc="default"> <name>ESP Next HeaderValue"> <t>Per the INT area directors direction, this document requests IANA allocateValue</name> <t>IANA has allocated an IP protocol number from the "Protocol Numbers - Assigned Internet Protocol Numbers"registry</t> <t><list style="hanging"> <t hangText="Decimal:"><vspace/>144</t> <t hangText="Keyword:"><vspace/>AGGFRAG</t> <t hangText="Protocol:"><vspace/>AGGFRAGregistry as follows.</t> <dl newline="false" spacing="compact"> <dt>Decimal:</dt> <dd>144</dd> <dt>Keyword:</dt> <dd>AGGFRAG</dd> <dt>Protocol:</dt> <dd>AGGFRAG encapsulation payload forESP (TEMPORARY - registered 2022-08-26, document sent to IESG Evaluation 2022-07-14)</t> <t hangText="Reference:"><vspace/>This document</t> </list></t>ESP</dd> <dt>Reference:</dt> <dd>RFC 9347</dd> </dl> </section> <sectiontitle="AGGFRAG_PAYLOAD Sub-Type Registry"> <t>This document requests IANA createnumbered="true" toc="default"> <name>AGGFRAG_PAYLOAD Sub-Types</name> <t>IANA has created a registry called "AGGFRAG_PAYLOADSub-Type Registry"Sub-Types" under a new category named "ESPAGGFRAG_PAYLOAD Parameters".AGGFRAG_PAYLOAD". The registration policy for this registry is "Expert Review"(<xref target="RFC8126"/> and<xreftarget="RFC7120"/>).</t> <t><list style="hanging"> <t hangText="Name:"><vspace/>AGGFRAG_PAYLOAD Sub-Type Registry</t> <t hangText="Description:"><vspace/>AGGFRAG_PAYLOADtarget="RFC8126" format="default"/> <xref target="RFC7120" format="default"/>.</t> <dl newline="false" spacing="compact"> <dt>Name:</dt> <dd>AGGFRAG_PAYLOAD Sub-Types</dd> <dt>Description:</dt> <dd>AGGFRAG_PAYLOAD PayloadFormats.</t> <t hangText="Reference:"><vspace/>This document</t> </list></t>Formats</dd> <dt>Reference:</dt> <dd>RFC 9347</dd> </dl> <t>This initial content for this registry is as follows:</t><figure><artwork><![CDATA[ Sub-Type Name Reference -------------------------------------------------------- 0 Non-Congestion Control Format This document 1 Congestion<table align="center"> <name>AGGFRAG_PAYLOAD Sub-Types</name> <thead> <tr> <th>Sub-Type</th> <th>Name</th> <th>Reference</th> </tr> </thead> <tbody> <tr> <td>0</td> <td>Non-Congestion-Control Format</td> <td>RFC 9347</td> </tr> <tr> <td>1</td> <td>Congestion ControlFormat This document 3-255 Reserved ]]></artwork></figure> </section> <section title="USE_AGGFRAGFormat</td> <td>RFC 9347</td> </tr> <tr> <td>3-255</td> <td>Reserved</td> <td></td> </tr> </tbody> </table> </section> <section numbered="true" toc="default"> <name>USE_AGGFRAG Notify Message StatusType"> <t>This document requestsType</name> <t>IANA has allocated a status type USE_AGGFRAGbe allocatedfrom the "IKEv2 Notify Message Types - Status Types" registry.</t><t><list style="hanging"> <t hangText="Decimal:"><vspace/>16442</t> <t hangText="Name:"><vspace/>USE_AGGFRAG</t> <t hangText="Reference:"><vspace/>This document</t> </list></t><dl newline="false" spacing="compact"> <dt>Decimal:</dt> <dd>16442</dd> <dt>Name:</dt> <dd>USE_AGGFRAG</dd> <dt>Reference:</dt> <dd>RFC 9347</dd> </dl> </section> </section> <sectiontitle="Implementation Status"> <t>[ RFC Ed.: please remove this entire section as well as the reference to RFC7942 prior to publication. ]</t> <t>[Section added during IESG review to help with evaluation]</t> <t>This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in <xref target="RFC7942"/>. The description of implementations in this section is intended to assist the IETF in its decision processes in progressing drafts to RFCs. Please note that the listing of any individual implementation here does not imply endorsement by the IETF. This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist.</t> <t>According to RFC 7942, "this will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature. It is up to the individual working groups to use this information as they see fit".</t> <t>Currently the author and contributors are aware of 1 full and completed implementation and 1 underway implementation of IP-TFS as defined in this document. These 2 are described below.</t> <section title="Reference Implementation - VPP + Strongswan"> <t>The entire IP-TFS protocol including congestion control mode has been implemented in VPP (Vector Packet Processor), and published to github with an Open Source (Apache 2) License. VPP is a highly efficient forwarding plane implemented in user-space utlizing direct control and polling of physical devices to provide high speed low-latency forwarding in Linux. By pinning packet processing threads directly to CPU cores for their exclusive use a high degree of control is given to the protocol designer.</t> <t>The IKEv2 additions were implemented in Strongswan and are licensed using the GNU public license used by the Strongswan project.</t> <t>Finally, an extensive automation suite was also created and is included with the open source implementation, which tests the functionality as well as the performance of the implementation, and most importantly verifies, through precise timing tracing and time-stamping, the decoupling of the users offered load from the tunnel packets (i.e., the Traffic Flow Security).</t> <t>The verification process utilized the <eref target="https://trex-tgn.cisco.com/">TREX</eref> packet generator for high bandwidth testing as well as other tools such as iperf. The test hardware included large servers with 10GE, 40GE and 100GE network interfaces, as well as small SoC (system on a chip) network appliances, and also cloud deployments.</t> <t>Tested IP-TFS tunnel rates ranged from 10M all the way to 10GE on the small network appliance, for the large servers multiple 10GE tunnel rates were tested as well.</t> <t>Offered loads included partial, full and oversubscribed bandwidths from various flow types consisting of small packets, large packets, random sized packets, sequential sized packets, and multiple IMIX variations sized flows. Timing analysis was done with variable rate traffic, impulse traffic and random bursty traffic.</t> <t>The quality of the reference implementation should be considered production level as it underwent extensive testing and verification.</t> <t>The organization responsible for this implementation is LabN Consulting, L.L.C.</t> <t>URLs to the implementation follow.</t> <t><list style="symbols"> <t><eref target="https://github.com/LabNConsulting/vpp/tree/labn-stable/2009-public">VPP+IPTFS</eref>, <eref target="https://github.com/LabNConsulting/vpp/tree/labn-stable/2009-public/src/plugins/iptfs">iptfs plugin</eref></t> <t><eref target="https://github.com/LabNConsulting/strongswan/tree/labn-5.8-public">Strongswan IKEv2</eref></t> </list></t> <t>The implementation was last updated April, 2021.</t> </section> <section title="In Progress Linux Kernel Implementation."> <t>A second open source implementation has begun by LabN Consulting L.L.C., within the Linux IPsec xfrm stack. Development has also been coordinated with the Linux IPsec community, and was being worked by the same during the most recent IETF 114 hackathon.</t> <t>Currently the quality is alpha level with aggregation-only complete and fragmentation support underway with congestion control to follow.</t> <t>This implementation is licensed under the GNU public license and can be found at the following URLs</t> <t><list style="symbols"> <t>development environment: <eref target="https://github.com/LabNConsulting/iptfs-dev"/></t> <t>linux kernel source: <eref target="https://github.com/LabNConsulting/iptfs-linux"/></t> <t>iproute2 source: <eref target="https://github.com/LabNConsulting/iptfs-iproute2"/></t> </list></t> </section> </section> <section title="Security Considerations">numbered="true" toc="default"> <name>Security Considerations</name> <t>This document describes an aggregation and fragmentation mechanism to efficiently implement TFC for IP traffic. This approach is expected to reduce the efficacy of traffic analysis on IPsec communication. Other than the additional security afforded by using this mechanism, IP-TFS utilizes the security protocols <xreftarget="RFC4303"/>target="RFC4303" format="default"/> and <xreftarget="RFC7296"/>target="RFC7296" format="default"/>, and so their security considerations apply to IP-TFS as well.</t> <t>As noted in <xreftarget="sec-ecn-support"></xref>,target="sec-ecn-support" format="default"/>, the ECN bits are not protected by IPsec and thus may constitute a covert channel. For this reason, ECN useSHOULD NOT<bcp14>SHOULD NOT</bcp14> be enabled by default.</t> <t>As noted previously in <xreftarget="sec-congestion-controlled-mode"></xref>,target="sec-congestion-controlled-mode" format="default"/>, for TFC to be maintained, the encapsulated traffic flow should not be affecting network congestion in a predictable way, and if it would be, then non-congestion-controlled mode use should be considered instead.</t> </section> </middle> <back><references title="Normative References"> <reference anchor='RFC2119' target='https://www.rfc-editor.org/info/rfc2119'> <front> <title>Key words for use in RFCs to Indicate Requirement Levels</title> <author initials='S.' surname='Bradner' fullname='S. Bradner'><organization /></author> <date year='1997' month='March' /> <abstract><t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t></abstract> </front> <seriesInfo name='BCP' value='14'/> <seriesInfo name='RFC' value='2119'/> <seriesInfo name='DOI' value='10.17487/RFC2119'/> </reference> <reference anchor='RFC4303' target='https://www.rfc-editor.org/info/rfc4303'> <front> <title>IP Encapsulating Security Payload (ESP)</title> <author initials='S.' surname='Kent' fullname='S. Kent'><organization /></author> <date year='2005' month='December' /> <abstract><t>This document describes an updated version of the Encapsulating Security Payload (ESP) protocol, which is designed to provide a mix of security services in IPv4 and IPv6. ESP is used to provide confidentiality, data origin authentication, connectionless integrity, an anti-replay service (a form of partial sequence integrity), and limited traffic flow confidentiality. This document obsoletes RFC 2406 (November 1998). [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='4303'/> <seriesInfo name='DOI' value='10.17487/RFC4303'/> </reference> <reference anchor='RFC7296' target='https://www.rfc-editor.org/info/rfc7296'> <front> <title>Internet Key Exchange Protocol Version 2 (IKEv2)</title> <author initials='C.' surname='Kaufman' fullname='C. Kaufman'><organization /></author> <author initials='P.' surname='Hoffman' fullname='P. Hoffman'><organization /></author> <author initials='Y.' surname='Nir' fullname='Y. Nir'><organization /></author> <author initials='P.' surname='Eronen' fullname='P. Eronen'><organization /></author> <author initials='T.' surname='Kivinen' fullname='T. Kivinen'><organization /></author> <date year='2014' month='October' /> <abstract><t>This document describes version 2 of the Internet Key Exchange (IKE) protocol. IKE is a component of IPsec used for performing mutual authentication and establishing and maintaining Security Associations (SAs). This document obsoletes RFC 5996, and includes all of the errata for it. It advances IKEv2 to be an Internet Standard.</t></abstract> </front> <seriesInfo name='STD' value='79'/> <seriesInfo name='RFC' value='7296'/> <seriesInfo name='DOI' value='10.17487/RFC7296'/> </reference> <reference anchor='RFC8174' target='https://www.rfc-editor.org/info/rfc8174'> <front> <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title> <author initials='B.' surname='Leiba' fullname='B. Leiba'><organization /></author> <date year='2017' month='May' /> <abstract><t>RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.</t></abstract> </front> <seriesInfo name='BCP' value='14'/> <seriesInfo name='RFC' value='8174'/> <seriesInfo name='DOI' value='10.17487/RFC8174'/> </reference><references> <name>References</name> <references> <name>Normative References</name> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4303.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7296.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/> </references><references title="Informative References"><references> <name>Informative References</name> <reference anchor="AppCrypt"> <front> <title>Applied Cryptography: Protocols, Algorithms, and Source Code in C</title> <authorinitials='B.' surname='Schneier' fullname='Bruce Schneier'><organization/></author> <date day="1" month="11" year="2017"/> </front> </reference> <reference anchor='RFC0791' target='https://www.rfc-editor.org/info/rfc791'> <front> <title>Internet Protocol</title> <author initials='J.' surname='Postel' fullname='J. Postel'><organization /></author> <date year='1981' month='September' /> </front> <seriesInfo name='STD' value='5'/> <seriesInfo name='RFC' value='791'/> <seriesInfo name='DOI' value='10.17487/RFC0791'/> </reference> <reference anchor='RFC1191' target='https://www.rfc-editor.org/info/rfc1191'> <front> <title>Path MTU discovery</title> <author initials='J.' surname='Mogul' fullname='J. Mogul'><organization /></author> <author initials='S.' surname='Deering' fullname='S. Deering'><organization /></author> <date year='1990' month='November' /> <abstract><t>This memo describes a technique for dynamically discovering the maximum transmission unit (MTU) of an arbitrary internet path. It specifies a small change to the way routers generate one type of ICMP message. For a path that passes through a router that has not been so changed, this technique might not discover the correct Path MTU, but it will always choose a Path MTU as accurate as, and in many cases more accurate than, the Path MTU that would be chosen by current practice. [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='1191'/> <seriesInfo name='DOI' value='10.17487/RFC1191'/> </reference> <reference anchor='RFC2474' target='https://www.rfc-editor.org/info/rfc2474'> <front> <title>Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers</title> <author initials='K.' surname='Nichols' fullname='K. Nichols'><organization /></author> <author initials='S.' surname='Blake' fullname='S. Blake'><organization /></author> <author initials='F.' surname='Baker' fullname='F. Baker'><organization /></author> <author initials='D.' surname='Black' fullname='D. Black'><organization /></author> <date year='1998' month='December' /> <abstract><t>This document defines the IP header field, called the DS (for differentiated services) field. [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='2474'/> <seriesInfo name='DOI' value='10.17487/RFC2474'/> </reference> <reference anchor='RFC2914' target='https://www.rfc-editor.org/info/rfc2914'> <front> <title>Congestion Control Principles</title> <author initials='S.' surname='Floyd' fullname='S. Floyd'><organization /></author> <date year='2000' month='September' /> <abstract><t>The goal of this document is to explain the need for congestion control in the Internet, and to discuss what constitutes correct congestion control. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t></abstract> </front> <seriesInfo name='BCP' value='41'/> <seriesInfo name='RFC' value='2914'/> <seriesInfo name='DOI' value='10.17487/RFC2914'/> </reference> <reference anchor='RFC3168' target='https://www.rfc-editor.org/info/rfc3168'> <front> <title>The Addition of Explicit Congestion Notification (ECN) to IP</title> <author initials='K.' surname='Ramakrishnan' fullname='K. Ramakrishnan'><organization /></author> <author initials='S.' surname='Floyd' fullname='S. Floyd'><organization /></author> <author initials='D.' surname='Black' fullname='D. Black'><organization /></author> <date year='2001' month='September' /> <abstract><t>This memo specifies the incorporation of ECN (Explicit Congestion Notification) to TCP and IP, including ECN's use of two bits in the IP header. [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='3168'/> <seriesInfo name='DOI' value='10.17487/RFC3168'/> </reference> <reference anchor='RFC4301' target='https://www.rfc-editor.org/info/rfc4301'> <front> <title>Security Architecture for the Internet Protocol</title> <author initials='S.' surname='Kent' fullname='S. Kent'><organization /></author> <author initials='K.' surname='Seo' fullname='K. Seo'><organization /></author> <date year='2005' month='December' /> <abstract><t>This document describes an updated version of the "Security Architecture for IP", which is designed to provide security services for traffic at the IP layer. This document obsoletes RFC 2401 (November 1998). [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='4301'/> <seriesInfo name='DOI' value='10.17487/RFC4301'/> </reference> <reference anchor='RFC4342' target='https://www.rfc-editor.org/info/rfc4342'> <front> <title>Profile for Datagram Congestion Control Protocol (DCCP) Congestion Control ID 3: TCP-Friendly Rate Control (TFRC)</title> <author initials='S.' surname='Floyd' fullname='S. Floyd'><organization /></author> <author initials='E.' surname='Kohler' fullname='E. Kohler'><organization /></author> <author initials='J.' surname='Padhye' fullname='J. Padhye'><organization /></author> <date year='2006' month='March' /> <abstract><t>This document contains the profile for Congestion Control Identifier 3, TCP-Friendly Rate Control (TFRC), in the Datagram Congestion Control Protocol (DCCP). CCID 3 should be used by senders that want a TCP-friendly sending rate, possibly with Explicit Congestion Notification (ECN), while minimizing abrupt rate changes. [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='4342'/> <seriesInfo name='DOI' value='10.17487/RFC4342'/> </reference> <reference anchor='RFC4821' target='https://www.rfc-editor.org/info/rfc4821'> <front> <title>Packetization Layer Path MTU Discovery</title> <author initials='M.' surname='Mathis' fullname='M. Mathis'><organization /></author> <author initials='J.' surname='Heffner' fullname='J. Heffner'><organization /></author> <date year='2007' month='March' /> <abstract><t>This document describes a robust method for Path MTU Discovery (PMTUD) that relies on TCP or some other Packetization Layer to probe an Internet path with progressively larger packets. This method is described as an extension to RFC 1191 and RFC 1981, which specify ICMP-based Path MTU Discovery for IP versions 4 and 6, respectively. [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='4821'/> <seriesInfo name='DOI' value='10.17487/RFC4821'/> </reference> <reference anchor='RFC5348' target='https://www.rfc-editor.org/info/rfc5348'> <front> <title>TCP Friendly Rate Control (TFRC): Protocol Specification</title> <author initials='S.' surname='Floyd' fullname='S. Floyd'><organization /></author> <author initials='M.' surname='Handley' fullname='M. Handley'><organization /></author> <author initials='J.' surname='Padhye' fullname='J. Padhye'><organization /></author> <author initials='J.' surname='Widmer' fullname='J. Widmer'><organization /></author> <date year='2008' month='September' /> <abstract><t>This document specifies TCP Friendly Rate Control (TFRC). TFRC is a congestion control mechanism for unicast flows operating in a best-effort Internet environment. It is reasonably fair when competing for bandwidth with TCP flows, but has a much lower variation of throughput over time compared with TCP, making it more suitable for applications such as streaming media where a relatively smooth sending rate is of importance.</t><t>This document obsoletes RFC 3448 and updates RFC 4342. [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='5348'/> <seriesInfo name='DOI' value='10.17487/RFC5348'/> </reference> <reference anchor='RFC6040' target='https://www.rfc-editor.org/info/rfc6040'> <front> <title>Tunnelling of Explicit Congestion Notification</title> <author initials='B.' surname='Briscoe' fullname='B. Briscoe'><organization /></author> <date year='2010' month='November' /> <abstract><t>This document redefines how the explicit congestion notification (ECN) field of the IP header should be constructed on entry to and exit from any IP-in-IP tunnel. On encapsulation, it updates RFC 3168 to bring all IP-in-IP tunnels (v4 or v6) into line with RFC 4301 IPsec ECN processing. On decapsulation, it updates both RFC 3168 and RFC 4301 to add new behaviours for previously unused combinations of inner and outer headers. The new rules ensure the ECN field is correctly propagated across a tunnel whether it is used to signal one or two severity levels of congestion; whereas before, only one severity level was supported. Tunnel endpoints can be updated in any order without affecting pre-existing uses of the ECN field, thus ensuring backward compatibility. Nonetheless, operators wanting to support two severity levels (e.g., for pre-congestion notification -- PCN) can require compliance with this new specification. A thorough analysis of the reasoning for these changes and the implications is included. In the unlikely event that the new rules do not meet a specific need, RFC 4774 gives guidance on designing alternate ECN semantics, and this document extends that to include tunnelling issues. [STANDARDS-TRACK]</t></abstract> </front> <seriesInfo name='RFC' value='6040'/> <seriesInfo name='DOI' value='10.17487/RFC6040'/> </reference> <reference anchor='RFC7120' target='https://www.rfc-editor.org/info/rfc7120'> <front> <title>Early IANA Allocation of Standards Track Code Points</title> <author initials='M.' surname='Cotton' fullname='M. Cotton'><organization /></author> <date year='2014' month='January' /> <abstract><t>This memo describes the process for early allocation of code points by IANA from registries for which "Specification Required", "RFC Required", "IETF Review", or "Standards Action" policies apply. This process can be used to alleviate the problem where code point allocation is needed to facilitate desired or required implementation and deployment experience prior to publication of an RFC, which would normally trigger code point allocation. The procedures in this document are intended to apply only to IETF Stream documents.</t></abstract> </front> <seriesInfo name='BCP' value='100'/> <seriesInfo name='RFC' value='7120'/> <seriesInfo name='DOI' value='10.17487/RFC7120'/> </reference> <reference anchor='RFC7510' target='https://www.rfc-editor.org/info/rfc7510'> <front> <title>Encapsulating MPLS in UDP</title> <author initials='X.' surname='Xu' fullname='X. Xu'><organization /></author> <author initials='N.' surname='Sheth' fullname='N. Sheth'><organization /></author> <author initials='L.' surname='Yong' fullname='L. Yong'><organization /></author> <author initials='R.' surname='Callon' fullname='R. Callon'><organization /></author> <author initials='D.' surname='Black' fullname='D. Black'><organization /></author> <date year='2015' month='April' /> <abstract><t>This document specifies an IP-based encapsulation for MPLS, called MPLS-in-UDP for situations where UDP (User Datagram Protocol) encapsulation is preferred to direct use of MPLS, e.g., to enable UDP-based ECMP (Equal-Cost Multipath) or link aggregation. The MPLS- in-UDP encapsulation technology must only be deployed within a single network (with a single network operator) or networks of an adjacent set of cooperating network operators where traffic is managed to avoid congestion, rather than over the Internet where congestion control is required. Usage restrictions apply to MPLS-in-UDP usage for traffic that is not congestion controlled and to UDP zero checksum usage with IPv6.</t></abstract> </front> <seriesInfo name='RFC' value='7510'/> <seriesInfo name='DOI' value='10.17487/RFC7510'/> </reference> <reference anchor='RFC7893' target='https://www.rfc-editor.org/info/rfc7893'> <front> <title>Pseudowire Congestion Considerations</title> <author initials='Y(J)' surname='Stein' fullname='Y(J) Stein'><organization /></author> <author initials='D.' surname='Black' fullname='D. Black'><organization /></author> <author initials='B.' surname='Briscoe' fullname='B. Briscoe'><organization /></author> <date year='2016' month='June' /> <abstract><t>Pseudowires (PWs) have become a common mechanism for tunneling traffic and may be found in unmanaged scenarios competing for network resources both with other PWs and with non-PW traffic, such as TCP/IP flows. Thus, it is worthwhile specifying under what conditions such competition is acceptable, i.e., the PW traffic does not significantly harm other traffic or contribute more than it should to congestion. We conclude that PWs transporting responsive traffic behave as desired without the need for additional mechanisms. For inelastic PWs (such as Time Division Multiplexing (TDM) PWs), we derive a bound under which such PWs consume no more network capacity than a TCP flow. For TDM PWs, we find that the level of congestion at which the PW can no longer deliver acceptable TDM service is never significantly greater, and is typically much lower, than this bound. Therefore, as long as the PW is shut down when it can no longer deliver acceptable TDM service, it will never do significantly more harm than even a single TCP flow. If the TDM service does not automatically shut down, a mechanism to block persistently unacceptable TDM pseudowires is required.</t></abstract> </front> <seriesInfo name='RFC' value='7893'/> <seriesInfo name='DOI' value='10.17487/RFC7893'/> </reference> <reference anchor='RFC7942' target='https://www.rfc-editor.org/info/rfc7942'> <front> <title>Improving Awareness of Running Code: The Implementation Status Section</title> <author initials='Y.' surname='Sheffer' fullname='Y. Sheffer'><organization /></author> <author initials='A.' surname='Farrel' fullname='A. Farrel'><organization /></author> <date year='2016' month='July' /> <abstract><t>This document describes a simple process that allows authors of Internet-Drafts to record the status of known implementations by including an Implementation Status section. This will allow reviewers and working groups to assign due consideration to documents that have the benefit of running code, which may serve as evidence of valuable experimentation and feedback that have made the implemented protocols more mature.</t><t>This process is not mandatory. Authors of Internet-Drafts are encouraged to consider using the process for their documents, and working groups are invited to think about applying the process to all of their protocol specifications. This document obsoletes RFC 6982, advancing it to a Best Current Practice.</t></abstract> </front> <seriesInfo name='BCP' value='205'/> <seriesInfo name='RFC' value='7942'/> <seriesInfo name='DOI' value='10.17487/RFC7942'/> </reference> <reference anchor='RFC8084' target='https://www.rfc-editor.org/info/rfc8084'> <front> <title>Network Transport Circuit Breakers</title> <author initials='G.' surname='Fairhurst' fullname='G. Fairhurst'><organization /></author> <date year='2017' month='March' /> <abstract><t>This document explains what is meant by the term "network transport Circuit Breaker". It describes the need for Circuit Breakers (CBs) for network tunnels and applications when using non-congestion- controlled traffic and explains where CBs are, and are not, needed. It also defines requirements for building a CB and the expected outcomes of using a CB within the Internet.</t></abstract> </front> <seriesInfo name='BCP' value='208'/> <seriesInfo name='RFC' value='8084'/> <seriesInfo name='DOI' value='10.17487/RFC8084'/> </reference> <reference anchor='RFC8126' target='https://www.rfc-editor.org/info/rfc8126'> <front> <title>Guidelines for Writing an IANA Considerations Section in RFCs</title> <author initials='M.' surname='Cotton' fullname='M. Cotton'><organization /></author> <author initials='B.' surname='Leiba' fullname='B. Leiba'><organization /></author> <author initials='T.' surname='Narten' fullname='T. Narten'><organization /></author> <date year='2017' month='June' /> <abstract><t>Many protocols make use of points of extensibility that use constants to identify various protocol parameters. To ensure that the values in these fields do not have conflicting uses and to promote interoperability, their allocations are often coordinated by a central record keeper. For IETF protocols, that role is filled by the Internet Assigned Numbers Authority (IANA).</t><t>To make assignments in a given registry prudently, guidance describing the conditions under which new values should be assigned, as well as when and how modifications to existing values can be made, is needed. This document defines a framework for the documentation of these guidelines by specification authors, in order to assure that the provided guidance for the IANA Considerations is clear and addresses the various issues that are likely in the operation of a registry.</t><t>This is the third edition of this document; it obsoletes RFC 5226.</t></abstract> </front> <seriesInfo name='BCP' value='26'/> <seriesInfo name='RFC' value='8126'/> <seriesInfo name='DOI' value='10.17487/RFC8126'/> </reference> <reference anchor='RFC8200' target='https://www.rfc-editor.org/info/rfc8200'> <front> <title>Internet Protocol, Version 6 (IPv6) Specification</title> <author initials='S.' surname='Deering' fullname='S. Deering'><organization /></author> <author initials='R.' surname='Hinden' fullname='R. Hinden'><organization /></author> <date year='2017' month='July' /> <abstract><t>This document specifies version 6 of the Internet Protocol (IPv6). It obsoletes RFC 2460.</t></abstract> </front> <seriesInfo name='STD' value='86'/> <seriesInfo name='RFC' value='8200'/> <seriesInfo name='DOI' value='10.17487/RFC8200'/> </reference> <reference anchor='RFC8201' target='https://www.rfc-editor.org/info/rfc8201'> <front> <title>Path MTU Discovery for IP version 6</title> <author initials='J.' surname='McCann' fullname='J. McCann'><organization /></author> <author initials='S.' surname='Deering' fullname='S. Deering'><organization /></author> <author initials='J.' surname='Mogul' fullname='J. Mogul'><organization /></author> <author initials='R.' surname='Hinden' fullname='R. Hinden' role='editor'><organization /></author> <date year='2017' month='July' /> <abstract><t>This document describes Path MTU Discovery (PMTUD) for IP version 6. It is largely derived from RFC 1191, which describes Path MTU Discovery for IP version 4. It obsoletes RFC 1981.</t></abstract> </front> <seriesInfo name='STD' value='87'/> <seriesInfo name='RFC' value='8201'/> <seriesInfo name='DOI' value='10.17487/RFC8201'/> </reference> <reference anchor='RFC8229' target='https://www.rfc-editor.org/info/rfc8229'> <front> <title>TCP Encapsulation of IKE and IPsec Packets</title> <author initials='T.' surname='Pauly' fullname='T. Pauly'><organization /></author> <author initials='S.' surname='Touati' fullname='S. Touati'><organization /></author> <author initials='R.' surname='Mantha' fullname='R. Mantha'><organization /></author> <date year='2017' month='August' /> <abstract><t>This document describes a method to transport Internet Key Exchange Protocol (IKE) and IPsec packets over a TCP connection for traversing network middleboxes that may block IKE negotiation over UDP. This method, referred to as "TCP encapsulation", involves sending both IKE packets for Security Association establishment and Encapsulating Security Payload (ESP) packets over a TCP connection. This method is intended to be used as a fallback option when IKE cannot be negotiated over UDP.</t></abstract> </front> <seriesInfo name='RFC' value='8229'/> <seriesInfo name='DOI' value='10.17487/RFC8229'/> </reference> <reference anchor='RFC8546' target='https://www.rfc-editor.org/info/rfc8546'> <front> <title>The Wire Image of a Network Protocol</title> <author initials='B.' surname='Trammell' fullname='B. Trammell'><organization /></author> <author initials='M.' surname='Kuehlewind' fullname='M. Kuehlewind'><organization /></author> <date year='2019' month='April' /> <abstract><t>This document defines the wire image, an abstraction of the information available to an on-path non-participant in a networking protocol. This abstraction is intended to shed light on the implications that increased encryption has for network functions that use the wire image.</t></abstract> </front> <seriesInfo name='RFC' value='8546'/> <seriesInfo name='DOI' value='10.17487/RFC8546'/> </reference> <reference anchor='RFC8899' target='https://www.rfc-editor.org/info/rfc8899'> <front> <title>Packetization Layer Path MTU Discovery for Datagram Transports</title> <author initials='G.' surname='Fairhurst' fullname='G. Fairhurst'><organization /></author> <author initials='T.' surname='Jones' fullname='T. Jones'><organization /></author> <author initials='M.' surname='Tüxen' fullname='M. Tüxen'><organization /></author> <author initials='I.' surname='Rüngeler' fullname='I. Rüngeler'><organization /></author> <author initials='T.' surname='Völker' fullname='T. Völker'><organization /></author>initials="B." surname="Schneier" fullname="Bruce Schneier"> <organization/> </author> <dateyear='2020' month='September' /> <abstract><t>This document specifies Datagram Packetization Layer Path MTU Discovery (DPLPMTUD). This is a robust method for Path MTU Discovery (PMTUD) for datagram Packetization Layers (PLs). It allows a PL, or a datagram application that uses a PL, to discover whether a network path can support the current size of datagram. This can be used to detect and reduce the message size when a sender encounters a packet black hole. It can also probe a network path to discover whether the maximum packet size can be increased. This provides functionality for datagram transports that is equivalent to the PLPMTUD specification for TCP, specified in RFC 4821, which it updates. It also updates the UDP Usage Guidelines to refer to this method for use with UDP datagrams and updates SCTP.</t><t>The document provides implementation notes for incorporating Datagram PMTUD into IETF datagram transports or applications that use datagram transports.</t><t>This specification updates RFC 4960, RFC 4821, RFC 6951, RFC 8085, and RFC 8261.</t></abstract>year="1996"/> </front><seriesInfo name='RFC' value='8899'/> <seriesInfo name='DOI' value='10.17487/RFC8899'/></reference> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.0791.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1191.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2474.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2914.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4301.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4342.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4821.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5348.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6040.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7120.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7510.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7893.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8084.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8126.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8200.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8201.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9329.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8546.xml"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8899.xml"/> <referenceanchor="I-D.ietf-ipsecme-mib-iptfs" target="https://www.ietf.org/archive/id/draft-ietf-ipsecme-mib-iptfs-03.txt">anchor='RFC9349' target='https://www.rfc-editor.org/info/rfc9349'> <front> <title>Definitions of Managed Objects for IP Traffic Flow Security</title> <author initials="D." surname="Fedyk" fullname="Don Fedyk"> <organization>LabN Consulting, L.L.C.</organization> </author> <author initials="E." surname="Kinzie" fullname="Eric Kinzie"> <organization>LabN Consulting, L.L.C.</organization> </author> <dateday="18" month="November" year="2021"/> <abstract> <t>This document describes managed objects for the the management of IP Traffic Flow Security additions to IKEv2 and IPsec. This document provides a read only version of the objects defined in the YANG module for the same purpose.</t> </abstract>month="January" year="2023"/> </front> <seriesInfoname="Internet-Draft" value="draft-ietf-ipsecme-mib-iptfs-03"/>name="RFC" value="9349"/> <seriesInfo name="DOI" value="10.17487/RFC9349"/> </reference> <referenceanchor="I-D.ietf-ipsecme-yang-iptfs" target="https://www.ietf.org/archive/id/draft-ietf-ipsecme-yang-iptfs-10.txt">anchor='RFC9348' target='https://www.rfc-editor.org/info/rfc9348'> <front> <title>A YANG Data Model for IP Traffic Flow Security</title> <author initials="D." surname="Fedyk" fullname="Don Fedyk"> <organization>LabN Consulting, L.L.C.</organization> </author> <author initials="C." surname="Hopps" fullname="Christian Hopps"> <organization>LabN Consulting, L.L.C.</organization> </author> <dateday="31" month="August" year="2022"/> <abstract> <t>This document describes a YANG module for the management of IP Traffic Flow Security additions to IKEv2 and IPsec.</t> </abstract>month="January" year="2023"/> </front> <seriesInfoname="Internet-Draft" value="draft-ietf-ipsecme-yang-iptfs-10"/>name="RFC" value="9348"/> <seriesInfo name="DOI" value="10.17487/RFC9348"/> </reference> </references> </references> <sectiontitle="Exampleanchor="sec-example-of-an-encapsulated-ip-packet-flow" numbered="true" toc="default"> <name>Example ofAnan Encapsulated IP PacketFlow" anchor="sec-example-of-an-encapsulated-ip-packet-flow">Flow</name> <t>Below, an example inner IP packet flow within the encapsulating tunnel packet stream is shown. Notice how encapsulated IP packets can start and end anywhere, and more than one or less than1one may occur in a single encapsulating packet.</t> <figuretitle="Inneranchor="sec-inner-and-outer-packet-flow"> <name>Inner andouter packet flow" anchor="sec-inner-and-outer-packet-flow"><artwork><![CDATA[Outer Packet Flow</name> <artwork name="" type="" align="left" alt=""><![CDATA[ Offset: 0 Offset: 100 Offset: 2000 Offset: 600 [ ESP1 (1404) ][ ESP2 (1404) ][ ESP3 (1404) ][ ESP4 (1404) ] [--750--][--750--][60][-240-][--3000----------------------][pad]]]></artwork></figure>]]></artwork> </figure> <t>Each outer encapsulatingESPupayloadESP space is afixed-sizefixed size of 1404octetsoctets, the first 4 octets of whichcontainscontain the AGGFRAG header. The encapsulated IP packet flow (lengths include the IP header and payload) is as follows: a 750-octet packet, a 750-octet packet, a 60-octet packet, a 240-octet packet, and a 3000-octet packet.</t> <t>The<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> values in the 4 AGGFRAG payload headers for this packet flow would thus be: 0, 100, 2000,600and 600, respectively. The first encapsulating packet (ESP1) has a zero<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt>, which points at the IP data block immediately following the AGGFRAG header. The following packet's (ESP2)<spanx style='verb'>BlockOffset</spanx><tt>BlockOffset</tt> points inward 100 octets to the start of the 60-octet data block. The third encapsulating packet (ESP3) contains the middle portion of the 3000-octet datablockblock, so the offset points past its end and into the fourth encapsulating packet. The fourth packet's (ESP4) offset is 600, pointing at the paddingwhichthat follows the completion of the continued 3000-octet packet.</t> </section> <sectiontitle="Aanchor="sec-a-send-and-loss-event-rate-calculation" numbered="true" toc="default"> <name>A Send and Loss Event RateCalculation" anchor="sec-a-send-and-loss-event-rate-calculation">Calculation</name> <t>The current best practice indicates that congestion controlSHOULD<bcp14>SHOULD</bcp14> be done in a TCP-friendly way. A TCP-friendly congestion control algorithm is described in <xreftarget="RFC5348"/>.target="RFC5348" format="default"/>. For this IP-TFS use case (as with <xreftarget="RFC4342"/>),target="RFC4342" format="default"/>), the (fixed) packet size is used as the segment size for the algorithm. The main formula in the algorithm for the send rate is then as follows:</t><figure><artwork><![CDATA[<artwork name="" type="" align="left" alt=""><![CDATA[ 1 X = ----------------------------------------------- R * (sqrt(2*p/3) + 12*sqrt(3*p/8)*p*(1+32*p^2))]]></artwork></figure> <t>Where <spanx style='verb'>X</spanx>]]></artwork> <t><tt>X</tt> is the send rate in packets per second,<spanx style='verb'>R</spanx><tt>R</tt> is theround trip time estimateRTT estimate, and<spanx style='verb'>p</spanx><tt>p</tt> is the loss event rate (the inverse of which is provided by the receiver).</t> <t>In addition, the algorithm in <xreftarget="RFC5348"/>target="RFC5348" format="default"/> also uses an<spanx style='verb'>X_recv</spanx><tt>X_recv</tt> value (the receiver's receive rate). ForIP-TFSIP-TFS, oneMAY<bcp14>MAY</bcp14> set this value according to the sender's current tunnelsend-rate (<spanx style='verb'>X</spanx>).</t>send rate (<tt>X</tt>).</t> <t>The IP-TFS receiver, having the RTT estimate from thesendersender, can use the same method as described in <xreftarget="RFC5348"/>target="RFC5348" format="default"/> and <xreftarget="RFC4342"/>target="RFC4342" format="default"/> to collect the loss intervals and calculate the loss event rate value using the weighted average as indicated. The receiver communicates the inverse of this value back to the sender in the AGGFRAG_PAYLOAD payload header field<spanx style='verb'>LossEventRate</spanx>.</t><tt>LossEventRate</tt>.</t> <t>The IP-TFS sender now has both the<spanx style='verb'>R</spanx><tt>R</tt> and<spanx style='verb'>p</spanx><tt>p</tt> values and can calculate the correct sending rate. If following <xreftarget="RFC5348"/>,target="RFC5348" format="default"/>, the sender should also use the slow start mechanism described therein when the IP-TFS SA is first established.</t> </section> <sectiontitle="Comparisons of IP-TFS" anchor="sec-comparisons-of-ip-tfs"> <section title="Comparing Overhead">anchor="sec-comparisons-of-ip-tfs" numbered="true" toc="default"> <name>Comparisons of IP-TFS</name> <section numbered="true" toc="default"> <name>Comparing Overhead</name> <t>For comparing overhead, the overhead of ESP for both normal and AGGFRAG tunnel packets must be calculated, and so an algorithm for encryption and authentication must be chosen. For the databelowbelow, AES-GCM-256 was selected. This leads to an IP+ESP overhead of 54.</t><figure><artwork><![CDATA[<artwork name="" type="" align="left" alt=""><![CDATA[ 54 = 20 (IP) + 8 (ESPH) + 2 (ESPF) + 8 (IV) + 16 (ICV)]]></artwork></figure>]]></artwork> <t>Additionally, for IP-TFS,non-congestion controlnon-congestion-control AGGFRAG_PAYLOAD headers werechosenchosen, which adds 4octetsoctets, for a total overhead of 58.</t> <sectiontitle="IP-TFS Overhead">numbered="true" toc="default"> <name>IP-TFS Overhead</name> <t>For comparison, the overhead of an AGGFRAG payload is 58 octets per outer packet. Therefore, the octet overhead per inner packet is 58 divided by the number of outer packets required (fractions allowed). The overhead as a percentage of inner packet size is a constant based on the Outer MTU size.</t><figure><artwork><![CDATA[<artwork name="" type="" align="left" alt=""><![CDATA[ OH = 58 / Outer Payload Size / Inner Packet Size OH % of Inner Packet Size = 100 * OH / Inner Packet Size OH % of Inner Packet Size = 5800 / Outer Payload Size]]></artwork></figure> <figure title="IP-TFS]]></artwork> <table anchor="sec-ip-tfs-overhead-as-percentage-of-inner-packet-size" align="center"> <name>IP-TFS Overhead as Percentage of Inner PacketSize" anchor="sec-ip-tfs-overhead-as-percentage-of-inner-packet-size"><artwork><![CDATA[ Type IP-TFS IP-TFS IP-TFS MTU 576 1500 9000 PSize 518 1442 8942 ------------------------------- 40 11.20% 4.02% 0.65% 576 11.20% 4.02% 0.65% 1500 11.20% 4.02% 0.65% 9000 11.20% 4.02% 0.65% ]]></artwork></figure> </section> <section title="ESPSize</name> <thead> <tr> <th>Type</th> <th>IP-TFS</th> <th>IP-TFS</th> <th>IP-TFS</th> </tr> <tr> <th>MTU</th> <th>576</th> <th>1500</th> <th>9000</th> </tr> <tr> <th>PSize</th> <th>518</th> <th>1442</th> <th>8942</th> </tr> </thead> <tbody> <tr> <td>40</td> <td>11.20%</td> <td>4.02%</td> <td>0.65% </td> </tr> <tr> <td>576</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>1500</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>9000</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> </tbody> </table> </section> <section numbered="true" toc="default"> <name>ESP with PaddingOverhead">Overhead</name> <t>The overhead per inner packet forconstant-send-rate paddedconstant-send-rate-padded ESP (i.e.,traditionaloriginal IPsec TFC) is 36 octets plus any padding, unless fragmentation is required.</t> <t>When fragmentation of the inner packet is required to fit in the outer IPsec packet, overhead is the number of outer packets required to carry the fragmented inner packet times both the inner IPoverheadOverhead (20) and the outer packet overhead (54) minus the initial inner IPoverheadOverhead plus any required tail padding in the last encapsulation packet. The required tail padding is the number of required packets times the difference of the Outer Payload Size and the IP Overhead minus the Inner Payload Size. So:</t><figure><artwork><![CDATA[<artwork name="" type="" align="left" alt=""><![CDATA[ Inner Payload Size = IP Packet Size - IP Overhead Outer Payload Size = MTU - IPsec Overhead Inner Payload Size NF0 = ---------------------------------- Outer Payload Size - IP Overhead NF = CEILING(NF0) OH = NF * (IP Overhead + IPsec Overhead) - IP Overhead + NF * (Outer Payload Size - IP Overhead) - Inner Payload Size OH = NF * (IPsec Overhead + Outer Payload Size) - (IP Overhead + Inner Payload Size) OH = NF * (IPsec Overhead + Outer Payload Size) - Inner Packet Size]]></artwork></figure>]]></artwork> </section> </section> <sectiontitle="Overhead Comparison">numbered="true" toc="default"> <name>Overhead Comparison</name> <t>The following tables collect the overhead values for some common L3 MTU sizes in order to compare them. The first table is the number of octets of overhead for a given L3MTU sizedMTU-sized packet. The second table is the percentage of overhead in the sameMTU sizedMTU-sized packet.</t><t></t> <figure title="Overhead comparison in octets" anchor="sec-overhead-comparison-in-octets"><artwork><![CDATA[ Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS L3 MTU 576 1500 9000 576 1500 9000 PSize 522 1446 8946 518 1442 8942 ----------------------------------------------------------- 40 482 1406 8906 4.5 1.6 0.3 128 394 1318 8818 14.3 5.1 0.8 256 266 1190 8690 28.7 10.3 1.7 518 4 928 8428 58.0 20.8 3.4 576 576 870 8370 64.5 23.2 3.7 1442 286 4 7504 161.5 58.0 9.4 1500 228 1500 7446 168.0 60.3 9.7 8942 1426 1558 4 1001.2 359.7 58.0 9000 1368 1500 9000 1007.7 362.0 58.4 ]]></artwork></figure> <figure title="Overhead<table anchor="sec-overhead-comparison-in-octets" align="center"> <name>Overhead Comparison in Octets</name> <thead> <tr> <th>Type</th> <th>ESP+Pad</th> <th>ESP+Pad</th> <th>ESP+Pad</th> <th>IP-TFS</th> <th>IP-TFS</th> <th>IP-TFS</th> </tr> <tr> <th>L3 MTU</th> <th>576</th> <th>1500</th> <th>9000</th> <th>576</th> <th>1500</th> <th>9000</th> </tr> <tr> <th>PSize</th> <th>522</th> <th>1446</th> <th>8946</th> <th>518</th> <th>1442</th> <th>8942 </th> </tr> </thead> <tbody> <tr> <td>40</td> <td>482</td> <td>1406</td> <td>8906</td> <td>4.5</td> <td>1.6</td> <td>0.3</td> </tr> <tr> <td>128</td> <td>394</td> <td>1318</td> <td>8818</td> <td>14.3</td> <td>5.1</td> <td>0.8</td> </tr> <tr> <td>256</td> <td>266</td> <td>1190</td> <td>8690</td> <td>28.7</td> <td>10.3</td> <td>1.7</td> </tr> <tr> <td>518</td> <td>4</td> <td>928</td> <td>8428</td> <td>58.0</td> <td>20.8</td> <td>3.4 </td> </tr> <tr> <td>576</td> <td>576</td> <td>870</td> <td>8370</td> <td>64.5</td> <td>23.2</td> <td>3.7</td> </tr> <tr> <td>1442</td> <td>286</td> <td>4</td> <td>7504</td> <td>161.5</td> <td>58.0</td> <td>9.4</td> </tr> <tr> <td>1500</td> <td>228</td> <td>1500</td> <td>7446</td> <td>168.0</td> <td>60.3</td> <td>9.7</td> </tr> <tr> <td>8942</td> <td>1426</td> <td>1558</td> <td>4</td> <td>1001.2</td> <td>359.7</td> <td>58.0</td> </tr> <tr> <td>9000</td> <td>1368</td> <td>1500</td> <td>9000</td> <td>1007.7</td> <td>362.0</td> <td>58.4</td> </tr> </tbody> </table> <table anchor="sec-overhead-as-percentage-of-inner-packet-size" align="center"> <name>Overhead as Percentage of Inner PacketSize" anchor="sec-overhead-as-percentage-of-inner-packet-size"><artwork><![CDATA[ Type ESP+Pad ESP+Pad ESP+Pad IP-TFS IP-TFS IP-TFS MTU 576 1500 9000 576 1500 9000 PSize 522 1446 8946 518 1442 8942 ----------------------------------------------------------- 40 1205.0% 3515.0% 22265.0% 11.20% 4.02% 0.65% 128 307.8% 1029.7% 6889.1% 11.20% 4.02% 0.65% 256 103.9% 464.8% 3394.5% 11.20% 4.02% 0.65% 518 0.8% 179.2% 1627.0% 11.20% 4.02% 0.65% 576 100.0% 151.0% 1453.1% 11.20% 4.02% 0.65% 1442 19.8% 0.3% 520.4% 11.20% 4.02% 0.65% 1500 15.2% 100.0% 496.4% 11.20% 4.02% 0.65% 8942 15.9% 17.4% 0.0% 11.20% 4.02% 0.65% 9000 15.2% 16.7% 100.0% 11.20% 4.02% 0.65% ]]></artwork></figure> </section> <section title="ComparingSize</name> <thead> <tr> <th>Type</th> <th>ESP+Pad</th> <th>ESP+Pad</th> <th>ESP+Pad</th> <th>IP-TFS</th> <th>IP-TFS</th> <th>IP-TFS</th> </tr> <tr> <th>MTU</th> <th>576</th> <th>1500</th> <th>9000</th> <th>576</th> <th>1500</th> <th>9000</th> </tr> <tr> <th>PSize</th> <th>522</th> <th>1446</th> <th>8946</th> <th>518</th> <th>1442</th> <th>8942</th> </tr> </thead> <tbody> <tr> <td>40</td> <td>1205.0%</td> <td>3515.0%</td> <td>22265.0%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>128</td> <td>307.8%</td> <td>1029.7%</td> <td>6889.1%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>256</td> <td>103.9%</td> <td>464.8%</td> <td>3394.5%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>518</td> <td>0.8%</td> <td>179.2%</td> <td>1627.0%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>576</td> <td>100.0%</td> <td>151.0%</td> <td>1453.1%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>1442</td> <td>19.8%</td> <td>0.3%</td> <td>520.4%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>1500</td> <td>15.2%</td> <td>100.0%</td> <td>496.4%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>8942</td> <td>15.9%</td> <td>17.4%</td> <td>0.0%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> <tr> <td>9000</td> <td>15.2%</td> <td>16.7%</td> <td>100.0%</td> <td>11.20%</td> <td>4.02%</td> <td>0.65%</td> </tr> </tbody> </table> </section> <section numbered="true" toc="default"> <name>Comparing AvailableBandwidth">Bandwidth</name> <t>Another way to compare the two solutions is to look at the amount of available bandwidth each solution provides. The following sections consider and compare the percentage of available bandwidth. For the sake of providing a well-understoodbaselinebaseline, normal (unencrypted) Ethernetas well asand normal ESP values are included.</t> <sectiontitle="Ethernet">numbered="true" toc="default"> <name>Ethernet</name> <t>In order to calculate the availablebandwidthbandwidth, theper packetper-packet overhead is calculated first. The total overhead of Ethernet is 14+4 octets of header andCRCCyclic Redundancy Check (CRC) plus an additional 20 octets of framing (preamble, start, and inter-packet gap), for a total of 38 octets. Additionally, the minimum payload is 46 octets.</t><figure title="L2<table anchor="sec-l2-octets-per-packet" align="center"> <name>L2 Octets PerPacket" anchor="sec-l2-octets-per-packet"><artwork><![CDATA[ Size EPacket</name> <thead> <tr> <th>Size</th> <th>E +P EP</th> <th>E +P EP</th> <th>E +P IPTFS IPTFS IPTFS Enet ESP MTU 590 1514 9014 590 1514 9014 any any OH 92 92 92 96 96 96 38 74 ------------------------------------------------------------ 40 614 1538 9038 47 42 40 84 114 128 614 1538 9038 151 136 129 166 202 256 614 1538 9038 303 273 258 294 330 518 614 1538 9038 614 552 523 574 610 576 1228 1538 9038 682 614 582 614 650 1442 1842 1538 9038 1709 1538 1457 1498 1534 1500 1842 3076 9038 1777 1599 1516 1538 1574 8942 11052 10766 9038 10599 9537 9038 8998 9034 9000 11052 10766 18076 10667 9599 9096 9038 9074 ]]></artwork></figure> <figure title="PacketsP</th> <th>IPTFS</th> <th>IPTFS</th> <th>IPTFS</th> <th>Enet</th> <th>ESP</th> </tr> <tr> <th>MTU</th> <th>590</th> <th>1514</th> <th>9014</th> <th>590</th> <th>1514</th> <th>9014</th> <th>any</th> <th>any</th> </tr> <tr> <th>OH</th> <th>92</th> <th>92</th> <th>92</th> <th>96</th> <th>96</th> <th>96</th> <th>38</th> <th>74</th> </tr> </thead> <tbody> <tr> <td>40</td> <td>614</td> <td>1538</td> <td>9038</td> <td>47</td> <td>42</td> <td>40</td> <td>84</td> <td>114</td> </tr> <tr> <td>128</td> <td>614</td> <td>1538</td> <td>9038</td> <td>151</td> <td>136</td> <td>129</td> <td>166</td> <td>202</td> </tr> <tr> <td>256</td> <td>614</td> <td>1538</td> <td>9038</td> <td>303</td> <td>273</td> <td>258</td> <td>294</td> <td>330</td> </tr> <tr> <td>518</td> <td>614</td> <td>1538</td> <td>9038</td> <td>614</td> <td>552</td> <td>523</td> <td>574</td> <td>610</td> </tr> <tr> <td>576</td> <td>1228</td> <td>1538</td> <td>9038</td> <td>682</td> <td>614</td> <td>582</td> <td>614</td> <td>650</td> </tr> <tr> <td>1442</td> <td>1842</td> <td>1538</td> <td>9038</td> <td>1709</td> <td>1538</td> <td>1457</td> <td>1498</td> <td>1534</td> </tr> <tr> <td>1500</td> <td>1842</td> <td>3076</td> <td>9038</td> <td>1777</td> <td>1599</td> <td>1516</td> <td>1538</td> <td>1574</td> </tr> <tr> <td>8942</td> <td>11052</td> <td>10766</td> <td>9038</td> <td>10599</td> <td>9537</td> <td>9038</td> <td>8998</td> <td>9034</td> </tr> <tr> <td>9000</td> <td>11052</td> <td>10766</td> <td>18076</td> <td>10667</td> <td>9599</td> <td>9096</td> <td>9038</td> <td>9074</td> </tr> </tbody> </table> <table anchor="sec-packets-per-second-on-10g-ethernet"> <name>Packets Per Second on 10GEthernet" anchor="sec-packets-per-second-on-10g-ethernet"><artwork><![CDATA[ Size EEthernet</name> <thead> <tr> <th>Size</th> <th>E +P EP</th> <th>E +P EP</th> <th>E +P IPTFS IPTFS IPTFS Enet ESP MTU 590 1514 9014 590 1514 9014 any any OH 92 92 92 96 96 96 38 74 -------------------------------------------------------------- 40 2.0M 0.8M 0.1M 26.4M 29.3M 30.9M 14.9M 11.0M 128 2.0M 0.8M 0.1M 8.2M 9.2M 9.7M 7.5M 6.2M 256 2.0M 0.8M 0.1M 4.1M 4.6M 4.8M 4.3M 3.8M 518 2.0M 0.8M 0.1M 2.0M 2.3M 2.4M 2.2M 2.1M 576 1.0M 0.8M 0.1M 1.8M 2.0M 2.1M 2.0M 1.9M 1442 678K 812K 138K 731K 812K 857K 844K 824K 1500 678K 406K 138K 703K 781K 824K 812K 794K 8942 113K 116K 138K 117K 131K 138K 139K 138K 9000 113K 116K 69K 117K 130K 137K 138K 137K ]]></artwork></figure> <figure title="PercentageP</th> <th>IPTFS</th> <th>IPTFS</th> <th>IPTFS</th> <th>Enet</th> <th>ESP</th> </tr> <tr> <th>MTU</th> <th>590</th> <th>1514</th> <th>9014</th> <th>590</th> <th>1514</th> <th>9014</th> <th>any</th> <th>any</th> </tr> <tr> <th>OH</th> <th>92</th> <th>92</th> <th>92</th> <th>96</th> <th>96</th> <th>96</th> <th>38</th> <th>74</th> </tr> </thead> <tbody> <tr> <td>40</td> <td>2.0M</td> <td>0.8M</td> <td>0.1M</td> <td>26.4M</td> <td>29.3M</td> <td>30.9M</td> <td>14.9M</td> <td>11.0M</td> </tr> <tr> <td>128</td> <td>2.0M</td> <td>0.8M</td> <td>0.1M</td> <td>8.2M</td> <td>9.2M</td> <td>9.7M</td> <td>7.5M</td> <td>6.2M</td> </tr> <tr> <td>256</td> <td>2.0M</td> <td>0.8M</td> <td>0.1M</td> <td>4.1M</td> <td>4.6M</td> <td>4.8M</td> <td>4.3M</td> <td>3.8M</td> </tr> <tr> <td>518</td> <td>2.0M</td> <td>0.8M</td> <td>0.1M</td> <td>2.0M</td> <td>2.3M</td> <td>2.4M</td> <td>2.2M</td> <td>2.1M</td> </tr> <tr> <td>576</td> <td>1.0M</td> <td>0.8M</td> <td>0.1M</td> <td>1.8M</td> <td>2.0M</td> <td>2.1M</td> <td>2.0M</td> <td>1.9M</td> </tr> <tr> <td>1442</td> <td>678K</td> <td>812K</td> <td>138K</td> <td>731K</td> <td>812K</td> <td>857K</td> <td>844K</td> <td>824K</td> </tr> <tr> <td>1500</td> <td>678K</td> <td>406K</td> <td>138K</td> <td>703K</td> <td>781K</td> <td>824K</td> <td>812K</td> <td>794K</td> </tr> <tr> <td>8942</td> <td>113K</td> <td>116K</td> <td>138K</td> <td>117K</td> <td>131K</td> <td>138K</td> <td>139K</td> <td>138K</td> </tr> <tr> <td>9000</td> <td>113K</td> <td>116K</td> <td>69K</td> <td>117K</td> <td>130K</td> <td>137K</td> <td>138K</td> <td>137K</td> </tr> </tbody> </table> <table anchor="sec-percentage-of-bandwidth-on-10g-ethernet" align="center"> <name>Percentage of Bandwidth on 10GEthernet" anchor="sec-percentage-of-bandwidth-on-10g-ethernet"><artwork><![CDATA[ Size EEthernet</name> <thead> <tr> <th>Size</th> <th>E +P EP</th> <th>E +P EP</th> <th>E +P IPTFS IPTFS IPTFS Enet ESP 590 1514 9014 590 1514 9014 any any 92 92 92 96 96 96 38 74 ---------------------------------------------------------------------- 40 6.51% 2.60% 0.44% 84.36% 93.76% 98.94% 47.62% 35.09% 128 20.85% 8.32% 1.42% 84.36% 93.76% 98.94% 77.11% 63.37% 256 41.69% 16.64% 2.83% 84.36% 93.76% 98.94% 87.07% 77.58% 518 84.36% 33.68% 5.73% 84.36% 93.76% 98.94% 93.17% 87.50% 576 46.91% 37.45% 6.37% 84.36% 93.76% 98.94% 93.81% 88.62% 1442 78.28% 93.76% 15.95% 84.36% 93.76% 98.94% 97.43% 95.12% 1500 81.43% 48.76% 16.60% 84.36% 93.76% 98.94% 97.53% 95.30% 8942 80.91% 83.06% 98.94% 84.36% 93.76% 98.94% 99.58% 99.18% 9000 81.43% 83.60% 49.79% 84.36% 93.76% 98.94% 99.58% 99.18% ]]></artwork></figure>P</th> <th>IP-TFS</th> <th>IP-TFS</th> <th>IP-TFS</th> <th>Enet</th> <th>ESP</th> </tr> <tr> <th>MTU</th> <th>590</th> <th>1514</th> <th>9014</th> <th>590</th> <th>1514</th> <th>9014</th> <th>any</th> <th>any</th> </tr> <tr> <th>OH</th> <th>92</th> <th>92</th> <th>92</th> <th>96</th> <th>96</th> <th>96</th> <th>38</th> <th>74</th> </tr> </thead> <tbody> <tr> <td>40</td> <td>6.51%</td> <td>2.60%</td> <td>0.44%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>47.62%</td> <td>35.09%</td> </tr> <tr> <td>128</td> <td>20.85%</td> <td>8.32%</td> <td>1.42%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>77.11%</td> <td>63.37%</td> </tr> <tr> <td>256</td> <td>41.69%</td> <td>16.64%</td> <td>2.83%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>87.07%</td> <td>77.58%</td> </tr> <tr> <td>518</td> <td>84.36%</td> <td>33.68%</td> <td>5.73%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>93.17%</td> <td>87.50%</td> </tr> <tr> <td>576</td> <td>46.91%</td> <td>37.45%</td> <td>6.37%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>93.81%</td> <td>88.62%</td> </tr> <tr> <td>1442</td> <td>78.28%</td> <td>93.76%</td> <td>15.95%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>97.43%</td> <td>95.12%</td> </tr> <tr> <td>1500</td> <td>81.43%</td> <td>48.76%</td> <td>16.60%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>97.53%</td> <td>95.30%</td> </tr> <tr> <td>8942</td> <td>80.91%</td> <td>83.06%</td> <td>98.94%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>99.58%</td> <td>99.18%</td> </tr> <tr> <td>9000</td> <td>81.43%</td> <td>83.60%</td> <td>49.79%</td> <td>84.36%</td> <td>93.76%</td> <td>98.94%</td> <td>99.58%</td> <td>99.18%</td> </tr> </tbody> </table> <t>A sometimes unexpected result of using an AGGFRAG tunnel (or any packet aggregating tunnel) is that, for small- to medium-sized packets, the available bandwidth is actually greater thannativeplain Ethernet. This is due to the reduction in Ethernet framing overhead. This increased bandwidth is paid for with an increase in latency. This latency is the time to send the unrelated octets in the outer tunnel frame. The following table illustrates the latency for some common values on a 10G Ethernet link. The table also includes latency introduced by padding if using ESP with padding.</t><figure title="Added Latency" anchor="sec-added-latency"><artwork><![CDATA[ ESP+Pad ESP+Pad IP-TFS IP-TFS 1500 9000 1500 9000 ------------------------------------------ 40 1.12 us 7.12 us 1.17 us 7.17 us 128 1.05 us 7.05 us 1.10 us 7.10 us 256 0.95 us 6.95 us 1.00 us 7.00 us 518 0.74 us 6.74 us 0.79 us 6.79 us 576 0.70 us 6.70 us 0.74 us 6.74 us 1442 0.00 us 6.00 us 0.05 us 6.05 us 1500 1.20 us 5.96 us 0.00 us 6.00 us ]]></artwork></figure><table anchor="sec-added-latency" align="center"> <name>Added Latency</name> <thead> <tr> <th>Size</th> <th>ESP+Pad</th> <th>ESP+Pad</th> <th>IP-TFS</th> <th>IP-TFS</th> </tr> <tr> <th>MTU</th> <th>1500</th> <th>9000</th> <th>1500</th> <th>9000</th> </tr> </thead> <tbody> <tr> <td>40</td> <td>1.12 us</td> <td>7.12 us</td> <td>1.17 us</td> <td>7.17 us</td> </tr> <tr> <td>128</td> <td>1.05 us</td> <td>7.05 us</td> <td>1.10 us</td> <td>7.10 us</td> </tr> <tr> <td>256</td> <td>0.95 us</td> <td>6.95 us</td> <td>1.00 us</td> <td>7.00 us</td> </tr> <tr> <td>518</td> <td>0.74 us</td> <td>6.74 us</td> <td>0.79 us</td> <td>6.79 us</td> </tr> <tr> <td>576</td> <td>0.70 us</td> <td>6.70 us</td> <td>0.74 us</td> <td>6.74 us</td> </tr> <tr> <td>1442</td> <td>0.00 us</td> <td>6.00 us</td> <td>0.05 us</td> <td>6.05 us</td> </tr> <tr> <td>1500</td> <td>1.20 us</td> <td>5.96 us</td> <td>0.00 us</td> <td>6.00 us</td> </tr> </tbody> </table> <t>Notice that the latency values are very similar between the two solutions; however, whereas IP-TFS provides for constant high bandwidth, in some cases even exceedingnativeplain Ethernet, ESP with padding often greatly reduces available bandwidth.</t> </section> </section> </section> <sectiontitle="Acknowledgements">numbered="false" toc="default"> <name>Acknowledgements</name> <t>We would like to thankDon Fedyk<contact fullname="Don Fedyk"/> for help in reviewing and editing this work. We would also like to thankMichael Richardson, Sean Turner, Valery Smyslov and Tero Kivinen<contact fullname="Michael Richardson"/>, <contact fullname="Sean Turner"/>, <contact fullname="Valery Smyslov"/>, and <contact fullname="Tero Kivinen"/> for reviews and many suggestions for improvements, as well asJoseph Touch<contact fullname="Joseph Touch"/> for the transport area review and suggested improvements.</t> </section> <sectiontitle="Contributors">numbered="false" toc="default"> <name>Contributors</name> <t>The followingpeopleperson made significant contributions to this document.</t><figure><artwork><![CDATA[ Lou Berger LabN<contact fullname="Lou Berger"> <organization>LabN Consulting,L.L.C. Email: lberger@labn.net ]]></artwork></figure>L.L.C.</organization> <address> <email>lberger@labn.net</email> </address> </contact> </section> </back> </rfc>