rfc9330.original | rfc9330.txt | |||
---|---|---|---|---|
Transport Area Working Group B. Briscoe, Ed. | Internet Engineering Task Force (IETF) B. Briscoe, Ed. | |||
Internet-Draft Independent | Request for Comments: 9330 Independent | |||
Intended status: Informational K. De Schepper | Category: Informational K. De Schepper | |||
Expires: 2 March 2023 Nokia Bell Labs | ISSN: 2070-1721 Nokia Bell Labs | |||
M. Bagnulo Braun | M. Bagnulo | |||
Universidad Carlos III de Madrid | Universidad Carlos III de Madrid | |||
G. White | G. White | |||
CableLabs | CableLabs | |||
29 August 2022 | January 2023 | |||
Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service: | Low Latency, Low Loss, and Scalable Throughput (L4S) Internet Service: | |||
Architecture | Architecture | |||
draft-ietf-tsvwg-l4s-arch-20 | ||||
Abstract | Abstract | |||
This document describes the L4S architecture, which enables Internet | This document describes the L4S architecture, which enables Internet | |||
applications to achieve Low queuing Latency, Low Loss, and Scalable | applications to achieve low queuing latency, low congestion loss, and | |||
throughput (L4S). L4S is based on the insight that the root cause of | scalable throughput control. L4S is based on the insight that the | |||
queuing delay is in the capacity-seeking congestion controllers of | root cause of queuing delay is in the capacity-seeking congestion | |||
senders, not in the queue itself. With the L4S architecture all | controllers of senders, not in the queue itself. With the L4S | |||
Internet applications could (but do not have to) transition away from | architecture, all Internet applications could (but do not have to) | |||
congestion control algorithms that cause substantial queuing delay, | transition away from congestion control algorithms that cause | |||
to a new class of congestion controls that can seek capacity with | substantial queuing delay and instead adopt a new class of congestion | |||
very little queuing. These are aided by a modified form of explicit | controls that can seek capacity with very little queuing. These are | |||
congestion notification (ECN) from the network. With this new | aided by a modified form of Explicit Congestion Notification (ECN) | |||
architecture, applications can have both low latency and high | from the network. With this new architecture, applications can have | |||
throughput. | both low latency and high throughput. | |||
The architecture primarily concerns incremental deployment. It | The architecture primarily concerns incremental deployment. It | |||
defines mechanisms that allow the new class of L4S congestion | defines mechanisms that allow the new class of L4S congestion | |||
controls to coexist with 'Classic' congestion controls in a shared | controls to coexist with 'Classic' congestion controls in a shared | |||
network. The aim is for L4S latency and throughput to be usually | network. The aim is for L4S latency and throughput to be usually | |||
much better (and rarely worse), while typically not impacting Classic | much better (and rarely worse) while typically not impacting Classic | |||
performance. | performance. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This document is not an Internet Standards Track specification; it is | |||
provisions of BCP 78 and BCP 79. | published for informational purposes. | |||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Not all documents | |||
approved by the IESG are candidates for any level of Internet | ||||
Standard; see Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 2 March 2023. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9330. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2023 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
in the Revised BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction | |||
1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 5 | 1.1. Document Roadmap | |||
2. L4S Architecture Overview . . . . . . . . . . . . . . . . . . 5 | 2. L4S Architecture Overview | |||
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 3. Terminology | |||
4. L4S Architecture Components . . . . . . . . . . . . . . . . . 9 | 4. L4S Architecture Components | |||
4.1. Protocol Mechanisms . . . . . . . . . . . . . . . . . . . 9 | 4.1. Protocol Mechanisms | |||
4.2. Network Components . . . . . . . . . . . . . . . . . . . 10 | 4.2. Network Components | |||
4.3. Host Mechanisms . . . . . . . . . . . . . . . . . . . . . 13 | 4.3. Host Mechanisms | |||
5. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 5. Rationale | |||
5.1. Why These Primary Components? . . . . . . . . . . . . . . 15 | 5.1. Why These Primary Components? | |||
5.2. What L4S adds to Existing Approaches . . . . . . . . . . 18 | 5.2. What L4S Adds to Existing Approaches | |||
6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 21 | 6. Applicability | |||
6.1. Applications . . . . . . . . . . . . . . . . . . . . . . 21 | 6.1. Applications | |||
6.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 22 | 6.2. Use Cases | |||
6.3. Applicability with Specific Link Technologies . . . . . . 24 | 6.3. Applicability with Specific Link Technologies | |||
6.4. Deployment Considerations . . . . . . . . . . . . . . . . 25 | 6.4. Deployment Considerations | |||
6.4.1. Deployment Topology . . . . . . . . . . . . . . . . . 25 | 6.4.1. Deployment Topology | |||
6.4.2. Deployment Sequences . . . . . . . . . . . . . . . . 26 | 6.4.2. Deployment Sequences | |||
6.4.3. L4S Flow but Non-ECN Bottleneck . . . . . . . . . . . 29 | 6.4.3. L4S Flow but Non-ECN Bottleneck | |||
6.4.4. L4S Flow but Classic ECN Bottleneck . . . . . . . . . 30 | 6.4.4. L4S Flow but Classic ECN Bottleneck | |||
6.4.5. L4S AQM Deployment within Tunnels . . . . . . . . . . 30 | 6.4.5. L4S AQM Deployment within Tunnels | |||
7. IANA Considerations (to be removed by RFC Editor) . . . . . . 30 | 7. IANA Considerations | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 31 | 8. Security Considerations | |||
8.1. Traffic Rate (Non-)Policing . . . . . . . . . . . . . . . 31 | 8.1. Traffic Rate (Non-)Policing | |||
8.1.1. (Non-)Policing Rate per Flow . . . . . . . . . . . . 31 | 8.1.1. (Non-)Policing Rate per Flow | |||
8.1.2. (Non-)Policing L4S Service Rate . . . . . . . . . . . 31 | 8.1.2. (Non-)Policing L4S Service Rate | |||
8.2. 'Latency Friendliness' . . . . . . . . . . . . . . . . . 32 | 8.2. 'Latency Friendliness' | |||
8.3. Interaction between Rate Policing and L4S . . . . . . . . 34 | 8.3. Interaction between Rate Policing and L4S | |||
8.4. ECN Integrity . . . . . . . . . . . . . . . . . . . . . . 35 | 8.4. ECN Integrity | |||
8.5. Privacy Considerations . . . . . . . . . . . . . . . . . 35 | 8.5. Privacy Considerations | |||
9. Informative References . . . . . . . . . . . . . . . . . . . 36 | 9. Informative References | |||
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 45 | Acknowledgements | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 | Authors' Addresses | |||
1. Introduction | 1. Introduction | |||
At any one time, it is increasingly common for all of the traffic in | At any one time, it is increasingly common for all of the traffic in | |||
a bottleneck link (e.g. a household's Internet access) to come from | a bottleneck link (e.g., a household's Internet access or Wi-Fi) to | |||
applications that prefer low delay: interactive Web, Web services, | come from applications that prefer low delay: interactive web, web | |||
voice, conversational video, interactive video, interactive remote | services, voice, conversational video, interactive video, interactive | |||
presence, instant messaging, online gaming, remote desktop, cloud- | remote presence, instant messaging, online and cloud-rendered gaming, | |||
based applications and video-assisted remote control of machinery and | remote desktop, cloud-based applications, cloud-rendered virtual | |||
industrial processes. In the last decade or so, much has been done | reality or augmented reality, and video-assisted remote control of | |||
to reduce propagation delay by placing caches or servers closer to | machinery and industrial processes. In the last decade or so, much | |||
users. However, queuing remains a major, albeit intermittent, | has been done to reduce propagation delay by placing caches or | |||
component of latency. For instance spikes of hundreds of | servers closer to users. However, queuing remains a major, albeit | |||
milliseconds are not uncommon, even with state-of-the-art active | intermittent, component of latency. For instance, spikes of hundreds | |||
queue management (AQM) [COBALT], [DOCSIS3AQM]. Queuing in access | of milliseconds are not uncommon, even with state-of-the-art Active | |||
network bottlenecks is typically configured to cause overall network | Queue Management (AQM) [COBALT] [DOCSIS3AQM]. A Classic AQM in an | |||
delay to roughly double during a long-running flow, relative to | access network bottleneck is typically configured to buffer the | |||
expected base (unloaded) path delay [BufferSize]. Low loss is also | sawteeth of lone flows, which can cause peak overall network delay to | |||
important because, for interactive applications, losses translate | roughly double during a long-running flow, relative to expected base | |||
into even longer retransmission delays. | (unloaded) path delay [BufferSize]. Low loss is also important | |||
because, for interactive applications, losses translate into even | ||||
longer retransmission delays. | ||||
It has been demonstrated that, once access network bit rates reach | It has been demonstrated that, once access network bit rates reach | |||
levels now common in the developed world, increasing link capacity | levels now common in the developed world, increasing link capacity | |||
offers diminishing returns if latency (delay) is not addressed | offers diminishing returns if latency (delay) is not addressed | |||
[Dukkipati06], [Rajiullah15]. Therefore, the goal is an Internet | [Dukkipati06] [Rajiullah15]. Therefore, the goal is an Internet | |||
service with very Low queueing Latency, very Low Loss and Scalable | service with very low queuing latency, very low loss, and scalable | |||
throughput (L4S). Very low queuing latency means less than | throughput. Very low queuing latency means less than 1 millisecond | |||
1 millisecond (ms) on average and less than about 2 ms at the 99th | (ms) on average and less than about 2 ms at the 99th percentile. | |||
percentile. End-to-end delay above 50 ms [Raaen14] or even above | End-to-end delay above 50 ms [Raaen14], or even above 20 ms [NASA04], | |||
20 ms [NASA04] starts to feel unnatural for more demanding | starts to feel unnatural for more demanding interactive applications. | |||
interactive applications. So removing unnecessary delay variability | Therefore, removing unnecessary delay variability increases the reach | |||
increases the reach of these applications (the distance over which | of these applications (the distance over which they are comfortable | |||
they are comfortable to use). This document describes the L4S | to use) and/or provides additional latency budget that can be used | |||
for enhanced processing. This document describes the L4S | ||||
architecture for achieving these goals. | architecture for achieving these goals. | |||
Differentiated services (Diffserv) offers Expedited Forwarding | Differentiated services (Diffserv) offers Expedited Forwarding (EF) | |||
(EF [RFC3246]) for some packets at the expense of others, but this | [RFC3246] for some packets at the expense of others, but this makes | |||
makes no difference when all (or most) of the traffic at a bottleneck | no difference when all (or most) of the traffic at a bottleneck at | |||
at any one time requires low latency. In contrast, L4S still works | any one time requires low latency. In contrast, L4S still works well | |||
well when all traffic is L4S - a service that gives without taking | when all traffic is L4S -- a service that gives without taking needs | |||
needs none of the configuration or management baggage (traffic | none of the configuration or management baggage (traffic policing or | |||
policing, traffic contracts) associated with favouring some traffic | traffic contracts) associated with favouring some traffic flows over | |||
flows over others. | others. | |||
Queuing delay degrades performance intermittently [Hohlfeld14]. It | Queuing delay degrades performance intermittently [Hohlfeld14]. It | |||
occurs when a large enough capacity-seeking (e.g. TCP) flow is | occurs i) when a large enough capacity-seeking (e.g., TCP) flow is | |||
running alongside the user's traffic in the bottleneck link, which is | running alongside the user's traffic in the bottleneck link, which is | |||
typically in the access network. Or when the low latency application | typically in the access network, or ii) when the low latency | |||
is itself a large capacity-seeking or adaptive rate (e.g. interactive | application is itself a large capacity-seeking or adaptive rate flow | |||
video) flow. At these times, the performance improvement from L4S | (e.g., interactive video). At these times, the performance | |||
must be sufficient that network operators will be motivated to deploy | improvement from L4S must be sufficient for network operators to be | |||
it. | motivated to deploy it. | |||
Active Queue Management (AQM) is part of the solution to queuing | Active Queue Management (AQM) is part of the solution to queuing | |||
under load. AQM improves performance for all traffic, but there is a | under load. AQM improves performance for all traffic, but there is a | |||
limit to how much queuing delay can be reduced by solely changing the | limit to how much queuing delay can be reduced by solely changing the | |||
network; without addressing the root of the problem. | network without addressing the root of the problem. | |||
The root of the problem is the presence of standard congestion | The root of the problem is the presence of standard congestion | |||
control (Reno [RFC5681]) or compatible variants | control (Reno [RFC5681]) or compatible variants (e.g., CUBIC | |||
(e.g. CUBIC [RFC8312]) that are used in TCP and in other transports | [RFC8312]) that are used in TCP and in other transports, such as QUIC | |||
such as QUIC [RFC9000]. We shall use the term 'Classic' for these | [RFC9000]. We shall use the term 'Classic' for these Reno-friendly | |||
Reno-friendly congestion controls. Classic congestion controls | congestion controls. Classic congestion controls induce relatively | |||
induce relatively large saw-tooth-shaped excursions up the queue and | large sawtooth-shaped excursions of queue occupancy. So if a network | |||
down again, which have been growing as flow rate scales [RFC3649]. | operator naively attempts to reduce queuing delay by configuring an | |||
So if a network operator naively attempts to reduce queuing delay by | AQM to operate at a shallower queue, a Classic congestion control | |||
configuring an AQM to operate at a shallower queue, a Classic | will significantly underutilize the link at the bottom of every | |||
congestion control will significantly underutilize the link at the | sawtooth. These sawteeth have also been growing in duration as flow | |||
bottom of every saw-tooth. | rate scales (see Section 5.1 and [RFC3649]). | |||
It has been demonstrated that if the sending host replaces a Classic | It has been demonstrated that, if the sending host replaces a Classic | |||
congestion control with a 'Scalable' alternative, when a suitable AQM | congestion control with a 'Scalable' alternative, the performance | |||
is deployed in the network the performance under load of all the | under load of all the above interactive applications can be | |||
above interactive applications can be significantly improved. For | significantly improved once a suitable AQM is deployed in the | |||
instance, queuing delay under heavy load with the example DCTCP/DualQ | network. Taking the example solution cited below that uses Data | |||
solution cited below on a DSL or Ethernet link is roughly 1 to 2 | Center TCP (DCTCP) [RFC8257] and a Dual-Queue Coupled AQM [RFC9332] | |||
milliseconds at the 99th percentile without losing link utilization | on a DSL or Ethernet link, queuing delay under heavy load is roughly | |||
[DualPI2Linux], [DCttH19] (for other link types, see Section 6.3). | 1-2 ms at the 99th percentile without losing link utilization | |||
[L4Seval22] [DualPI2Linux] (for other link types, see Section 6.3). | ||||
This compares with 5-20 ms on _average_ with a Classic congestion | This compares with 5-20 ms on _average_ with a Classic congestion | |||
control and current state-of-the-art AQMs such as FQ-CoDel [RFC8290], | control and current state-of-the-art AQMs, such as Flow Queue CoDel | |||
PIE [RFC8033] or DOCSIS PIE [RFC8034] and about 20-30 ms at the 99th | [RFC8290], Proportional Integral controller Enhanced (PIE) [RFC8033], | |||
percentile [DualPI2Linux]. | or DOCSIS PIE [RFC8034] and about 20-30 ms at the 99th percentile | |||
[DualPI2Linux]. | ||||
L4S is designed for incremental deployment. It is possible to deploy | L4S is designed for incremental deployment. It is possible to deploy | |||
the L4S service at a bottleneck link alongside the existing best | the L4S service at a bottleneck link alongside the existing best | |||
efforts service [DualPI2Linux] so that unmodified applications can | efforts service [DualPI2Linux] so that unmodified applications can | |||
start using it as soon as the sender's stack is updated. Access | start using it as soon as the sender's stack is updated. Access | |||
networks are typically designed with one link as the bottleneck for | networks are typically designed with one link as the bottleneck for | |||
each site (which might be a home, small enterprise or mobile device), | each site (which might be a home, small enterprise, or mobile | |||
so deployment at either or both ends of this link should give nearly | device), so deployment at either or both ends of this link should | |||
all the benefit in the respective direction. With some transport | give nearly all the benefit in the respective direction. With some | |||
protocols, namely TCP and SCTP, the sender has to check that the | transport protocols, namely TCP [ACCECN], the sender has to check | |||
receiver has been suitably updated to give more accurate feedback, | that the receiver has been suitably updated to give more accurate | |||
whereas with more recent transport protocols such as QUIC and DCCP, | feedback, whereas with more recent transport protocols, such as QUIC | |||
[RFC9000] and Datagram Congestion Control Protocol (DCCP) [RFC4340], | ||||
all receivers have always been suitable. | all receivers have always been suitable. | |||
This document presents the L4S architecture. It consists of three | This document presents the L4S architecture. It consists of three | |||
components: network support to isolate L4S traffic from classic | components: network support to isolate L4S traffic from Classic | |||
traffic; protocol features that allow network elements to identify | traffic; protocol features that allow network elements to identify | |||
L4S traffic; and host support for L4S congestion controls. The | L4S traffic; and host support for L4S congestion controls. The | |||
protocol is defined separately [I-D.ietf-tsvwg-ecn-l4s-id] as an | protocol is defined separately in [RFC9331] as an experimental change | |||
experimental change to Explicit Congestion Notification (ECN). This | to Explicit Congestion Notification (ECN). This document describes | |||
document describes and justifies the component parts and how they | and justifies the component parts and how they interact to provide | |||
interact to provide the scalable, low latency, low loss Internet | the low latency, low loss, and scalable Internet service. It also | |||
service. It also details the approach to incremental deployment, as | details the approach to incremental deployment, as briefly summarized | |||
briefly summarized above. | above. | |||
1.1. Document Roadmap | 1.1. Document Roadmap | |||
This document describes the L4S architecture in three passes. First | This document describes the L4S architecture in three passes. First, | |||
this brief overview gives the very high level idea and states the | the brief overview in Section 2 gives the very high-level idea and | |||
main components with minimal rationale. This is only intended to | states the main components with minimal rationale. This is only | |||
give some context for the terminology definitions that follow in | intended to give some context for the terminology definitions that | |||
Section 3, and to explain the structure of the rest of the document. | follow in Section 3 and to explain the structure of the rest of the | |||
Then Section 4 goes into more detail on each component with some | document. Then, Section 4 goes into more detail on each component | |||
rationale, but still mostly stating what the architecture is, rather | with some rationale but still mostly stating what the architecture | |||
than why. Finally, Section 5 justifies why each element of the | is, rather than why. Finally, Section 5 justifies why each element | |||
solution was chosen (Section 5.1) and why these choices were | of the solution was chosen (Section 5.1) and why these choices were | |||
different from other solutions (Section 5.2). | different from other solutions (Section 5.2). | |||
Having described the architecture, Section 6 clarifies its | After the architecture has been described, Section 6 clarifies its | |||
applicability; that is, the applications and use-cases that motivated | applicability by describing the applications and use cases that | |||
the design, the challenges applying the architecture to various link | motivated the design, the challenges applying the architecture to | |||
technologies, and various incremental deployment models: including | various link technologies, and various incremental deployment models | |||
the two main deployment topologies, different sequences for | (including the two main deployment topologies, different sequences | |||
incremental deployment and various interactions with pre-existing | for incremental deployment, and various interactions with preexisting | |||
approaches. The document ends with the usual tailpieces, including | approaches). The document ends with the usual tailpieces, including | |||
extensive discussion of traffic policing and other security | extensive discussion of traffic policing and other security | |||
considerations in Section 8. | considerations in Section 8. | |||
2. L4S Architecture Overview | 2. L4S Architecture Overview | |||
Below we outline the three main components to the L4S architecture; | Below, we outline the three main components to the L4S architecture: | |||
1) the scalable congestion control on the sending host; 2) the AQM at | 1) the Scalable congestion control on the sending host; 2) the AQM at | |||
the network bottleneck; and 3) the protocol between them. | the network bottleneck; and 3) the protocol between them. | |||
But first, the main point to grasp is that low latency is not | But first, the main point to grasp is that low latency is not | |||
provided by the network - low latency results from the careful | provided by the network; low latency results from the careful | |||
behaviour of the scalable congestion controllers used by L4S senders. | behaviour of the Scalable congestion controllers used by L4S senders. | |||
The network does have a role - primarily to isolate the low latency | The network does have a role, primarily to isolate the low latency of | |||
of the carefully behaving L4S traffic from the higher queuing delay | the carefully behaving L4S traffic from the higher queuing delay | |||
needed by traffic with pre-existing Classic behaviour. The network | needed by traffic with preexisting Classic behaviour. The network | |||
also alters the way it signals queue growth to the transport - It | also alters the way it signals queue growth to the transport. It | |||
uses the Explicit Congestion Notification (ECN) protocol, but it | uses the Explicit Congestion Notification (ECN) protocol, but it | |||
signals the very start of queue growth - immediately without the | signals the very start of queue growth immediately, without the | |||
smoothing delay typical of Classic AQMs. Because ECN support is | smoothing delay typical of Classic AQMs. Because ECN support is | |||
essential for L4S, senders use the ECN field as the protocol that | essential for L4S, senders use the ECN field as the protocol that | |||
allows the network to identify which packets are L4S and which are | allows the network to identify which packets are L4S and which are | |||
Classic. | Classic. | |||
1) Host: Scalable congestion controls already exist. They solve the | 1) Host: | |||
scaling problem with Classic congestion controls, such as Reno or | ||||
Cubic. Because flow rate has scaled since TCP congestion control | ||||
was first designed in 1988, assuming the flow lasts long enough, | ||||
it now takes hundreds of round trips (and growing) to recover | ||||
after a congestion signal (whether a loss or an ECN mark) as shown | ||||
in the examples in Section 5.1 and [RFC3649]. Therefore, control | ||||
of queuing and utilization becomes very slack, and the slightest | ||||
disturbances (e.g. from new flows starting) prevent a high rate | ||||
from being attained. | ||||
With a scalable congestion control, the average time from one | Scalable congestion controls already exist. They solve the | |||
congestion signal to the next (the recovery time) remains | scaling problem with Classic congestion controls, such as Reno or | |||
invariant as the flow rate scales, all other factors being equal. | CUBIC. Because flow rate has scaled since TCP congestion control | |||
This maintains the same degree of control over queueing and | was first designed in 1988, assuming the flow lasts long enough, | |||
utilization whatever the flow rate, as well as ensuring that high | it now takes hundreds of round trips (and growing) to recover | |||
throughput is more robust to disturbances. The scalable control | after a congestion signal (whether a loss or an ECN mark), as | |||
used most widely (in controlled environments) is Data Center TCP | shown in the examples in Section 5.1 and [RFC3649]. Therefore, | |||
(DCTCP [RFC8257]), which has been implemented and deployed in | control of queuing and utilization becomes very slack, and the | |||
Windows Server Editions (since 2012), in Linux and in FreeBSD. | slightest disturbances (e.g., from new flows starting) prevent a | |||
Although DCTCP as-is functions well over wide-area round trip | high rate from being attained. | |||
times, most implementations lack certain safety features that | ||||
would be necessary for use outside controlled environments like | ||||
data centres (see Section 6.4.3). So scalable congestion control | ||||
needs to be implemented in TCP and other transport protocols | ||||
(QUIC, SCTP, RTP/RTCP, RMCAT, etc.). Indeed, between the present | ||||
document being drafted and published, the following scalable | ||||
congestion controls were implemented: TCP Prague [PragueLinux], | ||||
QUIC Prague, an L4S variant of the RMCAT SCReAM | ||||
controller [SCReAM] and the L4S ECN part of BBRv2 [BBRv2] intended | ||||
for TCP and QUIC transports. | ||||
2) Network: L4S traffic needs to be isolated from the queuing | With a Scalable congestion control, the average time from one | |||
latency of Classic traffic. One queue per application flow (FQ) | congestion signal to the next (the recovery time) remains | |||
is one way to achieve this, e.g. FQ-CoDel [RFC8290]. However, | invariant as flow rate scales, all other factors being equal. | |||
using just two queues is sufficient and does not require | This maintains the same degree of control over queuing and | |||
inspection of transport layer headers in the network, which is not | utilization, whatever the flow rate, as well as ensuring that | |||
always possible (see Section 5.2). With just two queues, it might | high throughput is more robust to disturbances. The Scalable | |||
seem impossible to know how much capacity to schedule for each | control used most widely (in controlled environments) is DCTCP | |||
queue without inspecting how many flows at any one time are using | [RFC8257], which has been implemented and deployed in Windows | |||
each. And it would be undesirable to arbitrarily divide access | Server Editions (since 2012), in Linux, and in FreeBSD. Although | |||
network capacity into two partitions. The Dual Queue Coupled AQM | DCTCP as-is functions well over wide-area round-trip times | |||
was developed as a minimal complexity solution to this problem. | (RTTs), most implementations lack certain safety features that | |||
It acts like a 'semi-permeable' membrane that partitions latency | would be necessary for use outside controlled environments, like | |||
but not bandwidth. As such, the two queues are for transition | data centres (see Section 6.4.3). Therefore, Scalable congestion | |||
from Classic to L4S behaviour, not bandwidth prioritization. | control needs to be implemented in TCP and other transport | |||
protocols (QUIC, Stream Control Transmission Protocol (SCTP), | ||||
RTP/RTCP, RTP Media Congestion Avoidance Techniques (RMCAT), | ||||
etc.). Indeed, between the present document being drafted and | ||||
published, the following Scalable congestion controls were | ||||
implemented: Prague over TCP and QUIC [PRAGUE-CC] [PragueLinux], | ||||
an L4S variant of the RMCAT SCReAM controller [SCReAM-L4S], and | ||||
the L4S ECN part of Bottleneck Bandwidth and Round-trip | ||||
propagation time (BBRv2) [BBRv2] intended for TCP and QUIC | ||||
transports. | ||||
Section 4 gives a high level explanation of how the per-flow-queue | 2) Network: | |||
(FQ) and DualQ variants of L4S work, and | ||||
[I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of the | ||||
DualQ Coupled AQM framework. A specific marking algorithm is not | ||||
mandated for L4S AQMs. Appendices of | ||||
[I-D.ietf-tsvwg-aqm-dualq-coupled] give non-normative examples | ||||
that have been implemented and evaluated, and give recommended | ||||
default parameter settings. It is expected that L4S experiments | ||||
will improve knowledge of parameter settings and whether the set | ||||
of marking algorithms needs to be limited. | ||||
3) Protocol: A sending host needs to distinguish L4S and Classic | L4S traffic needs to be isolated from the queuing latency of | |||
packets with an identifier so that the network can classify them | Classic traffic. One queue per application flow (FQ) is one way | |||
into their separate treatments. The L4S identifier | to achieve this, e.g., FQ-CoDel [RFC8290]. However, using just | |||
spec. [I-D.ietf-tsvwg-ecn-l4s-id] concludes that all alternatives | two queues is sufficient and does not require inspection of | |||
involve compromises, but the ECT(1) and CE codepoints of the ECN | transport layer headers in the network, which is not always | |||
field represent a workable solution. As already explained, the | possible (see Section 5.2). With just two queues, it might seem | |||
network also uses ECN to immediately signal the very start of | impossible to know how much capacity to schedule for each queue | |||
queue growth to the transport. | without inspecting how many flows at any one time are using each. | |||
And it would be undesirable to arbitrarily divide access network | ||||
capacity into two partitions. The Dual-Queue Coupled AQM was | ||||
developed as a minimal complexity solution to this problem. It | ||||
acts like a 'semi-permeable' membrane that partitions latency but | ||||
not bandwidth. As such, the two queues are for transitioning | ||||
from Classic to L4S behaviour, not bandwidth prioritization. | ||||
3. Terminology | Section 4 gives a high-level explanation of how the per-flow | |||
queue (FQ) and DualQ variants of L4S work, and [RFC9332] gives a | ||||
full explanation of the DualQ Coupled AQM framework. A specific | ||||
marking algorithm is not mandated for L4S AQMs. Appendices of | ||||
[RFC9332] give non-normative examples that have been implemented | ||||
and evaluated and give recommended default parameter settings. | ||||
It is expected that L4S experiments will improve knowledge of | ||||
parameter settings and whether the set of marking algorithms | ||||
needs to be limited. | ||||
[Note to the RFC Editor (to be removed before publication as an RFC): | 3) Protocol: | |||
The following definitions are copied from the L4S ECN | ||||
spec [I-D.ietf-tsvwg-ecn-l4s-id] for the reader's convenience. | A sending host needs to distinguish L4S and Classic packets with | |||
Except, here, Classic CC and Scalable CC are condensed because they | an identifier so that the network can classify them into their | |||
refer to Section 5.1 later. Also the definition of Traffic Policing | separate treatments. The L4S identifier spec [RFC9331] concludes | |||
is not needed in [I-D.ietf-tsvwg-ecn-l4s-id].] | that all alternatives involve compromises, but the ECT(1) and | |||
Congestion Experienced (CE) codepoints of the ECN field represent | ||||
a workable solution. As already explained, the network also uses | ||||
ECN to immediately signal the very start of queue growth to the | ||||
transport. | ||||
3. Terminology | ||||
Classic Congestion Control: A congestion control behaviour that can | Classic Congestion Control: A congestion control behaviour that can | |||
co-exist with standard Reno [RFC5681] without causing | coexist with standard Reno [RFC5681] without causing significantly | |||
significantly negative impact on its flow rate [RFC5033]. The | negative impact on its flow rate [RFC5033]. The scaling problem | |||
scaling problem with Classic congestion control is explained, with | with Classic congestion control is explained, with examples, in | |||
examples, in Section 5.1 and in [RFC3649]. | Section 5.1 and in [RFC3649]. | |||
Scalable Congestion Control: A congestion control where the average | Scalable Congestion Control: A congestion control where the average | |||
time from one congestion signal to the next (the recovery time) | time from one congestion signal to the next (the recovery time) | |||
remains invariant as the flow rate scales, all other factors being | remains invariant as flow rate scales, all other factors being | |||
equal. For instance, DCTCP averages 2 congestion signals per | equal. For instance, DCTCP averages 2 congestion signals per | |||
round-trip whatever the flow rate, as do other recently developed | round trip, whatever the flow rate, as do other recently developed | |||
scalable congestion controls, e.g. Relentless TCP [Mathis09], TCP | Scalable congestion controls, e.g., Relentless TCP [RELENTLESS], | |||
Prague [I-D.briscoe-iccrg-prague-congestion-control], | Prague for TCP and QUIC [PRAGUE-CC] [PragueLinux], BBRv2 [BBRv2] | |||
[PragueLinux], BBRv2 [BBRv2], | [BBR-CC], and the L4S variant of SCReAM for real-time media | |||
[I-D.cardwell-iccrg-bbr-congestion-control] and the L4S variant of | [SCReAM-L4S] [RFC8298]. See Section 4.3 of [RFC9331] for more | |||
SCReAM for real-time media [SCReAM], [RFC8298]). See Section 4.3 | explanation. | |||
of [I-D.ietf-tsvwg-ecn-l4s-id] for more explanation. | ||||
Classic service: The Classic service is intended for all the | Classic Service: The Classic service is intended for all the | |||
congestion control behaviours that co-exist with Reno [RFC5681] | congestion control behaviours that coexist with Reno [RFC5681] | |||
(e.g. Reno itself, Cubic [RFC8312], | (e.g., Reno itself, CUBIC [RFC8312], Compound [CTCP], and TFRC | |||
Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]). The term | [RFC5348]). The term 'Classic queue' means a queue providing the | |||
'Classic queue' means a queue providing the Classic service. | Classic service. | |||
Low-Latency, Low-Loss Scalable throughput (L4S) service: The 'L4S' | Low Latency, Low Loss, and Scalable throughput (L4S) service: The | |||
service is intended for traffic from scalable congestion control | 'L4S' service is intended for traffic from Scalable congestion | |||
algorithms, such as the Prague congestion | control algorithms, such as the Prague congestion control | |||
control [I-D.briscoe-iccrg-prague-congestion-control], which was | [PRAGUE-CC], which was derived from DCTCP [RFC8257]. The L4S | |||
derived from DCTCP [RFC8257]. The L4S service is for more | service is for more general traffic than just Prague -- it allows | |||
general traffic than just Prague -- it allows the set of | the set of congestion controls with similar scaling properties to | |||
congestion controls with similar scaling properties to Prague to | Prague to evolve, such as the examples listed above (Relentless, | |||
evolve, such as the examples listed above (Relentless, SCReAM). | SCReAM, etc.). The term 'L4S queue' means a queue providing the | |||
The term 'L4S queue' means a queue providing the L4S service. | L4S service. | |||
The terms Classic or L4S can also qualify other nouns, such as | The terms Classic or L4S can also qualify other nouns, such as | |||
'queue', 'codepoint', 'identifier', 'classification', 'packet', | 'queue', 'codepoint', 'identifier', 'classification', 'packet', | |||
'flow'. For example: an L4S packet means a packet with an L4S | and 'flow'. For example, an L4S packet means a packet with an L4S | |||
identifier sent from an L4S congestion control. | identifier sent from an L4S congestion control. | |||
Both Classic and L4S services can cope with a proportion of | Both Classic and L4S services can cope with a proportion of | |||
unresponsive or less-responsive traffic as well, but in the L4S | unresponsive or less-responsive traffic as well but, in the L4S | |||
case its rate has to be smooth enough or low enough to not build a | case, its rate has to be smooth enough or low enough to not build | |||
queue (e.g. DNS, VoIP, game sync datagrams, etc.). | a queue (e.g., DNS, Voice over IP (VoIP), game sync datagrams, | |||
etc.). | ||||
Reno-friendly: The subset of Classic traffic that is friendly to the | Reno-friendly: The subset of Classic traffic that is friendly to the | |||
standard Reno congestion control defined for TCP in [RFC5681]. | standard Reno congestion control defined for TCP in [RFC5681]. | |||
The TFRC spec. [RFC5348] indirectly implies that 'friendly' is | The TFRC spec [RFC5348] indirectly implies that 'friendly' is | |||
defined as "generally within a factor of two of the sending rate | defined as "generally within a factor of two of the sending rate | |||
of a TCP flow under the same conditions". Reno-friendly is used | of a TCP flow under the same conditions". Reno-friendly is used | |||
here in place of 'TCP-friendly', given the latter has become | here in place of 'TCP-friendly', given the latter has become | |||
imprecise, because the TCP protocol is now used with so many | imprecise, because the TCP protocol is now used with so many | |||
different congestion control behaviours, and Reno is used in non- | different congestion control behaviours, and Reno is used in non- | |||
TCP transports such as QUIC [RFC9000]. | TCP transports, such as QUIC [RFC9000]. | |||
Classic ECN: The original Explicit Congestion Notification (ECN) | Classic ECN: The original Explicit Congestion Notification (ECN) | |||
protocol [RFC3168], which requires ECN signals to be treated as | protocol [RFC3168] that requires ECN signals to be treated as | |||
equivalent to drops, both when generated in the network and when | equivalent to drops, both when generated in the network and when | |||
responded to by the sender. | responded to by the sender. | |||
L4S uses the ECN field as an | For L4S, the names used for the four codepoints of the 2-bit IP- | |||
identifier [I-D.ietf-tsvwg-ecn-l4s-id] with the names for the four | ECN field are unchanged from those defined in the ECN spec | |||
codepoints of the 2-bit IP-ECN field unchanged from those defined | [RFC3168], i.e., Not-ECT, ECT(0), ECT(1), and CE, where ECT stands | |||
in the ECN spec [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where | for ECN-Capable Transport and CE stands for Congestion | |||
ECT stands for ECN-Capable Transport and CE stands for Congestion | ||||
Experienced. A packet marked with the CE codepoint is termed | Experienced. A packet marked with the CE codepoint is termed | |||
'ECN-marked' or sometimes just 'marked' where the context makes | 'ECN-marked' or sometimes just 'marked' where the context makes | |||
ECN obvious. | ECN obvious. | |||
Site: A home, mobile device, small enterprise or campus, where the | Site: A home, mobile device, small enterprise, or campus where the | |||
network bottleneck is typically the access link to the site. Not | network bottleneck is typically the access link to the site. Not | |||
all network arrangements fit this model but it is a useful, widely | all network arrangements fit this model, but it is a useful, | |||
applicable generalization. | widely applicable generalization. | |||
Traffic policing: Limiting traffic by dropping packets or shifting | Traffic Policing: Limiting traffic by dropping packets or shifting | |||
them to lower service class (as opposed to introducing delay, | them to a lower service class (as opposed to introducing delay, | |||
which is termed traffic shaping). Policing can involve limiting | which is termed 'traffic shaping'). Policing can involve limiting | |||
average rate and/or burst size. Policing focused on limiting | the average rate and/or burst size. Policing focused on limiting | |||
queuing but not average flow rate is termed congestion policing, | queuing but not the average flow rate is termed 'congestion | |||
latency policing, burst policing or queue protection in this | policing', 'latency policing', 'burst policing', or 'queue | |||
document. Otherwise, the term rate policing is used. | protection' in this document. Otherwise, the term rate policing | |||
is used. | ||||
4. L4S Architecture Components | 4. L4S Architecture Components | |||
The L4S architecture is composed of the elements in the following | The L4S architecture is composed of the elements in the following | |||
three subsections. | three subsections. | |||
4.1. Protocol Mechanisms | 4.1. Protocol Mechanisms | |||
The L4S architecture involves: a) unassignment of the previous use of | The L4S architecture involves: a) unassignment of the previous use of | |||
the identifier; b) reassignment of the same identifier; and c) | the identifier; b) reassignment of the same identifier; and c) | |||
optional further identifiers: | optional further identifiers: | |||
a. An essential aspect of a scalable congestion control is the use | a. An essential aspect of a Scalable congestion control is the use | |||
of explicit congestion signals. 'Classic' ECN [RFC3168] requires | of explicit congestion signals. Classic ECN [RFC3168] requires | |||
an ECN signal to be treated as equivalent to drop, both when it | an ECN signal to be treated as equivalent to drop, both when it | |||
is generated in the network and when it is responded to by hosts. | is generated in the network and when it is responded to by hosts. | |||
L4S needs networks and hosts to support a more fine-grained | L4S needs networks and hosts to support a more fine-grained | |||
meaning for each ECN signal that is less severe than a drop, so | meaning for each ECN signal that is less severe than a drop, so | |||
that the L4S signals: | that the L4S signals: | |||
* can be much more frequent; | * can be much more frequent and | |||
* can be signalled immediately, without the significant delay | * can be signalled immediately, without the significant delay | |||
required to smooth out fluctuations in the queue. | required to smooth out fluctuations in the queue. | |||
To enable L4S, the standards track Classic ECN spec. [RFC3168] | To enable L4S, the Standards Track Classic ECN spec [RFC3168] has | |||
has had to be updated to allow L4S packets to depart from the | had to be updated to allow L4S packets to depart from the | |||
'equivalent to drop' constraint. [RFC8311] is a standards track | 'equivalent-to-drop' constraint. [RFC8311] is a Standards Track | |||
update to relax specific requirements in RFC 3168 (and certain | update to relax specific requirements in [RFC3168] (and certain | |||
other standards track RFCs), which clears the way for the | other Standards Track RFCs), which clears the way for the | |||
experimental changes proposed for L4S. Also, the ECT(1) | experimental changes proposed for L4S. Also, the ECT(1) | |||
codepoint was previously assigned as the experimental ECN | codepoint was previously assigned as the experimental ECN nonce | |||
nonce [RFC3540], which RFC 8311 recategorizes as historic to make | [RFC3540], which [RFC8311] recategorizes as historic to make the | |||
the codepoint available again. | codepoint available again. | |||
b. [I-D.ietf-tsvwg-ecn-l4s-id] specifies that ECT(1) is used as the | b. [RFC9331] specifies that ECT(1) is used as the identifier to | |||
identifier to classify L4S packets into a separate treatment from | classify L4S packets into a separate treatment from Classic | |||
Classic packets. This satisfies the requirement for identifying | packets. This satisfies the requirement for identifying an | |||
an alternative ECN treatment in [RFC4774]. | alternative ECN treatment in [RFC4774]. | |||
The CE codepoint is used to indicate Congestion Experienced by | The CE codepoint is used to indicate Congestion Experienced by | |||
both L4S and Classic treatments. This raises the concern that a | both L4S and Classic treatments. This raises the concern that a | |||
Classic AQM earlier on the path might have marked some ECT(0) | Classic AQM earlier on the path might have marked some ECT(0) | |||
packets as CE. Then these packets will be erroneously classified | packets as CE. Then, these packets will be erroneously | |||
into the L4S queue. Appendix B of the L4S ECN | classified into the L4S queue. Appendix B of [RFC9331] explains | |||
spec [I-D.ietf-tsvwg-ecn-l4s-id] explains why five unlikely | why five unlikely eventualities all have to coincide for this to | |||
eventualities all have to coincide for this to have any | have any detrimental effect, which even then would only involve a | |||
detrimental effect, which even then would only involve a | ||||
vanishingly small likelihood of a spurious retransmission. | vanishingly small likelihood of a spurious retransmission. | |||
c. A network operator might wish to include certain unresponsive, | c. A network operator might wish to include certain unresponsive, | |||
non-L4S traffic in the L4S queue if it is deemed to be smoothly | non-L4S traffic in the L4S queue if it is deemed to be paced | |||
enough paced and low enough rate not to build a queue. For | smoothly enough and at a low enough rate not to build a queue, | |||
instance, VoIP, low rate datagrams to sync online games, | for instance, VoIP, low rate datagrams to sync online games, | |||
relatively low rate application-limited traffic, DNS, LDAP, etc. | relatively low rate application-limited traffic, DNS, Lightweight | |||
This traffic would need to be tagged with specific identifiers, | Directory Access Protocol (LDAP), etc. This traffic would need | |||
e.g. a low latency Diffserv Codepoint such as Expedited | to be tagged with specific identifiers, e.g., a low-latency | |||
Forwarding (EF [RFC3246]), Non-Queue-Building | Diffserv codepoint such as Expedited Forwarding (EF) [RFC3246], | |||
(NQB [I-D.ietf-tsvwg-nqb]), or operator-specific identifiers. | Non-Queue-Building (NQB) [NQB-PHB], or operator-specific | |||
identifiers. | ||||
4.2. Network Components | 4.2. Network Components | |||
The L4S architecture aims to provide low latency without the _need_ | The L4S architecture aims to provide low latency without the _need_ | |||
for per-flow operations in network components. Nonetheless, the | for per-flow operations in network components. Nonetheless, the | |||
architecture does not preclude per-flow solutions. The following | architecture does not preclude per-flow solutions. The following | |||
bullets describe the known arrangements: a) the DualQ Coupled AQM | bullets describe the known arrangements: a) the DualQ Coupled AQM | |||
with an L4S AQM in one queue coupled from a Classic AQM in the other; | with an L4S AQM in one queue coupled from a Classic AQM in the other; | |||
b) Per-Flow Queues with an instance of a Classic and an L4S AQM in | b) per-flow queues with an instance of a Classic and an L4S AQM in | |||
each queue; c) Dual queues with per-flow AQMs, but no per-flow | each queue; and c) Dual queues with per-flow AQMs but no per-flow | |||
queues: | queues: | |||
a. The Dual Queue Coupled AQM (illustrated in Figure 1) achieves the | a. The Dual-Queue Coupled AQM (illustrated in Figure 1) achieves the | |||
'semi-permeable' membrane property mentioned earlier as follows: | 'semi-permeable' membrane property mentioned earlier as follows: | |||
* Latency isolation: Two separate queues are used to isolate L4S | * Latency isolation: Two separate queues are used to isolate L4S | |||
queuing delay from the larger queue that Classic traffic needs | queuing delay from the larger queue that Classic traffic needs | |||
to maintain full utilization. | to maintain full utilization. | |||
* Bandwidth pooling: The two queues act as if they are a single | * Bandwidth pooling: The two queues act as if they are a single | |||
pool of bandwidth in which flows of either type get roughly | pool of bandwidth in which flows of either type get roughly | |||
equal throughput without the scheduler needing to identify any | equal throughput without the scheduler needing to identify any | |||
flows. This is achieved by having an AQM in each queue, but | flows. This is achieved by having an AQM in each queue, but | |||
skipping to change at page 11, line 28 ¶ | skipping to change at line 510 ¶ | |||
classes of congestion control. Specifically, the Classic AQM | classes of congestion control. Specifically, the Classic AQM | |||
generates a drop/mark probability based on congestion in its | generates a drop/mark probability based on congestion in its | |||
own queue, which it uses both to drop/mark packets in its own | own queue, which it uses both to drop/mark packets in its own | |||
queue and to affect the marking probability in the L4S queue. | queue and to affect the marking probability in the L4S queue. | |||
The strength of the coupling of the congestion signalling | The strength of the coupling of the congestion signalling | |||
between the two queues is enough to make the L4S flows slow | between the two queues is enough to make the L4S flows slow | |||
down to leave the right amount of capacity for the Classic | down to leave the right amount of capacity for the Classic | |||
flows (as they would if they were the same type of traffic | flows (as they would if they were the same type of traffic | |||
sharing the same queue). | sharing the same queue). | |||
Then the scheduler can serve the L4S queue with priority (denoted | Then, the scheduler can serve the L4S queue with priority | |||
by the '1' on the higher priority input), because the L4S traffic | (denoted by the '1' on the higher priority input), because the | |||
isn't offering up enough traffic to use all the priority that it | L4S traffic isn't offering up enough traffic to use all the | |||
is given. Therefore: | priority that it is given. Therefore: | |||
* for latency isolation on short time-scales (sub-round-trip) | * for latency isolation on short timescales (sub-round-trip), | |||
the prioritization of the L4S queue protects its low latency | the prioritization of the L4S queue protects its low latency | |||
by allowing bursts to dissipate quickly; | by allowing bursts to dissipate quickly; | |||
* but for bandwidth pooling on longer time-scales (round-trip | * but for bandwidth pooling on longer timescales (round-trip and | |||
and longer) the Classic queue creates an equal and opposite | longer), the Classic queue creates an equal and opposite | |||
pressure against the L4S traffic to ensure that neither has | pressure against the L4S traffic to ensure that neither has | |||
priority when it comes to bandwidth - the tension between | priority when it comes to bandwidth -- the tension between | |||
prioritizing L4S and coupling the marking from the Classic AQM | prioritizing L4S and coupling the marking from the Classic AQM | |||
results in approximate per-flow fairness. | results in approximate per-flow fairness. | |||
To protect against unresponsive traffic taking advantage of the | To protect against the prioritization of persistent L4S traffic | |||
prioritization of the L4S queue and starving the Classic queue, | deadlocking the Classic queue for a while in some | |||
it is advisable for the priority to be conditional, not strict | implementations, it is advisable for the priority to be | |||
(see Appendix A of the DualQ | conditional, not strict (see Appendix A of the DualQ spec | |||
spec [I-D.ietf-tsvwg-aqm-dualq-coupled]). | [RFC9332]). | |||
When there is no Classic traffic, the L4S queue's own AQM comes | When there is no Classic traffic, the L4S queue's own AQM comes | |||
into play. It starts congestion marking with a very shallow | into play. It starts congestion marking with a very shallow | |||
queue, so L4S traffic maintains very low queuing delay. | queue, so L4S traffic maintains very low queuing delay. | |||
If either queue becomes persistently overloaded, drop of ECN- | If either queue becomes persistently overloaded, drop of some | |||
capable packets is introduced, as recommended in Section 7 of the | ECN-capable packets is introduced, as recommended in Section 7 of | |||
ECN spec [RFC3168] and Section 4.2.1 of the AQM | the ECN spec [RFC3168] and Section 4.2.1 of the AQM | |||
recommendations [RFC7567]. Then both queues introduce the same | recommendations [RFC7567]. The trade-offs with different | |||
level of drop (not shown in the figure). | approaches are discussed in Section 4.2.3 of the DualQ spec | |||
[RFC9332] (not shown in the figure here). | ||||
The Dual Queue Coupled AQM has been specified as generically as | The Dual-Queue Coupled AQM has been specified as generically as | |||
possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying | possible [RFC9332] without specifying the particular AQMs to use | |||
the particular AQMs to use in the two queues so that designers | in the two queues so that designers are free to implement diverse | |||
are free to implement diverse ideas. Informational appendices in | ideas. Informational appendices in that document give pseudocode | |||
that draft give pseudocode examples of two different specific AQM | examples of two different specific AQM approaches: one called | |||
approaches: one called DualPI2 (pronounced Dual PI | DualPI2 (pronounced Dual PI Squared) [DualPI2Linux] that uses the | |||
Squared) [DualPI2Linux] that uses the PI2 variant of PIE, and a | PI2 variant of PIE and a zero-config variant of Random Early | |||
zero-config variant of RED called Curvy RED. A DualQ Coupled AQM | Detection (RED) called Curvy RED. A DualQ Coupled AQM based on | |||
based on PIE has also been specified and implemented for Low | PIE has also been specified and implemented for Low Latency | |||
Latency DOCSIS [DOCSIS3.1]. | DOCSIS [DOCSIS3.1]. | |||
(3) (2) | (3) (2) | |||
.-------^------..------------^------------------. | .-------^------..------------^------------------. | |||
,-(1)-----. _____ | ,-(1)-----. _____ | |||
; ________ : L4S -------. | | | ; ________ : L4S -------. | | | |||
:|Scalable| : _\ ||__\_|mark | | :|Scalable| : _\ ||__\_|mark | | |||
:| sender | : __________ / / || / |_____|\ _________ | :| sender | : __________ / / || / |_____|\ _________ | |||
:|________|\; | |/ -------' ^ \1|condit'nl| | :|________|\; | |/ -------' ^ \1|condit'nl| | |||
`---------'\_| IP-ECN | Coupling : \|priority |_\ | `---------'\_| IP-ECN | Coupling : \|priority |_\ | |||
________ / |Classifier| : /|scheduler| / | ________ / |Classifier| : /|scheduler| / | |||
|Classic |/ |__________|\ -------. __:__ / |_________| | |Classic |/ |__________|\ -------. __:__ / |_________| | |||
| sender | \_\ || | ||__\_|mark/|/ | | sender | \_\ || | ||__\_|mark/|/ | |||
|________| / || | || / |drop | | |________| / || | || / |drop | | |||
Classic -------' |_____| | Classic -------' |_____| | |||
Figure 1: Components of an L4S DualQ Coupled AQM Solution: 1) | (1) Scalable sending host | |||
Scalable Sending Host; 2) Isolation in separate network | (2) Isolation in separate network queues | |||
queues; and 3) Packet Identification Protocol | (3) Packet identification protocol | |||
b. Per-Flow Queues and AQMs: A scheduler with per-flow queues such | Figure 1: Components of an L4S DualQ Coupled AQM Solution | |||
as FQ-CoDel or FQ-PIE can be used for L4S. For instance within | ||||
b. Per-Flow Queues and AQMs: A scheduler with per-flow queues, such | ||||
as FQ-CoDel or FQ-PIE, can be used for L4S. For instance, within | ||||
each queue of an FQ-CoDel system, as well as a CoDel AQM, there | each queue of an FQ-CoDel system, as well as a CoDel AQM, there | |||
is typically also the option of ECN marking at an immediate | is typically also the option of ECN marking at an immediate | |||
(unsmoothed) shallow threshold to support use in data centres | (unsmoothed) shallow threshold to support use in data centres | |||
(see Sec.5.2.7 of the FQ-CoDel spec [RFC8290]). In Linux, this | (see Section 5.2.7 of the FQ-CoDel spec [RFC8290]). In Linux, | |||
has been modified so that the shallow threshold can be solely | this has been modified so that the shallow threshold can be | |||
applied to ECT(1) packets [FQ_CoDel_Thresh]. Then, if there is a | solely applied to ECT(1) packets [FQ_CoDel_Thresh]. Then, if | |||
flow of non-ECN or ECT(0) packets in the per-flow-queue, the | there is a flow of Not-ECT or ECT(0) packets in the per-flow | |||
Classic AQM (e.g. CoDel) is applied; while if there is a flow of | queue, the Classic AQM (e.g., CoDel) is applied; whereas, if | |||
ECT(1) packets in the queue, the shallower (typically sub- | there is a flow of ECT(1) packets in the queue, the shallower | |||
millisecond) threshold is applied. In addition, ECT(0) and not- | (typically sub-millisecond) threshold is applied. In addition, | |||
ECT packets could potentially be classified into a separate flow- | ECT(0) and Not-ECT packets could potentially be classified into a | |||
queue from ECT(1) and CE packets to avoid them mixing if they | separate flow queue from ECT(1) and CE packets to avoid them | |||
share a common flow-identifier (e.g. in a VPN). | mixing if they share a common flow identifier (e.g., in a VPN). | |||
c. Dual-queues, but per-flow AQMs: It should also be possible to use | c. Dual queues but per-flow AQMs: It should also be possible to use | |||
dual queues for isolation, but with per-flow marking to control | dual queues for isolation but with per-flow marking to control | |||
flow-rates (instead of the coupled per-queue marking of the Dual | flow rates (instead of the coupled per-queue marking of the Dual- | |||
Queue Coupled AQM). One of the two queues would be for isolating | Queue Coupled AQM). One of the two queues would be for isolating | |||
L4S packets, which would be classified by the ECN codepoint. | L4S packets, which would be classified by the ECN codepoint. | |||
Flow rates could be controlled by flow-specific marking. The | Flow rates could be controlled by flow-specific marking. The | |||
policy goal of the marking could be to differentiate flow rates | policy goal of the marking could be to differentiate flow rates | |||
(e.g. [Nadas20], which requires additional signalling of a per- | (e.g., [Nadas20], which requires additional signalling of a per- | |||
flow 'value'), or to equalize flow-rates (perhaps in a similar | flow 'value') or to equalize flow rates (perhaps in a similar way | |||
way to Approx Fair CoDel [AFCD], | to Approx Fair CoDel [AFCD] [CODEL-APPROX-FAIR] but with two | |||
[I-D.morton-tsvwg-codel-approx-fair], but with two queues not | queues not one). | |||
one). | ||||
Note that whenever the term 'DualQ' is used loosely without | Note that, whenever the term 'DualQ' is used loosely without | |||
saying whether marking is per-queue or per-flow, it means a dual | saying whether marking is per queue or per flow, it means a dual- | |||
queue AQM with per-queue marking. | queue AQM with per-queue marking. | |||
4.3. Host Mechanisms | 4.3. Host Mechanisms | |||
The L4S architecture includes two main mechanisms in the end host | The L4S architecture includes two main mechanisms in the end host | |||
that we enumerate next: | that we enumerate next: | |||
a. Scalable Congestion Control at the sender: Section 2 defines a | a. Scalable congestion control at the sender: Section 2 defines a | |||
scalable congestion control as one where the average time from | Scalable congestion control as one where the average time from | |||
one congestion signal to the next (the recovery time) remains | one congestion signal to the next (the recovery time) remains | |||
invariant as the flow rate scales, all other factors being equal. | invariant as flow rate scales, all other factors being equal. | |||
Data Center TCP is the most widely used example. It has been | DCTCP is the most widely used example. It has been documented as | |||
documented as an informational record of the protocol currently | an informational record of the protocol currently in use in | |||
in use in controlled environments [RFC8257]. A draft list of | controlled environments [RFC8257]. A list of safety and | |||
safety and performance improvements for a scalable congestion | performance improvements for a Scalable congestion control to be | |||
control to be usable on the public Internet has been drawn up | usable on the public Internet has been drawn up (see the so- | |||
(the so-called 'Prague L4S requirements' in Appendix A of | called 'Prague L4S requirements' in Appendix A of [RFC9331]). | |||
The subset that involve risk of harm to others have been captured | ||||
[I-D.ietf-tsvwg-ecn-l4s-id]). The subset that involve risk of | as normative requirements in Section 4 of [RFC9331]. TCP Prague | |||
harm to others have been captured as normative requirements in | [PRAGUE-CC] has been implemented in Linux as a reference | |||
Section 4 of [I-D.ietf-tsvwg-ecn-l4s-id]. TCP | implementation to address these requirements [PragueLinux]. | |||
Prague [I-D.briscoe-iccrg-prague-congestion-control] has been | ||||
implemented in Linux as a reference implementation to address | ||||
these requirements [PragueLinux]. | ||||
Transport protocols other than TCP use various congestion | Transport protocols other than TCP use various congestion | |||
controls that are designed to be friendly with Reno. Before they | controls that are designed to be friendly with Reno. Before they | |||
can use the L4S service, they will need to be updated to | can use the L4S service, they will need to be updated to | |||
implement a scalable congestion response, which they will have to | implement a Scalable congestion response, which they will have to | |||
indicate by using the ECT(1) codepoint. Scalable variants are | indicate by using the ECT(1) codepoint. Scalable variants are | |||
under consideration for more recent transport protocols, | under consideration for more recent transport protocols (e.g., | |||
e.g. QUIC, and the L4S ECN part of BBRv2 [BBRv2], | QUIC), and the L4S ECN part of BBRv2 [BBRv2] [BBR-CC] is a | |||
[I-D.cardwell-iccrg-bbr-congestion-control] is a scalable | Scalable congestion control intended for the TCP and QUIC | |||
congestion control intended for the TCP and QUIC transports, | transports, amongst others. Also, an L4S variant of the RMCAT | |||
amongst others. Also, an L4S variant of the RMCAT SCReAM | SCReAM controller [RFC8298] has been implemented [SCReAM-L4S] for | |||
controller [RFC8298] has been implemented [SCReAM] for media | media transported over RTP. | |||
transported over RTP. | ||||
Section 4.3 of the L4S ECN spec [I-D.ietf-tsvwg-ecn-l4s-id] | Section 4.3 of the L4S ECN spec [RFC9331] defines Scalable | |||
defines scalable congestion control in more detail, and specifies | congestion control in more detail and specifies the requirements | |||
the requirements that an L4S scalable congestion control has to | that an L4S Scalable congestion control has to comply with. | |||
comply with. | ||||
b. The ECN feedback in some transport protocols is already | b. The ECN feedback in some transport protocols is already | |||
sufficiently fine-grained for L4S (specifically DCCP [RFC4340] | sufficiently fine-grained for L4S (specifically DCCP [RFC4340] | |||
and QUIC [RFC9000]). But others either require update or are in | and QUIC [RFC9000]). But others either require updates or are in | |||
the process of being updated: | the process of being updated: | |||
* For the case of TCP, the feedback protocol for ECN embeds the | * For the case of TCP, the feedback protocol for ECN embeds the | |||
assumption from Classic ECN [RFC3168] that an ECN mark is | assumption from Classic ECN [RFC3168] that an ECN mark is | |||
equivalent to a drop, making it unusable for a scalable TCP. | equivalent to a drop, making it unusable for a Scalable TCP. | |||
Therefore, the implementation of TCP receivers will have to be | Therefore, the implementation of TCP receivers will have to be | |||
upgraded [RFC7560]. Work to standardize and implement more | upgraded [RFC7560]. Work to standardize and implement more | |||
accurate ECN feedback for TCP (AccECN) is in | accurate ECN feedback for TCP (AccECN) is in progress [ACCECN] | |||
progress [I-D.ietf-tcpm-accurate-ecn], [PragueLinux]. | [PragueLinux]. | |||
* ECN feedback was only roughly sketched in an appendix of the | * ECN feedback was only roughly sketched in the appendix of the | |||
now obsoleted second specification of SCTP [RFC4960], while a | now obsoleted second specification of SCTP [RFC4960], while a | |||
fuller specification was proposed in a long-expired | fuller specification was proposed in a long-expired document | |||
draft [I-D.stewart-tsvwg-sctpecn]. A new design would need to | [ECN-SCTP]. A new design would need to be implemented and | |||
be implemented and deployed before SCTP could support L4S. | deployed before SCTP could support L4S. | |||
* For RTP, sufficient ECN feedback was defined in [RFC6679], but | * For RTP, sufficient ECN feedback was defined in [RFC6679], but | |||
[RFC8888] defines the latest standards track improvements. | [RFC8888] defines the latest Standards Track improvements. | |||
5. Rationale | 5. Rationale | |||
5.1. Why These Primary Components? | 5.1. Why These Primary Components? | |||
Explicit congestion signalling (protocol): Explicit congestion | Explicit congestion signalling (protocol): Explicit congestion | |||
signalling is a key part of the L4S approach. In contrast, use of | signalling is a key part of the L4S approach. In contrast, use of | |||
drop as a congestion signal creates a tension because drop is both | drop as a congestion signal creates tension because drop is both | |||
an impairment (less would be better) and a useful signal (more | an impairment (less would be better) and a useful signal (more | |||
would be better): | would be better): | |||
* Explicit congestion signals can be used many times per round | * Explicit congestion signals can be used many times per round | |||
trip, to keep tight control, without any impairment. Under | trip to keep tight control without any impairment. Under heavy | |||
heavy load, even more explicit signals can be applied, so that | load, even more explicit signals can be applied so that the | |||
the queue can be kept short whatever the load. In contrast, | queue can be kept short whatever the load. In contrast, | |||
Classic AQMs have to introduce very high packet drop at high | Classic AQMs have to introduce very high packet drop at high | |||
load to keep the queue short. By using ECN, an L4S congestion | load to keep the queue short. By using ECN, an L4S congestion | |||
control's sawtooth reduction can be smaller and therefore | control's sawtooth reduction can be smaller and therefore | |||
return to the operating point more often, without worrying that | return to the operating point more often, without worrying that | |||
more sawteeth will cause more signals. The consequent smaller | more sawteeth will cause more signals. The consequent smaller | |||
amplitude sawteeth fit between an empty queue and a very | amplitude sawteeth fit between an empty queue and a very | |||
shallow marking threshold (~1 ms in the public Internet), so | shallow marking threshold (~1 ms in the public Internet), so | |||
queue delay variation can be very low, without risk of under- | queue delay variation can be very low, without risk of | |||
utilization. | underutilization. | |||
* Explicit congestion signals can be emitted immediately to track | * Explicit congestion signals can be emitted immediately to track | |||
fluctuations of the queue. L4S shifts smoothing from the | fluctuations of the queue. L4S shifts smoothing from the | |||
network to the host. The network doesn't know the round trip | network to the host. The network doesn't know the round-trip | |||
times of any of the flows. So if the network is responsible | times (RTTs) of any of the flows. So if the network is | |||
for smoothing (as in the Classic approach), it has to assume a | responsible for smoothing (as in the Classic approach), it has | |||
worst case RTT, otherwise long RTT flows would become unstable. | to assume a worst case RTT, otherwise long RTT flows would | |||
This delays Classic congestion signals by 100-200 ms. In | become unstable. This delays Classic congestion signals by | |||
contrast, each host knows its own round trip time. So, in the | 100-200 ms. In contrast, each host knows its own RTT. So, in | |||
L4S approach, the host can smooth each flow over its own RTT, | the L4S approach, the host can smooth each flow over its own | |||
introducing no more smoothing delay than strictly necessary | RTT, introducing no more smoothing delay than strictly | |||
(usually only a few milliseconds). A host can also choose not | necessary (usually only a few milliseconds). A host can also | |||
to introduce any smoothing delay if appropriate, e.g. during | choose not to introduce any smoothing delay if appropriate, | |||
flow start-up. | e.g., during flow start-up. | |||
Neither of the above are feasible if explicit congestion | Neither of the above are feasible if explicit congestion | |||
signalling has to be considered 'equivalent to drop' (as was | signalling has to be considered 'equivalent to drop' (as was | |||
required with Classic ECN [RFC3168]), because drop is an | required with Classic ECN [RFC3168]), because drop is an | |||
impairment as well as a signal. So drop cannot be excessively | impairment as well as a signal. So drop cannot be excessively | |||
frequent, and drop cannot be immediate, otherwise too many drops | frequent, and drop cannot be immediate; otherwise, too many drops | |||
would turn out to have been due to only a transient fluctuation in | would turn out to have been due to only a transient fluctuation in | |||
the queue that would not have warranted dropping a packet in | the queue that would not have warranted dropping a packet in | |||
hindsight. Therefore, in an L4S AQM, the L4S queue uses a new L4S | hindsight. Therefore, in an L4S AQM, the L4S queue uses a new L4S | |||
variant of ECN that is not equivalent to drop (see section 5.2 of | variant of ECN that is not equivalent to drop (see Section 5.2 of | |||
the L4S ECN spec [I-D.ietf-tsvwg-ecn-l4s-id]), while the Classic | the L4S ECN spec [RFC9331]), while the Classic queue uses either | |||
queue uses either Classic ECN [RFC3168] or drop, which are | Classic ECN [RFC3168] or drop, which are still equivalent to each | |||
equivalent to each other. | other. | |||
Before Classic ECN was standardized, there were various proposals | Before Classic ECN was standardized, there were various proposals | |||
to give an ECN mark a different meaning from drop. However, there | to give an ECN mark a different meaning from drop. However, there | |||
was no particular reason to agree on any one of the alternative | was no particular reason to agree on any one of the alternative | |||
meanings, so 'equivalent to drop' was the only compromise that | meanings, so 'equivalent to drop' was the only compromise that | |||
could be reached. RFC 3168 contains a statement that: | could be reached. [RFC3168] contains a statement that: | |||
"An environment where all end nodes were ECN-Capable could | An environment where all end nodes were ECN-Capable could | |||
allow new criteria to be developed for setting the CE | allow new criteria to be developed for setting the CE | |||
codepoint, and new congestion control mechanisms for end-node | codepoint, and new congestion control mechanisms for end-node | |||
reaction to CE packets. However, this is a research issue, and | reaction to CE packets. However, this is a research issue, | |||
as such is not addressed in this document." | and as such is not addressed in this document. | |||
Latency isolation (network): L4S congestion controls keep queue | Latency isolation (network): L4S congestion controls keep queue | |||
delay low whereas Classic congestion controls need a queue of the | delay low, whereas Classic congestion controls need a queue of the | |||
order of the RTT to avoid under-utilization. One queue cannot | order of the RTT to avoid underutilization. One queue cannot have | |||
have two lengths, therefore L4S traffic needs to be isolated in a | two lengths; therefore, L4S traffic needs to be isolated in a | |||
separate queue (e.g. DualQ) or queues (e.g. FQ). | separate queue (e.g., DualQ) or queues (e.g., FQ). | |||
Coupled congestion notification: Coupling the congestion | Coupled congestion notification: Coupling the congestion | |||
notification between two queues as in the DualQ Coupled AQM is not | notification between two queues as in the DualQ Coupled AQM is not | |||
necessarily essential, but it is a simple way to allow senders to | necessarily essential, but it is a simple way to allow senders to | |||
determine their rate, packet by packet, rather than be overridden | determine their rate packet by packet, rather than be overridden | |||
by a network scheduler. An alternative is for a network scheduler | by a network scheduler. An alternative is for a network scheduler | |||
to control the rate of each application flow (see discussion in | to control the rate of each application flow (see the discussion | |||
Section 5.2). | in Section 5.2). | |||
L4S packet identifier (protocol): Once there are at least two | L4S packet identifier (protocol): Once there are at least two | |||
treatments in the network, hosts need an identifier at the IP | treatments in the network, hosts need an identifier at the IP | |||
layer to distinguish which treatment they intend to use. | layer to distinguish which treatment they intend to use. | |||
Scalable congestion notification: A scalable congestion control in | Scalable congestion notification: A Scalable congestion control in | |||
the host keeps the signalling frequency from the network high | the host keeps the signalling frequency from the network high, | |||
whatever the flow rate, so that queue delay variations can be | whatever the flow rate, so that queue delay variations can be | |||
small when conditions are stable, and rate can track variations in | small when conditions are stable, and rate can track variations in | |||
available capacity as rapidly as possible otherwise. | available capacity as rapidly as possible otherwise. | |||
Low loss: Latency is not the only concern of L4S. The 'Low Loss' | Low loss: Latency is not the only concern of L4S. The 'Low Loss' | |||
part of the name denotes that L4S generally achieves zero | part of the name denotes that L4S generally achieves zero | |||
congestion loss due to its use of ECN. Otherwise, loss would | congestion loss due to its use of ECN. Otherwise, loss would | |||
itself cause delay, particularly for short flows, due to | itself cause delay, particularly for short flows, due to | |||
retransmission delay [RFC2884]. | retransmission delay [RFC2884]. | |||
Scalable throughput: The "Scalable throughput" part of the name | Scalable throughput: The 'Scalable throughput' part of the name | |||
denotes that the per-flow throughput of scalable congestion | denotes that the per-flow throughput of Scalable congestion | |||
controls should scale indefinitely, avoiding the imminent scaling | controls should scale indefinitely, avoiding the imminent scaling | |||
problems with Reno-friendly congestion control | problems with Reno-friendly congestion control algorithms | |||
algorithms [RFC3649]. It was known when TCP congestion avoidance | [RFC3649]. It was known when TCP congestion avoidance was first | |||
was first developed in 1988 that it would not scale to high | developed in 1988 that it would not scale to high bandwidth-delay | |||
bandwidth-delay products (see footnote 6 in [TCP-CA]). Today, | products (see footnote 6 in [TCP-CA]). Today, regular broadband | |||
regular broadband flow rates over WAN distances are already beyond | flow rates over WAN distances are already beyond the scaling range | |||
the scaling range of Classic Reno congestion control. So `less | of Classic Reno congestion control. So 'less unscalable' CUBIC | |||
unscalable' Cubic [RFC8312] and Compound [I-D.sridharan-tcpm-ctcp] | [RFC8312] and Compound [CTCP] variants of TCP have been | |||
variants of TCP have been successfully deployed. However, these | successfully deployed. However, these are now approaching their | |||
are now approaching their scaling limits. | scaling limits. | |||
For instance, we will consider a scenario with a maximum RTT of | For instance, we will consider a scenario with a maximum RTT of 30 | |||
30 ms at the peak of each sawtooth. As Reno packet rate scales 8x | ms at the peak of each sawtooth. As Reno packet rate scales 8 | |||
from 1,250 to 10,000 packet/s (from 15 to 120 Mb/s with 1500 B | times from 1,250 to 10,000 packet/s (from 15 to 120 Mb/s with 1500 | |||
packets), the time to recover from a congestion event rises | B packets), the time to recover from a congestion event rises | |||
proportionately by 8x as well, from 422 ms to 3.38 s. It is | proportionately by 8 times as well, from 422 ms to 3.38 s. It is | |||
clearly problematic for a congestion control to take multiple | clearly problematic for a congestion control to take multiple | |||
seconds to recover from each congestion event. Cubic [RFC8312] | seconds to recover from each congestion event. CUBIC [RFC8312] | |||
was developed to be less unscalable, but it is approaching its | was developed to be less unscalable, but it is approaching its | |||
scaling limit; with the same max RTT of 30 ms, at 120 Mb/s Cubic | scaling limit; with the same max RTT of 30 ms, at 120 Mb/s, CUBIC | |||
is still fully in its Reno-friendly mode, so it takes about 4.3 s | is still fully in its Reno-friendly mode, so it takes about 4.3 s | |||
to recover. However, once the flow rate scales by 8x again to | to recover. However, once flow rate scales by 8 times again to | |||
960 Mb/s it enters true Cubic mode, with a recovery time of | 960 Mb/s it enters true CUBIC mode, with a recovery time of 12.2 | |||
12.2 s. From then on, each further scaling by 8x doubles Cubic's | s. From then on, each further scaling by 8 times doubles CUBIC's | |||
recovery time (because the cube root of 8 is 2), e.g. at 7.68 Gb/s | recovery time (because the cube root of 8 is 2), e.g., at 7.68 Gb/ | |||
the recovery time is 24.3 s. In contrast, a scalable congestion | s, the recovery time is 24.3 s. In contrast, a Scalable | |||
control like DCTCP or TCP Prague induces 2 congestion signals per | congestion control like DCTCP or Prague induces 2 congestion | |||
round trip on average, which remains invariant for any flow rate, | signals per round trip on average, which remains invariant for any | |||
keeping dynamic control very tight. | flow rate, keeping dynamic control very tight. | |||
For a feel of where the global average lone-flow download sits on | For a feel of where the global average lone-flow download sits on | |||
this scale at the time of writing (2021), according to [BDPdata] | this scale at the time of writing (2021), according to [BDPdata], | |||
globally averaged fixed access capacity was 103 Mb/s in 2020 and | the global average fixed access capacity was 103 Mb/s in 2020 and | |||
averaged base RTT to a CDN was 25-34ms in 2019. Averaging of per- | the average base RTT to a CDN was 25 to 34 ms in 2019. Averaging | |||
country data was weighted by Internet user population (data | of per-country data was weighted by Internet user population (data | |||
collected globally is necessarily of variable quality, but the | collected globally is necessarily of variable quality, but the | |||
paper does double-check that the outcome compares well against a | paper does double-check that the outcome compares well against a | |||
second source). So a lone CUBIC flow would at best take about 200 | second source). So a lone CUBIC flow would at best take about 200 | |||
round trips (5 s) to recover from each of its sawtooth reductions, | round trips (5 s) to recover from each of its sawtooth reductions, | |||
if the flow even lasted that long. This is described as 'at best' | if the flow even lasted that long. This is described as 'at best' | |||
because it assumes everyone uses an AQM, whereas in reality most | because it assumes everyone uses an AQM, whereas in reality, most | |||
users still have a (probably bloated) tail-drop buffer. In the | users still have a (probably bloated) tail-drop buffer. In the | |||
tail-drop case, likely average recovery time would be at least 4x | tail-drop case, the likely average recovery time would be at least | |||
5 s, if not more, because RTT under load would be at least double | 4 times 5 s, if not more, because RTT under load would be at least | |||
that of an AQM, and recovery time depends on the square of RTT. | double that of an AQM, and the recovery time of Reno-friendly | |||
flows depends on the square of RTT. | ||||
Although work on scaling congestion controls tends to start with | Although work on scaling congestion controls tends to start with | |||
TCP as the transport, the above is not intended to exclude other | TCP as the transport, the above is not intended to exclude other | |||
transports (e.g. SCTP, QUIC) or less elastic algorithms | transports (e.g., SCTP and QUIC) or less elastic algorithms (e.g., | |||
(e.g. RMCAT), which all tend to adopt the same or similar | RMCAT), which all tend to adopt the same or similar developments. | |||
developments. | ||||
5.2. What L4S adds to Existing Approaches | 5.2. What L4S Adds to Existing Approaches | |||
All the following approaches address some part of the same problem | All the following approaches address some part of the same problem | |||
space as L4S. In each case, it is shown that L4S complements them or | space as L4S. In each case, it is shown that L4S complements them or | |||
improves on them, rather than being a mutually exclusive alternative: | improves on them, rather than being a mutually exclusive alternative: | |||
Diffserv: Diffserv addresses the problem of bandwidth apportionment | Diffserv: Diffserv addresses the problem of bandwidth apportionment | |||
for important traffic as well as queuing latency for delay- | for important traffic as well as queuing latency for delay- | |||
sensitive traffic. Of these, L4S solely addresses the problem of | sensitive traffic. Of these, L4S solely addresses the problem of | |||
queuing latency. Diffserv will still be necessary where important | queuing latency. Diffserv will still be necessary where important | |||
traffic requires priority (e.g. for commercial reasons, or for | traffic requires priority (e.g., for commercial reasons or for | |||
protection of critical infrastructure traffic) - see | protection of critical infrastructure traffic) -- see | |||
[I-D.briscoe-tsvwg-l4s-diffserv]. Nonetheless, the L4S approach | [L4S-DIFFSERV]. Nonetheless, the L4S approach can provide low | |||
can provide low latency for all traffic within each Diffserv class | latency for all traffic within each Diffserv class (including the | |||
(including the case where there is only the one default Diffserv | case where there is only the one default Diffserv class). | |||
class). | ||||
Also, Diffserv can only provide a latency benefit if a small | Also, Diffserv can only provide a latency benefit if a small | |||
subset of the traffic on a bottleneck link requests low latency. | subset of the traffic on a bottleneck link requests low latency. | |||
As already explained, it has no effect when all the applications | As already explained, it has no effect when all the applications | |||
in use at one time at a single site (home, small business or | in use at one time at a single site (e.g., a home, small business, | |||
mobile device) require low latency. In contrast, because L4S | or mobile device) require low latency. In contrast, because L4S | |||
works for all traffic, it needs none of the management baggage | works for all traffic, it needs none of the management baggage | |||
(traffic policing, traffic contracts) associated with favouring | (traffic policing or traffic contracts) associated with favouring | |||
some packets over others. This lack of management baggage ought | some packets over others. This lack of management baggage ought | |||
to give L4S a better chance of end-to-end deployment. | to give L4S a better chance of end-to-end deployment. | |||
In particular, because networks tend not to trust end systems to | In particular, if networks do not trust end systems to identify | |||
identify which packets should be favoured over others, where | which packets should be favoured, they assign packets to Diffserv | |||
networks assign packets to Diffserv classes they tend to use | classes themselves. However, the techniques available to such | |||
packet inspection of application flow identifiers or deeper | networks, like inspection of flow identifiers or deeper inspection | |||
inspection of application signatures. Thus, nowadays, Diffserv | of application signatures, do not always sit well with encryption | |||
doesn't always sit well with encryption of the layers above IP | of the layers above IP [RFC8404]. In these cases, users can have | |||
[RFC8404]. So users have to choose between privacy and QoS. | either privacy or Quality of Service (QoS), but not both. | |||
As with Diffserv, the L4S identifier is in the IP header. But, in | As with Diffserv, the L4S identifier is in the IP header. But, in | |||
contrast to Diffserv, the L4S identifier does not convey a want or | contrast to Diffserv, the L4S identifier does not convey a want or | |||
a need for a certain level of quality. Rather, it promises a | a need for a certain level of quality. Rather, it promises a | |||
certain behaviour (scalable congestion response), which networks | certain behaviour (Scalable congestion response), which networks | |||
can objectively verify if they need to. This is because low delay | can objectively verify if they need to. This is because low delay | |||
depends on collective host behaviour, whereas bandwidth priority | depends on collective host behaviour, whereas bandwidth priority | |||
depends on network behaviour. | depends on network behaviour. | |||
State-of-the-art AQMs: AQMs such as PIE and FQ-CoDel give a | State-of-the-art AQMs: AQMs for Classic traffic, such as PIE and FQ- | |||
significant reduction in queuing delay relative to no AQM at all. | CoDel, give a significant reduction in queuing delay relative to | |||
L4S is intended to complement these AQMs, and should not distract | no AQM at all. L4S is intended to complement these AQMs and | |||
from the need to deploy them as widely as possible. Nonetheless, | should not distract from the need to deploy them as widely as | |||
AQMs alone cannot reduce queuing delay too far without | possible. Nonetheless, AQMs alone cannot reduce queuing delay too | |||
significantly reducing link utilization, because the root cause of | far without significantly reducing link utilization, because the | |||
the problem is on the host - where Classic congestion controls use | root cause of the problem is on the host -- where Classic | |||
large saw-toothing rate variations. The L4S approach resolves | congestion controls use large sawtoothing rate variations. The | |||
this tension between delay and utilization by enabling hosts to | L4S approach resolves this tension between delay and utilization | |||
minimize the amplitude of their sawteeth. A single-queue Classic | by enabling hosts to minimize the amplitude of their sawteeth. A | |||
AQM is not sufficient to allow hosts to use small sawteeth for two | single-queue Classic AQM is not sufficient to allow hosts to use | |||
reasons: i) smaller sawteeth would not get lower delay in an AQM | small sawteeth for two reasons: i) smaller sawteeth would not get | |||
designed for larger amplitude Classic sawteeth, because a queue | lower delay in an AQM designed for larger amplitude Classic | |||
can only have one length at a time; and ii) much smaller sawteeth | sawteeth, because a queue can only have one length at a time and | |||
implies much more frequent sawteeth, so L4S flows would drive a | ii) much smaller sawteeth implies much more frequent sawteeth, so | |||
Classic AQM into a high level of ECN-marking, which would appear | L4S flows would drive a Classic AQM into a high level of ECN- | |||
as heavy congestion to Classic flows, which in turn would greatly | marking, which would appear as heavy congestion to Classic flows, | |||
reduce their rate as a result (see Section 6.4.4). | which in turn would greatly reduce their rate as a result (see | |||
Section 6.4.4). | ||||
Per-flow queuing or marking: Similarly, per-flow approaches such as | Per-flow queuing or marking: Similarly, per-flow approaches, such as | |||
FQ-CoDel or Approx Fair CoDel [AFCD] are not incompatible with the | FQ-CoDel or Approx Fair CoDel [AFCD], are not incompatible with | |||
L4S approach. However, per-flow queuing alone is not enough - it | the L4S approach. However, per-flow queuing alone is not enough | |||
only isolates the queuing of one flow from others; not from | -- it only isolates the queuing of one flow from others, not from | |||
itself. Per-flow implementations need to have support for | itself. Per-flow implementations need to have support for | |||
scalable congestion control added, which has already been done for | Scalable congestion control added, which has already been done for | |||
FQ-CoDel in Linux (see Sec.5.2.7 of [RFC8290] and | FQ-CoDel in Linux (see Section 5.2.7 of [RFC8290] and | |||
[FQ_CoDel_Thresh]). Without this simple modification, per-flow | [FQ_CoDel_Thresh]). Without this simple modification, per-flow | |||
AQMs like FQ-CoDel would still not be able to support applications | AQMs, like FQ-CoDel, would still not be able to support | |||
that need both very low delay and high bandwidth, e.g. video-based | applications that need both very low delay and high bandwidth, | |||
control of remote procedures, or interactive cloud-based video | e.g., video-based control of remote procedures or interactive | |||
(see Note 1 below). | cloud-based video (see Note 1 below). | |||
Although per-flow techniques are not incompatible with L4S, it is | Although per-flow techniques are not incompatible with L4S, it is | |||
important to have the DualQ alternative. This is because handling | important to have the DualQ alternative. This is because handling | |||
end-to-end (layer 4) flows in the network (layer 3 or 2) precludes | end-to-end (layer 4) flows in the network (layer 3 or 2) precludes | |||
some important end-to-end functions. For instance: | some important end-to-end functions. For instance: | |||
a. Per-flow forms of L4S like FQ-CoDel are incompatible with full | A. Per-flow forms of L4S, like FQ-CoDel, are incompatible with | |||
end-to-end encryption of transport layer identifiers for | full end-to-end encryption of transport layer identifiers for | |||
privacy and confidentiality (e.g. IPSec or encrypted VPN | privacy and confidentiality (e.g., IPsec or encrypted VPN | |||
tunnels, as opposed to DTLS over UDP), because they require | tunnels, as opposed to DTLS over UDP), because they require | |||
packet inspection to access the end-to-end transport flow | packet inspection to access the end-to-end transport flow | |||
identifiers. | identifiers. | |||
In contrast, the DualQ form of L4S requires no deeper | In contrast, the DualQ form of L4S requires no deeper | |||
inspection than the IP layer. So, as long as operators take | inspection than the IP layer. So as long as operators take | |||
the DualQ approach, their users can have both very low queuing | the DualQ approach, their users can have both very low queuing | |||
delay and full end-to-end encryption [RFC8404]. | delay and full end-to-end encryption [RFC8404]. | |||
b. With per-flow forms of L4S, the network takes over control of | B. With per-flow forms of L4S, the network takes over control of | |||
the relative rates of each application flow. Some see it as | the relative rates of each application flow. Some see it as | |||
an advantage that the network will prevent some flows running | an advantage that the network will prevent some flows running | |||
faster than others. Others consider it an inherent part of | faster than others. Others consider it an inherent part of | |||
the Internet's appeal that applications can control their rate | the Internet's appeal that applications can control their rate | |||
while taking account of the needs of others via congestion | while taking account of the needs of others via congestion | |||
signals. They maintain that this has allowed applications | signals. They maintain that this has allowed applications | |||
with interesting rate behaviours to evolve, for instance, | with interesting rate behaviours to evolve, for instance: i) a | |||
variable bit-rate video that varies around an equal share | variable bit-rate video that varies around an equal share, | |||
rather than being forced to remain equal at every instant, or | rather than being forced to remain equal at every instant or | |||
e2e scavenger behaviours [RFC6817] that use less than an equal | ii) end-to-end scavenger behaviours [RFC6817] that use less | |||
share of capacity [LEDBAT_AQM]. | than an equal share of capacity [LEDBAT_AQM]. | |||
The L4S architecture does not require the IETF to commit to | The L4S architecture does not require the IETF to commit to | |||
one approach over the other, because it supports both, so that | one approach over the other, because it supports both so that | |||
the 'market' can decide. Nonetheless, in the spirit of 'Do | the 'market' can decide. Nonetheless, in the spirit of 'Do | |||
one thing and do it well' [McIlroy78], the DualQ option | one thing and do it well' [McIlroy78], the DualQ option | |||
provides low delay without prejudging the issue of flow-rate | provides low delay without prejudging the issue of flow-rate | |||
control. Then, flow rate policing can be added separately if | control. Then, flow rate policing can be added separately if | |||
desired. This allows application control up to a point, but | desired. In contrast to scheduling, a policer would allow | |||
the network can still choose to set the point at which it | application control up to a point, but the network would still | |||
intervenes to prevent one flow completely starving another. | be able to set the point at which it intervened to prevent one | |||
flow completely starving another. | ||||
Note: | Note: | |||
1. It might seem that self-inflicted queuing delay within a per- | 1. It might seem that self-inflicted queuing delay within a per- | |||
flow queue should not be counted, because if the delay wasn't | flow queue should not be counted, because if the delay wasn't | |||
in the network it would just shift to the sender. However, | in the network, it would just shift to the sender. However, | |||
modern adaptive applications, e.g. HTTP/2 [RFC9113] or some | modern adaptive applications, e.g., HTTP/2 [RFC9113] or some | |||
interactive media applications (see Section 6.1), can keep low | interactive media applications (see Section 6.1), can keep low | |||
latency objects at the front of their local send queue by | latency objects at the front of their local send queue by | |||
shuffling priorities of other objects dependent on the | shuffling priorities of other objects dependent on the | |||
progress of other transfers (for example see [lowat]). They | progress of other transfers (for example, see [lowat]). They | |||
cannot shuffle objects once they have released them into the | cannot shuffle objects once they have released them into the | |||
network. | network. | |||
Alternative Back-off ECN (ABE): Here again, L4S is not an | Alternative Back-off ECN (ABE): Here again, L4S is not an | |||
alternative to ABE but a complement that introduces much lower | alternative to ABE but a complement that introduces much lower | |||
queuing delay. ABE [RFC8511] alters the host behaviour in | queuing delay. ABE [RFC8511] alters the host behaviour in | |||
response to ECN marking to utilize a link better and give ECN | response to ECN marking to utilize a link better and give ECN | |||
flows faster throughput. It uses ECT(0) and assumes the network | flows faster throughput. It uses ECT(0) and assumes the network | |||
still treats ECN and drop the same. Therefore, ABE exploits any | still treats ECN and drop the same. Therefore, ABE exploits any | |||
lower queuing delay that AQMs can provide. But, as explained | lower queuing delay that AQMs can provide. But, as explained | |||
above, AQMs still cannot reduce queuing delay too far without | above, AQMs still cannot reduce queuing delay too much without | |||
losing link utilization (to allow for other, non-ABE, flows). | losing link utilization (to allow for other, non-ABE, flows). | |||
BBR: Bottleneck Bandwidth and Round-trip propagation time | BBR: Bottleneck Bandwidth and Round-trip propagation time (BBR) | |||
(BBR [I-D.cardwell-iccrg-bbr-congestion-control]) controls queuing | [BBR-CC] controls queuing delay end-to-end without needing any | |||
delay end-to-end without needing any special logic in the network, | special logic in the network, such as an AQM. So it works pretty | |||
such as an AQM. So it works pretty-much on any path. BBR keeps | much on any path. BBR keeps queuing delay reasonably low, but | |||
queuing delay reasonably low, but perhaps not quite as low as with | perhaps not quite as low as with state-of-the-art AQMs, such as | |||
state-of-the-art AQMs such as PIE or FQ-CoDel, and certainly | PIE or FQ-CoDel, and certainly nowhere near as low as with L4S. | |||
nowhere near as low as with L4S. Queuing delay is also not | Queuing delay is also not consistently low, due to BBR's regular | |||
consistently low, due to BBR's regular bandwidth probing spikes | bandwidth probing spikes and its aggressive flow start-up phase. | |||
and its aggressive flow start-up phase. | ||||
L4S complements BBR. Indeed, BBRv2 can use L4S ECN where | L4S complements BBR. Indeed, BBRv2 can use L4S ECN where | |||
available and a scalable L4S congestion control behaviour in | available and a Scalable L4S congestion control behaviour in | |||
response to any ECN signalling from the path [BBRv2]. The L4S ECN | response to any ECN signalling from the path [BBRv2]. The L4S ECN | |||
signal complements the delay based congestion control aspects of | signal complements the delay-based congestion control aspects of | |||
BBR with an explicit indication that hosts can use, both to | BBR with an explicit indication that hosts can use, both to | |||
converge on a fair rate and to keep below a shallow queue target | converge on a fair rate and to keep below a shallow queue target | |||
set by the network. Without L4S ECN, both these aspects need to | set by the network. Without L4S ECN, both these aspects need to | |||
be assumed or estimated. | be assumed or estimated. | |||
6. Applicability | 6. Applicability | |||
6.1. Applications | 6.1. Applications | |||
A transport layer that solves the current latency issues will provide | A transport layer that solves the current latency issues will provide | |||
new service, product and application opportunities. | new service, product, and application opportunities. | |||
With the L4S approach, the following existing applications also | With the L4S approach, the following existing applications also | |||
experience significantly better quality of experience under load: | experience significantly better quality of experience under load: | |||
* Gaming, including cloud based gaming; | * gaming, including cloud-based gaming; | |||
* VoIP; | * VoIP; | |||
* Video conferencing; | * video conferencing; | |||
* Web browsing; | * web browsing; | |||
* (Adaptive) video streaming; | * (adaptive) video streaming; and | |||
* Instant messaging. | * instant messaging. | |||
The significantly lower queuing latency also enables some interactive | The significantly lower queuing latency also enables some interactive | |||
application functions to be offloaded to the cloud that would hardly | application functions to be offloaded to the cloud that would hardly | |||
even be usable today: | even be usable today, including: | |||
* Cloud based interactive video; | * cloud-based interactive video and | |||
* Cloud based virtual and augmented reality. | * cloud-based virtual and augmented reality. | |||
The above two applications have been successfully demonstrated with | The above two applications have been successfully demonstrated with | |||
L4S, both running together over a 40 Mb/s broadband access link | L4S, both running together over a 40 Mb/s broadband access link | |||
loaded up with the numerous other latency sensitive applications in | loaded up with the numerous other latency-sensitive applications in | |||
the previous list as well as numerous downloads - all sharing the | the previous list, as well as numerous downloads, with all sharing | |||
same bottleneck queue simultaneously [L4Sdemo16]. For the former, a | the same bottleneck queue simultaneously [L4Sdemo16] | |||
panoramic video of a football stadium could be swiped and pinched so | [L4Sdemo16-Video]. For the former, a panoramic video of a football | |||
that, on the fly, a proxy in the cloud could generate a sub-window of | stadium could be swiped and pinched so that, on the fly, a proxy in | |||
the match video under the finger-gesture control of each user. For | the cloud could generate a sub-window of the match video under the | |||
the latter, a virtual reality headset displayed a viewport taken from | finger-gesture control of each user. For the latter, a virtual | |||
a 360-degree camera in a racing car. The user's head movements | reality headset displayed a viewport taken from a 360-degree camera | |||
controlled the viewport extracted by a cloud-based proxy. In both | in a racing car. The user's head movements controlled the viewport | |||
cases, with 7 ms end-to-end base delay, the additional queuing delay | extracted by a cloud-based proxy. In both cases, with a 7 ms end-to- | |||
of roughly 1 ms was so low that it seemed the video was generated | end base delay, the additional queuing delay of roughly 1 ms was so | |||
locally. | low that it seemed the video was generated locally. | |||
Using a swiping finger gesture or head movement to pan a video are | Using a swiping finger gesture or head movement to pan a video are | |||
extremely latency-demanding actions -- far more demanding than VoIP. | extremely latency-demanding actions -- far more demanding than VoIP | |||
Because human vision can detect extremely low delays of the order of | -- because human vision can detect extremely low delays of the order | |||
single milliseconds when delay is translated into a visual lag | of single milliseconds when delay is translated into a visual lag | |||
between a video and a reference point (the finger or the orientation | between a video and a reference point (the finger or the orientation | |||
of the head sensed by the balance system in the inner ear -- the | of the head sensed by the balance system in the inner ear, i.e., the | |||
vestibular system). With an alternative AQM, the video noticeably | vestibular system). With an alternative AQM, the video noticeably | |||
lagged behind the finger gestures and head movements. | lagged behind the finger gestures and head movements. | |||
Without the low queuing delay of L4S, cloud-based applications like | Without the low queuing delay of L4S, cloud-based applications like | |||
these would not be credible without significantly more access | these would not be credible without significantly more access-network | |||
bandwidth (to deliver all possible video that might be viewed) and | bandwidth (to deliver all possible areas of the video that might be | |||
more local processing, which would increase the weight and power | viewed) and more local processing, which would increase the weight | |||
consumption of head-mounted displays. When all interactive | and power consumption of head-mounted displays. When all interactive | |||
processing can be done in the cloud, only the data to be rendered for | processing can be done in the cloud, only the data to be rendered for | |||
the end user needs to be sent. | the end user needs to be sent. | |||
Other low latency high bandwidth applications such as: | Other low latency high bandwidth applications, such as: | |||
* Interactive remote presence; | * interactive remote presence and | |||
* Video-assisted remote control of machinery or industrial | * video-assisted remote control of machinery or industrial processes | |||
processes. | ||||
are not credible at all without very low queuing delay. No amount of | are not credible at all without very low queuing delay. No amount of | |||
extra access bandwidth or local processing can make up for lost time. | extra access bandwidth or local processing can make up for lost time. | |||
6.2. Use Cases | 6.2. Use Cases | |||
The following use-cases for L4S are being considered by various | The following use cases for L4S are being considered by various | |||
interested parties: | interested parties: | |||
* Where the bottleneck is one of various types of access network: | * where the bottleneck is one of various types of access network, | |||
e.g. DSL, Passive Optical Networks (PON), DOCSIS cable, mobile, | e.g., DSL, Passive Optical Networks (PONs), DOCSIS cable, mobile, | |||
satellite (see Section 6.3 for some technology-specific details) | satellite; or where it's a Wi-Fi link (see Section 6.3 for some | |||
technology-specific details) | ||||
* Private networks of heterogeneous data centres, where there is no | * private networks of heterogeneous data centres, where there is no | |||
single administrator that can arrange for all the simultaneous | single administrator that can arrange for all the simultaneous | |||
changes to senders, receivers and network needed to deploy DCTCP: | changes to senders, receivers, and networks needed to deploy | |||
DCTCP: | ||||
- a set of private data centres interconnected over a wide area | - a set of private data centres interconnected over a wide area | |||
with separate administrations, but within the same company | with separate administrations but within the same company | |||
- a set of data centres operated by separate companies | - a set of data centres operated by separate companies | |||
interconnected by a community of interest network (e.g. for the | interconnected by a community of interest network (e.g., for | |||
finance sector) | the finance sector) | |||
- multi-tenant (cloud) data centres where tenants choose their | - multi-tenant (cloud) data centres where tenants choose their | |||
operating system stack (Infrastructure as a Service - IaaS) | operating system stack (Infrastructure as a Service (IaaS)) | |||
* Different types of transport (or application) congestion control: | * different types of transport (or application) congestion control: | |||
- elastic (TCP/SCTP); | - elastic (TCP/SCTP); | |||
- real-time (RTP, RMCAT); | - real-time (RTP, RMCAT); and | |||
- query (DNS/LDAP). | - query-response (DNS/LDAP). | |||
* Where low delay quality of service is required, but without | * where low delay QoS is required but without inspecting or | |||
inspecting or intervening above the IP layer [RFC8404]: | intervening above the IP layer [RFC8404]: | |||
- mobile and other networks have tended to inspect higher layers | - Mobile and other networks have tended to inspect higher layers | |||
in order to guess application QoS requirements. However, with | in order to guess application QoS requirements. However, with | |||
growing demand for support of privacy and encryption, L4S | growing demand for support of privacy and encryption, L4S | |||
offers an alternative. There is no need to select which | offers an alternative. There is no need to select which | |||
traffic to favour for queuing, when L4S can give favourable | traffic to favour for queuing when L4S can give favourable | |||
queuing to all traffic. | queuing to all traffic. | |||
* If queuing delay is minimized, applications with a fixed delay | * If queuing delay is minimized, applications with a fixed delay | |||
budget can communicate over longer distances, or via a longer | budget can communicate over longer distances or via more | |||
chain of service functions [RFC7665] or onion routers. | circuitous paths, e.g., longer chains of service functions | |||
[RFC7665] or of onion routers. | ||||
* If delay jitter is minimized, it is possible to reduce the | * If delay jitter is minimized, it is possible to reduce the | |||
dejitter buffers on the receive end of video streaming, which | dejitter buffers on the receiving end of video streaming, which | |||
should improve the interactive experience | should improve the interactive experience. | |||
6.3. Applicability with Specific Link Technologies | 6.3. Applicability with Specific Link Technologies | |||
Certain link technologies aggregate data from multiple packets into | Certain link technologies aggregate data from multiple packets into | |||
bursts, and buffer incoming packets while building each burst. Wi- | bursts and buffer incoming packets while building each burst. Wi-Fi, | |||
Fi, PON and cable all involve such packet aggregation, whereas fixed | PON, and cable all involve such packet aggregation, whereas fixed | |||
Ethernet and DSL do not. No sender, whether L4S or not, can do | Ethernet and DSL do not. No sender, whether L4S or not, can do | |||
anything to reduce the buffering needed for packet aggregation. So | anything to reduce the buffering needed for packet aggregation. So | |||
an AQM should not count this buffering as part of the queue that it | an AQM should not count this buffering as part of the queue that it | |||
controls, given no amount of congestion signals will reduce it. | controls, given no amount of congestion signals will reduce it. | |||
Certain link technologies also add buffering for other reasons, | Certain link technologies also add buffering for other reasons, | |||
specifically: | specifically: | |||
* Radio links (cellular, Wi-Fi, satellite) that are distant from the | * Radio links (cellular, Wi-Fi, or satellite) that are distant from | |||
source are particularly challenging. The radio link capacity can | the source are particularly challenging. The radio link capacity | |||
vary rapidly by orders of magnitude, so it is considered desirable | can vary rapidly by orders of magnitude, so it is considered | |||
to hold a standing queue that can utilize sudden increases of | desirable to hold a standing queue that can utilize sudden | |||
capacity; | increases of capacity. | |||
* Cellular networks are further complicated by a perceived need to | * Cellular networks are further complicated by a perceived need to | |||
buffer in order to make hand-overs imperceptible; | buffer in order to make hand-overs imperceptible. | |||
L4S cannot remove the need for all these different forms of | L4S cannot remove the need for all these different forms of | |||
buffering. However, by removing 'the longest pole in the tent' | buffering. However, by removing 'the longest pole in the tent' | |||
(buffering for the large sawteeth of Classic congestion controls), | (buffering for the large sawteeth of Classic congestion controls), | |||
L4S exposes all these 'shorter poles' to greater scrutiny. | L4S exposes all these 'shorter poles' to greater scrutiny. | |||
Until now, the buffering needed for these additional reasons tended | Until now, the buffering needed for these additional reasons tended | |||
to be over-specified - with the excuse that none were 'the longest | to be over-specified -- with the excuse that none were 'the longest | |||
pole in the tent'. But having removed the 'longest pole', it becomes | pole in the tent'. But having removed the 'longest pole', it becomes | |||
worthwhile to minimize them, for instance reducing packet aggregation | worthwhile to minimize them, for instance, reducing packet | |||
burst sizes and MAC scheduling intervals. | aggregation burst sizes and MAC scheduling intervals. | |||
Also certain link types, particularly radio-based links, are far more | Also, certain link types, particularly radio-based links, are far | |||
prone to transmission losses. Section 6.4.3 explains how an L4S | more prone to transmission losses. Section 6.4.3 explains how an L4S | |||
response to loss has to be as drastic as a Classic response. | response to loss has to be as drastic as a Classic response. | |||
Nonetheless, research referred to in the same section has | Nonetheless, research referred to in the same section has | |||
demonstrated potential for considerably more effective loss repair at | demonstrated potential for considerably more effective loss repair at | |||
the link layer, due to the relaxed ordering constraints of L4S | the link layer, due to the relaxed ordering constraints of L4S | |||
packets. | packets. | |||
6.4. Deployment Considerations | 6.4. Deployment Considerations | |||
L4S AQMs, whether DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] or FQ, | L4S AQMs, whether DualQ [RFC9332] or FQ [RFC8290], are in themselves | |||
e.g. [RFC8290] are, in themselves, an incremental deployment | an incremental deployment mechanism for L4S -- so that L4S traffic | |||
mechanism for L4S - so that L4S traffic can coexist with existing | can coexist with existing Classic (Reno-friendly) traffic. | |||
Classic (Reno-friendly) traffic. Section 6.4.1 explains why only | Section 6.4.1 explains why only deploying an L4S AQM in one node at | |||
deploying an L4S AQM in one node at each end of the access link will | each end of the access link will realize nearly all the benefit of | |||
realize nearly all the benefit of L4S. | L4S. | |||
L4S involves both end systems and the network, so Section 6.4.2 | L4S involves both the network and end systems, so Section 6.4.2 | |||
suggests some typical sequences to deploy each part, and why there | suggests some typical sequences to deploy each part and why there | |||
will be an immediate and significant benefit after deploying just one | will be an immediate and significant benefit after deploying just one | |||
part. | part. | |||
Section 6.4.3 and Section 6.4.4 describe the converse incremental | Sections 6.4.3 and 6.4.4 describe the converse incremental deployment | |||
deployment case where there is no L4S AQM at the network bottleneck, | case where there is no L4S AQM at the network bottleneck, so any L4S | |||
so any L4S flow traversing this bottleneck has to take care in case | flow traversing this bottleneck has to take care in case it is | |||
it is competing with Classic traffic. | competing with Classic traffic. | |||
6.4.1. Deployment Topology | 6.4.1. Deployment Topology | |||
L4S AQMs will not have to be deployed throughout the Internet before | L4S AQMs will not have to be deployed throughout the Internet before | |||
L4S can benefit anyone. Operators of public Internet access networks | L4S can benefit anyone. Operators of public Internet access networks | |||
typically design their networks so that the bottleneck will nearly | typically design their networks so that the bottleneck will nearly | |||
always occur at one known (logical) link. This confines the cost of | always occur at one known (logical) link. This confines the cost of | |||
queue management technology to one place. | queue management technology to one place. | |||
The case of mesh networks is different and will be discussed later in | The case of mesh networks is different and will be discussed later in | |||
this section. But the known bottleneck case is generally true for | this section. However, the known-bottleneck case is generally true | |||
Internet access to all sorts of different 'sites', where the word | for Internet access to all sorts of different 'sites', where the word | |||
'site' includes home networks, small- to medium-sized campus or | 'site' includes home networks, small- to medium-sized campus or | |||
enterprise networks and even cellular devices (Figure 2). Also, this | enterprise networks and even cellular devices (Figure 2). Also, this | |||
known-bottleneck case tends to be applicable whatever the access link | known-bottleneck case tends to be applicable whatever the access link | |||
technology; whether xDSL, cable, PON, cellular, line of sight | technology, whether xDSL, cable, PON, cellular, line of sight | |||
wireless or satellite. | wireless, or satellite. | |||
Therefore, the full benefit of the L4S service should be available in | Therefore, the full benefit of the L4S service should be available in | |||
the downstream direction when an L4S AQM is deployed at the ingress | the downstream direction when an L4S AQM is deployed at the ingress | |||
to this bottleneck link. And similarly, the full upstream service | to this bottleneck link. And similarly, the full upstream service | |||
will be available once an L4S AQM is deployed at the ingress into the | will typically be available once an L4S AQM is deployed at the | |||
upstream link. (Of course, multi-homed sites would only see the full | ingress into the upstream link. (Of course, multihomed sites would | |||
benefit once all their access links were covered.) | only see the full benefit once all their access links were covered.) | |||
______ | ______ | |||
( ) | ( ) | |||
__ __ ( ) | __ __ ( ) | |||
|DQ\________/DQ|( enterprise ) | |DQ\________/DQ|( enterprise ) | |||
___ |__/ \__| ( /campus ) | ___ |__/ \__| ( /campus ) | |||
( ) (______) | ( ) (______) | |||
( ) ___||_ | ( ) ___||_ | |||
+----+ ( ) __ __ / \ | +----+ ( ) __ __ / \ | |||
| DC |-----( Core )|DQ\_______________/DQ|| home | | | DC |-----( Core )|DQ\_______________/DQ|| home | | |||
+----+ ( ) |__/ \__||______| | +----+ ( ) |__/ \__||______| | |||
(_____) __ | (_____) __ | |||
|DQ\__/\ __ ,===. | |DQ\__/\ __ ,===. | |||
|__/ \ ____/DQ||| ||mobile | |__/ \ ____/DQ||| ||mobile | |||
\/ \__|||_||device | \/ \__|||_||device | |||
| o | | | o | | |||
`---' | `---' | |||
Figure 2: Likely location of DualQ (DQ) Deployments in common | Figure 2: Likely Location of DualQ (DQ) Deployments in Common | |||
access topologies | Access Topologies | |||
Deployment in mesh topologies depends on how overbooked the core is. | Deployment in mesh topologies depends on how overbooked the core is. | |||
If the core is non-blocking, or at least generously provisioned so | If the core is non-blocking, or at least generously provisioned so | |||
that the edges are nearly always the bottlenecks, it would only be | that the edges are nearly always the bottlenecks, it would only be | |||
necessary to deploy an L4S AQM at the edge bottlenecks. For example, | necessary to deploy an L4S AQM at the edge bottlenecks. For example, | |||
some data-centre networks are designed with the bottleneck in the | some data-centre networks are designed with the bottleneck in the | |||
hypervisor or host NICs, while others bottleneck at the top-of-rack | hypervisor or host Network Interface Controllers (NICs), while others | |||
switch (both the output ports facing hosts and those facing the | bottleneck at the top-of-rack switch (both the output ports facing | |||
core). | hosts and those facing the core). | |||
An L4S AQM would often next be needed where the Wi-Fi links in a home | An L4S AQM would often next be needed where the Wi-Fi links in a home | |||
sometimes become the bottleneck. And an L4S AQM would eventually | sometimes become the bottleneck. Also an L4S AQM would eventually | |||
also need to be deployed at any other persistent bottlenecks such as | need to be deployed at any other persistent bottlenecks, such as | |||
network interconnections, e.g. some public Internet exchange points | network interconnections, e.g., some public Internet exchange points | |||
and the ingress and egress to WAN links interconnecting data-centres. | and the ingress and egress to WAN links interconnecting data centres. | |||
6.4.2. Deployment Sequences | 6.4.2. Deployment Sequences | |||
For any one L4S flow to provide benefit, it requires three (or | For any one L4S flow to provide benefit, it requires three (or | |||
sometimes two) parts to have been deployed: i) the congestion control | sometimes two) parts to have been deployed: i) the congestion control | |||
at the sender; ii) the AQM at the bottleneck; and iii) older | at the sender; ii) the AQM at the bottleneck; and iii) older | |||
transports (namely TCP) need upgraded receiver feedback too. This | transports (namely TCP) need upgraded receiver feedback too. This | |||
was the same deployment problem that ECN faced [RFC8170] so we have | was the same deployment problem that ECN faced [RFC8170], so we have | |||
learned from that experience. | learned from that experience. | |||
Firstly, L4S deployment exploits the fact that DCTCP already exists | Firstly, L4S deployment exploits the fact that DCTCP already exists | |||
on many Internet hosts (Windows, FreeBSD and Linux); both servers and | on many Internet hosts (e.g., Windows, FreeBSD, and Linux), both | |||
clients. Therefore, an L4S AQM can be deployed at a network | servers and clients. Therefore, an L4S AQM can be deployed at a | |||
bottleneck to immediately give a working deployment of all the L4S | network bottleneck to immediately give a working deployment of all | |||
parts for testing, as long as the ECT(0) codepoint is switched to | the L4S parts for testing, as long as the ECT(0) codepoint is | |||
ECT(1). DCTCP needs some safety concerns to be fixed for general use | switched to ECT(1). DCTCP needs some safety concerns to be fixed for | |||
over the public Internet (see Section 4.3 of the L4S ECN | general use over the public Internet (see Section 4.3 of the L4S ECN | |||
spec [I-D.ietf-tsvwg-ecn-l4s-id]), but DCTCP is not on by default, so | spec [RFC9331]), but DCTCP is not on by default, so these issues can | |||
these issues can be managed within controlled deployments or | be managed within controlled deployments or controlled trials. | |||
controlled trials. | ||||
Secondly, the performance improvement with L4S is so significant that | Secondly, the performance improvement with L4S is so significant that | |||
it enables new interactive services and products that were not | it enables new interactive services and products that were not | |||
previously possible. It is much easier for companies to initiate new | previously possible. It is much easier for companies to initiate new | |||
work on deployment if there is budget for a new product trial. If, | work on deployment if there is budget for a new product trial. In | |||
in contrast, there were only an incremental performance improvement | contrast, if there were only an incremental performance improvement | |||
(as with Classic ECN), spending on deployment tends to be much harder | (as with Classic ECN), spending on deployment tends to be much harder | |||
to justify. | to justify. | |||
Thirdly, the L4S identifier is defined so that initially network | Thirdly, the L4S identifier is defined so that network operators can | |||
operators can enable L4S exclusively for certain customers or certain | initially enable L4S exclusively for certain customers or certain | |||
applications. But this is carefully defined so that it does not | applications. However, this is carefully defined so that it does not | |||
compromise future evolution towards L4S as an Internet-wide service. | compromise future evolution towards L4S as an Internet-wide service. | |||
This is because the L4S identifier is defined not only as the end-to- | This is because the L4S identifier is defined not only as the end-to- | |||
end ECN field, but it can also optionally be combined with any other | end ECN field, but it can also optionally be combined with any other | |||
packet header or some status of a customer or their access link (see | packet header or some status of a customer or their access link (see | |||
section 5.4 of [I-D.ietf-tsvwg-ecn-l4s-id]). Operators could do this | Section 5.4 of [RFC9331]). Operators could do this anyway, even if | |||
anyway, even if it were not blessed by the IETF. However, it is best | it were not blessed by the IETF. However, it is best for the IETF to | |||
for the IETF to specify that, if they use their own local identifier, | specify that, if they use their own local identifier, it must be in | |||
it must be in combination with the IETF's identifier. Then, if an | combination with the IETF's identifier, ECT(1). Then, if an operator | |||
operator has opted for an exclusive local-use approach, later they | has opted for an exclusive local-use approach, they only have to | |||
only have to remove this extra rule to make the service work | remove this extra rule later to make the service work across the | |||
Internet-wide - it will already traverse middleboxes, peerings, etc. | Internet -- it will already traverse middleboxes, peerings, etc. | |||
+-+--------------------+----------------------+---------------------+ | +-+--------------------+----------------------+---------------------+ | |||
| | Servers or proxies | Access link | Clients | | | | Servers or proxies | Access link | Clients | | |||
+-+--------------------+----------------------+---------------------+ | +-+--------------------+----------------------+---------------------+ | |||
|0| DCTCP (existing) | | DCTCP (existing) | | |0| DCTCP (existing) | | DCTCP (existing) | | |||
+-+--------------------+----------------------+---------------------+ | +-+--------------------+----------------------+---------------------+ | |||
|1| |Add L4S AQM downstream| | | |1| |Add L4S AQM downstream| | | |||
| | WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS | | | | WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS | | |||
+-+--------------------+----------------------+---------------------+ | +-+--------------------+----------------------+---------------------+ | |||
|2| Upgrade DCTCP to | |Replace DCTCP feedb'k| | |2| Upgrade DCTCP to | |Replace DCTCP feedb'k| | |||
skipping to change at page 28, line 4 ¶ | skipping to change at line 1269 ¶ | |||
+-+--------------------+----------------------+---------------------+ | +-+--------------------+----------------------+---------------------+ | |||
|2| Upgrade DCTCP to | |Replace DCTCP feedb'k| | |2| Upgrade DCTCP to | |Replace DCTCP feedb'k| | |||
| | TCP Prague | | with AccECN | | | | TCP Prague | | with AccECN | | |||
| | FULLY WORKS DOWNSTREAM | | | | FULLY WORKS DOWNSTREAM | | |||
+-+--------------------+----------------------+---------------------+ | +-+--------------------+----------------------+---------------------+ | |||
| | | | Upgrade DCTCP to | | | | | | Upgrade DCTCP to | | |||
|3| | Add L4S AQM upstream | TCP Prague | | |3| | Add L4S AQM upstream | TCP Prague | | |||
| | | | | | | | | | | | |||
| | FULLY WORKS UPSTREAM AND DOWNSTREAM | | | | FULLY WORKS UPSTREAM AND DOWNSTREAM | | |||
+-+--------------------+----------------------+---------------------+ | +-+--------------------+----------------------+---------------------+ | |||
Figure 3: Example L4S Deployment Sequence | Figure 3: Example L4S Deployment Sequence | |||
Figure 3 illustrates some example sequences in which the parts of L4S | Figure 3 illustrates some example sequences in which the parts of L4S | |||
might be deployed. It consists of the following stages, preceded by | might be deployed. It consists of the following stages, preceded by | |||
a presumption that DCTCP is already installed at both ends: | a presumption that DCTCP is already installed at both ends: | |||
1. DCTCP is not applicable for use over the public Internet, so it | 1. DCTCP is not applicable for use over the public Internet, so it | |||
is emphasized here that any DCTCP flow has to be completely | is emphasized here that any DCTCP flow has to be completely | |||
contained within a controlled trial environment. | contained within a controlled trial environment. | |||
Within this trial environment, once an L4S AQM has been deployed, | Within this trial environment, once an L4S AQM has been deployed, | |||
the trial DCTCP flow will experience immediate benefit, without | the trial DCTCP flow will experience immediate benefit, without | |||
any other deployment being needed. In this example downstream | any other deployment being needed. In this example, downstream | |||
deployment is first, but in other scenarios the upstream might be | deployment is first, but in other scenarios, the upstream might | |||
deployed first. If no AQM at all was previously deployed for the | be deployed first. If no AQM at all was previously deployed for | |||
downstream access, an L4S AQM greatly improves the Classic | the downstream access, an L4S AQM greatly improves the Classic | |||
service (as well as adding the L4S service). If an AQM was | service (as well as adding the L4S service). If an AQM was | |||
already deployed, the Classic service will be unchanged (and L4S | already deployed, the Classic service will be unchanged (and L4S | |||
will add an improvement on top). | will add an improvement on top). | |||
2. In this stage, the name 'TCP | 2. In this stage, the name 'TCP Prague' [PRAGUE-CC] is used to | |||
Prague' [I-D.briscoe-iccrg-prague-congestion-control] is used to | ||||
represent a variant of DCTCP that is designed to be used in a | represent a variant of DCTCP that is designed to be used in a | |||
production Internet environment (that is, it has to comply with | production Internet environment (that is, it has to comply with | |||
all the requirements in Section 4 of the L4S ECN | all the requirements in Section 4 of the L4S ECN spec [RFC9331], | |||
spec [I-D.ietf-tsvwg-ecn-l4s-id], which then means it can be used | which then means it can be used over the public Internet). If | |||
over the public Internet). If the application is primarily | the application is primarily unidirectional, 'TCP Prague' at the | |||
unidirectional, 'TCP Prague' at one end will provide all the | sending end will provide all the benefit needed, as long as the | |||
benefit needed. | receiving end supports Accurate ECN (AccECN) feedback [ACCECN]. | |||
For TCP transports, Accurate ECN feedback | For TCP transports, AccECN feedback is needed at the other end, | |||
(AccECN) [I-D.ietf-tcpm-accurate-ecn] is needed at the other end, | ||||
but it is a generic ECN feedback facility that is already planned | but it is a generic ECN feedback facility that is already planned | |||
to be deployed for other purposes, e.g. DCTCP, BBR. The two ends | to be deployed for other purposes, e.g., DCTCP and BBR. The two | |||
can be deployed in either order, because, in TCP, an L4S | ends can be deployed in either order because, in TCP, an L4S | |||
congestion control only enables itself if it has negotiated the | congestion control only enables itself if it has negotiated the | |||
use of AccECN feedback with the other end during the connection | use of AccECN feedback with the other end during the connection | |||
handshake. Thus, deployment of TCP Prague on a server enables | handshake. Thus, deployment of TCP Prague on a server enables | |||
L4S trials to move to a production service in one direction, | L4S trials to move to a production service in one direction, | |||
wherever AccECN is deployed at the other end. This stage might | wherever AccECN is deployed at the other end. This stage might | |||
be further motivated by the performance improvements of TCP | be further motivated by the performance improvements of TCP | |||
Prague relative to DCTCP (see Appendix A.2 of the L4S ECN | Prague relative to DCTCP (see Appendix A.2 of the L4S ECN spec | |||
spec [I-D.ietf-tsvwg-ecn-l4s-id]). | [RFC9331]). | |||
Unlike TCP, from the outset, QUIC ECN feedback [RFC9000] has | Unlike TCP, from the outset, QUIC ECN feedback [RFC9000] has | |||
supported L4S. Therefore, if the transport is QUIC, one-ended | supported L4S. Therefore, if the transport is QUIC, one-ended | |||
deployment of a Prague congestion control at this stage is simple | deployment of a Prague congestion control at this stage is simple | |||
and sufficient. | and sufficient. | |||
For QUIC, if a proxy sits in the path between multiple origin | For QUIC, if a proxy sits in the path between multiple origin | |||
servers and the access bottlenecks to multiple clients, then | servers and the access bottlenecks to multiple clients, then | |||
upgrading the proxy with a Scalable congestion control would | upgrading the proxy with a Scalable congestion control would | |||
provide the benefits of L4S over all the clients' downstream | provide the benefits of L4S over all the clients' downstream | |||
bottlenecks in one go --- whether or not all the origin servers | bottlenecks in one go -- whether or not all the origin servers | |||
were upgraded. Conversely, where a proxy has not been upgraded, | were upgraded. Conversely, where a proxy has not been upgraded, | |||
the clients served by it will not benefit from L4S at all in the | the clients served by it will not benefit from L4S at all in the | |||
downstream, even when any origin server behind the proxy has been | downstream, even when any origin server behind the proxy has been | |||
upgraded to support L4S. | upgraded to support L4S. | |||
For TCP, a proxy upgraded to support 'TCP Prague' would provide | For TCP, a proxy upgraded to support 'TCP Prague' would provide | |||
the benefits of L4S downstream to all clients that support AccECN | the benefits of L4S downstream to all clients that support AccECN | |||
(whether or not they support L4S as well). And in the upstream, | (whether or not they support L4S as well). And in the upstream, | |||
the proxy would also support AccECN as a receiver, so that any | the proxy would also support AccECN as a receiver, so that any | |||
client deploying its own L4S support would benefit in the | client deploying its own L4S support would benefit in the | |||
upstream direction, irrespective of whether any origin server | upstream direction, irrespective of whether any origin server | |||
beyond the proxy supported AccECN. | beyond the proxy supported AccECN. | |||
3. This is a two-move stage to enable L4S upstream. An L4S AQM or | 3. This is a two-move stage to enable L4S upstream. An L4S AQM or | |||
TCP Prague can be deployed in either order as already explained. | TCP Prague can be deployed in either order as already explained. | |||
To motivate the first of two independent moves, the deferred | To motivate the first of two independent moves, the deferred | |||
benefit of enabling new services after the second move has to be | benefit of enabling new services after the second move has to be | |||
worth it to cover the first mover's investment risk. As | worth it to cover the first mover's investment risk. As | |||
explained already, the potential for new interactive services | explained already, the potential for new interactive services | |||
provides this motivation. An L4S AQM also improves the upstream | provides this motivation. An L4S AQM also improves the upstream | |||
Classic service - significantly if no other AQM has already been | Classic service significantly if no other AQM has already been | |||
deployed. | deployed. | |||
Note that other deployment sequences might occur. For instance: the | Note that other deployment sequences might occur. For instance, the | |||
upstream might be deployed first; a non-TCP protocol might be used | upstream might be deployed first; a non-TCP protocol might be used | |||
end-to-end, e.g. QUIC, RTP; a body such as the 3GPP might require L4S | end to end, e.g., QUIC and RTP; a body, such as the 3GPP, might | |||
to be implemented in 5G user equipment, or other random acts of | require L4S to be implemented in 5G user equipment; or other random | |||
kindness. | acts of kindness might arise. | |||
6.4.3. L4S Flow but Non-ECN Bottleneck | 6.4.3. L4S Flow but Non-ECN Bottleneck | |||
If L4S is enabled between two hosts, the L4S sender is required to | If L4S is enabled between two hosts, the L4S sender is required to | |||
coexist safely with Reno in response to any drop (see Section 4.3 of | coexist safely with Reno in response to any drop (see Section 4.3 of | |||
the L4S ECN spec [I-D.ietf-tsvwg-ecn-l4s-id]). | the L4S ECN spec [RFC9331]). | |||
Unfortunately, as well as protecting Classic traffic, this rule | Unfortunately, as well as protecting Classic traffic, this rule | |||
degrades the L4S service whenever there is any loss, even if the | degrades the L4S service whenever there is any loss, even if the | |||
cause is not persistent congestion at a bottleneck, e.g.: | cause is not persistent congestion at a bottleneck, for example: | |||
* congestion loss at other transient bottlenecks, e.g. due to bursts | * congestion loss at other transient bottlenecks, e.g., due to | |||
in shallower queues; | bursts in shallower queues; | |||
* transmission errors, e.g., due to electrical interference; and | ||||
* transmission errors, e.g. due to electrical interference; | ||||
* rate policing. | * rate policing. | |||
Three complementary approaches are in progress to address this issue, | Three complementary approaches are in progress to address this issue, | |||
but they are all currently research: | but they are all currently research: | |||
* In Prague congestion control, ignore certain losses deemed | * In Prague congestion control, ignore certain losses deemed | |||
unlikely to be due to congestion (using some ideas from | unlikely to be due to congestion (using some ideas from BBR | |||
BBR [I-D.cardwell-iccrg-bbr-congestion-control] regarding isolated | [BBR-CC] regarding isolated losses). This could mask any of the | |||
losses). This could mask any of the above types of loss while | above types of loss while still coexisting with drop-based | |||
still coexisting with drop-based congestion controls. | congestion controls. | |||
* A combination of RACK, L4S and link retransmission without | * A combination of Recent Acknowledgement (RACK) [RFC8985], L4S, and | |||
resequencing could repair transmission errors without the head of | link retransmission without resequencing could repair transmission | |||
line blocking delay usually associated with link-layer | errors without the head of line blocking delay usually associated | |||
retransmission [UnorderedLTE], [I-D.ietf-tsvwg-ecn-l4s-id]; | with link-layer retransmission [UnorderedLTE] [RFC9331]. | |||
* Hybrid ECN/drop rate policers (see Section 8.3). | * Hybrid ECN/drop rate policers (see Section 8.3). | |||
L4S deployment scenarios that minimize these issues (e.g. over | L4S deployment scenarios that minimize these issues (e.g., over | |||
wireline networks) can proceed in parallel to this research, in the | wireline networks) can proceed in parallel to this research, in the | |||
expectation that research success could continually widen L4S | expectation that research success could continually widen L4S | |||
applicability. | applicability. | |||
6.4.4. L4S Flow but Classic ECN Bottleneck | 6.4.4. L4S Flow but Classic ECN Bottleneck | |||
Classic ECN support is starting to materialize on the Internet as an | Classic ECN support is starting to materialize on the Internet as an | |||
increased level of CE marking. It is hard to detect whether this is | increased level of CE marking. It is hard to detect whether this is | |||
all due to the addition of support for ECN in implementations of FQ- | all due to the addition of support for ECN in implementations of FQ- | |||
CoDel and/or FQ-COBALT, which is not generally problematic, because | CoDel and/or FQ-COBALT, which is not generally problematic, because | |||
flow-queue (FQ) scheduling inherently prevents a flow from exceeding | flow queue (FQ) scheduling inherently prevents a flow from exceeding | |||
the 'fair' rate irrespective of its aggressiveness. However, some of | the 'fair' rate irrespective of its aggressiveness. However, some of | |||
this Classic ECN marking might be due to single-queue ECN deployment. | this Classic ECN marking might be due to single-queue ECN deployment. | |||
This case is discussed in Section 4.3 of the L4S ECN | This case is discussed in Section 4.3 of the L4S ECN spec [RFC9331]. | |||
spec [I-D.ietf-tsvwg-ecn-l4s-id]. | ||||
6.4.5. L4S AQM Deployment within Tunnels | 6.4.5. L4S AQM Deployment within Tunnels | |||
An L4S AQM uses the ECN field to signal congestion. So, in common | An L4S AQM uses the ECN field to signal congestion. So in common | |||
with Classic ECN, if the AQM is within a tunnel or at a lower layer, | with Classic ECN, if the AQM is within a tunnel or at a lower layer, | |||
correct functioning of ECN signalling requires correct propagation of | correct functioning of ECN signalling requires standards-compliant | |||
the ECN field up the layers [RFC6040], | propagation of the ECN field up the layers [RFC6040] [ECN-SHIM] | |||
[I-D.ietf-tsvwg-rfc6040update-shim], | [ECN-ENCAP]. | |||
[I-D.ietf-tsvwg-ecn-encap-guidelines]. | ||||
7. IANA Considerations (to be removed by RFC Editor) | 7. IANA Considerations | |||
This specification contains no IANA considerations. | This document has no IANA actions. | |||
8. Security Considerations | 8. Security Considerations | |||
8.1. Traffic Rate (Non-)Policing | 8.1. Traffic Rate (Non-)Policing | |||
8.1.1. (Non-)Policing Rate per Flow | 8.1.1. (Non-)Policing Rate per Flow | |||
In the current Internet, ISPs usually enforce separation between the | In the current Internet, ISPs usually enforce separation between the | |||
capacity of shared links assigned to different 'sites' | capacity of shared links assigned to different 'sites' (e.g., | |||
(e.g. households, businesses or mobile users - see terminology in | households, businesses, or mobile users -- see terminology in | |||
Section 3) using some form of scheduler [RFC0970]. And they use | Section 3) using some form of scheduler [RFC0970]. And they use | |||
various techniques like redirection to traffic scrubbing facilities | various techniques, like redirection to traffic scrubbing facilities, | |||
to deal with flooding attacks. However, there has never been a | to deal with flooding attacks. However, there has never been a | |||
universal need to police the rate of individual application flows - | universal need to police the rate of individual application flows -- | |||
the Internet has generally always relied on self-restraint of | the Internet has generally always relied on self-restraint of | |||
congestion controls at senders for sharing intra-'site' capacity. | congestion controls at senders for sharing intra-'site' capacity. | |||
L4S has been designed not to upset this status quo. If a DualQ is | L4S has been designed not to upset this status quo. If a DualQ is | |||
used to provide L4S service, section 4.2 of | used to provide L4S service, Section 4.2 of [RFC9332] explains how it | |||
[I-D.ietf-tsvwg-aqm-dualq-coupled] explains how it is designed to | is designed to give no more rate advantage to unresponsive flows than | |||
give no more rate advantage to unresponsive flows than a single-queue | a single-queue AQM would, whether or not there is traffic overload. | |||
AQM would, whether or not there is traffic overload. | ||||
Also, in case per-flow rate policing is ever required, it can be | Also, in case per-flow rate policing is ever required, it can be | |||
added because it is orthogonal to the distinction between L4S and | added because it is orthogonal to the distinction between L4S and | |||
Classic. As explained in Section 5.2, the DualQ variant of L4S | Classic. As explained in Section 5.2, the DualQ variant of L4S | |||
provides low delay without prejudging the issue of flow-rate control. | provides low delay without prejudging the issue of flow-rate control. | |||
So, if flow-rate control is needed, per-flow-queuing (FQ) with L4S | So if flow-rate control is needed, per-flow queuing (FQ) with L4S | |||
support can be used instead, or flow rate policing can be added as a | support can be used instead, or flow rate policing can be added as a | |||
modular addition to a DualQ. However, per-flow rate control is not | modular addition to a DualQ. However, per-flow rate control is not | |||
usually deployed as a security mechanism, because an active attacker | usually deployed as a security mechanism, because an active attacker | |||
can just shard its traffic over more flow IDs if the rate of each is | can just shard its traffic over more flow identifiers if the rate of | |||
restricted. | each is restricted. | |||
8.1.2. (Non-)Policing L4S Service Rate | 8.1.2. (Non-)Policing L4S Service Rate | |||
Section 5.2 explains how Diffserv only makes a difference if some | Section 5.2 explains how Diffserv only makes a difference if some | |||
packets get less favourable treatment than others, which typically | packets get less favourable treatment than others, which typically | |||
requires traffic rate policing for a low latency class. In contrast, | requires traffic rate policing for a low latency class. In contrast, | |||
it should not be necessary to rate-police access to the L4S service | it should not be necessary to rate-police access to the L4S service | |||
to protect the Classic service, because L4S is designed to reduce | to protect the Classic service, because L4S is designed to reduce | |||
delay without harming the delay or rate of any Classic traffic. | delay without harming the delay or rate of any Classic traffic. | |||
During early deployment (and perhaps always), some networks will not | During early deployment (and perhaps always), some networks will not | |||
offer the L4S service. In general, these networks should not need to | offer the L4S service. In general, these networks should not need to | |||
police L4S traffic. They are required (by both the ECN | police L4S traffic. They are required (by both the ECN spec | |||
spec [RFC3168] and the L4S ECN spec [I-D.ietf-tsvwg-ecn-l4s-id]) not | [RFC3168] and the L4S ECN spec [RFC9331]) not to change the L4S | |||
to change the L4S identifier, which would interfere with end-to-end | identifier, which would interfere with end-to-end congestion control. | |||
congestion control. If they already treat ECN traffic as Not-ECT, | If they already treat ECN traffic as Not-ECT, they can merely treat | |||
they can merely treat L4S traffic as Not-ECT too. At a bottleneck, | L4S traffic as Not-ECT too. At a bottleneck, such networks will | |||
such networks will introduce some queuing and dropping. When a | introduce some queuing and dropping. When a Scalable congestion | |||
scalable congestion control detects a drop it will have to respond | control detects a drop, it will have to respond safely with respect | |||
safely with respect to Classic congestion controls (as required in | to Classic congestion controls (as required in Section 4.3 of | |||
Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id]). This will degrade the | [RFC9331]). This will degrade the L4S service to be no better (but | |||
L4S service to be no better (but never worse) than Classic best | never worse) than Classic best efforts whenever a non-ECN bottleneck | |||
efforts, whenever a non-ECN bottleneck is encountered on a path (see | is encountered on a path (see Section 6.4.3). | |||
Section 6.4.3). | ||||
In cases that are expected to be rare, networks that solely support | In cases that are expected to be rare, networks that solely support | |||
Classic ECN [RFC3168] in a single queue bottleneck might opt to | Classic ECN [RFC3168] in a single queue bottleneck might opt to | |||
police L4S traffic so as to protect competing Classic ECN traffic | police L4S traffic so as to protect competing Classic ECN traffic | |||
(for instance, see Section 6.1.3 of the L4S operational | (for instance, see Section 6.1.3 of the L4S operational guidance | |||
guidance [I-D.ietf-tsvwg-l4sops]). However, Section 4.3 of the L4S | [L4SOPS]). However, Section 4.3 of the L4S ECN spec [RFC9331] | |||
ECN spec [I-D.ietf-tsvwg-ecn-l4s-id] recommends that the sender | recommends that the sender adapts its congestion response to properly | |||
adapts its congestion response to properly coexist with Classic ECN | coexist with Classic ECN flows, i.e., reverting to the self-restraint | |||
flows, i.e. reverting to the self-restraint approach. | approach. | |||
Certain network operators might choose to restrict access to the L4S | Certain network operators might choose to restrict access to the L4S | |||
service, perhaps only to selected premium customers as a value-added | service, perhaps only to selected premium customers as a value-added | |||
service. Their packet classifier (item 2 in Figure 1) could identify | service. Their packet classifier (item 2 in Figure 1) could identify | |||
such customers against some other field (e.g. source address range) | such customers against some other field (e.g., source address range), | |||
as well as classifying on the ECN field. If only the ECN L4S | as well as classifying on the ECN field. If only the ECN L4S | |||
identifier matched, but not the source address (say), the classifier | identifier matched, but not (say) the source address, the classifier | |||
could direct these packets (from non-premium customers) into the | could direct these packets (from non-premium customers) into the | |||
Classic queue. Explaining clearly how operators can use additional | Classic queue. Explaining clearly how operators can use additional | |||
local classifiers (see section 5.4 of the L4S ECN | local classifiers (see Section 5.4 of [RFC9331]) is intended to | |||
spec [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any | remove any motivation to clear the L4S identifier. Then at least the | |||
motivation to clear the L4S identifier. Then at least the L4S ECN | L4S ECN identifier will be more likely to survive end to end, even | |||
identifier will be more likely to survive end-to-end even though the | though the service may not be supported at every hop. Such local | |||
service may not be supported at every hop. Such local arrangements | arrangements would only require simple registered/not-registered | |||
would only require simple registered/not-registered packet | packet classification, rather than the managed, application-specific | |||
classification, rather than the managed, application-specific traffic | traffic policing against customer-specific traffic contracts that | |||
policing against customer-specific traffic contracts that Diffserv | Diffserv uses. | |||
uses. | ||||
8.2. 'Latency Friendliness' | 8.2. 'Latency Friendliness' | |||
Like the Classic service, the L4S service relies on self-restraint - | Like the Classic service, the L4S service relies on self-restraint to | |||
limiting rate in response to congestion. In addition, the L4S | limit the rate in response to congestion. In addition, the L4S | |||
service requires self-restraint in terms of limiting latency | service requires self-restraint in terms of limiting latency | |||
(burstiness). It is hoped that self-interest and guidance on dynamic | (burstiness). It is hoped that self-interest and guidance on dynamic | |||
behaviour (especially flow start-up, which might need to be | behaviour (especially flow start-up, which might need to be | |||
standardized) will be sufficient to prevent transports from sending | standardized) will be sufficient to prevent transports from sending | |||
excessive bursts of L4S traffic, given the application's own latency | excessive bursts of L4S traffic, given the application's own latency | |||
will suffer most from such behaviour. | will suffer most from such behaviour. | |||
Because the L4S service can reduce delay without discernibly | Because the L4S service can reduce delay without discernibly | |||
increasing the delay of any Classic traffic, it should not be | increasing the delay of any Classic traffic, it should not be | |||
necessary to police L4S traffic to protect the delay of Classic. | necessary to police L4S traffic to protect the delay of Classic | |||
However, whether burst policing becomes necessary to protect other | traffic. However, whether burst policing becomes necessary to | |||
L4S traffic remains to be seen. Without it, there will be potential | protect other L4S traffic remains to be seen. Without it, there will | |||
for attacks on the low latency of the L4S service. | be potential for attacks on the low latency of the L4S service. | |||
If needed, various arrangements could be used to address this | If needed, various arrangements could be used to address this | |||
concern: | concern: | |||
Local bottleneck queue protection: A per-flow (5-tuple) queue | Local bottleneck queue protection: A per-flow (5-tuple) queue | |||
protection function [I-D.briscoe-docsis-q-protection] has been | protection function [DOCSIS-Q-PROT] has been developed for the low | |||
developed for the low latency queue in DOCSIS, which has adopted | latency queue in DOCSIS, which has adopted the DualQ L4S | |||
the DualQ L4S architecture. It protects the low latency service | architecture. It protects the low latency service from any queue- | |||
from any queue-building flows that accidentally or maliciously | building flows that accidentally or maliciously classify | |||
classify themselves into the low latency queue. It is designed to | themselves into the low latency queue. It is designed to score | |||
score flows based solely on their contribution to queuing (not | flows based solely on their contribution to queuing (not flow rate | |||
flow rate in itself). Then, if the shared low latency queue is at | in itself). Then, if the shared low latency queue is at risk of | |||
risk of exceeding a threshold, the function redirects enough | exceeding a threshold, the function redirects enough packets of | |||
packets of the highest scoring flow(s) into the Classic queue to | the highest scoring flow(s) into the Classic queue to preserve low | |||
preserve low latency. | latency. | |||
Distributed traffic scrubbing: Rather than policing locally at each | Distributed traffic scrubbing: Rather than policing locally at each | |||
bottleneck, it may only be necessary to address problems | bottleneck, it may only be necessary to address problems | |||
reactively, e.g. punitively target any deployments of new bursty | reactively, e.g., punitively target any deployments of new bursty | |||
malware, in a similar way to how traffic from flooding attack | malware, in a similar way to how traffic from flooding attack | |||
sources is rerouted via scrubbing facilities. | sources is rerouted via scrubbing facilities. | |||
Local bottleneck per-flow scheduling: Per-flow scheduling should | Local bottleneck per-flow scheduling: Per-flow scheduling should | |||
inherently isolate non-bursty flows from bursty (see Section 5.2 | inherently isolate non-bursty flows from bursty flows (see | |||
for discussion of the merits of per-flow scheduling relative to | Section 5.2 for discussion of the merits of per-flow scheduling | |||
per-flow policing). | relative to per-flow policing). | |||
Distributed access subnet queue protection: Per-flow queue | Distributed access subnet queue protection: Per-flow queue | |||
protection could be arranged for a queue structure distributed | protection could be arranged for a queue structure distributed | |||
across a subnet intercommunicating using lower layer control | across a subnet intercommunicating using lower layer control | |||
messages (see Section 2.1.4 of [QDyn]). For instance, in a radio | messages (see Section 2.1.4 of [QDyn]). For instance, in a radio | |||
access network, user equipment already sends regular buffer status | access network, user equipment already sends regular buffer status | |||
reports to a radio network controller, which could use this | reports to a radio network controller, which could use this | |||
information to remotely police individual flows. | information to remotely police individual flows. | |||
Distributed Congestion Exposure to Ingress Policers: The Congestion | Distributed Congestion Exposure to ingress policers: The Congestion | |||
Exposure (ConEx) architecture [RFC7713] uses egress audit to | Exposure (ConEx) architecture [RFC7713] uses an egress audit to | |||
motivate senders to truthfully signal path congestion in-band | motivate senders to truthfully signal path congestion in-band, | |||
where it can be used by ingress policers. An edge-to-edge variant | where it can be used by ingress policers. An edge-to-edge variant | |||
of this architecture is also possible. | of this architecture is also possible. | |||
Distributed Domain-edge traffic conditioning: An architecture | Distributed domain-edge traffic conditioning: An architecture | |||
similar to Diffserv [RFC2475] may be preferred, where traffic is | similar to Diffserv [RFC2475] may be preferred, where traffic is | |||
proactively conditioned on entry to a domain, rather than | proactively conditioned on entry to a domain, rather than | |||
reactively policed only if it leads to queuing once combined with | reactively policed only if it leads to queuing once combined with | |||
other traffic at a bottleneck. | other traffic at a bottleneck. | |||
Distributed core network queue protection: The policing function | Distributed core network queue protection: The policing function | |||
could be divided between per-flow mechanisms at the network | could be divided between per-flow mechanisms at the network | |||
ingress that characterize the burstiness of each flow into a | ingress that characterize the burstiness of each flow into a | |||
signal carried with the traffic, and per-class mechanisms at | signal carried with the traffic and per-class mechanisms at | |||
bottlenecks that act on these signals if queuing actually occurs | bottlenecks that act on these signals if queuing actually occurs | |||
once the traffic converges. This would be somewhat similar to | once the traffic converges. This would be somewhat similar to | |||
[Nadas20], which is in turn similar to the idea behind core | [Nadas20], which is in turn similar to the idea behind core | |||
stateless fair queuing. | stateless fair queuing. | |||
No single one of these possible queue protection capabilities is | No single one of these possible queue protection capabilities is | |||
considered an essential part of the L4S architecture, which works | considered an essential part of the L4S architecture, which works | |||
without any of them under non-attack conditions (much as the Internet | without any of them under non-attack conditions (much as the Internet | |||
normally works without per-flow rate policing). Indeed, even where | normally works without per-flow rate policing). Indeed, even where | |||
latency policers are deployed, under normal circumstances they would | latency policers are deployed, under normal circumstances, they would | |||
not intervene, and if operators found they were not necessary they | not intervene, and if operators found they were not necessary, they | |||
could disable them. Part of the L4S experiment will be to see | could disable them. Part of the L4S experiment will be to see | |||
whether such a function is necessary, and which arrangements are most | whether such a function is necessary and which arrangements are most | |||
appropriate to the size of the problem. | appropriate to the size of the problem. | |||
8.3. Interaction between Rate Policing and L4S | 8.3. Interaction between Rate Policing and L4S | |||
As mentioned in Section 5.2, L4S should remove the need for low | As mentioned in Section 5.2, L4S should remove the need for low | |||
latency Diffserv classes. However, those Diffserv classes that give | latency Diffserv classes. However, those Diffserv classes that give | |||
certain applications or users priority over capacity, would still be | certain applications or users priority over capacity would still be | |||
applicable in certain scenarios (e.g. corporate networks). Then, | applicable in certain scenarios (e.g., corporate networks). Then, | |||
within such Diffserv classes, L4S would often be applicable to give | within such Diffserv classes, L4S would often be applicable to give | |||
traffic low latency and low loss as well. Within such a Diffserv | traffic low latency and low loss as well. Within such a Diffserv | |||
class, the bandwidth available to a user or application is often | class, the bandwidth available to a user or application is often | |||
limited by a rate policer. Similarly, in the default Diffserv class, | limited by a rate policer. Similarly, in the default Diffserv class, | |||
rate policers are sometimes used to partition shared capacity. | rate policers are sometimes used to partition shared capacity. | |||
A classic rate policer drops any packets exceeding a set rate, | A Classic rate policer drops any packets exceeding a set rate, | |||
usually also giving a burst allowance (variants exist where the | usually also giving a burst allowance (variants exist where the | |||
policer re-marks non-compliant traffic to a discard-eligible Diffserv | policer re-marks noncompliant traffic to a discard-eligible Diffserv | |||
codepoint, so they can be dropped elsewhere during contention). | codepoint, so they can be dropped elsewhere during contention). | |||
Whenever L4S traffic encounters one of these rate policers, it will | Whenever L4S traffic encounters one of these rate policers, it will | |||
experience drops and the source will have to fall back to a Classic | experience drops and the source will have to fall back to a Classic | |||
congestion control, thus losing the benefits of L4S (Section 6.4.3). | congestion control, thus losing the benefits of L4S (Section 6.4.3). | |||
So, in networks that already use rate policers and plan to deploy | So in networks that already use rate policers and plan to deploy L4S, | |||
L4S, it will be preferable to redesign these rate policers to be more | it will be preferable to redesign these rate policers to be more | |||
friendly to the L4S service. | friendly to the L4S service. | |||
L4S-friendly rate policing is currently a research area (note that | L4S-friendly rate policing is currently a research area (note that | |||
this is not the same as latency policing). It might be achieved by | this is not the same as latency policing). It might be achieved by | |||
setting a threshold where ECN marking is introduced, such that it is | setting a threshold where ECN marking is introduced, such that it is | |||
just under the policed rate or just under the burst allowance where | just under the policed rate or just under the burst allowance where | |||
drop is introduced. For instance the two-rate three-colour | drop is introduced. For instance, the two-rate, three-colour marker | |||
marker [RFC2698] or a PCN threshold and excess-rate marker [RFC5670] | [RFC2698] or a PCN threshold and excess-rate marker [RFC5670] could | |||
could mark ECN at the lower rate and drop at the higher. Or an | mark ECN at the lower rate and drop at the higher. Or an existing | |||
existing rate policer could have congestion-rate policing added, | rate policer could have congestion-rate policing added, e.g., using | |||
e.g. using the 'local' (non-ConEx) variant of the ConEx aggregate | the 'local' (non-ConEx) variant of the ConEx aggregate congestion | |||
congestion policer [I-D.briscoe-conex-policing]. It might also be | policer [CONG-POLICING]. It might also be possible to design | |||
possible to design scalable congestion controls to respond less | Scalable congestion controls to respond less catastrophically to loss | |||
catastrophically to loss that has not been preceded by a period of | that has not been preceded by a period of increasing delay. | |||
increasing delay. | ||||
The design of L4S-friendly rate policers will require a separate | The design of L4S-friendly rate policers will require a separate, | |||
dedicated document. For further discussion of the interaction | dedicated document. For further discussion of the interaction | |||
between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv]. | between L4S and Diffserv, see [L4S-DIFFSERV]. | |||
8.4. ECN Integrity | 8.4. ECN Integrity | |||
Various ways have been developed to protect the integrity of the | Various ways have been developed to protect the integrity of the | |||
congestion feedback loop (whether signalled by loss, Classic ECN or | congestion feedback loop (whether signalled by loss, Classic ECN, or | |||
L4S ECN) against misbehaviour by the receiver, sender or network (or | L4S ECN) against misbehaviour by the receiver, sender, or network (or | |||
all three). Brief details of each including applicability, pros and | all three). Brief details of each, including applicability, pros, | |||
cons is given in Appendix C.1 of the L4S ECN | and cons, are given in Appendix C.1 of the L4S ECN spec [RFC9331]. | |||
spec [I-D.ietf-tsvwg-ecn-l4s-id]. | ||||
8.5. Privacy Considerations | 8.5. Privacy Considerations | |||
As discussed in Section 5.2, the L4S architecture does not preclude | As discussed in Section 5.2, the L4S architecture does not preclude | |||
approaches that inspect end-to-end transport layer identifiers. For | approaches that inspect end-to-end transport layer identifiers. For | |||
instance, L4S support has been added to FQ-CoDel, which classifies by | instance, L4S support has been added to FQ-CoDel, which classifies by | |||
application flow ID in the network. However, the main innovation of | application flow identifier in the network. However, the main | |||
L4S is the DualQ AQM framework that does not need to inspect any | innovation of L4S is the DualQ AQM framework that does not need to | |||
deeper than the outermost IP header, because the L4S identifier is in | inspect any deeper than the outermost IP header, because the L4S | |||
the IP-ECN field. | identifier is in the IP-ECN field. | |||
Thus, the L4S architecture enables very low queuing delay without | Thus, the L4S architecture enables very low queuing delay without | |||
_requiring_ inspection of information above the IP layer. This means | _requiring_ inspection of information above the IP layer. This means | |||
that users who want to encrypt application flow identifiers, e.g. in | that users who want to encrypt application flow identifiers, e.g., in | |||
IPSec or other encrypted VPN tunnels, don't have to sacrifice low | IPsec or other encrypted VPN tunnels, don't have to sacrifice low | |||
delay [RFC8404]. | delay [RFC8404]. | |||
Because L4S can provide low delay for a broad set of applications | Because L4S can provide low delay for a broad set of applications | |||
that choose to use it, there is no need for individual applications | that choose to use it, there is no need for individual applications | |||
or classes within that broad set to be distinguishable in any way | or classes within that broad set to be distinguishable in any way | |||
while traversing networks. This removes much of the ability to | while traversing networks. This removes much of the ability to | |||
correlate between the delay requirements of traffic and other | correlate between the delay requirements of traffic and other | |||
identifying features [RFC6973]. There may be some types of traffic | identifying features [RFC6973]. There may be some types of traffic | |||
that prefer not to use L4S, but the coarse binary categorization of | that prefer not to use L4S, but the coarse binary categorization of | |||
traffic reveals very little that could be exploited to compromise | traffic reveals very little that could be exploited to compromise | |||
privacy. | privacy. | |||
9. Informative References | 9. Informative References | |||
[ACCECN] Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More | ||||
Accurate ECN Feedback in TCP", Work in Progress, Internet- | ||||
Draft, draft-ietf-tcpm-accurate-ecn-22, 9 November 2022, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-tcpm- | ||||
accurate-ecn-22>. | ||||
[AFCD] Xue, L., Kumar, S., Cui, C., Kondikoppa, P., Chiu, C-H., | [AFCD] Xue, L., Kumar, S., Cui, C., Kondikoppa, P., Chiu, C-H., | |||
and S-J. Park, "Towards fair and low latency next | and S-J. Park, "Towards fair and low latency next | |||
generation high speed networks: AFCD queuing", Journal of | generation high speed networks: AFCD queuing", Journal of | |||
Network and Computer Applications 70:183--193, July 2016, | Network and Computer Applications, Volume 70, pp. 183-193, | |||
DOI 10.1016/j.jnca.2016.03.021, July 2016, | ||||
<https://doi.org/10.1016/j.jnca.2016.03.021>. | <https://doi.org/10.1016/j.jnca.2016.03.021>. | |||
[BBRv2] Cardwell, N., "TCP BBR v2 Alpha/Preview Release", GitHub | [BBR-CC] Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. | |||
repository; Linux congestion control module, | Jacobson, "BBR Congestion Control", Work in Progress, | |||
<https://github.com/google/bbr/blob/v2alpha/README.md>. | Internet-Draft, draft-cardwell-iccrg-bbr-congestion- | |||
control-02, 7 March 2022, | ||||
<https://datatracker.ietf.org/doc/html/draft-cardwell- | ||||
iccrg-bbr-congestion-control-02>. | ||||
[BDPdata] Briscoe, B., "PI2 Parameters", Technical Report TR-BB- | [BBRv2] "TCP BBR v2 Alpha/Preview Release", commit 17700ca, June | |||
2021-001 arXiv:2107.01003 [cs.NI], July 2021, | 2022, <https://github.com/google/bbr>. | |||
<https://arxiv.org/abs/2107.01003>. | ||||
[BDPdata] Briscoe, B., "PI2 Parameters", TR-BB-2021-001, | ||||
arXiv:2107.01003 [cs.NI], DOI 10.48550/arXiv.2107.01003, | ||||
October 2021, <https://arxiv.org/abs/2107.01003>. | ||||
[BufferSize] | [BufferSize] | |||
Appenzeller, G., Keslassy, I., and N. McKeown, "Sizing | Appenzeller, G., Keslassy, I., and N. McKeown, "Sizing | |||
Router Buffers", In Proc. SIGCOMM'04 34(4):281--292, | Router Buffers", SIGCOMM '04: Proceedings of the 2004 | |||
September 2004, <https://doi.org/10.1145/1015467.1015499>. | conference on Applications, technologies, architectures, | |||
and protocols for computer communications, pp. 281-292, | ||||
DOI 10.1145/1015467.1015499, October 2004, | ||||
<https://doi.org/10.1145/1015467.1015499>. | ||||
[COBALT] Palmei, J., Gupta, S., Imputato, P., Morton, J., | [COBALT] Palmei, J., Gupta, S., Imputato, P., Morton, J., | |||
Tahiliani, M. P., Avallone, S., and D. Täht, "Design and | Tahiliani, M. P., Avallone, S., and D. Täht, "Design and | |||
Evaluation of COBALT Queue Discipline", In Proc. IEEE | Evaluation of COBALT Queue Discipline", IEEE International | |||
Int'l Symp. Local and Metropolitan Area Networks | Symposium on Local and Metropolitan Area Networks | |||
(LANMAN'19) 2019:1-6, July 2019, | (LANMAN), DOI 10.1109/LANMAN.2019.8847054, July 2019, | |||
<https://ieeexplore.ieee.org/abstract/document/8847054>. | <https://ieeexplore.ieee.org/abstract/document/8847054>. | |||
[DCttH19] De Schepper, K., Bondarenko, O., Tilmans, O., and B. | [CODEL-APPROX-FAIR] | |||
Briscoe, "`Data Centre to the Home': Ultra-Low Latency for | Morton, J. and P. G. Heist, "Controlled Delay Approximate | |||
All", Updated RITE project Technical Report , July 2019, | Fairness AQM", Work in Progress, Internet-Draft, draft- | |||
<https://bobbriscoe.net/pubs.html#DCttH_TR>. | morton-tsvwg-codel-approx-fair-01, 9 March 2020, | |||
<https://datatracker.ietf.org/doc/html/draft-morton-tsvwg- | ||||
codel-approx-fair-01>. | ||||
[CONG-POLICING] | ||||
Briscoe, B., "Network Performance Isolation using | ||||
Congestion Policing", Work in Progress, Internet-Draft, | ||||
draft-briscoe-conex-policing-01, 14 February 2014, | ||||
<https://datatracker.ietf.org/doc/html/draft-briscoe- | ||||
conex-policing-01>. | ||||
[CTCP] Sridharan, M., Tan, K., Bansal, D., and D. Thaler, | ||||
"Compound TCP: A New TCP Congestion Control for High-Speed | ||||
and Long Distance Networks", Work in Progress, Internet- | ||||
Draft, draft-sridharan-tcpm-ctcp-02, 11 November 2008, | ||||
<https://datatracker.ietf.org/doc/html/draft-sridharan- | ||||
tcpm-ctcp-02>. | ||||
[DOCSIS-Q-PROT] | ||||
Briscoe, B., Ed. and G. White, "The DOCSIS(R) Queue | ||||
Protection Algorithm to Preserve Low Latency", Work in | ||||
Progress, Internet-Draft, draft-briscoe-docsis-q- | ||||
protection-06, 13 May 2022, | ||||
<https://datatracker.ietf.org/doc/html/draft-briscoe- | ||||
docsis-q-protection-06>. | ||||
[DOCSIS3.1] | [DOCSIS3.1] | |||
CableLabs, "MAC and Upper Layer Protocols Interface | CableLabs, "MAC and Upper Layer Protocols Interface | |||
(MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable | (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable | |||
Service Interface Specifications DOCSIS® 3.1 Version i17 | Service Interface Specifications DOCSIS 3.1 Version i17 or | |||
or later, 21 January 2019, <https://specification- | later, 21 January 2019, <https://specification- | |||
search.cablelabs.com/CM-SP-MULPIv3.1>. | search.cablelabs.com/CM-SP-MULPIv3.1>. | |||
[DOCSIS3AQM] | [DOCSIS3AQM] | |||
White, G., "Active Queue Management Algorithms for DOCSIS | White, G., "Active Queue Management Algorithms for DOCSIS | |||
3.0; A Simulation Study of CoDel, SFQ-CoDel and PIE in | 3.0: A Simulation Study of CoDel, SFQ-CoDel and PIE in | |||
DOCSIS 3.0 Networks", CableLabs Technical Report , April | DOCSIS 3.0 Networks", CableLabs Technical Report, April | |||
2013, <{https://www.cablelabs.com/wp- | 2013, <https://www.cablelabs.com/wp- | |||
content/uploads/2013/11/ | content/uploads/2013/11/ | |||
Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf>. | Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf>. | |||
[DualPI2Linux] | [DualPI2Linux] | |||
Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O., | Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O., | |||
and H. Steen, "DUALPI2 - Low Latency, Low Loss and | and H. Steen, "DUALPI2 - Low Latency, Low Loss and | |||
Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019, | Scalable (L4S) AQM", Proceedings of Linux Netdev 0x13, | |||
<https://www.netdevconf.org/0x13/session.html?talk- | March 2019, <https://www.netdevconf.org/0x13/ | |||
DUALPI2-AQM>. | session.html?talk-DUALPI2-AQM>. | |||
[Dukkipati06] | [Dukkipati06] | |||
Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is | Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is | |||
the Right Metric for Congestion Control", ACM CCR | the Right Metric for Congestion Control", ACM SIGCOMM | |||
36(1):59--62, January 2006, | Computer Communication Review, Volume 36, Issue 1, pp. | |||
59-62, DOI 10.1145/1111322.1111336, January 2006, | ||||
<https://dl.acm.org/doi/10.1145/1111322.1111336>. | <https://dl.acm.org/doi/10.1145/1111322.1111336>. | |||
[ECN-ENCAP] | ||||
Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding | ||||
Congestion Notification to Protocols that Encapsulate IP", | ||||
Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn- | ||||
encap-guidelines-17, 11 July 2022, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- | ||||
ecn-encap-guidelines-17>. | ||||
[ECN-SCTP] Stewart, R. R., Tüxen, M., and X. Dong, "ECN for Stream | ||||
Control Transmission Protocol (SCTP)", Work in Progress, | ||||
Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January | ||||
2014, <https://datatracker.ietf.org/doc/html/draft- | ||||
stewart-tsvwg-sctpecn-05>. | ||||
[ECN-SHIM] Briscoe, B., "Propagating Explicit Congestion Notification | ||||
Across IP Tunnel Headers Separated by a Shim", Work in | ||||
Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update- | ||||
shim-15, 11 July 2022, | ||||
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- | ||||
rfc6040update-shim-15>. | ||||
[FQ_CoDel_Thresh] | [FQ_CoDel_Thresh] | |||
Høiland-Jørgensen, T., "fq_codel: generalise ce_threshold | "fq_codel: generalise ce_threshold marking for subset of | |||
marking for subset of traffic", Linux Patch Commit ID: | traffic", commit dfcb63ce1de6b10b, October 2021, | |||
dfcb63ce1de6b10b, 20 October 2021, | ||||
<https://git.kernel.org/pub/scm/linux/kernel/git/netdev/ | <https://git.kernel.org/pub/scm/linux/kernel/git/netdev/ | |||
net-next.git/commit/?id=dfcb63ce1de6b10b>. | net-next.git/commit/?id=dfcb63ce1de6b10b>. | |||
[Hohlfeld14] | [Hohlfeld14] | |||
Hohlfeld, O., Pujol, E., Ciucu, F., Feldmann, A., and P. | Hohlfeld, O., Pujol, E., Ciucu, F., Feldmann, A., and P. | |||
Barford, "A QoE Perspective on Sizing Network Buffers", | Barford, "A QoE Perspective on Sizing Network Buffers", | |||
Proc. ACM Internet Measurement Conf (IMC'14) hmm, November | IMC '14: Proceedings of the 2014 Conference on Internet | |||
2014, <https://doi.acm.org/10.1145/2663716.2663730>. | Measurement, pp. 333-346, DOI 10.1145/2663716.2663730, | |||
November 2014, | ||||
[I-D.briscoe-conex-policing] | <https://doi.acm.org/10.1145/2663716.2663730>. | |||
Briscoe, B., "Network Performance Isolation using | ||||
Congestion Policing", Work in Progress, Internet-Draft, | ||||
draft-briscoe-conex-policing-01, 14 February 2014, | ||||
<https://www.ietf.org/archive/id/draft-briscoe-conex- | ||||
policing-01.txt>. | ||||
[I-D.briscoe-docsis-q-protection] | ||||
Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection | ||||
Algorithm to Preserve Low Latency", Work in Progress, | ||||
Internet-Draft, draft-briscoe-docsis-q-protection-06, 13 | ||||
May 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
briscoe-docsis-q-protection/>. | ||||
[I-D.briscoe-iccrg-prague-congestion-control] | ||||
Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague | ||||
Congestion Control", Work in Progress, Internet-Draft, | ||||
draft-briscoe-iccrg-prague-congestion-control-01, 11 July | ||||
2022, <https://datatracker.ietf.org/api/v1/doc/document/ | ||||
draft-briscoe-iccrg-prague-congestion-control/>. | ||||
[I-D.briscoe-tsvwg-l4s-diffserv] | [L4S-DIFFSERV] | |||
Briscoe, B., "Interactions between Low Latency, Low Loss, | Briscoe, B., "Interactions between Low Latency, Low Loss, | |||
Scalable Throughput (L4S) and Differentiated Services", | Scalable Throughput (L4S) and Differentiated Services", | |||
Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s- | Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s- | |||
diffserv-02, 2 July 2018, | diffserv-02, 4 November 2018, | |||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | <https://datatracker.ietf.org/doc/html/draft-briscoe- | |||
briscoe-tsvwg-l4s-diffserv/>. | tsvwg-l4s-diffserv-02>. | |||
[I-D.cardwell-iccrg-bbr-congestion-control] | ||||
Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V. | ||||
Jacobson, "BBR Congestion Control", Work in Progress, | ||||
Internet-Draft, draft-cardwell-iccrg-bbr-congestion- | ||||
control-02, 7 March 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
cardwell-iccrg-bbr-congestion-control/>. | ||||
[I-D.ietf-tcpm-accurate-ecn] | ||||
Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More | ||||
Accurate ECN Feedback in TCP", Work in Progress, Internet- | ||||
Draft, draft-ietf-tcpm-accurate-ecn-20, 25 July 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tcpm-accurate-ecn/>. | ||||
[I-D.ietf-tsvwg-aqm-dualq-coupled] | ||||
Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled | ||||
AQMs for Low Latency, Low Loss and Scalable Throughput | ||||
(L4S)", Work in Progress, Internet-Draft, draft-ietf- | ||||
tsvwg-aqm-dualq-coupled-24, 7 July 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-aqm-dualq-coupled/>. | ||||
[I-D.ietf-tsvwg-ecn-encap-guidelines] | ||||
Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding | ||||
Congestion Notification to Protocols that Encapsulate IP", | ||||
Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn- | ||||
encap-guidelines-17, 11 July 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-ecn-encap-guidelines/>. | ||||
[I-D.ietf-tsvwg-ecn-l4s-id] | ||||
Schepper, K. D. and B. Briscoe, "Explicit Congestion | ||||
Notification (ECN) Protocol for Very Low Queuing Delay | ||||
(L4S)", Work in Progress, Internet-Draft, draft-ietf- | ||||
tsvwg-ecn-l4s-id-28, 8 August 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-ecn-l4s-id/>. | ||||
[I-D.ietf-tsvwg-l4sops] | ||||
White, G., "Operational Guidance for Deployment of L4S in | ||||
the Internet", Work in Progress, Internet-Draft, draft- | ||||
ietf-tsvwg-l4sops-03, 28 April 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-l4sops/>. | ||||
[I-D.ietf-tsvwg-nqb] | ||||
White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | ||||
Behavior (NQB PHB) for Differentiated Services", Work in | ||||
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-10, 4 March | ||||
2022, <https://datatracker.ietf.org/api/v1/doc/document/ | ||||
draft-ietf-tsvwg-nqb/>. | ||||
[I-D.ietf-tsvwg-rfc6040update-shim] | ||||
Briscoe, B., "Propagating Explicit Congestion Notification | ||||
Across IP Tunnel Headers Separated by a Shim", Work in | ||||
Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update- | ||||
shim-15, 11 July 2022, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
ietf-tsvwg-rfc6040update-shim/>. | ||||
[I-D.morton-tsvwg-codel-approx-fair] | [L4Sdemo16] | |||
Morton, J. and P. G. Heist, "Controlled Delay Approximate | Bondarenko, O., De Schepper, K., Tsang, I., Briscoe, B., | |||
Fairness AQM", Work in Progress, Internet-Draft, draft- | Petlund, A., and C. Griwodz, "Ultra-Low Delay for All: | |||
morton-tsvwg-codel-approx-fair-01, 9 March 2020, | Live Experience, Live Analysis", Proceedings of the 7th | |||
<https://www.ietf.org/archive/id/draft-morton-tsvwg-codel- | International Conference on Multimedia Systems, Article | |||
approx-fair-01.txt>. | No. 33, pp. 1-4, DOI 10.1145/2910017.2910633, May 2016, | |||
<https://dl.acm.org/citation.cfm?doid=2910017.2910633>. | ||||
[I-D.sridharan-tcpm-ctcp] | [L4Sdemo16-Video] | |||
Sridharan, M., Tan, K., Bansal, D., and D. Thaler, | "Videos used in IETF dispatch WG 'Ultra-Low Queuing Delay | |||
"Compound TCP: A New TCP Congestion Control for High-Speed | for All Apps' slot", | |||
and Long Distance Networks", Work in Progress, Internet- | <https://riteproject.eu/dctth/#1511dispatchwg>. | |||
Draft, draft-sridharan-tcpm-ctcp-02, 29 October 2007, | ||||
<https://datatracker.ietf.org/api/v1/doc/document/draft- | ||||
sridharan-tcpm-ctcp/>. | ||||
[I-D.stewart-tsvwg-sctpecn] | [L4Seval22] | |||
Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream | De Schepper, K., Albisser, O., Tilmans, O., and B. | |||
Control Transmission Protocol (SCTP)", Work in Progress, | Briscoe, "Dual Queue Coupled AQM: Deployable Very Low | |||
Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January | Queuing Delay for All", TR-BB-2022-001, arXiv:2209.01078 | |||
2014, <https://www.ietf.org/archive/id/draft-stewart- | [cs.NI], DOI 10.48550/arXiv.2209.01078, September 2022, | |||
tsvwg-sctpecn-05.txt>. | <https://arxiv.org/abs/2209.01078>. | |||
[L4Sdemo16] | [L4SOPS] White, G., Ed., "Operational Guidance for Deployment of | |||
Bondarenko, O., De Schepper, K., Tsang, I., and B. | L4S in the Internet", Work in Progress, Internet-Draft, | |||
Briscoe, "Ultra-Low Delay for All: Live Experience, Live | draft-ietf-tsvwg-l4sops-03, 28 April 2022, | |||
Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, | <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg- | |||
<https://dl.acm.org/citation.cfm?doid=2910017.2910633 | l4sops-03>. | |||
(videos of demos: | ||||
https://riteproject.eu/dctth/#1511dispatchwg )>. | ||||
[LEDBAT_AQM] | [LEDBAT_AQM] | |||
Al-Saadi, R., Armitage, G., and J. But, "Characterising | Al-Saadi, R., Armitage, G., and J. But, "Characterising | |||
LEDBAT Performance Through Bottlenecks Using PIE, FQ-CoDel | LEDBAT Performance Through Bottlenecks Using PIE, FQ-CoDel | |||
and FQ-PIE Active Queue Management", Proc. IEEE 42nd | and FQ-PIE Active Queue Management", IEEE 42nd Conference | |||
Conference on Local Computer Networks (LCN) 278--285, | on Local Computer Networks (LCN), DOI 10.1109/LCN.2017.22, | |||
2017, <https://ieeexplore.ieee.org/document/8109367>. | October 2017, | |||
<https://ieeexplore.ieee.org/document/8109367>. | ||||
[lowat] Meenan, P., "Optimizing HTTP/2 prioritization with BBR and | [lowat] Meenan, P., "Optimizing HTTP/2 prioritization with BBR and | |||
tcp_notsent_lowat", Cloudflare Blog , 12 October 2018, | tcp_notsent_lowat", Cloudflare Blog, October 2018, | |||
<https://blog.cloudflare.com/http-2-prioritization-with- | <https://blog.cloudflare.com/http-2-prioritization-with- | |||
nginx/>. | nginx/>. | |||
[Mathis09] Mathis, M., "Relentless Congestion Control", PFLDNeT'09 , | ||||
May 2009, <https://www.gdt.id.au/~gdt/ | ||||
presentations/2010-07-06-questnet-tcp/reference- | ||||
materials/papers/mathis-relentless-congestion- | ||||
control.pdf>. | ||||
[McIlroy78] | [McIlroy78] | |||
McIlroy, M.D., Pinson, E. N., and B. A. Tague, "UNIX Time- | McIlroy, M.D., Pinson, E. N., and B. A. Tague, "UNIX Time- | |||
Sharing System: Foreword", The Bell System Technical | Sharing System: Foreword", The Bell System Technical | |||
Journal 57:6(1902--1903), July 1978, | Journal 57: 6, pp. 1899-1904, | |||
DOI 10.1002/j.1538-7305.1978.tb02135.x, July 1978, | ||||
<https://archive.org/details/bstj57-6-1899>. | <https://archive.org/details/bstj57-6-1899>. | |||
[Nadas20] Nádas, S., Gombos, G., Fejes, F., and S. Laki, "A | [Nadas20] Nádas, S., Gombos, G., Fejes, F., and S. Laki, "A | |||
Congestion Control Independent L4S Scheduler", Proc. | Congestion Control Independent L4S Scheduler", ANRW '20: | |||
Applied Networking Research Workshop (ANRW '20) 45--51, | Proceedings of the Applied Networking Research Workshop, | |||
July 2020, <https://doi.org/10.1145/3404868.3406669>. | pp. 45-51, DOI 10.1145/3404868.3406669, July 2020, | |||
<https://doi.org/10.1145/3404868.3406669>. | ||||
[NASA04] Bailey, R.R., Trey Arthur III, J.J., and S.P. Williams, | [NASA04] Bailey, R., Trey Arthur III, J., and S. Williams, "Latency | |||
"Latency Requirements for Head-Worn Display S/EVS | Requirements for Head-Worn Display S/EVS Applications", | |||
Applications", SPIE Defense and Security | Proceedings of SPIE 5424, DOI 10.1117/12.554462, April | |||
Symposium LF99-1955, April 2004, | 2004, <https://ntrs.nasa.gov/api/citations/20120009198/ | |||
<https://ntrs.nasa.gov/api/citations/20120009198/ | ||||
downloads/20120009198.pdf?attachment=true>. | downloads/20120009198.pdf?attachment=true>. | |||
[NQB-PHB] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | ||||
Behavior (NQB PHB) for Differentiated Services", Work in | ||||
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-15, 11 | ||||
January 2023, <https://datatracker.ietf.org/doc/html/ | ||||
draft-ietf-tsvwg-nqb-15>. | ||||
[PRAGUE-CC] | ||||
De Schepper, K., Tilmans, O., and B. Briscoe, Ed., "Prague | ||||
Congestion Control", Work in Progress, Internet-Draft, | ||||
draft-briscoe-iccrg-prague-congestion-control-01, 11 July | ||||
2022, <https://datatracker.ietf.org/doc/html/draft- | ||||
briscoe-iccrg-prague-congestion-control-01>. | ||||
[PragueLinux] | [PragueLinux] | |||
Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | |||
Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing | Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing | |||
the `TCP Prague' Requirements for Low Latency Low Loss | the 'TCP Prague' Requirements for Low Latency Low Loss | |||
Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 , | Scalable Throughput (L4S)", Proceedings Linux Netdev 0x13, | |||
March 2019, <https://www.netdevconf.org/0x13/ | March 2019, <https://www.netdevconf.org/0x13/ | |||
session.html?talk-tcp-prague-l4s>. | session.html?talk-tcp-prague-l4s>. | |||
[QDyn] Briscoe, B., "Rapid Signalling of Queue Dynamics", | [QDyn] Briscoe, B., "Rapid Signalling of Queue Dynamics", TR-BB- | |||
bobbriscoe.net Technical Report TR-BB-2017-001; | 2017-001, arXiv:1904.07044 [cs.NI], | |||
arXiv:1904.07044 [cs.NI], September 2017, | DOI 10.48550/arXiv.1904.07044, April 2019, | |||
<https://arxiv.org/abs/1904.07044>. | <https://arxiv.org/abs/1904.07044>. | |||
[Raaen14] Raaen, K. and T-M. Grønli, "Latency thresholds for | [Raaen14] Raaen, K. and T-M. Grønli, "Latency Thresholds for | |||
usability in games: A survey", Norsk IKT-konferanse for | Usability in Games: A Survey", Norsk IKT-konferanse for | |||
forskning og utdanning , 2014, | forskning og utdanning (Norwegian ICT conference for | |||
research and education), 2014, | ||||
<http://ojs.bibsys.no/index.php/NIK/article/view/9/6>. | <http://ojs.bibsys.no/index.php/NIK/article/view/9/6>. | |||
[Rajiullah15] | [Rajiullah15] | |||
Rajiullah, M., "Towards a Low Latency Internet: | Rajiullah, M., "Towards a Low Latency Internet: | |||
Understanding and Solutions", Master's Thesis; Karlstad | Understanding and Solutions", Dissertation, Karlstad | |||
Uni, Dept of Maths & CS 2015:41, 2015, <https://www.diva- | University, 2015, <https://www.diva- | |||
portal.org/smash/get/diva2:846109/FULLTEXT01.pdf>. | portal.org/smash/get/diva2:846109/FULLTEXT01.pdf>. | |||
[RELENTLESS] | ||||
Mathis, M., "Relentless Congestion Control", Work in | ||||
Progress, Internet-Draft, draft-mathis-iccrg-relentless- | ||||
tcp-00, 4 March 2009, | ||||
<https://datatracker.ietf.org/doc/html/draft-mathis-iccrg- | ||||
relentless-tcp-00>. | ||||
[RFC0970] Nagle, J., "On Packet Switches With Infinite Storage", | [RFC0970] Nagle, J., "On Packet Switches With Infinite Storage", | |||
RFC 970, DOI 10.17487/RFC0970, December 1985, | RFC 970, DOI 10.17487/RFC0970, December 1985, | |||
<https://www.rfc-editor.org/info/rfc970>. | <https://www.rfc-editor.org/info/rfc970>. | |||
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., | |||
and W. Weiss, "An Architecture for Differentiated | and W. Weiss, "An Architecture for Differentiated | |||
Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, | Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, | |||
<https://www.rfc-editor.org/info/rfc2475>. | <https://www.rfc-editor.org/info/rfc2475>. | |||
[RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color | [RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color | |||
skipping to change at page 44, line 6 ¶ | skipping to change at line 2016 ¶ | |||
DOI 10.17487/RFC7713, December 2015, | DOI 10.17487/RFC7713, December 2015, | |||
<https://www.rfc-editor.org/info/rfc7713>. | <https://www.rfc-editor.org/info/rfc7713>. | |||
[RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White, | [RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White, | |||
"Proportional Integral Controller Enhanced (PIE): A | "Proportional Integral Controller Enhanced (PIE): A | |||
Lightweight Control Scheme to Address the Bufferbloat | Lightweight Control Scheme to Address the Bufferbloat | |||
Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, | Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, | |||
<https://www.rfc-editor.org/info/rfc8033>. | <https://www.rfc-editor.org/info/rfc8033>. | |||
[RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based | [RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based | |||
on Proportional Integral Controller Enhanced PIE) for | on Proportional Integral Controller Enhanced (PIE) for | |||
Data-Over-Cable Service Interface Specifications (DOCSIS) | Data-Over-Cable Service Interface Specifications (DOCSIS) | |||
Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February | Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February | |||
2017, <https://www.rfc-editor.org/info/rfc8034>. | 2017, <https://www.rfc-editor.org/info/rfc8034>. | |||
[RFC8170] Thaler, D., Ed., "Planning for Protocol Adoption and | [RFC8170] Thaler, D., Ed., "Planning for Protocol Adoption and | |||
Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170, | Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8170>. | May 2017, <https://www.rfc-editor.org/info/rfc8170>. | |||
[RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., | [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., | |||
and G. Judd, "Data Center TCP (DCTCP): TCP Congestion | and G. Judd, "Data Center TCP (DCTCP): TCP Congestion | |||
skipping to change at page 45, line 10 ¶ | skipping to change at line 2065 ¶ | |||
[RFC8511] Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, | [RFC8511] Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, | |||
"TCP Alternative Backoff with ECN (ABE)", RFC 8511, | "TCP Alternative Backoff with ECN (ABE)", RFC 8511, | |||
DOI 10.17487/RFC8511, December 2018, | DOI 10.17487/RFC8511, December 2018, | |||
<https://www.rfc-editor.org/info/rfc8511>. | <https://www.rfc-editor.org/info/rfc8511>. | |||
[RFC8888] Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP | [RFC8888] Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP | |||
Control Protocol (RTCP) Feedback for Congestion Control", | Control Protocol (RTCP) Feedback for Congestion Control", | |||
RFC 8888, DOI 10.17487/RFC8888, January 2021, | RFC 8888, DOI 10.17487/RFC8888, January 2021, | |||
<https://www.rfc-editor.org/info/rfc8888>. | <https://www.rfc-editor.org/info/rfc8888>. | |||
[RFC8985] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "The | ||||
RACK-TLP Loss Detection Algorithm for TCP", RFC 8985, | ||||
DOI 10.17487/RFC8985, February 2021, | ||||
<https://www.rfc-editor.org/info/rfc8985>. | ||||
[RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based | |||
Multiplexed and Secure Transport", RFC 9000, | Multiplexed and Secure Transport", RFC 9000, | |||
DOI 10.17487/RFC9000, May 2021, | DOI 10.17487/RFC9000, May 2021, | |||
<https://www.rfc-editor.org/info/rfc9000>. | <https://www.rfc-editor.org/info/rfc9000>. | |||
[RFC9113] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, | [RFC9113] Thomson, M., Ed. and C. Benfield, Ed., "HTTP/2", RFC 9113, | |||
DOI 10.17487/RFC9113, June 2022, | DOI 10.17487/RFC9113, June 2022, | |||
<https://www.rfc-editor.org/info/rfc9113>. | <https://www.rfc-editor.org/info/rfc9113>. | |||
[SCReAM] Johansson, I., "SCReAM", GitHub repository; , | [RFC9331] De Schepper, K. and B. Briscoe, Ed., "The Explicit | |||
<https://github.com/EricssonResearch/scream/blob/master/ | Congestion Notification (ECN) Protocol for Low Latency, | |||
README.md>. | Low Loss, and Scalable Throughput (L4S)", RFC 9331, | |||
DOI 10.17487/RFC9331, January 2023, | ||||
<https://www.rfc-editor.org/info/rfc9331>. | ||||
[TCP-CA] Jacobson, V. and M.J. Karels, "Congestion Avoidance and | [RFC9332] De Schepper, K., Briscoe, B., Ed., and G. White, "Dual- | |||
Queue Coupled Active Queue Management (AQM) for Low | ||||
Latency, Low Loss, and Scalable Throughput (L4S)", | ||||
RFC 9332, DOI 10.17487/RFC9332, January 2023, | ||||
<https://www.rfc-editor.org/info/rfc9332>. | ||||
[SCReAM-L4S] | ||||
"SCReAM", commit fda6c53, June 2022, | ||||
<https://github.com/EricssonResearch/scream>. | ||||
[TCP-CA] Jacobson, V. and M. Karels, "Congestion Avoidance and | ||||
Control", Laurence Berkeley Labs Technical Report , | Control", Laurence Berkeley Labs Technical Report , | |||
November 1988, <https://ee.lbl.gov/papers/congavoid.pdf>. | November 1988, <https://ee.lbl.gov/papers/congavoid.pdf>. | |||
[UnorderedLTE] | [UnorderedLTE] | |||
Austrheim, M.V., "Implementing immediate forwarding for 4G | Austrheim, M., "Implementing immediate forwarding for 4G | |||
in a network simulator", Master's Thesis, Uni Oslo , June | in a network simulator", Master's Thesis, University of | |||
2019. | Oslo, 2018. | |||
Acknowledgements | Acknowledgements | |||
Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David | Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David | |||
Black, Jake Holland, Vidhi Goel, Ermin Sakic, Praveen | Black, Jake Holland, Vidhi Goel, Ermin Sakic, Praveen | |||
Balasubramanian, Gorry Fairhurst, Mirja Kuehlewind, Philip Eardley, | Balasubramanian, Gorry Fairhurst, Mirja Kuehlewind, Philip Eardley, | |||
Neal Cardwell, Pete Heist and Martin Duke for their useful review | Neal Cardwell, Pete Heist, and Martin Duke for their useful review | |||
comments. Thanks also to the area reviewers: Marco Tiloca, Lars | comments. Thanks also to the area reviewers: Marco Tiloca, Lars | |||
Eggert, Roman Danyliw and Eric Vyncke. | Eggert, Roman Danyliw, and Éric Vyncke. | |||
Bob Briscoe and Koen De Schepper were part-funded by the European | Bob Briscoe and Koen De Schepper were partly funded by the European | |||
Community under its Seventh Framework Programme through the Reducing | Community under its Seventh Framework Programme through the Reducing | |||
Internet Transport Latency (RITE) project (ICT-317700). The | Internet Transport Latency (RITE) project (ICT-317700). The | |||
contribution of Koen De Schepper was also part-funded by the 5Growth | contribution of Koen De Schepper was also partly funded by the | |||
and DAEMON EU H2020 projects. Bob Briscoe was also part-funded by | 5Growth and DAEMON EU H2020 projects. Bob Briscoe was also partly | |||
the Research Council of Norway through the TimeIn project, partly by | funded by the Research Council of Norway through the TimeIn project, | |||
CableLabs and partly by the Comcast Innovation Fund. The views | partly by CableLabs, and partly by the Comcast Innovation Fund. The | |||
expressed here are solely those of the authors. | views expressed here are solely those of the authors. | |||
Authors' Addresses | Authors' Addresses | |||
Bob Briscoe (editor) | Bob Briscoe (editor) | |||
Independent | Independent | |||
United Kingdom | United Kingdom | |||
Email: ietf@bobbriscoe.net | Email: ietf@bobbriscoe.net | |||
URI: https://bobbriscoe.net/ | URI: https://bobbriscoe.net/ | |||
Koen De Schepper | Koen De Schepper | |||
Nokia Bell Labs | Nokia Bell Labs | |||
Antwerp | Antwerp | |||
Belgium | Belgium | |||
Email: koen.de_schepper@nokia.com | Email: koen.de_schepper@nokia.com | |||
URI: https://www.bell-labs.com/about/researcher-profiles/ | URI: https://www.bell-labs.com/about/researcher-profiles/ | |||
koende_schepper/ | koende_schepper/ | |||
Marcelo Bagnulo | Marcelo Bagnulo | |||
Universidad Carlos III de Madrid | Universidad Carlos III de Madrid | |||
Av. Universidad 30 | Av. Universidad 30 | |||
Leganes, Madrid 28911 | 28911 Madrid | |||
Spain | Spain | |||
Phone: 34 91 6249500 | Phone: 34 91 6249500 | |||
Email: marcelo@it.uc3m.es | Email: marcelo@it.uc3m.es | |||
URI: https://www.it.uc3m.es | URI: https://www.it.uc3m.es | |||
Greg White | Greg White | |||
CableLabs | CableLabs | |||
United States of America | United States of America | |||
Email: G.White@CableLabs.com | Email: G.White@CableLabs.com | |||
End of changes. 277 change blocks. | ||||
1009 lines changed or deleted | 1019 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |