rfc9330v7.txt | rfc9330.txt | |||
---|---|---|---|---|
skipping to change at line 186 ¶ | skipping to change at line 186 ¶ | |||
It has been demonstrated that, if the sending host replaces a Classic | It has been demonstrated that, if the sending host replaces a Classic | |||
congestion control with a 'Scalable' alternative, the performance | congestion control with a 'Scalable' alternative, the performance | |||
under load of all the above interactive applications can be | under load of all the above interactive applications can be | |||
significantly improved once a suitable AQM is deployed in the | significantly improved once a suitable AQM is deployed in the | |||
network. Taking the example solution cited below that uses Data | network. Taking the example solution cited below that uses Data | |||
Center TCP (DCTCP) [RFC8257] and a Dual-Queue Coupled AQM [RFC9332] | Center TCP (DCTCP) [RFC8257] and a Dual-Queue Coupled AQM [RFC9332] | |||
on a DSL or Ethernet link, queuing delay under heavy load is roughly | on a DSL or Ethernet link, queuing delay under heavy load is roughly | |||
1-2 ms at the 99th percentile without losing link utilization | 1-2 ms at the 99th percentile without losing link utilization | |||
[L4Seval22] [DualPI2Linux] (for other link types, see Section 6.3). | [L4Seval22] [DualPI2Linux] (for other link types, see Section 6.3). | |||
This compares with 5-20 ms on _average_ with a Classic congestion | This compares with 5-20 ms on _average_ with a Classic congestion | |||
control and current state-of-the-art AQMs, such as FQ-CoDel | control and current state-of-the-art AQMs, such as Flow Queue CoDel | |||
[RFC8290], PIE [RFC8033], or DOCSIS PIE [RFC8034] and about 20-30 ms | [RFC8290], Proportional Integral controller Enhanced (PIE) [RFC8033], | |||
at the 99th percentile [DualPI2Linux]. | or DOCSIS PIE [RFC8034] and about 20-30 ms at the 99th percentile | |||
[DualPI2Linux]. | ||||
L4S is designed for incremental deployment. It is possible to deploy | L4S is designed for incremental deployment. It is possible to deploy | |||
the L4S service at a bottleneck link alongside the existing best | the L4S service at a bottleneck link alongside the existing best | |||
efforts service [DualPI2Linux] so that unmodified applications can | efforts service [DualPI2Linux] so that unmodified applications can | |||
start using it as soon as the sender's stack is updated. Access | start using it as soon as the sender's stack is updated. Access | |||
networks are typically designed with one link as the bottleneck for | networks are typically designed with one link as the bottleneck for | |||
each site (which might be a home, small enterprise, or mobile | each site (which might be a home, small enterprise, or mobile | |||
device), so deployment at either or both ends of this link should | device), so deployment at either or both ends of this link should | |||
give nearly all the benefit in the respective direction. With some | give nearly all the benefit in the respective direction. With some | |||
transport protocols, namely TCP [ACCECN] and SCTP [RFC4960], the | transport protocols, namely TCP [ACCECN], the sender has to check | |||
sender has to check that the receiver has been suitably updated to | that the receiver has been suitably updated to give more accurate | |||
give more accurate feedback, whereas with more recent transport | feedback, whereas with more recent transport protocols, such as QUIC | |||
protocols, such as QUIC [RFC9000] and DCCP [RFC4340], all receivers | [RFC9000] and Datagram Congestion Control Protocol (DCCP) [RFC4340], | |||
have always been suitable. | all receivers have always been suitable. | |||
This document presents the L4S architecture. It consists of three | This document presents the L4S architecture. It consists of three | |||
components: network support to isolate L4S traffic from Classic | components: network support to isolate L4S traffic from Classic | |||
traffic; protocol features that allow network elements to identify | traffic; protocol features that allow network elements to identify | |||
L4S traffic; and host support for L4S congestion controls. The | L4S traffic; and host support for L4S congestion controls. The | |||
protocol is defined separately in [RFC9331] as an experimental change | protocol is defined separately in [RFC9331] as an experimental change | |||
to Explicit Congestion Notification (ECN). This document describes | to Explicit Congestion Notification (ECN). This document describes | |||
and justifies the component parts and how they interact to provide | and justifies the component parts and how they interact to provide | |||
the low latency, low loss, and scalable Internet service. It also | the low latency, low loss, and scalable Internet service. It also | |||
details the approach to incremental deployment, as briefly summarized | details the approach to incremental deployment, as briefly summarized | |||
skipping to change at line 285 ¶ | skipping to change at line 286 ¶ | |||
utilization, whatever the flow rate, as well as ensuring that | utilization, whatever the flow rate, as well as ensuring that | |||
high throughput is more robust to disturbances. The Scalable | high throughput is more robust to disturbances. The Scalable | |||
control used most widely (in controlled environments) is DCTCP | control used most widely (in controlled environments) is DCTCP | |||
[RFC8257], which has been implemented and deployed in Windows | [RFC8257], which has been implemented and deployed in Windows | |||
Server Editions (since 2012), in Linux, and in FreeBSD. Although | Server Editions (since 2012), in Linux, and in FreeBSD. Although | |||
DCTCP as-is functions well over wide-area round-trip times | DCTCP as-is functions well over wide-area round-trip times | |||
(RTTs), most implementations lack certain safety features that | (RTTs), most implementations lack certain safety features that | |||
would be necessary for use outside controlled environments, like | would be necessary for use outside controlled environments, like | |||
data centres (see Section 6.4.3). Therefore, Scalable congestion | data centres (see Section 6.4.3). Therefore, Scalable congestion | |||
control needs to be implemented in TCP and other transport | control needs to be implemented in TCP and other transport | |||
protocols (QUIC, SCTP, RTP/RTCP, RTP Media Congestion Avoidance | protocols (QUIC, Stream Control Transmission Protocol (SCTP), | |||
Techniques (RMCAT), etc.). Indeed, between the present document | RTP/RTCP, RTP Media Congestion Avoidance Techniques (RMCAT), | |||
being drafted and published, the following Scalable congestion | etc.). Indeed, between the present document being drafted and | |||
controls were implemented: TCP Prague [PragueLinux], QUIC Prague, | published, the following Scalable congestion controls were | |||
implemented: Prague over TCP and QUIC [PRAGUE-CC] [PragueLinux], | ||||
an L4S variant of the RMCAT SCReAM controller [SCReAM-L4S], and | an L4S variant of the RMCAT SCReAM controller [SCReAM-L4S], and | |||
the L4S ECN part of BBRv2 [BBRv2] intended for TCP and QUIC | the L4S ECN part of Bottleneck Bandwidth and Round-trip | |||
propagation time (BBRv2) [BBRv2] intended for TCP and QUIC | ||||
transports. | transports. | |||
2) Network: | 2) Network: | |||
L4S traffic needs to be isolated from the queuing latency of | L4S traffic needs to be isolated from the queuing latency of | |||
Classic traffic. One queue per application flow (FQ) is one way | Classic traffic. One queue per application flow (FQ) is one way | |||
to achieve this, e.g., FQ-CoDel [RFC8290]. However, using just | to achieve this, e.g., FQ-CoDel [RFC8290]. However, using just | |||
two queues is sufficient and does not require inspection of | two queues is sufficient and does not require inspection of | |||
transport layer headers in the network, which is not always | transport layer headers in the network, which is not always | |||
possible (see Section 5.2). With just two queues, it might seem | possible (see Section 5.2). With just two queues, it might seem | |||
skipping to change at line 345 ¶ | skipping to change at line 348 ¶ | |||
negative impact on its flow rate [RFC5033]. The scaling problem | negative impact on its flow rate [RFC5033]. The scaling problem | |||
with Classic congestion control is explained, with examples, in | with Classic congestion control is explained, with examples, in | |||
Section 5.1 and in [RFC3649]. | Section 5.1 and in [RFC3649]. | |||
Scalable Congestion Control: A congestion control where the average | Scalable Congestion Control: A congestion control where the average | |||
time from one congestion signal to the next (the recovery time) | time from one congestion signal to the next (the recovery time) | |||
remains invariant as flow rate scales, all other factors being | remains invariant as flow rate scales, all other factors being | |||
equal. For instance, DCTCP averages 2 congestion signals per | equal. For instance, DCTCP averages 2 congestion signals per | |||
round trip, whatever the flow rate, as do other recently developed | round trip, whatever the flow rate, as do other recently developed | |||
Scalable congestion controls, e.g., Relentless TCP [RELENTLESS], | Scalable congestion controls, e.g., Relentless TCP [RELENTLESS], | |||
TCP Prague [PRAGUE-CC] [PragueLinux], BBRv2 [BBRv2] [BBR-CC], and | Prague for TCP and QUIC [PRAGUE-CC] [PragueLinux], BBRv2 [BBRv2] | |||
the L4S variant of SCReAM for real-time media [SCReAM-L4S] | [BBR-CC], and the L4S variant of SCReAM for real-time media | |||
[RFC8298]. See Section 4.3 of [RFC9331] for more explanation. | [SCReAM-L4S] [RFC8298]. See Section 4.3 of [RFC9331] for more | |||
explanation. | ||||
Classic Service: The Classic service is intended for all the | Classic Service: The Classic service is intended for all the | |||
congestion control behaviours that coexist with Reno [RFC5681] | congestion control behaviours that coexist with Reno [RFC5681] | |||
(e.g., Reno itself, CUBIC [RFC8312], Compound [CTCP], and TFRC | (e.g., Reno itself, CUBIC [RFC8312], Compound [CTCP], and TFRC | |||
[RFC5348]). The term 'Classic queue' means a queue providing the | [RFC5348]). The term 'Classic queue' means a queue providing the | |||
Classic service. | Classic service. | |||
Low Latency, Low Loss, and Scalable throughput (L4S) service: The | Low Latency, Low Loss, and Scalable throughput (L4S) service: The | |||
'L4S' service is intended for traffic from Scalable congestion | 'L4S' service is intended for traffic from Scalable congestion | |||
control algorithms, such as the Prague congestion control | control algorithms, such as the Prague congestion control | |||
skipping to change at line 781 ¶ | skipping to change at line 785 ¶ | |||
clearly problematic for a congestion control to take multiple | clearly problematic for a congestion control to take multiple | |||
seconds to recover from each congestion event. CUBIC [RFC8312] | seconds to recover from each congestion event. CUBIC [RFC8312] | |||
was developed to be less unscalable, but it is approaching its | was developed to be less unscalable, but it is approaching its | |||
scaling limit; with the same max RTT of 30 ms, at 120 Mb/s, CUBIC | scaling limit; with the same max RTT of 30 ms, at 120 Mb/s, CUBIC | |||
is still fully in its Reno-friendly mode, so it takes about 4.3 s | is still fully in its Reno-friendly mode, so it takes about 4.3 s | |||
to recover. However, once flow rate scales by 8 times again to | to recover. However, once flow rate scales by 8 times again to | |||
960 Mb/s it enters true CUBIC mode, with a recovery time of 12.2 | 960 Mb/s it enters true CUBIC mode, with a recovery time of 12.2 | |||
s. From then on, each further scaling by 8 times doubles CUBIC's | s. From then on, each further scaling by 8 times doubles CUBIC's | |||
recovery time (because the cube root of 8 is 2), e.g., at 7.68 Gb/ | recovery time (because the cube root of 8 is 2), e.g., at 7.68 Gb/ | |||
s, the recovery time is 24.3 s. In contrast, a Scalable | s, the recovery time is 24.3 s. In contrast, a Scalable | |||
congestion control like DCTCP or TCP Prague induces 2 congestion | congestion control like DCTCP or Prague induces 2 congestion | |||
signals per round trip on average, which remains invariant for any | signals per round trip on average, which remains invariant for any | |||
flow rate, keeping dynamic control very tight. | flow rate, keeping dynamic control very tight. | |||
For a feel of where the global average lone-flow download sits on | For a feel of where the global average lone-flow download sits on | |||
this scale at the time of writing (2021), according to [BDPdata], | this scale at the time of writing (2021), according to [BDPdata], | |||
the global average fixed access capacity was 103 Mb/s in 2020 and | the global average fixed access capacity was 103 Mb/s in 2020 and | |||
the average base RTT to a CDN was 25 to 34 ms in 2019. Averaging | the average base RTT to a CDN was 25 to 34 ms in 2019. Averaging | |||
of per-country data was weighted by Internet user population (data | of per-country data was weighted by Internet user population (data | |||
collected globally is necessarily of variable quality, but the | collected globally is necessarily of variable quality, but the | |||
paper does double-check that the outcome compares well against a | paper does double-check that the outcome compares well against a | |||
skipping to change at line 1373 ¶ | skipping to change at line 1377 ¶ | |||
Three complementary approaches are in progress to address this issue, | Three complementary approaches are in progress to address this issue, | |||
but they are all currently research: | but they are all currently research: | |||
* In Prague congestion control, ignore certain losses deemed | * In Prague congestion control, ignore certain losses deemed | |||
unlikely to be due to congestion (using some ideas from BBR | unlikely to be due to congestion (using some ideas from BBR | |||
[BBR-CC] regarding isolated losses). This could mask any of the | [BBR-CC] regarding isolated losses). This could mask any of the | |||
above types of loss while still coexisting with drop-based | above types of loss while still coexisting with drop-based | |||
congestion controls. | congestion controls. | |||
* A combination of RACK [RFC8985], L4S, and link retransmission | * A combination of Recent Acknowledgement (RACK) [RFC8985], L4S, and | |||
without resequencing could repair transmission errors without the | link retransmission without resequencing could repair transmission | |||
head of line blocking delay usually associated with link-layer | errors without the head of line blocking delay usually associated | |||
retransmission [UnorderedLTE] [RFC9331]. | with link-layer retransmission [UnorderedLTE] [RFC9331]. | |||
* Hybrid ECN/drop rate policers (see Section 8.3). | * Hybrid ECN/drop rate policers (see Section 8.3). | |||
L4S deployment scenarios that minimize these issues (e.g., over | L4S deployment scenarios that minimize these issues (e.g., over | |||
wireline networks) can proceed in parallel to this research, in the | wireline networks) can proceed in parallel to this research, in the | |||
expectation that research success could continually widen L4S | expectation that research success could continually widen L4S | |||
applicability. | applicability. | |||
6.4.4. L4S Flow but Classic ECN Bottleneck | 6.4.4. L4S Flow but Classic ECN Bottleneck | |||
skipping to change at line 1851 ¶ | skipping to change at line 1855 ¶ | |||
<https://doi.org/10.1145/3404868.3406669>. | <https://doi.org/10.1145/3404868.3406669>. | |||
[NASA04] Bailey, R., Trey Arthur III, J., and S. Williams, "Latency | [NASA04] Bailey, R., Trey Arthur III, J., and S. Williams, "Latency | |||
Requirements for Head-Worn Display S/EVS Applications", | Requirements for Head-Worn Display S/EVS Applications", | |||
Proceedings of SPIE 5424, DOI 10.1117/12.554462, April | Proceedings of SPIE 5424, DOI 10.1117/12.554462, April | |||
2004, <https://ntrs.nasa.gov/api/citations/20120009198/ | 2004, <https://ntrs.nasa.gov/api/citations/20120009198/ | |||
downloads/20120009198.pdf?attachment=true>. | downloads/20120009198.pdf?attachment=true>. | |||
[NQB-PHB] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | [NQB-PHB] White, G. and T. Fossati, "A Non-Queue-Building Per-Hop | |||
Behavior (NQB PHB) for Differentiated Services", Work in | Behavior (NQB PHB) for Differentiated Services", Work in | |||
Progress, Internet-Draft, draft-ietf-tsvwg-nqb-14, 24 | Progress, Internet-Draft, draft-ietf-tsvwg-nqb-15, 11 | |||
October 2022, <https://datatracker.ietf.org/doc/html/ | January 2023, <https://datatracker.ietf.org/doc/html/ | |||
draft-ietf-tsvwg-nqb-14>. | draft-ietf-tsvwg-nqb-15>. | |||
[PRAGUE-CC] | [PRAGUE-CC] | |||
De Schepper, K., Tilmans, O., and B. Briscoe, Ed., "Prague | De Schepper, K., Tilmans, O., and B. Briscoe, Ed., "Prague | |||
Congestion Control", Work in Progress, Internet-Draft, | Congestion Control", Work in Progress, Internet-Draft, | |||
draft-briscoe-iccrg-prague-congestion-control-01, 11 July | draft-briscoe-iccrg-prague-congestion-control-01, 11 July | |||
2022, <https://datatracker.ietf.org/doc/html/draft- | 2022, <https://datatracker.ietf.org/doc/html/draft- | |||
briscoe-iccrg-prague-congestion-control-01>. | briscoe-iccrg-prague-congestion-control-01>. | |||
[PragueLinux] | [PragueLinux] | |||
Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | Briscoe, B., De Schepper, K., Albisser, O., Misund, J., | |||
End of changes. 8 change blocks. | ||||
24 lines changed or deleted | 28 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |