rfc9318.original   rfc9318.txt 
Network Working Group W. Hardaker Internet Architecture Board (IAB) W. Hardaker
Internet-Draft USC/ISI Request for Comments: 9318
Intended status: Informational O. Shapira Category: Informational O. Shapira
Expires: 11 February 2023 Apple ISSN: 2070-1721 September 2022
10 August 2022
IAB workshop report: Measuring Network Quality for End-Users IAB Workshop Report: Measuring Network Quality for End-Users
draft-iab-mnqeu-report-04
Abstract Abstract
The Measuring Network Quality for End-Users workshop was held The Measuring Network Quality for End-Users workshop was held
virtually by the Internet Architecture Board (IAB) from September virtually by the Internet Architecture Board (IAB) on September
14-16, 2021. This report summarizes the workshop, the topics 14-16, 2021. This report summarizes the workshop, the topics
discussed, and some preliminary conclusions drawn at the end of the discussed, and some preliminary conclusions drawn at the end of the
workshop. workshop.
Status of This Memo Note that this document is a report on the proceedings of the
workshop. The views and positions documented in this report are
those of the workshop participants and do not necessarily reflect IAB
views and positions.
This Internet-Draft is submitted in full conformance with the Status of This Memo
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering This document is not an Internet Standards Track specification; it is
Task Force (IETF). Note that other groups may also distribute published for informational purposes.
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Architecture Board (IAB)
and may be updated, replaced, or obsoleted by other documents at any and represents information that the IAB has deemed valuable to
time. It is inappropriate to use Internet-Drafts as reference provide for permanent record. It represents the consensus of the
material or to cite them other than as "work in progress." Internet Architecture Board (IAB). Documents approved for
publication by the IAB are not candidates for any level of Internet
Standard; see Section 2 of RFC 7841.
This Internet-Draft will expire on 11 February 2023. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9318.
Copyright Notice Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents
license-info) in effect on the date of publication of this document. (https://trustee.ietf.org/license-info) in effect on the date of
Please review these documents carefully, as they describe your rights publication of this document. Please review these documents
and restrictions with respect to this document. Code Components carefully, as they describe your rights and restrictions with respect
extracted from this document must include Revised BSD License text as to this document.
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction
1.1. Problem space . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Problem Space
2. Workshop Agenda . . . . . . . . . . . . . . . . . . . . . . . 4 2. Workshop Agenda
3. Position Papers . . . . . . . . . . . . . . . . . . . . . . . 5 3. Position Papers
4. Workshop Topics and Discussion . . . . . . . . . . . . . . . 7 4. Workshop Topics and Discussion
4.1. Introduction and overviews . . . . . . . . . . . . . . . 8 4.1. Introduction and Overviews
4.1.1. Key points from the keynote by Vint Cerf . . . . . . 8 4.1.1. Key Points from the Keynote by Vint Cerf
4.1.2. Introductory talks . . . . . . . . . . . . . . . . . 9 4.1.2. Introductory Talks
4.1.3. Introductory talks - key points . . . . . . . . . . . 11 4.1.3. Introductory Talks - Key Points
4.2. Metrics considerations . . . . . . . . . . . . . . . . . 11 4.2. Metrics Considerations
4.2.1. Common performance metrics . . . . . . . . . . . . . 11 4.2.1. Common Performance Metrics
4.2.2. Availability metrics . . . . . . . . . . . . . . . . 14 4.2.2. Availability Metrics
4.2.3. Capacity metrics . . . . . . . . . . . . . . . . . . 14 4.2.3. Capacity Metrics
4.2.4. Latency metrics . . . . . . . . . . . . . . . . . . . 15 4.2.4. Latency Metrics
4.2.5. Measurement case studies . . . . . . . . . . . . . . 17 4.2.5. Measurement Case Studies
4.2.6. Metrics Key Points . . . . . . . . . . . . . . . . . 18 4.2.6. Metrics Key Points
4.3. Cross-layer Considerations . . . . . . . . . . . . . . . 19 4.3. Cross-Layer Considerations
4.3.1. Separation of Concerns . . . . . . . . . . . . . . . 20 4.3.1. Separation of Concerns
4.3.2. Security and Privacy Considerations . . . . . . . . . 21 4.3.2. Security and Privacy Considerations
4.3.3. Metric Measurement Considerations . . . . . . . . . . 21 4.3.3. Metric Measurement Considerations
4.3.4. Towards Improving Future Cross-layer Observability . 22 4.3.4. Towards Improving Future Cross-Layer Observability
4.3.5. Efficient Collaboration Between Hardware and Transport 4.3.5. Efficient Collaboration between Hardware and Transport
Protocols . . . . . . . . . . . . . . . . . . . . . . 22 Protocols
4.3.6. Cross-Layer Key Points . . . . . . . . . . . . . . . 23 4.3.6. Cross-Layer Key Points
4.4. Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4. Synthesis
4.4.1. Measurement and Metrics Considerations . . . . . . . 23 4.4.1. Measurement and Metrics Considerations
4.4.2. End-User metrics presentation . . . . . . . . . . . . 24 4.4.2. End-User Metrics Presentation
4.4.3. Synthesis Key Points . . . . . . . . . . . . . . . . 25 4.4.3. Synthesis Key Points
5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 26 5. Conclusions
5.1. General statements . . . . . . . . . . . . . . . . . . . 26 5.1. General Statements
5.2. Specific statements about detailed protocols/ 5.2. Specific Statements about Detailed Protocols/Techniques
techniques . . . . . . . . . . . . . . . . . . . . . . . 27 5.3. Problem Statements and Concerns
5.3. Problem statements and concerns . . . . . . . . . . . . . 27 5.4. No-Consensus-Reached Statements
5.4. No-consensus reached statements . . . . . . . . . . . . . 28 6. Follow-On Work
6. Follow-on work . . . . . . . . . . . . . . . . . . . . . . . 28 7. IANA Considerations
7. Security considerations . . . . . . . . . . . . . . . . . . . 28 8. Security Considerations
8. Informative References . . . . . . . . . . . . . . . . . . . 28 9. Informative References
Appendix A. Participants List . . . . . . . . . . . . . . . . . 33 Appendix A. Program Committee
Appendix B. IAB Members at the Time of Approval . . . . . . . . 35 Appendix B. Workshop Chairs
Appendix C. Acknowledgements . . . . . . . . . . . . . . . . . . 36 Appendix C. Workshop Participants
C.1. Draft contributors . . . . . . . . . . . . . . . . . . . 36 IAB Members at the Time of Approval
C.2. Workshop Chairs . . . . . . . . . . . . . . . . . . . . . 36 Acknowledgments
C.3. Program Committee . . . . . . . . . . . . . . . . . . . . 36 Contributors
Appendix D. Github Version of this document . . . . . . . . . . 37 Authors' Addresses
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37
1. Introduction 1. Introduction
The Internet Architecture Board (IAB) holds occasional workshops The Internet Architecture Board (IAB) holds occasional workshops
designed to consider long-term issues and strategies for the designed to consider long-term issues and strategies for the
Internet, and to suggest future directions for the Internet Internet, and to suggest future directions for the Internet
architecture. This long-term planning function of the IAB is architecture. This long-term planning function of the IAB is
complementary to the ongoing engineering efforts performed by working complementary to the ongoing engineering efforts performed by working
groups of the Internet Engineering Task Force (IETF). groups of the Internet Engineering Task Force (IETF).
The Measuring Network Quality for End-Users workshop [WORKSHOP] was The Measuring Network Quality for End-Users workshop [WORKSHOP] was
held virtually by the Internet Architecture Board (IAB) in September held virtually by the Internet Architecture Board (IAB) on September
14-16, 2021. This report summarizes the workshop, the topics 14-16, 2021. This report summarizes the workshop, the topics
discussed, and some preliminary conclusions drawn at the end of the discussed, and some preliminary conclusions drawn at the end of the
workshop. workshop.
1.1. Problem space 1.1. Problem Space
The Internet in 2021 is quite different from what it was 10 years The Internet in 2021 is quite different from what it was 10 years
ago. Today, it is a crucial part of everyone's daily life. People ago. Today, it is a crucial part of everyone's daily life. People
use the Internet for their social life, for their daily jobs, for use the Internet for their social life, for their daily jobs, for
routine shopping, and for keeping up with major events. An routine shopping, and for keeping up with major events. An
increasing number of people can access a Gigabit connection, which increasing number of people can access a gigabit connection, which
would be hard to imagine a decade ago. And, thanks to improvements would be hard to imagine a decade ago. Additionally, thanks to
in security, people trust the Internet for financial banking improvements in security, people trust the Internet for financial
transactions, purchasing goods and everyday bill payments. banking transactions, purchasing goods, and everyday bill payments.
At the same time, some aspects of end-user experience have not At the same time, some aspects of the end-user experience have not
improved as much. Many users have typical connection latencies that improved as much. Many users have typical connection latencies that
remain at decade-old levels. Despite significant reliability remain at decade-old levels. Despite significant reliability
improvements in data center environments, end users also still often improvements in data center environments, end users also still often
see interruptions in service. Despite algorithmic advances in the see interruptions in service. Despite algorithmic advances in the
field of control theory, one still finds that the queuing delays in field of control theory, one still finds that the queuing delays in
the last-mile equipment exceeds the accumulated transit delays. the last-mile equipment exceeds the accumulated transit delays.
Transport improvements, such as QUIC, Multipath TCP, and TCP Fast Transport improvements, such as QUIC, Multipath TCP, and TCP Fast
Open are still not fully supported in some networks. Likewise, Open, are still not fully supported in some networks. Likewise,
various advances in the security and privacy of user data are not various advances in the security and privacy of user data are not
widely supported, such as encrypted DNS to the local resolver. widely supported, such as encrypted DNS to the local resolver.
Some of the major factors behind this lack of progress is the popular Some of the major factors behind this lack of progress is the popular
perception that throughput is the often sole measure of the quality perception that throughput is often the sole measure of the quality
of Internet connectivity. With such narrow focus, the Measuring of Internet connectivity. With such a narrow focus, the Measuring
Network Quality for End-Users workshop aimed to discuss various Network Quality for End-Users workshop aimed to discuss various
questions: topics:
* What is user latency under typical working conditions? * What is user latency under typical working conditions?
* How reliable is connectivity across longer time periods? * How reliable is connectivity across longer time periods?
* Do networks allow the use of a broad range of protocols? * Do networks allow the use of a broad range of protocols?
* What services can be run by network clients? * What services can be run by network clients?
* What kind of IPv4, NAT, or IPv6 connectivity is offered, and are * What kind of IPv4, NAT, or IPv6 connectivity is offered, and are
there firewalls? there firewalls?
* What security mechanisms are available for local services, such as * What security mechanisms are available for local services, such as
DNS? DNS?
* To what degree are the privacy, confidentiality, integrity, and * To what degree are the privacy, confidentiality, integrity, and
authenticity of user communications guarded? authenticity of user communications guarded?
* Improving these aspects of network quality will likely depend on * Improving these aspects of network quality will likely depend on
measurement and exposing metrics in a meaningful way to all measuring and exposing metrics in a meaningful way to all involved
involved parties, including to end users. Such measurement and parties, including to end users. Such measurement and exposure of
exposure of the right metrics will allow service providers and the right metrics will allow service providers and network
network operators to concentrate focus on their users' experience operators to concentrate focus on their users' experience and will
and will simultaneously empower users to choose the Internet simultaneously empower users to choose the Internet Service
service providers that can deliver the best experience based on Providers (ISPs) that can deliver the best experience based on
their needs. their needs.
* What are the fundamental properties of a network that contributes * What are the fundamental properties of a network that contributes
to a good user experience? to a good user experience?
* What metrics quantify these properties, and how can we collect * What metrics quantify these properties, and how can we collect
such metrics in a practical way? such metrics in a practical way?
* What are the best practices for interpreting those metrics, and * What are the best practices for interpreting those metrics and
incorporating those in a decision making process? incorporating them in a decision-making process?
* What are the best ways to communicate these properties to service * What are the best ways to communicate these properties to service
providers and network operators? providers and network operators?
* How can these metrics be displayed to users in a meaningful way? * How can these metrics be displayed to users in a meaningful way?
2. Workshop Agenda 2. Workshop Agenda
The Measuring Network Quality for End-Users workshop was divided into The Measuring Network Quality for End-Users workshop was divided into
the following main topic areas, further discussion in Section 4: the following main topic areas; see further discussion in Sections 4
and 5:
* Introduction overviews and a keynote by Vint Cerf * Introduction overviews and a keynote by Vint Cerf
* Metrics considerations * Metrics considerations
* Cross-layer considerations * Cross-layer considerations
* Synthesis * Synthesis
* Group conclusions * Group conclusions
3. Position Papers 3. Position Papers
The following position papers were received for consideration by the The following position papers were received for consideration by the
workshop attendees. The workshop's web-page [WORKSHOP] contains workshop attendees. The workshop's web page [WORKSHOP] contains
archives of the papers, presentations and recorded videos. archives of the papers, presentations, and recorded videos.
* Ahmed Aldabbagh. "Regulatory perspective on measuring network * Ahmed Aldabbagh. "Regulatory perspective on measuring network
quality for end users" [Aldabbagh2021] quality for end users" [Aldabbagh2021]
* Al Morton. "Dream-Pipe or Pipe-Dream: What Do Users Want (and how * Al Morton. "Dream-Pipe or Pipe-Dream: What Do Users Want (and how
can we assure it)?" [Morton2021] can we assure it)?" [Morton2021]
* Alexander Kozlov . "The 2021 National Internet Segment Reliability * Alexander Kozlov. "The 2021 National Internet Segment Reliability
Research" Research"
* Anna Brunstrom. "Measuring newtork quality - the MONROE * Anna Brunstrom. "Measuring network quality - the MONROE
experience" experience"
* Bob Briscoe, Greg White, Vidhi Goel and Koen De Schepper. "A * Bob Briscoe, Greg White, Vidhi Goel, and Koen De Schepper. "A
single common metric to characterize varying packet delay" Single Common Metric to Characterize Varying Packet Delay"
[Briscoe2021] [Briscoe2021]
* Brandon Schlinker. "Internet's performance from Facebook's edge" * Brandon Schlinker. "Internet Performance from Facebook's Edge"
[Schlinker2019] [Schlinker2019]
* Christoph Paasch, Kristen McIntyre, Randall Meyer, Stuart * Christoph Paasch, Kristen McIntyre, Randall Meyer, Stuart
Cheshire, Omer Shapira. "An end-user approach to the Internet Cheshire, and Omer Shapira. "An end-user approach to the Internet
Score" [McIntyre2021] Score" [McIntyre2021]
* Christoph Paasch, Randall Meyer, Stuart Cheshire, Omer Shapira. * Christoph Paasch, Randall Meyer, Stuart Cheshire, and Omer
"Responsiveness under Working Conditions" [Paasch2021] Shapira. "Responsiveness under Working Conditions" [Paasch2021]
* Dave Reed, Levi Perigo. "Measuring ISP Performance in Broadband * Dave Reed and Levi Perigo. "Measuring ISP Performance in
America: a Study of Latency Under Load" [Reed2021] Broadband America: A Study of Latency Under Load" [Reed2021]
* Eve M. Schooler, Rick Taylor. "Non-traditional Network Metrics" * Eve M. Schooler and Rick Taylor. "Non-traditional Network
Metrics"
* Gino Dion. "Focusing on latency, not throughput, to provide * Gino Dion. "Focusing on latency, not throughput, to provide
better internet experience and network quality" [Dion2021] better internet experience and network quality" [Dion2021]
* Gregory Mirsky, Xiao Min, Gyan Mishra, Liuyan Han. "Error * Gregory Mirsky, Xiao Min, Gyan Mishra, and Liuyan Han. "The error
Performance Measurement in Packet-Switched Networks" [Mirsky2021] performance metric in a packet-switched network" [Mirsky2021]
* Jana Iyengar. "The Internet Exists In Its Use" [Iyengar2021] * Jana Iyengar. "The Internet Exists In Its Use" [Iyengar2021]
* Jari Arkko, Mirja Kuehlewind. "Observability is needed to improve
network quality" [Arkko2021]
* Joachim Fabini. "Objective and subjective network quality" * Jari Arkko and Mirja Kuehlewind. "Observability is needed to
improve network quality" [Arkko2021]
* Joachim Fabini. "Network Quality from an End User Perspective"
[Fabini2021] [Fabini2021]
* Jonathan Foulkes. "Metrics helpful in assessing Internet Quality" * Jonathan Foulkes. "Metrics helpful in assessing Internet Quality"
[Foulkes2021] [Foulkes2021]
* Kalevi Kilkki, Benajamin Finley. "In Search of Lost QoS" * Kalevi Kilkki and Benajamin Finley. "In Search of Lost QoS"
[Kilkki2021] [Kilkki2021]
* Karthik Sundaresan, Greg White, Steve Glennon . "Latency * Karthik Sundaresan, Greg White, and Steve Glennon. "Latency
Measurement: What is latency and how do we measure it?" Measurement: What is latency and how do we measure it?"
* Keith Winstein. "Five Observations on Measuring Network Quality * Keith Winstein. "Five Observations on Measuring Network Quality
for Users of Real-Time Media Applications" for Users of Real-Time Media Applications"
* Ken Kerpez, Jinous Shafiei, John Cioffi, Pete Chow, Djamel * Ken Kerpez, Jinous Shafiei, John Cioffi, Pete Chow, and Djamel
Bousaber. "State of Wi-Fi Reporting" [Kerpez2021] Bousaber. "Wi-Fi and Broadband Data" [Kerpez2021]
* Kenjiro Cho. "Access Network Quality as Fitness for Purpose" * Kenjiro Cho. "Access Network Quality as Fitness for Purpose"
* Koen De Schepper, Olivier Tilmans, Gino Dion. "Challenges and * Koen De Schepper, Olivier Tilmans, and Gino Dion. "Challenges and
opportunities of hardware support for Low Queuing Latency without opportunities of hardware support for Low Queuing Latency without
Packet Loss" [DeSchepper2021] Packet Loss" [DeSchepper2021]
* Kyle MacMillian, Nick Feamster. "Beyond Speed Test: Measuring * Kyle MacMillian and Nick Feamster. "Beyond Speed Test: Measuring
Latency Under Load Across Different Speed Tiers" [MacMillian2021] Latency Under Load Across Different Speed Tiers" [MacMillian2021]
* Lucas Pardue, Sreeni Tellakula. "Lower layer performance not * Lucas Pardue and Sreeni Tellakula. "Lower-layer performance not
indicative of upper layer success" [Pardue2021] indicative of upper-layer success" [Pardue2021]
* Matt Mathis. "Preliminary Longitudinal Study of Internet * Matt Mathis. "Preliminary Longitudinal Study of Internet
Responsiveness" [Mathis2021] Responsiveness" [Mathis2021]
* Michael Welzl. "A Case for Long-Term Statistics" [Welzl2021] * Michael Welzl. "A Case for Long-Term Statistics" [Welzl2021]
* Mikhail Liubogoshchev. "Cross-layer Cooperation for Better * Mikhail Liubogoshchev. "Cross-layer Cooperation for Better
Network Service" [Liubogoshchev2021] Network Service" [Liubogoshchev2021]
* Mingrui Zhang, Vidhi Goel, Lisong Xu. "User-Perceived Latency to * Mingrui Zhang, Vidhi Goel, and Lisong Xu. "User-Perceived Latency
measure CCAs" [Zhang2021] to Measure CCAs" [Zhang2021]
* Neil Davies, Peter Thompson. "Measuring Network Impact on * Neil Davies and Peter Thompson. "Measuring Network Impact on
Application Outcomes using Quality Attenuation" [Davies2021] Application Outcomes Using Quality Attenuation" [Davies2021]
* Olivier Bonaventure, Francois Michel. "Packet delivery time as a * Olivier Bonaventure and Francois Michel. "Packet delivery time as
tie-breaker for assessing Wi-Fi access points" [Michel2021] a tie-breaker for assessing Wi-Fi access points" [Michel2021]
* Pedro Casas. "10 Years of Internet-QoE Measurements. Video, * Pedro Casas. "10 Years of Internet-QoE Measurements. Video,
Cloud, Conferencing, Web and Apps. What do we need from the Cloud, Conferencing, Web and Apps. What do we Need from the
Network Side?" [Casas2021] Network Side?" [Casas2021]
* Praveen Balasubramanian. "Transport Layer Statistics for Network * Praveen Balasubramanian. "Transport Layer Statistics for Network
Quality" [Balasubramanian2021] Quality" [Balasubramanian2021]
* Rajat Ghai. "Measuring & Improving QoE on the Xfinity Wi-Fi * Rajat Ghai. "Using TCP Connect Latency for measuring CX and
Network" [Ghai2021] Network Optimization" [Ghai2021]
* Robin Marx, Joris Herbots. "Merge Those Metrics: Towards Holistic * Robin Marx and Joris Herbots. "Merge Those Metrics: Towards
(Protocol) Logging" [Marx2021] Holistic (Protocol) Logging" [Marx2021]
* Sandor Laki, Szilveszter Nadas, Balazs Varga, Luis M. Contreras. * Sandor Laki, Szilveszter Nadas, Balazs Varga, and Luis M.
"Incentive-Based Traffic Management and QoS Measurements" Contreras. "Incentive-Based Traffic Management and QoS
[Laki2021] Measurements" [Laki2021]
* Satadal Sengupta, Hyojoon Kim, Jennifer Rexford. "Fine-Grained * Satadal Sengupta, Hyojoon Kim, and Jennifer Rexford. "Fine-
RTT Monitoring Inside the Network" [Sengupta2021] Grained RTT Monitoring Inside the Network" [Sengupta2021]
* Stuart Cheshire. "The Internet is a Shared Network" * Stuart Cheshire. "The Internet is a Shared Network"
[Cheshire2021] [Cheshire2021]
* Toerless Eckert, Alex Clemm. "network-quality-eckert-clemm-00.4" * Toerless Eckert and Alex Clemm. "network-quality-eckert-clemm-
00.4"
* Vijay Sivaraman, Sharat Madanapalli, Himal Kumar. "Measuring * Vijay Sivaraman, Sharat Madanapalli, and Himal Kumar. "Measuring
Network Experience Meaningfully, Accurately, and Scalably" Network Experience Meaningfully, Accurately, and Scalably"
[Sivaraman2021] [Sivaraman2021]
* Yaakov (J) Stein. "The Futility of QoS" [Stein2021] * Yaakov (J) Stein. "The Futility of QoS" [Stein2021]
4. Workshop Topics and Discussion 4. Workshop Topics and Discussion
The agenda for the three day workshop was broken into four separate The agenda for the three-day workshop was broken into four separate
sections that each played a role in framing the discussions. The sections that each played a role in framing the discussions. The
workshop started with a series of Introduction and problem space workshop started with a series of introduction and problem space
presentations {introduction-section}, followed by metrics presentations (Section 4.1), followed by metrics considerations
considerations Section 4.2, cross layer considerations Section 4.3 (Section 4.2), cross-layer considerations (Section 4.3), and a
and a synthesis discussion Section 4.4. After the four subsections synthesis discussion (Section 4.4). After the four subsections
concluded, a follow-on discussion was held to draw conclusions that concluded, a follow-on discussion was held to draw conclusions that
could be agreed upon by workshop participants (Section 5). could be agreed upon by workshop participants (Section 5).
4.1. Introduction and overviews 4.1. Introduction and Overviews
The workshop started with a broad focus on the state of user Quality The workshop started with a broad focus on the state of user Quality
of Service (QoS) and quality of experience (QoE) on the Internet of Service (QoS) and Quality of Experience (QoE) on the Internet
today. The goal of the introductory talks was to set the stage for today. The goal of the introductory talks was to set the stage for
the workshop by describing both the problem space and the current the workshop by describing both the problem space and the current
solutions in place and their limitations. solutions in place and their limitations.
The introduction presentations provided views of existing QoS and QoE The introduction presentations provided views of existing QoS and QoE
measurements and their effectiveness. Also discussed was the measurements and their effectiveness. Also discussed was the
interaction between multiple users within the network, as well as the interaction between multiple users within the network, as well as the
interaction between multiple layers of the OSI stack. Vint Cerf interaction between multiple layers of the OSI stack. Vint Cerf
provided a key note describing the history and importance of the provided a keynote describing the history and importance of the
topic. topic.
4.1.1. Key points from the keynote by Vint Cerf 4.1.1. Key Points from the Keynote by Vint Cerf
We may be operating in a networking space with dramatically different We may be operating in a networking space with dramatically different
parameters compared to 30 years ago. This differentiation justifies parameters compared to 30 years ago. This differentiation justifies
re-considering not only the importance of one metric over the other, reconsidering not only the importance of one metric over the other
but also re-considering the entire metaphor. but also reconsidering the entire metaphor.
It is time for the experts to look at not only at adjusting TCP, but It is time for the experts to look at not only adjusting TCP but also
also at exploring other protocols, such as QUIC has done lately. exploring other protocols, such as QUIC has done lately. It's
It's important that we feel free to consider alternatives to TCP. important that we feel free to consider alternatives to TCP. TCP is
TCP is not a teddy bear, and one should not be afraid to replace it not a teddy bear, and one should not be afraid to replace it with a
with a transport later with better properties that better benefits transport layer with better properties that better benefit its users.
its users.
A suggestion: we should consider exercises to identify desirable A suggestion: we should consider exercises to identify desirable
properties. As we are looking at the parametric spaces, one can properties. As we are looking at the parametric spaces, one can
identify "desirable properties", as opposed to "fundamental identify "desirable properties", as opposed to "fundamental
properties", for example a low-latency property. An example coming properties", for example, a low-latency property. An example coming
from ARPA: you want to know where the missile is now, not where it from the Advanced Research Projects Agency (ARPA): you want to know
was. Understanding drives particular parameter creation and where the missile is now, not where it was. Understanding drives
selection in the design space. particular parameter creation and selection in the design space.
When parameter values are changed in extreme, such as connectiveness, When parameter values are changed in extreme, such as connectiveness,
alternative designs will emerge. One case study of note is the alternative designs will emerge. One case study of note is the
interplanetary protocol, where "ping" is no long indicative of interplanetary protocol, where "ping" is no longer indicative of
anything useful. While we look at responsiveness, we should not anything useful. While we look at responsiveness, we should not
ignore connectivity. ignore connectivity.
Unfortunately, maintaining backward compatibility is painful. The Unfortunately, maintaining backward compatibility is painful. The
work on designing IPv6 so as to transition from IPv4 could have been work on designing IPv6 so as to transition from IPv4 could have been
done better if the backward compatibility was considered. This is done better if the backward compatibility was considered. It is too
too late for IPv6, but this problem space is not too late for the late for IPv6, but it is not too late to consider this issue for
future laying problems. potential future problems.
IPv6 is still not implemented fully everywhere. It's been a long IPv6 is still not implemented fully everywhere. It's been a long
road to deployment since starting work in 1996, and we are still not road to deployment since starting work in 1996, and we are still not
there. In 1996, the thinking was that it was quite easy to implement there. In 1996, the thinking was that it was quite easy to implement
IPv6, but that failed to hold true. In 1996 the dot-com boom began, IPv6, but that failed to hold true. In 1996, the dot-com boom began,
with lots of money was spent quickly, and the moment was not caught where a lot of money was spent quickly, and the moment was not caught
in time while the market expanded exponentially. This should serve in time while the market expanded exponentially. This should serve
as a cautionary tale. as a cautionary tale.
One last point: consider performance across multiple hops in the One last point: consider performance across multiple hops in the
Internet. We've not seen many end-to-end metrics, as successfully Internet. We've not seen many end-to-end metrics, as successfully
developing end-to-end measurements across different network and developing end-to-end measurements across different network and
business boundaries is quite hard to achieve. A good question to ask business boundaries is quite hard to achieve. A good question to ask
when developing new protocols is "will the new protocol work across when developing new protocols is "will the new protocol work across
multiple network hops?" multiple network hops?"
Multi-hop networks are being gradually replaced by humongous, flat Multi-hop networks are being gradually replaced by humongous, flat
networks with sufficient connectivity between operators so that networks with sufficient connectivity between operators so that
systems become 1 hop or 2 hop at most away from each other (e.g. systems become 1 hop, or 2 hops at most, away from each other (e.g.,
Google, Facebook, Amazon). The fundamental architecture of the Google, Facebook, and Amazon). The fundamental architecture of the
Internet is changing. Internet is changing.
4.1.2. Introductory talks 4.1.2. Introductory Talks
The Internet is a shared network, built on the IP protocols using The Internet is a shared network built on IP protocols using packet
packet-switching to interconnect multiple autonomous networks. The switching to interconnect multiple autonomous networks. The
Internet's departure from circuit-switching technologies allowed it Internet's departure from circuit-switching technologies allowed it
to scale beyond any other known network design. On the other hand, to scale beyond any other known network design. On the other hand,
the lack of in-network regulation made it difficult to ensure the the lack of in-network regulation made it difficult to ensure the
best experience for every user. best experience for every user.
As Internet use cases continue to expand, it becomes increasingly As Internet use cases continue to expand, it becomes increasingly
more difficult to predict which network characteristics correlate more difficult to predict which network characteristics correlate
with better user experiences. Different application classes, e.g., with better user experiences. Different application classes, e.g.,
video streaming and teleconferencing, can affect user experience in video streaming and teleconferencing, can affect user experience in
complex and difficult to measure ways. Internet utilization shifts ways that are complex and difficult to measure. Internet utilization
rapidly during the course of each day, week and year, which further shifts rapidly during the course of each day, week, and year, which
complicates identifying key metrics capable of predicting a good user further complicates identifying key metrics capable of predicting a
experience. good user experience.
Quality of Service (QoS) initiatives attempted to overcome these QoS initiatives attempted to overcome these difficulties by strictly
difficulties by strictly prioritizing different types of traffic. prioritizing different types of traffic. However, QoS metrics do not
However, QoS metrics do not always correlate with user experience. always correlate with user experience. The utility of the QoS metric
The utility of the QoS metric is further limited by the difficulties is further limited by the difficulties in building solutions with the
in building solutions with the desired QoS characteristics. desired QoS characteristics.
Quality of Experience (QoE) initiatives attempted to integrate the QoE initiatives attempted to integrate the psychological aspects of
psychological aspects of how quality is perceived, and created how quality is perceived and create statistical models designed to
statistical models designed to optimize the user experience. Despite optimize the user experience. Despite these high modeling efforts,
these high modeling efforts, the QoE approach proved beneficial in the QoE approach proved beneficial in certain application classes.
certain application classes. Unfortunately, generalizing the models Unfortunately, generalizing the models proved to be difficult, and
proved to be difficult, and the question of how different the question of how different applications affect each other when
applications affect each other when sharing the same network remains sharing the same network remains an open problem.
an open problem.
The industry's focus on giving the end-user more throughput/bandwidth The industry's focus on giving the end user more throughput/bandwidth
led to remarkable advances. In many places around the world, a home led to remarkable advances. In many places around the world, a home
user enjoys gigabit speeds to their Internet Service Provider. This user enjoys gigabit speeds to their ISP. This is so remarkable that
is so remarkable that it would have been brushed off as science it would have been brushed off as science fiction a decade ago.
fiction a decade ago. However, the focus on increased capacity came However, the focus on increased capacity came at the expense of
at the expense of neglecting another important core metric: latency. neglecting another important core metric: latency. As a result, end
As a result, end-users whose experience is negatively affected by users whose experience is negatively affected by high latency were
high latency were advised to upgrade their equipment to get more advised to upgrade their equipment to get more throughput instead.
throughput instead. [MacMillian2021] showed that sometimes such an [MacMillian2021] showed that sometimes such an upgrade can lead to
upgrade can lead to latency improvements, due to the economical latency improvements, due to the economical reasons of overselling
reasons of overselling the "value-priced" data plans. the "value-priced" data plans.
As the industry continued to give end users more throughput, while As the industry continued to give end users more throughput, while
mostly neglecting latency concerns, application designs started to mostly neglecting latency concerns, application designs started to
employ various latency and short service disruption hiding employ various latency and short service disruption hiding
techniques. For example, a user's experience of web browser techniques. For example, a user's web browser performance experience
performance is closely tired to the content in the browser's local is closely tied to the content in the browser's local cache. While
cache. While such techniques can clearly improve the user experience such techniques can clearly improve the user experience when using
when using stale data is possible, this development further decouples stale data is possible, this development further decouples user
user experience from core metrics. experience from core metrics.
In the most recent 10 years, efforts by Dave Taht and the bufferbloat In the most recent 10 years, efforts by Dave Taht and the bufferbloat
society had led to significant progress updating queuing algorithms society have led to significant progress in updating queuing
to reduce latencies under load compared to simipler FIFO queues. algorithms to reduce latencies under load compared to simpler FIFO
Unfortunately, the home router industry has yet to implement these queues. Unfortunately, the home router industry has yet to implement
algorithms, mostly due to marketing and cost concerns. Most home these algorithms, mostly due to marketing and cost concerns. Most
router manufacturers depend on System on a Chip (SoC) acceleration to home router manufacturers depend on System on a Chip (SoC)
create products with a desired throughput. SoC manufacturers opt for acceleration to create products with a desired throughput. SoC
simpler algorithms and aggressive aggregation, reasoning that a manufacturers opt for simpler algorithms and aggressive aggregation,
higher-throughput chip will have guaranteed demand. Because reasoning that a higher-throughput chip will have guaranteed demand.
consumers are offered choices primarily among different high Because consumers are offered choices primarily among different high-
throughput devices, the perception that a higher throughput leads to throughput devices, the perception that a higher throughput leads to
higher a quality of service continues to strengthen. higher a QoS continues to strengthen.
The home router is not the only place that can benefit from clearer The home router is not the only place that can benefit from clearer
indications of acceptable performance for users. Since users indications of acceptable performance for users. Since users
perceive the Internet via the lens of applications, its important to perceive the Internet via the lens of applications, it is important
appeal to the application vendors that they should adopt solutions that we call upon application vendors to adopt solutions that stress
that stress lower latencies. Unfortunately, while bandwidth is lower latencies. Unfortunately, while bandwidth is straightforward
straightforward to measure, responsiveness is trickier. Many to measure, responsiveness is trickier. Many applications have found
applications have found a set of metrics which are helpful to their a set of metrics that are helpful to their realm but do not
realm, but do not generalize well and cannot become universally generalize well and cannot become universally applicable.
applicable. Furthermore, due to the highly competitive application Furthermore, due to the highly competitive application space, vendors
space, vendors may have economic reasons to avoid sharing their most may have economic reasons to avoid sharing their most useful metrics.
useful metrics.
4.1.3. Introductory talks - key points 4.1.3. Introductory Talks - Key Points
1. Measuring bandwidth is necessary, but is not alone sufficient. 1. Measuring bandwidth is necessary but is not alone sufficient.
2. In many cases, Internet users don't need more bandwidth, but 2. In many cases, Internet users don't need more bandwidth but
rather need "better bandwidth" - i.e., they need other rather need "better bandwidth", i.e., they need other
connectivity improvements. connectivity improvements.
3. Users perceive the quality of their Internet connection based on 3. Users perceive the quality of their Internet connection based on
the applications they use, which are affected by a combination of the applications they use, which are affected by a combination of
factors. There's little value in exposing a typical user to the factors. There's little value in exposing a typical user to the
entire spectrum of possible reasons for the poor performance entire spectrum of possible reasons for the poor performance
perceived in their application-centric view. perceived in their application-centric view.
4. Many factors affecting user experience are outside the users' 4. Many factors affecting user experience are outside the users'
sphere of control. It's unclear whether exposing users to these sphere of control. It's unclear whether exposing users to these
other factors will help users understand the state of their other factors will help them understand the state of their
network performance. In general, users prefer simple, network performance. In general, users prefer simple,
categorical choices (e.g. "good", "better", and "best" options). categorical choices (e.g., "good", "better", and "best" options).
5. The Internet content market is highly competitive, and many 5. The Internet content market is highly competitive, and many
applications develop their own "secret sauce." applications develop their own "secret sauce".
4.2. Metrics considerations 4.2. Metrics Considerations
In the second agenda section, the workshop continued its discussion In the second agenda section, the workshop continued its discussion
about metrics that can be used instead of or in addition to available about metrics that can be used instead of or in addition to available
bandwidth. Several workshop attendees presented deep-dive studies on bandwidth. Several workshop attendees presented deep-dive studies on
measurement methodology. measurement methodology.
4.2.1. Common performance metrics 4.2.1. Common Performance Metrics
Losing Internet access entirely is, of course, the worst user Losing Internet access entirely is, of course, the worst user
experience. Unfortunately, unless rebooting the home router restores experience. Unfortunately, unless rebooting the home router restores
connectivity, there is little a user can do other than contacting connectivity, there is little a user can do other than contacting
their service provider. Nevertheless, there is value in the their service provider. Nevertheless, there is value in the
systematic collection of availability metrics on the client side: systematic collection of availability metrics on the client side;
these can help the user's ISP localize and resolve issues faster, these can help the user's ISP localize and resolve issues faster
while enabling users to better choose between ISPs. One can measure while enabling users to better choose between ISPs. One can measure
availability directly by simply attempting connections from the availability directly by simply attempting connections from the
client-side to distant locations of interest. For example, Ookla's client side to distant locations of interest. For example, Ookla's
([Speedtest]) uses a large number of Android devices to measure [Speedtest] uses a large number of Android devices to measure network
network and cellular availability around the globe. Ookla collects and cellular availability around the globe. Ookla collects hundreds
hundreds of millions of data points per day, and uses these for of millions of data points per day and uses these for accurate
accurate availability reporting. An alternative approach is to availability reporting. An alternative approach is to derive
derive availability from the failure rates of other tests. For availability from the failure rates of other tests. For example,
example, [FCC_MBA] [FCC_MBA_methodology] uses thousands of off-the [FCC_MBA] and [FCC_MBA_methodology] use thousands of off-the-shelf
shelf routers, called "Whiteboxes", with measurement software routers, with measurement software developed by [SamKnows]. These
developed by [SamKnows]. These Whiteboxes perform an array of routers perform an array of network tests and report availability
network tests and report availability based whether test connections based on whether test connections were successful or not.
were successful or not.
Measuring available capacity can be helpful to end-users, but it is Measuring available capacity can be helpful to end users, but it is
even more valuable for service providers and application developers. even more valuable for service providers and application developers.
High-definition video streaming requires significantly more capacity High-definition video streaming requires significantly more capacity
than any other type of traffic. At the time of the workshop, video than any other type of traffic. At the time of the workshop, video
traffic constituted 90% of overall Internet traffic and contributed traffic constituted 90% of overall Internet traffic and contributed
to 95% of the revenues from monetization (via subscriptions, fees, or to 95% of the revenues from monetization (via subscriptions, fees, or
ads). As a result, video streaming services, such as Netflix, need ads). As a result, video streaming services, such as Netflix, need
to continuously cope with rapid changes in available capacity. The to continuously cope with rapid changes in available capacity. The
ability to measure available capacity in real-time leverages the ability to measure available capacity in real time leverages the
different adaptive bitrate (ABR) compression algorithms to ensure the different adaptive bitrate (ABR) compression algorithms to ensure the
best possible user experience. Measuring aggregated capacity demand best possible user experience. Measuring aggregated capacity demand
allows Internet Service Provider's to be ready for traffic spikes. allows ISPs to be ready for traffic spikes. For example, during the
For example, during the end-of-year holiday season, the global demand end-of-year holiday season, the global demand for capacity has been
for capacity has been shown to be 5-7 times higher than during other shown to be 5-7 times higher than during other seasons. For end
seasons. For end-users, knowledge of their capacity needs can help users, knowledge of their capacity needs can help them select the
them select the best data plan given their intended usage. In many best data plan given their intended usage. In many cases, however,
cases, however, end-users have more than enough capacity and adding end users have more than enough capacity, and adding more bandwidth
more bandwidth will not improve their experience - after a point it will not improve their experience -- after a point, it is no longer
is no longer the limiting factor in user experience. Finally, the the limiting factor in user experience. Finally, the ability to
ability to differentiate between the "throughput" and the "goodput" differentiate between the "throughput" and the "goodput" can be
can be helpful in identifying when the network is saturated. helpful in identifying when the network is saturated.
In measuring network quality, latency is defined as the time it takes In measuring network quality, latency is defined as the time it takes
a packet to traverse a network path from one end to the other. At a packet to traverse a network path from one end to the other. At
the time of this report, users in many places worldwide can enjoy the time of this report, users in many places worldwide can enjoy
Internet access that has adequately high capacity and availability Internet access that has adequately high capacity and availability
for their current needs. For these users, latency improvements for their current needs. For these users, latency improvements,
rather than bandwidth improvements can lead to the most significant rather than bandwidth improvements, can lead to the most significant
improvements in quality of experience. The established latency improvements in QoE. The established latency metric is a round-trip
metric is a round-trip time (RTT), commonly measured in milliseconds. time (RTT), commonly measured in milliseconds. However, users often
However, users often find RTT values unintuitive since, unlike other find RTT values unintuitive since, unlike other performance metrics,
performance metrics, high RTT values indicate poor latency and users high RTT values indicate poor latency and users typically understand
typically understand higher scores to be better. To address this, higher scores to be better. To address this, [Paasch2021] and
[Paasch2021] and [Mathis2021] presented an inverse metric, called [Mathis2021] present an inverse metric, called "Round-trips Per
"Round-trips per minute" (RPM). Minute" (RPM).
There is an important distinction between "idle latency" and "latency There is an important distinction between "idle latency" and "latency
under working conditions." The former is measured when the network under working conditions". The former is measured when the network
is underused and reflects a best-case scenario. The latter is is underused and reflects a best-case scenario. The latter is
measured when the network is under a typical workload. Until measured when the network is under a typical workload. Until
recently, typical tools reported a network's idle latency, which can recently, typical tools reported a network's idle latency, which can
be misleading. For example, data presented at the workshop shows be misleading. For example, data presented at the workshop shows
that idle latencies can be up to 25 times lower than the latency that idle latencies can be up to 25 times lower than the latency
under typical working loads. Because of this, it is essential to under typical working loads. Because of this, it is essential to
make a clear distinction between the two when presenting latency to make a clear distinction between the two when presenting latency to
end-users. end users.
Data shows that rapid changes in capacity affect latency. Data shows that rapid changes in capacity affect latency.
[Foulkes2021] attempts to quantify how often a rapid change in [Foulkes2021] attempts to quantify how often a rapid change in
capacity can cause network connectivity to become "unstable" (i.e., capacity can cause network connectivity to become "unstable" (i.e.,
having high latency with very little throughput). Such changes in having high latency with very little throughput). Such changes in
capacity can be caused by infrastructure failures, but are much more capacity can be caused by infrastructure failures but are much more
often caused by in-network phenomena, like changing traffic often caused by in-network phenomena, like changing traffic
engineering policies or rapid changes in cross-traffic. engineering policies or rapid changes in cross-traffic.
Data presented at the workshop shows that 36% of measured lines have Data presented at the workshop shows that 36% of measured lines have
capacity metrics that vary by more than 10% throughout the day and capacity metrics that vary by more than 10% throughout the day and
across multiple days. These differences are caused by many across multiple days. These differences are caused by many
variables, including local connectivity methods (WiFi vs. Ethernet), variables, including local connectivity methods (Wi-Fi vs. Ethernet),
competing LAN traffic, device load/configuration, time of day and competing LAN traffic, device load/configuration, time of day, and
local loop/backhaul capacity. These factor variations make measuring local loop/backhaul capacity. These factor variations make measuring
capacity using only an end-user device or other end-network capacity using only an end-user device or other end-network
measurement difficult. A network router seeing aggregated traffic measurement difficult. A network router seeing aggregated traffic
from multiple devices provides a better vantage point for capacity from multiple devices provides a better vantage point for capacity
measurements. Such a test can account for the totality of local measurements. Such a test can account for the totality of local
traffic and perform an independent capacity test. However, various traffic and perform an independent capacity test. However, various
factors might still limit the accuracy of such a test. Accurate factors might still limit the accuracy of such a test. Accurate
capacity measurement requires multiple samples. capacity measurement requires multiple samples.
As users perceive the Internet through the lens of applications, it As users perceive the Internet through the lens of applications, it
may be difficult to correlate changes in capacity and latency with may be difficult to correlate changes in capacity and latency with
the quality of the end-user experience. For example, web browsers the quality of the end-user experience. For example, web browsers
rely on cached page versions to shorten page load times and mitigate rely on cached page versions to shorten page load times and mitigate
connectivity losses. In addition, social networking applications connectivity losses. In addition, social networking applications
often rely on pre-fetching their "feed" items. These techniques make often rely on prefetching their "feed" items. These techniques make
the core in-network metrics less indicative of the users' experience the core in-network metrics less indicative of the users' experience
and necessitates collecting data in-application. and necessitates collecting data from the end-user applications
themselves.
It is helpful to distinguish between applications that operate on a It is helpful to distinguish between applications that operate on a
"fixed latency budget" from those that have more tolerance to latency "fixed latency budget" from those that have more tolerance to latency
variance. Cloud gaming serves as an example application that variance. Cloud gaming serves as an example application that
requires a "fixed latency budget", as a sudden latency spike can requires a "fixed latency budget", as a sudden latency spike can
decide the "win/lose" ratio for a player. Companies that compete in decide the "win/lose" ratio for a player. Companies that compete in
the lucrative cloud gaming market make significant infrastructure the lucrative cloud gaming market make significant infrastructure
investments, such as buiding entire datacenters closer to their investments, such as building entire data centers closer to their
users. These data centers highlight the economic benefits that lower users. These data centers highlight the economic benefit that lower
numbers of latency spikes outweighs the associated deployment costs. numbers of latency spikes outweigh the associated deployment costs.
On the other hand, applications that are more tolerant to latency On the other hand, applications that are more tolerant to latency
spikes can continue to operate reasonably well through short spikes. spikes can continue to operate reasonably well through short spikes.
Yet even those applications can benefit from consistently low latency Yet, even those applications can benefit from consistently low
depending on usage shifts. For example, Video-on-Demand (VOD) apps latency depending on usage shifts. For example, Video-on-Demand
can work reasonably well when the video is consumed linearly, but (VOD) apps can work reasonably well when the video is consumed
once the user tries to "switch a channel", or to "skip ahead", the linearly, but once the user tries to "switch a channel" or to "skip
user experience suffers unless the latency is sufficiently low. ahead", the user experience suffers unless the latency is
sufficiently low.
Finally, as applications continue to evolve, in-application metrics Finally, as applications continue to evolve, in-application metrics
are gaining in importance. For example, VOD applications can assess are gaining in importance. For example, VOD applications can assess
the quality of experience by application-specific metrics such as the QoE by application-specific metrics, such as whether the video
whether the video player is able to use the highest possible player is able to use the highest possible resolution, identifying
resolution, identify when the video is smooth or freezing, or other when the video is smooth or freezing, or other similar metrics.
similar metrics. Application developers can then effectively use Application developers can then effectively use these metrics to
these metrics to prioritize future work. All popular video platforms prioritize future work. All popular video platforms (YouTube,
(Youtube, Instagram, Netflix, and others) have developed frameworks Instagram, Netflix, and others) have developed frameworks to collect
to collect and analyze VOD metrics at scale. One example is the and analyze VOD metrics at scale. One example is the Scuba framework
Scuba framework used by Meta [Scuba]. used by Meta [Scuba].
Unfortunately, the in-application metrics can be challenging to use Unfortunately, in-application metrics can be challenging to use for
for comparative research purposes. Firstly, different applications comparative research purposes. First, different applications often
often use different metrics to measure the same phenomena. For use different metrics to measure the same phenomena. For example,
example, application A may measure the smoothness of video via "mean application A may measure the smoothness of video via "mean time to
time to re-buffer", while application B may rely on the "probability rebuffer", while application B may rely on the "probability of
of re-buffering per second" for the same purpose. A different rebuffering per second" for the same purpose. A different challenge
challenge with in-application metrics is VOD is a significant source with in-application metrics is that VOD is a significant source of
of revenue for companies such as YouTube, Facebook, and Netflix, revenue for companies, such as YouTube, Facebook, and Netflix,
placing a proprietary incentive against exchanging the in-application placing a proprietary incentive against exchanging the in-application
data. A final concern centers on the privacy issues resulting from data. A final concern centers on the privacy issues resulting from
in-application metrics that accurately describe the activities and in-application metrics that accurately describe the activities and
preferences of an individual end-user. preferences of an individual end user.
4.2.2. Availability metrics 4.2.2. Availability Metrics
Availability is simply defined as whether or not a packet can be sent Availability is simply defined as whether or not a packet can be sent
and then received by its intended recipient. Availability is naively and then received by its intended recipient. Availability is naively
thought to be the simplest to measure, but is more complex when thought to be the simplest to measure, but it is more complex when
considering that continual, instantaneous measurements would be considering that continual, instantaneous measurements would be
needed to detect the smallest of outages. Also difficult is needed to detect the smallest of outages. Also difficult is
determining the root cause of infallibility: was the user's line determining the root cause of infallibility: was the user's line
down, something in the middle of the network or was it the service down, was something in the middle of the network, or was it the
with which the user was attempting to communicate. service with which the user was attempting to communicate?
4.2.3. Capacity metrics 4.2.3. Capacity Metrics
If the network capacity does not meet the user demands, the network If the network capacity does not meet user demands, the network
quality will be impacted. Once the capacity meets the demands, quality will be impacted. Once the capacity meets the demands,
increasing capacity won't lead to further quality improvements. increasing capacity won't lead to further quality improvements.
The actual network connection capacity is determined by the equipment The actual network connection capacity is determined by the equipment
and the lines along the network path, and it varies throughout the and the lines along the network path, and it varies throughout the
day and across multiple days. Studies involving DSL lines in North day and across multiple days. Studies involving DSL lines in North
America indicate that over 30% of the DSL lines have capacity metrics America indicate that over 30% of the DSL lines have capacity metrics
that vary by more than 10% throughout the day and accross multiple that vary by more than 10% throughout the day and across multiple
days. days.
Some factors that affect the actual capacity are: Some factors that affect the actual capacity are:
1. Presence of a competing traffic, either in the LAN or in the WAN 1. Presence of a competing traffic, either in the LAN or in the WAN
environments. In the LAN setting, the competing traffic reflects environments. In the LAN setting, the competing traffic reflects
the multiple devices that share the Internet connection. In the the multiple devices that share the Internet connection. In the
WAN setting the competing traffic often originates from the WAN setting, the competing traffic often originates from the
unrelated network flows that happen to share the same network unrelated network flows that happen to share the same network
path. path.
2. Capabilities of the equipment along the path of the network 2. Capabilities of the equipment along the path of the network
connection, including the data transfer rate and the amount of connection, including the data transfer rate and the amount of
memory used for buffering. memory used for buffering.
3. Active traffic management measures, such as traffic shapers and 3. Active traffic management measures, such as traffic shapers and
policers that are often used by the network providers. policers that are often used by the network providers.
There are other factors that can negatively affect the actual line There are other factors that can negatively affect the actual line
capacities. capacities.
The user demands of the traffic follow the usage patterns and The user demands of the traffic follow the usage patterns and
preferences of the particular users. For example, large data preferences of the particular users. For example, large data
transfers can use any available capacity, while the media streaming transfers can use any available capacity, while the media streaming
applicaitons require limited capacity to function correclty. Video- applications require limited capacity to function correctly.
conferencing applications typically need less capacity than high- Videoconferencing applications typically need less capacity than
definition video streaming. high-definition video streaming.
4.2.4. Latency metrics 4.2.4. Latency Metrics
End-to-end latency is the time that a particular packet takes to End-to-end latency is the time that a particular packet takes to
traverse the network path from the user to their destination and traverse the network path from the user to their destination and
back. The end-to-end latency comprises several components: back. The end-to-end latency comprises several components:
1. The propagation delay, which reflects the path distance and the 1. The propagation delay, which reflects the path distance and the
individual link technologies (e.g. fibre vs satellite). The individual link technologies (e.g., fiber vs. satellite). The
propagation doesn't depend on the utilization of the network, to propagation doesn't depend on the utilization of the network, to
the extent that the network path remains constant. the extent that the network path remains constant.
2. The buffering delay, which reflects the time segments spend in 2. The buffering delay, which reflects the time segments spent in
the memory of the network equipment that connect the individual the memory of the network equipment that connect the individual
network links, as well as in the memory of the transmitting network links, as well as in the memory of the transmitting
endpoint. The buffering delay depends on the network endpoint. The buffering delay depends on the network
utilization, as well as on the algorithms that govern the queued utilization, as well as on the algorithms that govern the queued
segments. segments.
3. The transport protocol delays, which reflects the time spent in 3. The transport protocol delays, which reflect the time spent in
retransmission and reassembly, as well as the time spent when the retransmission and reassembly, as well as the time spent when the
transport is "head-of-line blocked." transport is "head-of-line blocked".
4. Some of the workshop sumbissions have explicitly called out the 4. Some of the workshop submissions that have explicitly called out
application delay, which reflects the inefficiencies in the the application delay, which reflects the inefficiencies in the
application layer. application layer.
Traditionally, end-to-end latency is measured when the network is Typically, end-to-end latency is measured when the network is idle.
idle. Results of such measurements reflect mostly the propagation Results of such measurements mostly reflect the propagation delay but
delay, but not other kinds of delay. This report uses the term "idle not other kinds of delay. This report uses the term "idle latency"
latency" to refer to results achieved under idle network conditions. to refer to results achieved under idle network conditions.
Alternatively, if the latency is measured when the network is under Alternatively, if the latency is measured when the network is under
its typical working conditions, the results reflect multiple types of its typical working conditions, the results reflect multiple types of
delays. This report uses the term "working latency" to refer to such delays. This report uses the term "working latency" to refer to such
results. Other sources use the term "latency under load" (LUL) as a results. Other sources use the term "latency under load" (LUL) as a
synonym. synonym.
Data presented at the workshop reveals a substantial difference Data presented at the workshop reveals a substantial difference
between the idle latency and the working latency. Depending on the between the idle latency and the working latency. Depending on the
traffic direciton and the technology type, the working latency is traffic direction and the technology type, the working latency is
between 6 to 25 times higher than the idle latency: between 6 to 25 times higher than the idle latency:
+============+============+========+=========+============+=========+ +============+============+========+=========+============+=========+
| Direction | Technology |Working | Idle | Working - |Working /| | Direction | Technology |Working | Idle | Working - |Working /|
| | type |latency | latency | Idle |Idle | | | Type |Latency | Latency | Idle |Idle |
| | | | | difference |ratio | | | | | | Difference |Ratio |
+============+============+========+=========+============+=========+ +============+============+========+=========+============+=========+
| Downstream | FTTH |148 | 10 | 138 |15 | | Downstream | FTTH |148 | 10 | 138 |15 |
+------------+------------+--------+---------+------------+---------+ +------------+------------+--------+---------+------------+---------+
| Dowstream | Cable |103 | 13 | 90 |8 | | Downstream | Cable |103 | 13 | 90 |8 |
+------------+------------+--------+---------+------------+---------+ +------------+------------+--------+---------+------------+---------+
| Downstream | DSL |194 | 10 | 184 |19 | | Downstream | DSL |194 | 10 | 184 |19 |
+------------+------------+--------+---------+------------+---------+ +------------+------------+--------+---------+------------+---------+
| Upstream | FTTH |207 | 12 | 195 |17 | | Upstream | FTTH |207 | 12 | 195 |17 |
+------------+------------+--------+---------+------------+---------+ +------------+------------+--------+---------+------------+---------+
| Upstream | Cable |176 | 27 | 149 |6 | | Upstream | Cable |176 | 27 | 149 |6 |
+------------+------------+--------+---------+------------+---------+ +------------+------------+--------+---------+------------+---------+
| Upstream | DSL |686 | 27 | 659 |25 | | Upstream | DSL |686 | 27 | 659 |25 |
+------------+------------+--------+---------+------------+---------+ +------------+------------+--------+---------+------------+---------+
Table 1 Table 1
While historically the tooling available for measuring latency While historically the tooling available for measuring latency
focused on measuring the idle latency, there is a trend in the focused on measuring the idle latency, there is a trend in the
industry to start measuring the working latency as well, e.g. industry to start measuring the working latency as well, e.g.,
Apple's [NetworkQuality]. Apple's [NetworkQuality].
4.2.5. Measurement case studies 4.2.5. Measurement Case Studies
The participants have proposed several concrete methodologies for The participants have proposed several concrete methodologies for
measuring the onetwork quality for the end users. measuring the network quality for the end users.
[Paasch2021] introduced a methodology for measuring working latency [Paasch2021] introduced a methodology for measuring working latency
from the end-user vantage point. The suggested method incrementally from the end-user vantage point. The suggested method incrementally
adds network flows between the user device and a server endpoint adds network flows between the user device and a server endpoint
until a bottleneck capacity is reached. From these measurements, a until a bottleneck capacity is reached. From these measurements, a
round trip latency is measured and reported to the end-user. The round-trip latency is measured and reported to the end user. The
authors chose to report results with the RPM metric. The methodology authors chose to report results with the RPM metric. The methodology
had been implemented in Apple Monterey OS. had been implemented in Apple's macOS Monterey.
[Mathis2021] have applied the RPM metric to the results of more than [Mathis2021] applied the RPM metric to the results of more than 4
4 billion download tests that M-Lab performed in 2010-2021. During billion download tests that M-Lab performed from 2010-2021. During
this time frame, the M-Lab measurement platform underwent several this time frame, the M-Lab measurement platform underwent several
upgrades which allowed the research team to compare the effect of upgrades that allowed the research team to compare the effect of
different TCP congestion control algorithms (CCAs) on the measured different TCP congestion control algorithms (CCAs) on the measured
end-to-end latency. The study showed that the use Cubic CCA leads to end-to-end latency. The study showed that the use of cubic CCA leads
increased working latency, which is attributed to its use of larger to increased working latency, which is attributed to its use of
queues. larger queues.
[Schlinker2019] presented a large-scale study that aimed to establish [Schlinker2019] presented a large-scale study that aimed to establish
a correlation between goodput and quality of experience on a large a correlation between goodput and QoE on a large social network. The
social network. The authors performed the measurements at multiple authors performed the measurements at multiple data centers from
data centers from which video segments of set sizes were streamed to which video segments of set sizes were streamed to a large number of
a large number of end users. The authors used the goodput and end users. The authors used the goodput and throughput metrics to
throughput metrics to determine whether particular paths were determine whether particular paths were congested.
congested.
[Reed2021] presented the analysis of working latency measurements [Reed2021] presented the analysis of working latency measurements
collected as part of the FCC's "Measuring Broadband America" (MBA) collected as part of the Measuring Broadband America (MBA) program by
program. The FCC does not include working latency in its yearly the Federal Communication Commission (FCC). The FCC does not include
report, but does offer it in the raw data files. The authors used a working latency in its yearly report but does offer it in the raw
subset of the raw data to identify important differences in the data files. The authors used a subset of the raw data to identify
working latencies across different ISPs. important differences in the working latencies across different ISPs.
[MacMillian2021] presented analysis of working latency across [MacMillian2021] presented analysis of working latency across
multiple service tiers. They found that, unsurprisingly, "premium" multiple service tiers. They found that, unsurprisingly, "premium"
tier users experienced lower working latency compared to a "value" tier users experienced lower working latency compared to a "value"
tier. The data demonstrated that working latency varies tier. The data demonstrated that working latency varies
significantly within each tier; one possible explanation is the significantly within each tier; one possible explanation is the
difference in equipment deployed in the homes. difference in equipment deployed in the homes.
These studies have stressed the importance of measurement of working These studies have stressed the importance of measurement of working
latency. At the time of this report, many home router manufacturers latency. At the time of this report, many home router manufacturers
rely on hardware-accelerated routing which used FIFO queues. rely on hardware-accelerated routing that uses FIFO queues. Focusing
Focusing on measuring the working latency measurements on these on measuring the working latency measurements on these devices and
devices, and making the consumer aware of the effect of chosing one making the consumer aware of the effect of choosing one manufacturer
manufacturer vs. another, can help improving the home router vs. another can help improve the home router situation. The ideal
situation. The ideal test would be able to identify the working test would be able to identify the working latency and pinpoint the
latency, and to pinpoint to the source of delay (home router, ISP, source of the delay (home router, ISP, server side, or some network
server side, or some network node in between). node in between).
Another source of high working latency comes from network routers Another source of high working latency comes from network routers
exposed to cross-traffic. As [Schlinker2019] indicated, these can exposed to cross-traffic. As [Schlinker2019] indicated, these can
become saturated during the peak hours of the day. Systematic become saturated during the peak hours of the day. Systematic
testing of the working latency in routers under load can help improve testing of the working latency in routers under load can help improve
both our understanding of latency and the impact of deployed both our understanding of latency and the impact of deployed
infrastructure. infrastructure.
4.2.6. Metrics Key Points 4.2.6. Metrics Key Points
The metrics for network quality can be roughly grouped into: The metrics for network quality can be roughly grouped into the
following:
1. Availability metrics, which indicate whether the user can access 1. Availability metrics, which indicate whether the user can access
the network at all. the network at all.
2. Capacity metrics, which indicate whether the actual line capacity 2. Capacity metrics, which indicate whether the actual line capacity
is sufficient to meet the user's demands. is sufficient to meet the user's demands.
3. Latency metrics, indicating if the user gets the data in a timely 3. Latency metrics, which indicate if the user gets the data in a
fashion. timely fashion.
4. Higher-order metrics, which include both the network metrics, 4. Higher-order metrics, which include both the network metrics,
such as inter-packet arrival time, and the applicaiton metrics, such as inter-packet arrival time, and the application metrics,
such as the mean time between rebuffering for video streaming. such as the mean time between rebuffering for video streaming.
The availabiltiy metrics can be seen as derivative of either the The availability metrics can be seen as a derivative of either the
capacity (zero capacity leading to zero availability) or the latency capacity (zero capacity leading to zero availability) or the latency
(infinite latency leading to zero availability). (infinite latency leading to zero availability).
Key points from the presentations and discussions included: Key points from the presentations and discussions included the
following:
1. Availability and capacity are "hygienic factors" - unless an 1. Availability and capacity are "hygienic factors" -- unless an
application is capable of using extra capacity, end-users will application is capable of using extra capacity, end users will
see little benefit from using overprovisioned lines. see little benefit from using over-provisioned lines.
2. Working latency has stronger correlation with user experience 2. Working latency has a stronger correlation with the user
than latency under an idle network load. Working latency can experience than latency under an idle network load. Working
exceed the idle latency by order of magnitude. latency can exceed the idle latency by order of magnitude.
3. The RPM metric is a stable metric, with positive values being 3. The RPM metric is a stable metric, with positive values being
better, that may be more effective when communicating latency to better, that may be more effective when communicating latency to
end-users. end users.
4. The relationship between throughput and goodput can be effective 4. The relationship between throughput and goodput can be effective
in finding the saturation points, both in client-side in finding the saturation points, both in client-side
[Paasch2021] and server-side [Schlinker2019] settings. [Paasch2021] and server-side [Schlinker2019] settings.
5. Working latency depends on algorithm choice for addressing 5. Working latency depends on the algorithm choice for addressing
endpoint congestion control and router queuing. endpoint congestion control and router queuing.
Finally, it was commonly agreed to that the best metrics are those Finally, it was commonly agreed to that the best metrics are those
that are actionable. that are actionable.
4.3. Cross-layer Considerations 4.3. Cross-Layer Considerations
In the Cross-layer segment of the workshop, participants presented In the cross-layer segment of the workshop, participants presented
material on and discussed how to accurately measure exactly where material on and discussed how to accurately measure exactly where
problems occur. Discussion centered especially on the differences problems occur. Discussion centered especially on the differences
between physically wired and wireless connections and the between physically wired and wireless connections and the
difficulties of accurately determining problem spots when multiple difficulties of accurately determining problem spots when multiple
different types of network segments are responsible for the quality. different types of network segments are responsible for the quality.
As an example, [Kerpez2021] showed that limited bandwidth of 2.4Ghz As an example, [Kerpez2021] showed that a limited bandwidth of 2.4
wifi is the most frequently the bottleneck. In comparison, the wider Ghz Wi-Fi bottlenecks the most frequently. In comparison, the wider
bandwidth of the 5Ghz WiFi have only been the bottleneck in 20% of bandwidth of the 5 Ghz Wi-Fi has only bottlenecked in 20% of
observations. observations.
The participants agreed that no single component of a network The participants agreed that no single component of a network
connection has all the data required to measure the effects of the connection has all the data required to measure the effects of the
network performance on the quality of the end user experience. network performance on the quality of the end-user experience.
* Applications that are running on the end-user devices have the * Applications that are running on the end-user devices have the
best insight into their respective performance, but have limited best insight into their respective performance but have limited
visibility into the behavior of the network itself, and are unable visibility into the behavior of the network itself and are unable
to act based on their limited perspective. to act based on their limited perspective.
* Internet service providers have good insight into QoS * ISPs have good insight into QoS considerations but are not able to
considerations, but are not able to infer the effect of the QoS infer the effect of the QoS metrics on the quality of end-user
metrics on the quality of end user experiences. experiences.
* Content providers have good insight into the aggregated behavior * Content providers have good insight into the aggregated behavior
of the end users, but lack the insight on what aspects of network of the end users but lack the insight on what aspects of network
performance are leading indicators of user behavior. performance are leading indicators of user behavior.
The workshop had identified the need for a standard and extensible The workshop had identified the need for a standard and extensible
way to exchange network performance characteristics. Such an way to exchange network performance characteristics. Such an
exchange standard should address (at least) the following: exchange standard should address (at least) the following:
* A scalable way to capture the performance of multiple (potentially * A scalable way to capture the performance of multiple (potentially
thousands of) endpoints. thousands of) endpoints.
* The data exchange format should prevent data manipulation, so that * The data exchange format should prevent data manipulation so that
the different participants won't be able to game the mechanisms. the different participants won't be able to game the mechanisms.
* Preservation of end-user privacy. In particular, federated * Preservation of end-user privacy. In particular, federated
learning approaches should be preferred so no centralized entity learning approaches should be preferred so that no centralized
has the access to the whole picture. entity has the access to the whole picture.
* A transparent model for giving the different actors on a network * A transparent model for giving the different actors on a network
connection an incentive to share the performance data they connection an incentive to share the performance data they
collect. collect.
* An accompanying set of tools to analyze the data is needed as * An accompanying set of tools to analyze the data.
well.
4.3.1. Separation of Concerns 4.3.1. Separation of Concerns
Commonly, there's a tight coupling between collecting performance Commonly, there's a tight coupling between collecting performance
metrics, interpreting those metrics, and and acting upon the metrics, interpreting those metrics, and acting upon the
interpretation. Unfortunately, such model is not the best for interpretation. Unfortunately, such a model is not the best for
successfully exchanging cross-layer data as: successfully exchanging cross-layer data, as:
* Actors that are able to collect particular performance metrics * actors that are able to collect particular performance metrics
(e.g. the TCP RTT) do not necessarily have the context necessary (e.g., the TCP RTT) do not necessarily have the context necessary
for a meaningful interpretation. for a meaningful interpretation,
* The actors that have the context and the computational/storage * the actors that have the context and the computational/storage
capacity to interpret metrics do not necessarily have the ability capacity to interpret metrics do not necessarily have the ability
to control the behavior of network / application. to control the behavior of the network/application, and
* The actors that can control the behavior of networks and/or * the actors that can control the behavior of networks and/or
applications typically do not have access to complete measurement applications typically do not have access to complete measurement
data. data.
The participants agreed that it is important to separate the above The participants agreed that it is important to separate the above
three aspects, so that: three aspects, so that:
* The different actors that have the data but not the ability to * the different actors that have the data, but not the ability to
interpret and/or act upon it should publish their measured data. interpret and/or act upon it, should publish their measured data
and
* The actors that have the expertise in interpreting and * the actors that have the expertise in interpreting and
synthesizing performance data should publish the results of their synthesizing performance data should publish the results of their
interpretations. interpretations.
4.3.2. Security and Privacy Considerations 4.3.2. Security and Privacy Considerations
Preserving the privacy of Internet end users is a difficult Preserving the privacy of Internet end users is a difficult
requirement to meet when addressing this problem space. There is an requirement to meet when addressing this problem space. There is an
intrinsic trade-off between collecting more data about user intrinsic trade-off between collecting more data about user
activities, and infringing their privacy while doing so. activities and infringing on their privacy while doing so.
Participants agreed that observability across multiple layers is Participants agreed that observability across multiple layers is
necessary for an accurate measurement of the network quality, but necessary for an accurate measurement of the network quality, but
doing so in a way that minimizes privacy leakage is an open question. doing so in a way that minimizes privacy leakage is an open question.
4.3.3. Metric Measurement Considerations 4.3.3. Metric Measurement Considerations
* The following TCP protocol metrics have been found to be effective * The following TCP protocol metrics have been found to be effective
and are available for passive measurement: and are available for passive measurement:
- TCP connection latency measured using SACK/ACK timing, as well - TCP connection latency measured using selective acknowledgment
as the timing between TCP retransmission events, are good (SACK) or acknowledgment (ACK) timing, as well as the timing
proxies for end-to-end RTT measurements. between TCP retransmission events, are good proxies for end-to-
end RTT measurements.
- On the Linux platform, the tcp_info structure is the de-facto - On the Linux platform, the tcp_info structure is the de facto
standard for an application to inspect the performance of standard for an application to inspect the performance of
kernel-space networking. However, there is no equivalent de- kernel-space networking. However, there is no equivalent de
facto standard for the user-space networking. facto standard for user-space networking.
* The QUIC and MASQUE protocols make passive performance * The QUIC and MASQUE protocols make passive performance
measurements more challenging. measurements more challenging.
- An approach that uses federated measurement / hierarchical - An approach that uses federated measurement/hierarchical
aggregation may be more valuable for these protocols. aggregation may be more valuable for these protocols.
- The QLOG format seems to be the most mature candidate for such - The QLOG format seems to be the most mature candidate for such
an exchange. an exchange.
4.3.4. Towards Improving Future Cross-layer Observability 4.3.4. Towards Improving Future Cross-Layer Observability
The ownership of the Internet is spread across multiple The ownership of the Internet is spread across multiple
administrative domains, making measurement of end-to-end performance administrative domains, making measurement of end-to-end performance
data difficult. Furthermore, the immense scale of the Internet makes data difficult. Furthermore, the immense scale of the Internet makes
aggregation and analysis of this difficult. [Marx2021] presented a aggregation and analysis of this difficult. [Marx2021] presented a
simple logging format that could potentially be used to collect and simple logging format that could potentially be used to collect and
aggregate data from different layers. aggregate data from different layers.
Another aspect of cross-layer collaboration hampering measurement is Another aspect of the cross-layer collaboration hampering measurement
that the majority of current algorithms do not explicitly provide is that the majority of current algorithms do not explicitly provide
performance data that can be used in cross-layer analysis. The IETF performance data that can be used in cross-layer analysis. The IETF
community could be more diligent in identifying each protocol's key community could be more diligent in identifying each protocol's key
performance indicators, and exposing them as part of the protocol performance indicators and exposing them as part of the protocol
specification. specification.
Despite all these challenges, it should still be possible to perform Despite all these challenges, it should still be possible to perform
limited-scope studies in order to have a better understanding of how limited-scope studies in order to have a better understanding of how
user quality is affected by the interaction of the different user quality is affected by the interaction of the different
components that constitute the Internet. Furthermore, recent components that constitute the Internet. Furthermore, recent
development of federated learning algorithms suggests that it might development of federated learning algorithms suggests that it might
be possible to perform cross-layer performance measurements while be possible to perform cross-layer performance measurements while
preserving user privacy. preserving user privacy.
4.3.5. Efficient Collaboration Between Hardware and Transport Protocols 4.3.5. Efficient Collaboration between Hardware and Transport Protocols
With the advent of the low latency, low loss and scalable throughput With the advent of the low latency, low loss, and scalable throughput
(L4S) congestion notification and control, there is an even higher (L4S) congestion notification and control, there is an even higher
need for the transport protocols and the underlying hardware to work need for the transport protocols and the underlying hardware to work
in unison. in unison.
At the time of the workshop, the typical home router uses a single At the time of the workshop, the typical home router uses a single
FIFO queue, large enough to allow amortizing the lower-layer header FIFO queue that is large enough to allow amortizing the lower-layer
overhead across multiple transport PDUs. These designs worked well header overhead across multiple transport PDUs. These designs worked
with the Cubic congestion control algorithm, yet the newer generation well with the cubic congestion control algorithm, yet the newer
of CCAs can operate on much smaller queues. To fully support generation of algorithms can operate on much smaller queues. To
latencies less than 1ms, the home router needs to work efficiently on fully support latencies less than 1 ms, the home router needs to work
sequential transmissions of just a few segments vs. being optimized efficiently on sequential transmissions of just a few segments vs.
for large packet bursts. being optimized for large packet bursts.
Another design trait common in home routers is the use of packet Another design trait common in home routers is the use of packet
aggregation to further amortize the overhead added by the lower-layer aggregation to further amortize the overhead added by the lower-layer
headers. Specifically, multiple IP datagrams are combined into a headers. Specifically, multiple IP datagrams are combined into a
single, large tranfer frame. However, this aggregation can add up to single, large transfer frame. However, this aggregation can add up
10ms to the packet sojourn delay. to 10 ms to the packet sojourn delay.
Following the famous "you can't improve what you don't measure" Following the famous "you can't improve what you don't measure"
adage, it is important to expose these aggregation delays in a way adage, it is important to expose these aggregation delays in a way
that would allow identifying the source of the bottlenecks, and that would allow identifying the source of the bottlenecks and making
making hardware more suitable for the next generation transport hardware more suitable for the next generation of transport
protocols. protocols.
4.3.6. Cross-Layer Key Points 4.3.6. Cross-Layer Key Points
* Significant differences exist in the characteristics of metrics to * Significant differences exist in the characteristics of metrics to
measured and required optimizations needed in wireless vs wired be measured and the required optimizations needed in wireless vs.
networks. wired networks.
* Identification of an issue's root-cause is hampered by the * Identification of an issue's root cause is hampered by the
challenges in measuring multi-segment network paths. challenges in measuring multi-segment network paths.
* No single component of a network connection has all the data * No single component of a network connection has all the data
required to measure the effects of the complete network required to measure the effects of the complete network
performance on the quality of the end user experience. performance on the quality of the end-user experience.
* Actionable results require both proper collection and * Actionable results require both proper collection and
interpretation. interpretation.
* Coordination among network providers is important to successful * Coordination among network providers is important to successfully
improve measurement of end user experiences. improve the measurement of end-user experiences.
* Simultaneously providing accurate measurements while preserving * Simultaneously providing accurate measurements while preserving
end-user privacy is challenging. end-user privacy is challenging.
* Passive measurements from protocol implementations may provide * Passive measurements from protocol implementations may provide
beneficial data. beneficial data.
4.4. Synthesis 4.4. Synthesis
Finally, in the Synthesis section of the workshop, the presentations Finally, in the synthesis section of the workshop, the presentations
and discussions concentrated on the next steps likely needed to make and discussions concentrated on the next steps likely needed to make
forward progress. Of particular concern is how to bring forward forward progress. Of particular concern is how to bring forward
measurements that can make sense to end users trying to select measurements that can make sense to end users trying to select
between various networking subscription options. between various networking subscription options.
4.4.1. Measurement and Metrics Considerations 4.4.1. Measurement and Metrics Considerations
One important consideration is how decisions can be made and actions One important consideration is how decisions can be made and what
taken based on collected metrics. Measurements must be integrated actions can be taken based on collected metrics. Measurements must
with applications in order to get true application views of be integrated with applications in order to get true application
congestion, as measurements over different infrastructure or via views of congestion, as measurements over different infrastructure or
other applications may return incorrect results. Congestion itself via other applications may return incorrect results. Congestion
can be a temporary problem, and mitigation strategies may need to be itself can be a temporary problem, and mitigation strategies may need
different depending on whether it is expected to be a short-term or to be different depending on whether it is expected to be a short-
long-term phenomenon. A significant challenge exists in measuring term or long-term phenomenon. A significant challenge exists in
short-term problems, driving the need for continuous measurements to measuring short-term problems, driving the need for continuous
ensure capture of critical moments and long-term trends. For short- measurements to ensure critical moments and long-term trends are
term problems, workshop participants debated whether an issue that captured. For short-term problems, workshop participants debated
goes away is indeed a problem or is a sign that a network is properly whether an issue that goes away is indeed a problem or is a sign that
adapting and self-recovering. a network is properly adapting and self-recovering.
Important consideration must be taken when constructing metrics in Important consideration must be taken when constructing metrics in
order to understand the results. Measurements can also affected by order to understand the results. Measurements can also be affected
individual packet characteristics - different sized packets have a by individual packet characteristics -- differently sized packets
typically linear relationship with their delay. With this in mind, typically have a linear relationship with their delay. With this in
measurements can be divided into a delay based on geographical mind, measurements can be divided into a delay based on geographical
distances, a packet-size serialization delay and a variable (noise) distances, a packet-size serialization delay, and a variable (noise)
delay. Each of these three sub-component delays can be different and delay. Each of these three sub-component delays can be different and
individually measured across each segment in a multi-hop path. individually measured across each segment in a multi-hop path.
Variable delay can also be significantly impacted by external Variable delay can also be significantly impacted by external
factors, such as bufferbloat, routing changes, network load sharing, factors, such as bufferbloat, routing changes, network load sharing,
and other local or remote changes in performance. Network and other local or remote changes in performance. Network
measurements, especially load-specific tests, must also be run long measurements, especially load-specific tests, must also be run long
enough to ensure capture of any problems associated with buffering, enough to ensure that any problems associated with buffering,
queuing, etc. Measurement technologies should also distinguish queuing, etc. are captured. Measurement technologies should also
between upsteam and downstream measurements, as well as measure the distinguish between upstream and downstream measurements, as well as
difference between end-to-end paths and sub-path measurements. measure the difference between end-to-end paths and sub-path
measurements.
4.4.2. End-User metrics presentation 4.4.2. End-User Metrics Presentation
Determining end-user needs requires informative measurements and Determining end-user needs requires informative measurements and
metrics. How do we provide the users with the service they need or metrics. How do we provide the users with the service they need or
want? Is it possible for users to even voice their desires want? Is it possible for users to even voice their desires
effectively? Only high-level, simplistic answers like "reliability", effectively? Only high-level, simplistic answers like "reliability",
"capacity", and "service bundling" are typical answers given in end- "capacity", and "service bundling" are typical answers given in end-
user surveys. Technical requirements that operators can consume, user surveys. Technical requirements that operators can consume,
like "low-latency" and "congestion avoidance",are not terms known to like "low-latency" and "congestion avoidance", are not terms known to
and used by end-users. and used by end users.
Example metrics useful to end users might include the number of users Example metrics useful to end users might include the number of users
supported by a service, and the number of applications or streams supported by a service and the number of applications or streams that
that a network can support. An example solution to combat netwokring a network can support. An example solution to combat networking
issues include incentive-based traffic management strategies (e.g. an issues include incentive-based traffic management strategies (e.g.,
application requesting lower latency may also mean accepting lower an application requesting lower latency may also mean accepting lower
bandwidth). User perceived latency must be considered, not just bandwidth). User-perceived latency must be considered, not just
network latency - users experience in-application to in-server network latency -- user experience in-application to in-server
latency, and network to network measurements may only be studying the latency and network-to-network measurements may only be studying the
lowest level latency. Thus, picking the right protocol to use in a lowest-level latency. Thus, picking the right protocol to use in a
measurement is critical in order to match user experience (for measurement is critical in order to match user experience (for
example, users do not transmit data over ICMP even though it is a example, users do not transmit data over ICMP, even though it is a
common measurement tool). common measurement tool).
In-application measurements should consider how to measure different In-application measurements should consider how to measure different
types of applications, such as video streaming, file sharing, multi- types of applications, such as video streaming, file sharing, multi-
user gaming, and real-time voice communications. It may be that user gaming, and real-time voice communications. It may be that
asking users for what tradeoffs they are willing to accept would be a asking users for what trade-offs they are willing to accept would be
helpful approach: would they rather have a network with low latency, a helpful approach: would they rather have a network with low latency
or a network with higher bandwidth. Gamers may make different or a network with higher bandwidth? Gamers may make different
decisions than home office users or content producers, for example. decisions than home office users or content producers, for example.
Furthermore, how can users make these trade-offs in a fair manner Furthermore, how can users make these trade-offs in a fair manner
that does not impact other users? There is a tension between that does not impact other users? There is a tension between
solutions in this space vs the cost associated with solving these solutions in this space vs. the cost associated with solving these
solutions, and which customers are willing to front these improvement problems, as well as which customers are willing to front these
costs. improvement costs.
Challenges in providing higher-priority traffic to users centers Challenges in providing higher-priority traffic to users centers
around the ability for networks to be willing to listen to client around the ability for networks to be willing to listen to client
requests for higher incentives, even though commercial interests may requests for higher incentives, even though commercial interests may
not flow to them without a cost incentive. Shared mediums in general not flow to them without a cost incentive. Shared mediums in general
are subject to oversubscribing such that the number of users a are subject to oversubscribing, such that the number of users a
network can support is either accurate on an underutilized network, network can support is either accurate on an underutilized network or
or may assume an average bandwidth or other usage metric that fails may assume an average bandwidth or other usage metric that fails to
to be accurate during utilization spikes. Individual metrics are be accurate during utilization spikes. Individual metrics are also
also affected by in-home devices from cheap routers to microwaves and affected by in-home devices from cheap routers to microwaves and by
from (multi-)user behaviors during tests. Thus, a single metric (multi-)user behaviors during tests. Thus, a single metric alone or
alone or a single reading without context may not be useful in a single reading without context may not be useful in assisting a
assisting a user or operator to determine where the problem source user or operator to determine where the problem source actually is.
actually is.
User comprehension of a network remains a challenging problem. User comprehension of a network remains a challenging problem.
Multiple workshop participants argued for a single number Multiple workshop participants argued for a single number
(potentially calculated with weighted aggregation formula), or a (potentially calculated with a weighted aggregation formula) or a
small number of measurements per expected usage (a "gaming" score vs small number of measurements per expected usage (e.g., a "gaming"
a "content producer" score). Many agreed that some users may instead score vs. a "content producer" score). Many agreed that some users
prefer to consume simplified or color-coded ratings (good/better/ may instead prefer to consume simplified or color-coded ratings
best, red/yellow/green, or bronze/gold/platinum). (e.g., good/better/best, red/yellow/green, or bronze/gold/platinum).
4.4.3. Synthesis Key Points 4.4.3. Synthesis Key Points
* Some proposed metrics: * Some proposed metrics:
- Round-trips Per Minute (RPMs) - Round-trips Per Minute (RPM)
- Users per network - users per network
- Latency - latency
- 99% latency and bandwidth - 99% latency and bandwidth
* Median and mean measurements are distractions from the real * Median and mean measurements are distractions from the real
problems. problems.
* Shared network usage greatly affect quality. * Shared network usage greatly affects quality.
* Long measurements are needed to capture all facets of potential * Long measurements are needed to capture all facets of potential
network bottlenecks. network bottlenecks.
* Better funded research in all these areas is needed for progress. * Better-funded research in all these areas is needed for progress.
* End-users will best understand a simplified score or ranking * End users will best understand a simplified score or ranking
system. system.
5. Conclusions 5. Conclusions
During the final hour of the workshop we gathered statements that the During the final hour of the three-day workshop, statements that the
group thought were summary statements from the 3 day event. We later group deemed to be summary statements were gathered. Later, any
discarded any that were in contention (listed further below for statements that were in contention were discarded (listed further
completeness). For this document, the editor took the original list below for completeness). For this document, the authors took the
and divided it into rough categories, applied some suggested edits original list and divided it into rough categories, applied some
discussed on the mailing list and further edited for clarity and to suggested edits discussed on the mailing list, and further edited for
provide context. clarity and to provide context.
5.1. General statements 5.1. General Statements
1. Bandwidth is necessary but not alone sufficient. 1. Bandwidth is necessary but not alone sufficient.
2. In many cases, Internet users don't need more bandwidth, but 2. In many cases, Internet users don't need more bandwidth but
rather need "better bandwidth" - i.e., they need other rather need "better bandwidth", i.e., they need other
improvements to their connectivity. improvements to their connectivity.
3. We need both active and passive measurements - passive 3. We need both active and passive measurements -- passive
measurements can provide historical debugging. measurements can provide historical debugging.
4. We need passive measurements to be continuous and archivable and 4. We need passive measurements to be continuous, archivable, and
queriable - include reliability/connectivity measurements. queriable, including reliability/connectivity measurements.
5. A really meaningful metric for users is whether their application 5. A really meaningful metric for users is whether their application
will work properly or fail because of a lack of a network with will work properly or fail because of a lack of a network with
sufficient characteristics. sufficient characteristics.
6. A useful metric for goodness must actually incentive goodness - 6. A useful metric for goodness must actually incentivize goodness
good metrics should be actionable to help drive industries toward -- good metrics should be actionable to help drive industries
improvement. towards improvement.
7. A lower latency Internet, however achieved would benefit all end 7. A lower-latency Internet, however achieved, would benefit all end
users. users.
5.2. Specific statements about detailed protocols/techniques 5.2. Specific Statements about Detailed Protocols/Techniques
1. Round trips Per Minute (RPM) is a useful, consumable metric. 1. Round-trips Per Minute (RPM) is a useful, consumable metric.
2. We need a usable tool that fills the current gap between network 2. We need a usable tool that fills the current gap between network
reachability, latency, and speed tests. reachability, latency, and speed tests.
3. End-users that want to be involved in QoS decisions should be 3. End users that want to be involved in QoS decisions should be
able to voice their needs and desires. able to voice their needs and desires.
4. Applications are needed that can perform and report good quality 4. Applications are needed that can perform and report good quality
measurements in order to identify insufficient points in network measurements in order to identify insufficient points in network
access. access.
5. Research done by regulators indicate that users/consumers prefer 5. Research done by regulators indicate that users/consumers prefer
a simple metric per application, which frequently resolves to a simple metric per application, which frequently resolves to
whether the application will work properly or not. whether the application will work properly or not.
6. New measurements and QoS or QoE techniques should not rely only 6. New measurements and QoS or QoE techniques should not rely only
or depend on reading TCP headers. or depend on reading TCP headers.
7. It is clear from developers of interactive applications and from 7. It is clear from developers of interactive applications and from
network operators that lower latency is a strong factor in user network operators that lower latency is a strong factor in user
QoE. However, metrics are lacking to support this statement QoE. However, metrics are lacking to support this statement
directly. directly.
5.3. Problem statements and concerns 5.3. Problem Statements and Concerns
1. Latency mean and medians are distractions from better 1. Latency mean and medians are distractions from better
measurements. measurements.
2. It is frustrating to only measure network services without 2. It is frustrating to only measure network services without
simultaneously improving those services. simultaneously improving those services.
3. Stakeholder incentives aren't aligned for easy wins in this 3. Stakeholder incentives aren't aligned for easy wins in this
space. Incentives are needed to motivate improvements in public space. Incentives are needed to motivate improvements in public
network access. Measurements may be one step toward driving network access. Measurements may be one step towards driving
competitive market incentive. competitive market incentives.
4. For future-proof networking, it is important to measure the 4. For future-proof networking, it is important to measure the
ecological impact of material and energy usage. ecological impact of material and energy usage.
5. We do not have incontrovertible evidence that any one metric 5. We do not have incontrovertible evidence that any one metric
(e.g., latency or speed) is more important than others to (e.g., latency or speed) is more important than others to
persuade device vendors to concentrate on any one optimization. persuade device vendors to concentrate on any one optimization.
5.4. No-consensus reached statements 5.4. No-Consensus-Reached Statements
Additional statements were recorded that did not have consensus of Additional statements were discussed and recorded that did not have
the group at the time, but we list them here for completeness about consensus of the group at the time, but they are listed here for
the fact they were discussed: completeness:
1. We do not have incontrovertible evidence that buffer bloat is a 1. We do not have incontrovertible evidence that bufferbloat is a
prevalent problem. prevalent problem.
2. The measurement needs to support reporting localization in order 2. The measurement needs to support reporting localization in order
to find problems. Specifically: to find problems. Specifically:
* Detecting a problem is not sufficient if you can't find the * Detecting a problem is not sufficient if you can't find the
location. location.
* Need more than just English - different localization concerns. * Need more than just English -- different localization
concerns.
3. Stakeholder incentives aren't aligned for easy wins in this 3. Stakeholder incentives aren't aligned for easy wins in this
space. space.
6. Follow-on work 6. Follow-On Work
There was discussion during the workshop about where future work There was discussion during the workshop about where future work
should be performed. The group agreed that some work could be done should be performed. The group agreed that some work could be done
more immediately within existing IETF working groups (e.g. IPPM, more immediately within existing IETF working groups (e.g., IPPM,
DetNet and RAW), while other longer-term research may be needed in DetNet, and RAW), while other longer-term research may be needed in
IRTF groups. IRTF groups.
7. Security considerations 7. IANA Considerations
A few security relevant topics were discussed at the workshop, This document has no IANA actions.
8. Security Considerations
A few security-relevant topics were discussed at the workshop,
including but not limited to: including but not limited to:
* What prioritization techniques can work without invading the * what prioritization techniques can work without invading the
privacy of the communicating parties. privacy of the communicating parties and
* How oversubscribed networks can essentially be viewed as a DDoS * how oversubscribed networks can essentially be viewed as a DDoS
attack. attack.
8. Informative References 9. Informative References
[Aldabbagh2021] [Aldabbagh2021]
Aldabbagh, A., "Regulatory perspective on measuring Aldabbagh, A., "Regulatory perspective on measuring
network quality for end users", https://www.iab.org/wp- network quality for end-users", September 2021,
content/IAB-uploads/2021/09/2021-09-07-Aldabbagh-Ofcom- <https://www.iab.org/wp-content/IAB-
presentationt-to-IAB-1v00-1.pdf , September 2021. uploads/2021/09/2021-09-07-Aldabbagh-Ofcom-presentationt-
to-IAB-1v00-1.pdf>.
[Arkko2021] [Arkko2021]
Arkko, J. and M. Kühlewind, "Observability is needed to Arkko, J. and M. Kühlewind, "Observability is needed to
improve network quality", https://www.iab.org/wp-content/ improve network quality", August 2021,
IAB-uploads/2021/09/iab-position-paper-observability.pdf , <https://www.iab.org/wp-content/IAB-uploads/2021/09/iab-
August 2021. position-paper-observability.pdf>.
[Balasubramanian2021] [Balasubramanian2021]
Balasubramanian, P., "Transport Layer Statistics for Balasubramanian, P., "Transport Layer Statistics for
Network Quality", https://www.iab.org/wp-content/IAB- Network Quality", February 2021, <https://www.iab.org/wp-
uploads/2021/09/transportstatsquality.pdf , February 2021. content/IAB-uploads/2021/09/transportstatsquality.pdf>.
[Briscoe2021] [Briscoe2021]
Briscoe, B., White, G., Goel, V., and K. De Schepper, "A Briscoe, B., White, G., Goel, V., and K. De Schepper, "A
Single Common Metric to Characterize Varying Packet Single Common Metric to Characterize Varying Packet
Delay", https://www.iab.org/wp-content/IAB- Delay", September 2021, <https://www.iab.org/wp-content/
uploads/2021/09/single-delay-metric-1.pdf , September IAB-uploads/2021/09/single-delay-metric-1.pdf>.
2021.
[Casas2021] [Casas2021]
Casas, P., "10 Years of Internet-QoE Measurements. Video, Casas, P., "10 Years of Internet-QoE Measurements Video,
Cloud, Conferencing, Web and Apps. What do we need from Cloud, Conferencing, Web and Apps. What do we need from
the Network Side?", https://www.iab.org/wp-content/IAB- the Network Side?", August 2021, <https://www.iab.org/wp-
uploads/2021/09/net_quality_internet_qoe_CASAS.pdf , content/IAB-uploads/2021/09/
August 2021. net_quality_internet_qoe_CASAS.pdf>.
[Cheshire2021] [Cheshire2021]
Cheshire, S., "The Internet is a Shared Network", Cheshire, S., "The Internet is a Shared Network", August
https://www.iab.org/wp-content/IAB-uploads/2021/09/draft- 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/
cheshire-internet-is-shared-00b.pdf , February 2021. draft-cheshire-internet-is-shared-00b.pdf>.
[Davies2021] [Davies2021]
Davies, N. and P. Thompson, "Measuring Network Impact on Davies, N. and P. Thompson, "Measuring Network Impact on
Application Outcomes using Quality Attenuation", Application Outcomes Using Quality Attenuation", September
https://www.iab.org/wp-content/IAB-uploads/2021/09/PNSol- 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/
et-al-Submission-to-Measuring-Network-Quality-for-End- PNSol-et-al-Submission-to-Measuring-Network-Quality-for-
Users-1.pdf , September 2021. End-Users-1.pdf>.
[DeSchepper2021] [DeSchepper2021]
De Schepper, K., Tilmans, O., and G. Dion, "Challenges and De Schepper, K., Tilmans, O., and G. Dion, "Challenges and
opportunities of hardware support for Low Queuing Latency opportunities of hardware support for Low Queuing Latency
without Packet Loss", https://www.iab.org/wp-content/IAB- without Packet Loss", February 2021, <https://www.iab.org/
uploads/2021/09/Nokia-IAB-Measuring-Network-Quality-Low- wp-content/IAB-uploads/2021/09/Nokia-IAB-Measuring-
Latency-measurement-workshop-20210802.pdf , February 2021. Network-Quality-Low-Latency-measurement-workshop-
20210802.pdf>.
[Dion2021] Dion, G., "Focusing on latency, not throughput, to provide [Dion2021] Dion, G., De Schepper, K., and O. Tilmans, "Focusing on
a better internet experience and network quality", latency, not throughput, to provide a better internet
https://www.iab.org/wp-content/IAB-uploads/2021/09/Nokia- experience and network quality", August 2021,
<https://www.iab.org/wp-content/IAB-uploads/2021/09/Nokia-
IAB-Measuring-Network-Quality-Improving-and-focusing-on- IAB-Measuring-Network-Quality-Improving-and-focusing-on-
latency-.pdf , August 2021. latency-.pdf>.
[Fabini2021] [Fabini2021]
Fabini, J., "Network Quality from an End User Fabini, J., "Network Quality from an End User
Perspective", https://www.iab.org/wp-content/IAB- Perspective", February 2021, <https://www.iab.org/wp-
uploads/2021/09/Fabini-IAB-NetworkQuality.txt , February content/IAB-uploads/2021/09/Fabini-IAB-
2021. NetworkQuality.txt>.
[FCC_MBA] "Measuring Broadband America", [FCC_MBA] FCC, "Measuring Broadband America",
https://www.fcc.gov/general/measuring-broadband-america , <https://www.fcc.gov/general/measuring-broadband-america>.
n.d..
[FCC_MBA_methodology] [FCC_MBA_methodology]
"Measuring Broadband America - Open Methodology", FCC, "Measuring Broadband America - Open Methodology",
https://www.fcc.gov/general/measuring-broadband-america- <https://www.fcc.gov/general/measuring-broadband-america-
open-methodology , n.d.. open-methodology>.
[Foulkes2021] [Foulkes2021]
Foulkes, J., "Metrics helpful in assessing Internet Foulkes, J., "Metrics helpful in assessing Internet
Quality", https://www.iab.org/wp-content/IAB- Quality", September 2021, <https://www.iab.org/wp-content/
uploads/2021/09/ IAB-uploads/2021/09/
IAB_Metrics_helpful_in_assessing_Internet_Quality.pdf , IAB_Metrics_helpful_in_assessing_Internet_Quality.pdf>.
September 2021.
[Ghai2021] Ghai, R., "Using TCP Connect Latency for Measuring CX and [Ghai2021] Ghai, R., "Using TCP Connect Latency for measuring CX and
Network Optimization", https://www.iab.org/wp-content/IAB- Network Optimization", February 2021,
uploads/2021/09/xfinity-wifi-ietf-iab-v2-1.pdf , February <https://www.iab.org/wp-content/IAB-uploads/2021/09/
2021. xfinity-wifi-ietf-iab-v2-1.pdf>.
[Iyengar2021] [Iyengar2021]
Iyengar, J., "The Internet Exists In Its Use", Iyengar, J., "The Internet Exists In Its Use", August
https://www.iab.org/wp-content/IAB-uploads/2021/09/The- 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/
Internet-Exists-In-Its-Use.pdf , August 2021. The-Internet-Exists-In-Its-Use.pdf>.
[Kerpez2021] [Kerpez2021]
Shafiei, J., Kerpez, K., Cioffi, J., Chow, P., and D. Shafiei, J., Kerpez, K., Cioffi, J., Chow, P., and D.
Bousaber, "Wi-Fi and Broadband Data", https://www.iab.org/ Bousaber, "Wi-Fi and Broadband Data", September 2021,
wp-content/IAB-uploads/2021/09/Wi-Fi-Report-ASSIA.pdf , <https://www.iab.org/wp-content/IAB-uploads/2021/09/Wi-Fi-
September 2021. Report-ASSIA.pdf>.
[Kilkki2021] [Kilkki2021]
Kilkki, K. and B. Finley, "In Search of Lost QoS", Kilkki, K. and B. Finley, "In Search of Lost QoS",
https://www.iab.org/wp-content/IAB-uploads/2021/09/Kilkki- February 2021, <https://www.iab.org/wp-content/IAB-
In-Search-of-Lost-QoS.pdf , February 2021. uploads/2021/09/Kilkki-In-Search-of-Lost-QoS.pdf>.
[Laki2021] Nadas, S., Varga, B., Contreras, L.M., and S. Laki, [Laki2021] Nadas, S., Varga, B., Contreras, L.M., and S. Laki,
"Incentive-Based Traffic Management and QoS Measurements", "Incentive-Based Traffic Management and QoS Measurements",
https://www.iab.org/wp-content/IAB-uploads/2021/11/CamRdy- February 2021, <https://www.iab.org/wp-content/IAB-
IAB_user_meas_WS_Nadas_et_al_IncentiveBasedTMwQoS.pdf , uploads/2021/11/CamRdy-
February 2021. IAB_user_meas_WS_Nadas_et_al_IncentiveBasedTMwQoS.pdf>.
[Liubogoshchev2021] [Liubogoshchev2021]
Liubogoshchev, M., "Cross-layer cooperation for Better Liubogoshchev, M., "Cross-layer Cooperation for Better
Network Service", https://www.iab.org/wp-content/IAB- Network Service", February 2021, <https://www.iab.org/wp-
uploads/2021/09/Cross-layer-Cooperation-for-Better- content/IAB-uploads/2021/09/Cross-layer-Cooperation-for-
Network-Service-2.pdf , February 2021. Better-Network-Service-2.pdf>.
[MacMillian2021] [MacMillian2021]
MacMillian, K. and N. Feamster, "Beyond Speed Test: MacMillian, K. and N. Feamster, "Beyond Speed Test:
Measuring Latency Under Load Across Different Speed Measuring Latency Under Load Across Different Speed
Tiers", https://www.iab.org/wp-content/IAB- Tiers", February 2021, <https://www.iab.org/wp-content/
uploads/2021/09/2021_nqw_lul.pdf , February 2021. IAB-uploads/2021/09/2021_nqw_lul.pdf>.
[Marx2021] Marx, R. and J. Herbots, "Merge Those Metrics: Towards [Marx2021] Marx, R. and J. Herbots, "Merge Those Metrics: Towards
Holistic (Protocol) Logging", https://www.iab.org/wp- Holistic (Protocol) Logging", February 2021,
content/IAB-uploads/2021/09/ <https://www.iab.org/wp-content/IAB-uploads/2021/09/
MergeThoseMetrics_Marx_Jul2021.pdf , February 2021. MergeThoseMetrics_Marx_Jul2021.pdf>.
[Mathis2021] [Mathis2021]
Mathis, M., "Preliminary Longitudinal Study of Internet Mathis, M., "Preliminary Longitudinal Study of Internet
Responsiveness", https://www.iab.org/wp-content/IAB- Responsiveness", August 2021, <https://www.iab.org/wp-
uploads/2021/09/Preliminary-Longitudinal-Study-of- content/IAB-uploads/2021/09/Preliminary-Longitudinal-
Internet-Responsiveness-1.pdf , August 2021. Study-of-Internet-Responsiveness-1.pdf>.
[McIntyre2021] [McIntyre2021]
Paasch, C., McIntyre, K., Shapira, O., Meyer, R., and S. Paasch, C., McIntyre, K., Shapira, O., Meyer, R., and S.
Cheshire, "An end-user approach to an Internet Score", Cheshire, "An end-user approach to an Internet Score",
https://www.iab.org/wp-content/IAB-uploads/2021/09/ September 2021, <https://www.iab.org/wp-content/IAB-
Internet-Score-2.pdf , September 2021. uploads/2021/09/Internet-Score-2.pdf>.
[Michel2021] [Michel2021]
Michel, F. and O. Bonaventure, "Packet delivery time as a Michel, F. and O. Bonaventure, "Packet delivery time as a
tie-breaker for assessing Wi-Fi access points", tie-breaker for assessing Wi-Fi access points", February
https://www.iab.org/wp-content/IAB-uploads/2021/09/camera_ 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/
ready_Packet_delivery_time_as_a_tie_breaker_for_assessing_ camera_ready_Packet_delivery_time_as_a_tie_breaker_for_ass
Wi_Fi_access_points.pdf , February 2021. essing_Wi_Fi_access_points.pdf>.
[Mirsky2021] [Mirsky2021]
Mirsky, G., Min, X., Mishra, G., and L. Han, "The Error Mirsky, G., Min, X., Mishra, G., and L. Han, "The error
Performance Metric in a Packet-Switched Network", performance metric in a packet-switched network", February
https://www.iab.org/wp-content/IAB-uploads/2021/09/IAB- 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/
worshop-Error-performance-measurement-in-packet-switched- IAB-worshop-Error-performance-measurement-in-packet-
networks.pdf , February 2021. switched-networks.pdf>.
[Morton2021] [Morton2021]
Morton, A., "Dream-Pipe or Pipe-Dream: What Do Users Want Morton, A. C., "Dream-Pipe or Pipe-Dream: What Do Users
(and how can we assure it)?", https://www.iab.org/wp- Want (and how can we assure it)?", Work in Progress,
content/IAB-uploads/2021/09/draft-morton-ippm-pipe-dream- Internet-Draft, draft-morton-ippm-pipe-dream-01, 6
01.pdf , September 2021. September 2021, <https://datatracker.ietf.org/doc/html/
draft-morton-ippm-pipe-dream-01>.
[NetworkQuality] [NetworkQuality]
"Apple Network Quality", n.d.. Apple, "Network Quality",
<https://support.apple.com/en-gb/HT212313>.
[Paasch2021] [Paasch2021]
Paasch, C., Meyer, R., Cheshire, S., and O. Shapira, Paasch, C., Meyer, R., Cheshire, S., and O. Shapira,
"Responsiveness under Working Conditions", "Responsiveness under Working Conditions", Work in
https://www.iab.org/wp-content/IAB-uploads/2021/09/draft- Progress, Internet-Draft, draft-cpaasch-ippm-
cpaasch-ippm-responsiveness-1-1.pdf , February 2021. responsiveness-01, 25 October 2021,
<https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-
responsiveness-01>.
[Pardue2021] [Pardue2021]
Pardue, L. and S. Tellakula, "Lower-layer performance is Pardue, L. and S. Tellakula, "Lower-layer performance is
not indicative of upper-layer success", not indicative of upper-layer success", February 2021,
https://www.iab.org/wp-content/IAB-uploads/2021/09/Lower- <https://www.iab.org/wp-content/IAB-uploads/2021/09/Lower-
layer-performance-is-not-indicative-of-upper-layer- layer-performance-is-not-indicative-of-upper-layer-
success-20210906-00-1.pdf , February 2021. success-20210906-00-1.pdf>.
[Reed2021] Reed, D.P. and L. Perigo, "Measuring IKSP Performance in [Reed2021] Reed, D.P. and L. Perigo, "Measuring ISP Performance in
Broadband America: A Study of Latency Under Load", Broadband America: A Study of Latency Under Load",
https://www.iab.org/wp-content/IAB-uploads/2021/09/ February 2021, <https://www.iab.org/wp-content/IAB-
Camera_Ready_-Measuring-ISP-Performance-in-Broadband- uploads/2021/09/Camera_Ready_-Measuring-ISP-Performance-
America.pdf , February 2021. in-Broadband-America.pdf>.
[SamKnows] "SamKnows", n.d., <https://www.samknows.com/>. [SamKnows] "SamKnows", <https://www.samknows.com/>.
[Schlinker2019] [Schlinker2019]
Schlinker, B., Cunha, I., Chiu, Y., Sundaresan, S., and E. Schlinker, B., Cunha, I., Chiu, Y., Sundaresan, S., and E.
Katz-Basset, "Internet's performance from Facebook's Katz-Basset, "Internet Performance from Facebook's Edge",
edge", https://www.iab.org/wp-content/IAB-uploads/2021/09/ February 2019, <https://www.iab.org/wp-content/IAB-
Internet-Performance-from-Facebooks-Edge.pdf , February uploads/2021/09/Internet-Performance-from-Facebooks-
2019. Edge.pdf>.
[Scuba] "Facebook Scuba", n.d., [Scuba] Abraham, L. et al., "Scuba: Diving into Data at Facebook",
<https://research.facebook.com/publications/scuba-diving- <https://research.facebook.com/publications/scuba-diving-
into-data-at-facebook/>. into-data-at-facebook/>.
[Sengupta2021] [Sengupta2021]
Sengupta, S., Kim, H., and J. Rexford, "Fine-Grained RTT Sengupta, S., Kim, H., and J. Rexford, "Fine-Grained RTT
Monitoring Inside the Network", https://www.iab.org/wp- Monitoring Inside the Network", February 2021,
content/IAB-uploads/2021/09/Camera_Ready__Fine- <https://www.iab.org/wp-content/IAB-uploads/2021/09/
Grained_RTT_Monitoring_Inside_the_Network.pdf , February Camera_Ready__Fine-
2021. Grained_RTT_Monitoring_Inside_the_Network.pdf>.
[Sivaraman2021] [Sivaraman2021]
Sivaraman, V., Madanapalli, S., and H. Kumar, "Measuring Sivaraman, V., Madanapalli, S., and H. Kumar, "Measuring
Network Experience Meaningfully, Accurately, and Network Experience Meaningfully, Accurately, and
Scalably", https://www.iab.org/wp-content/IAB- Scalably", February 2021, <https://www.iab.org/wp-content/
uploads/2021/09/CanopusPositionPaperCameraReady.pdf , IAB-uploads/2021/09/CanopusPositionPaperCameraReady.pdf>.
February 2021.
[Speedtest] [Speedtest]
"Speedtest by Ookla", n.d., <https://www.speedtest.net>. Ookla, "Speedtest", <https://www.speedtest.net>.
[Stein2021] [Stein2021]
Stein, J., "The Futility of QoS", https://www.iab.org/wp- Stein, Y., "The Futility of QoS", August 2021,
content/IAB-uploads/2021/09/QoS-futility.pdf , August <https://www.iab.org/wp-content/IAB-uploads/2021/09/QoS-
2021. futility.pdf>.
[Welzl2021] [Welzl2021]
Welzl, M., "A Case for Long-Term Statistics", Welzl, M., "A Case for Long-Term Statistics", February
https://www.iab.org/wp-content/IAB-uploads/2021/09/iab- 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/
longtermstats_cameraready.docx-1.pdf , February 2021. iab-longtermstats_cameraready.docx-1.pdf>.
[WORKSHOP] IAB, ., "IAB Workshop: Measuring Network Quality for End- [WORKSHOP] IAB, "IAB Workshop: Measuring Network Quality for End-
Users, 2021", September 2021. Users, 2021", September 2021,
<https://www.iab.org/activities/workshops/network-
quality>.
[Zhang2021] [Zhang2021]
Zhang, M., Goel, V., and L. Xu, "User-Perceived Latency to Zhang, M., Goel, V., and L. Xu, "User-Perceived Latency to
measure CCAs", https://www.iab.org/wp-content/IAB- Measure CCAs", September 2021, <https://www.iab.org/wp-
uploads/2021/09/User_Perceived_Latency-1.pdf , September content/IAB-uploads/2021/09/User_Perceived_Latency-1.pdf>.
2021.
Appendix A. Participants List Appendix A. Program Committee
The program committee consisted of:
Jari Arkko
Olivier Bonaventure
Vint Cerf
Stuart Cheshire
Sam Crowford
Nick Feamster
Jim Gettys
Toke Hoiland-Jorgensen
Geoff Huston
Cullen Jennings
Katarzyna Kosek-Szott
Mirja Kühlewind
Jason Livingood
Matt Mathis
Randall Meyer
Kathleen Nichols
Christoph Paasch
Tommy Pauly
Greg White
Keith Winstein
Appendix B. Workshop Chairs
The workshop chairs consisted of:
Wes Hardaker
Evgeny Khorov
Omer Shapira
Appendix C. Workshop Participants
The following is a list of participants who attended the workshop The following is a list of participants who attended the workshop
over a remote connection: over a remote connection:
Ahmed Aldabbagh Ahmed Aldabbagh
Jari Arkko Jari Arkko
Praveen Balasubramanian Praveen Balasubramanian
Olivier Bonaventure Olivier Bonaventure
Djamel Bousaber Djamel Bousaber
Bob Briscoe Bob Briscoe
Rich Brown Rich Brown
Anna Brunstrom Anna Brunstrom
Pedro Casas Pedro Casas
Vint Cerf Vint Cerf
Stuart Cheshire Stuart Cheshire
Kenjiro Cho Kenjiro Cho
Steve Christianson Steve Christianson
John Cioffi John Cioffi
Alexander Clemm Alexander Clemm
Luis M. Contreras Luis M. Contreras
Sam Crawford Sam Crawford
Neil Davies Neil Davies
Gino Dion Gino Dion
Toerless Eckert Toerless Eckert
Lars Eggert Lars Eggert
Joachim Fabini Joachim Fabini
Gorry Fairhurst Gorry Fairhurst
Nick Feamster Nick Feamster
Mat Ford Mat Ford
Jonathan Foulkes Jonathan Foulkes
Jim Gettys Jim Gettys
Rajat Ghai Rajat Ghai
Vidhi Goel Vidhi Goel
Wes Hardaker Wes Hardaker
Joris Herbots Joris Herbots
Geoff Huston Geoff Huston
Toke Høiland-Jørgensen Toke Høiland-Jørgensen
Jana Iyengar Jana Iyengar
Cullen Jennings Cullen Jennings
Ken Kerpez Ken Kerpez
Evgeny Khorov Evgeny Khorov
Kalevi Kilkki Kalevi Kilkki
Joon Kim Joon Kim
Zhenbin Li Zhenbin Li
Mikhail Liubogoshchev Mikhail Liubogoshchev
Jason Livingood Jason Livingood
Kyle MacMillan Kyle MacMillan
Sharat Madanapalli Sharat Madanapalli
Vesna Manojlovic Vesna Manojlovic
Robin Marx Robin Marx
Matt Mathis Matt Mathis
Jared Mauch Jared Mauch
Kristen McIntyre Kristen McIntyre
Randall Meyer Randall Meyer
François Michel François Michel
Greg Mirsky Greg Mirsky
Cindy Morgan Cindy Morgan
Al Morton Al Morton
Szilveszter Nadas Szilveszter Nadas
Kathleen Nichols Kathleen Nichols
Lai Yi Ohlsen Lai Yi Ohlsen
Christoph Paasch Christoph Paasch
Lucas Pardue Lucas Pardue
Tommy Pauly Tommy Pauly
Levi Perigo Levi Perigo
David Reed David Reed
Alvaro Retana Alvaro Retana
Roberto Roberto
Koen De Schepper Koen De Schepper
David Schinazi David Schinazi
Brandon Schlinker Brandon Schlinker
Eve Schooler Eve Schooler
Satadal Sengupta Satadal Sengupta
Jinous Shafiei Jinous Shafiei
Shapelez Shapelez
Omer Shapira Omer Shapira
Dan Siemon Dan Siemon
Vijay Sivaraman Vijay Sivaraman
Karthik Sundaresan Karthik Sundaresan
Dave Taht Dave Taht
Rick Taylor Rick Taylor
Bjørn Ivar Teigen Bjørn Ivar Teigen
Nicolas Tessares Nicolas Tessares
Peter Thompson Peter Thompson
Balazs Varga Balazs Varga
Bren Tully Walsh Bren Tully Walsh
Michael Welzl Michael Welzl
Greg White Greg White
Russ White Russ White
Keith Winstein Keith Winstein
Lisong Xu Lisong Xu
Jiankang Yao Jiankang Yao
Gavin Young Gavin Young
Mingrui Zhang Mingrui Zhang
Appendix B. IAB Members at the Time of Approval IAB Members at the Time of Approval
Internet Architecture Board members at the time this document was Internet Architecture Board members at the time this document was
approved for publication were: approved for publication were:
Jari Arkko Jari Arkko
Deborah Brungard Deborah Brungard
Ben Campbell Lars Eggert
Lars Eggert Wes Hardaker
Wes Hardaker Cullen Jennings
Cullen Jennings Mallory Knodel
Mirja Kühlewind Mirja Kühlewind
Zhenbin Li Zhenbin Li
Jared Mauch Tommy Pauly
Tommy Pauly David Schinazi
Colin Perkins Russ White
David Schinazi Qin Wu
Russ White Jiankang Yao
Jiankang Yao
Appendix C. Acknowledgements Acknowledgments
The authors would like to thank the workshop participants, the The authors would like to thank the workshop participants, the
members of the IAB, and the program committee for creating and members of the IAB, and the program committee for creating and
participating in many interesting discussions. participating in many interesting discussions.
C.1. Draft contributors Contributors
Thank you to the people that contributed edits to this draft:
Erik Auerswald
Simon Leinen
Brian Trammell
C.2. Workshop Chairs
The workshop chairs consisted of:
Wes Hardaker
Evgeny Khorov
Omer Shapira
C.3. Program Committee
The program committee consisted of:
Jari Arkko
Olivier Bonaventure
Vint Cerf
Stuart Cheshire
Sam Crowford
Nick Feamster
Jim Gettys
Toke Hoiland-Jorgensen
Geoff Huston
Cullen Jennings
Katarzyna Kosek-Szott
Mirja Kuehlewind
Jason Livingood
Matt Mathis
Randall Meyer
Kathleen Nichols
Christoph Paasch
Tommy Pauly
Greg White
Keith Winstein
Appendix D. Github Version of this document
While this document is under development, it can be viewed and Thank you to the people that contributed edits to this document:
tracked here:
https://github.com/intarchboard/network-quality-workshop-report Erik Auerswald
Simon Leinen
Brian Trammell
Authors' Addresses Authors' Addresses
Wes Hardaker Wes Hardaker
USC/ISI
Email: ietf@hardakers.net Email: ietf@hardakers.net
Omer Shapira Omer Shapira
Apple
Email: omer_shapira@apple.com Email: omer_shapira@apple.com
 End of changes. 266 change blocks. 
796 lines changed or deleted 800 lines changed or added

This html diff was produced by rfcdiff 1.48.