rfc9544.original.xml   rfc9544.xml 
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="UTF-8"?>
<!-- xml2rfc v2v3 conversion 3.6.0 -->
<!DOCTYPE rfc [ <!DOCTYPE rfc [
<!ENTITY nbsp "&#160;"> <!ENTITY nbsp "&#160;">
<!ENTITY zwsp "&#8203;"> <!ENTITY zwsp "&#8203;">
<!ENTITY nbhy "&#8209;"> <!ENTITY nbhy "&#8209;">
<!ENTITY wj "&#8288;"> <!ENTITY wj "&#8288;">
]> ]>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-i
etf-ippm-pam-09" ipr="trust200902" obsoletes="" updates="" submissionType="IETF" <rfc xmlns:xi="http://www.w3.org/2001/XInclude"
xml:lang="en" tocInclude="true" tocDepth="3" symRefs="true" sortRefs="true" ver category="info"
sion="3"> docName="draft-ietf-ippm-pam-09"
<!-- xml2rfc v2v3 conversion 3.6.0 --> number="9544"
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> ipr="trust200902"
obsoletes=""
updates=""
submissionType="IETF"
consensus="true"
xml:lang="en"
tocInclude="true"
tocDepth="3"
symRefs="true"
sortRefs="true"
version="3">
<front> <front>
<title abbrev="Framework of PAM">Precision Availability Metrics for Services <title abbrev="PAMs for Services Governed by SLOs">Precision Availability Me
Governed by Service Level Objectives (SLOs)</title> trics (PAMs) for Services Governed by Service Level Objectives (SLOs)</title>
<seriesInfo name="Internet-Draft" value="draft-ietf-ippm-pam-09"/> <seriesInfo name="RFC" value="9544"/>
<author fullname="Greg Mirsky" initials="G." surname="Mirsky"> <author fullname="Greg Mirsky" initials="G." surname="Mirsky">
<organization>Ericsson</organization> <organization>Ericsson</organization>
<address> <address>
<postal> <postal>
<street/> <street/>
<city/> <city/>
<code/> <code/>
<country/> <country/>
</postal> </postal>
<email>gregimirsky@gmail.com</email> <email>gregimirsky@gmail.com</email>
skipping to change at line 52 skipping to change at line 68
<postal> <postal>
<street/> <street/>
<city/> <city/>
<code/> <code/>
<country/> <country/>
</postal> </postal>
<email>xiao.min2@zte.com.cn</email> <email>xiao.min2@zte.com.cn</email>
</address> </address>
</author> </author>
<author fullname="Alexander Clemm" initials="A." surname="Clemm"> <author fullname="Alexander Clemm" initials="A." surname="Clemm">
<organization>Futurewei</organization> <organization></organization>
<address> <address>
<postal> <postal>
<street>2330 Central Expressway</street> <street/>
<city>Santa Clara</city> <city/>
<code>CA 95050</code> <code/>
<country>USA</country> <country/>
</postal> </postal>
<email>ludwig@clemm.org</email> <email>ludwig@clemm.org</email>
</address> </address>
</author> </author>
<author fullname="John Strassner" initials="J." surname="Strassner"> <author fullname="John Strassner" initials="J." surname="Strassner">
<organization>Futurewei</organization> <organization>Futurewei</organization>
<address> <address>
<postal> <postal>
<street>2330 Central Expressway</street> <street>2330 Central Expressway</street>
<city>Santa Clara</city> <city>Santa Clara</city>
<code>CA 95050</code> <region>CA</region> <code>95050</code>
<country>USA</country> <country>United States of America</country>
</postal> </postal>
<email>strazpdj@gmail.com</email> <email>strazpdj@gmail.com</email>
</address> </address>
</author> </author>
<author fullname="Jerome Francois" initials="J." surname="Francois"> <author fullname="Jerome Francois" initials="J." surname="Francois">
<organization>Inria and University of Luxembourg</organization> <organization>Inria and University of Luxembourg</organization>
<address> <address>
<postal> <postal>
<street>615 Rue du Jardin Botanique</street> <street>615 Rue du Jardin Botanique</street>
<city>Villers-les-Nancy</city> <city>Villers-les-Nancy</city>
<code>54600</code> <code>54600</code>
<country>France</country> <country>France</country>
</postal> </postal>
<email>jerome.francois@inria.fr</email> <email>jerome.francois@inria.fr</email>
</address> </address>
</author> </author>
<date year="2023"/> <date year="2024" month="February"/>
<area>Transport</area>
<workgroup>Network Working Group</workgroup> <area>TSV</area>
<keyword>Internet-Draft</keyword> <workgroup>ippm</workgroup>
<keyword>IPPM</keyword> <keyword>IPPM</keyword>
<keyword>Performance Measurement </keyword> <keyword>Performance Measurement</keyword>
<abstract> <abstract>
<t>
<t>
This document defines a set of metrics for networking services with This document defines a set of metrics for networking services with
performance requirements expressed as Service Level Objectives (SLO). performance requirements expressed as Service Level Objectives (SLOs).
These metrics, referred to as Precision Availability Metrics (PAM), These metrics, referred to as "Precision Availability Metrics (PAMs)",
are useful for defining and monitoring SLOs. are useful for defining and monitoring SLOs.
For example, PAM can be used by providers and/or customers of an RFC XXXX Netw ork Slice Service For example, PAMs can be used by providers and/or customers of an RFC 9543 Net work Slice Service
to assess whether the service is provided in compliance with its defined SLOs. to assess whether the service is provided in compliance with its defined SLOs.
</t> </t>
<t>Note to the RFC Editor: Please update "RFC XXXX Network Slice"
with the RFC number assigned to draft-ietf-teas-ietf-network-slices.</t>
</abstract> </abstract>
</front> </front>
<middle> <middle>
<section anchor="intro" numbered="true" toc="default"> <section anchor="intro" numbered="true" toc="default">
<name>Introduction</name> <name>Introduction</name>
<t> <t>
Service providers and users often need to assess the quality with which networ k services are being delivered. Service providers and users often need to assess the quality with which networ k services are being delivered.
In particular, in cases where service level guarantees are documented (includi ng their companion metrology) as part of a In particular, in cases where service-level guarantees are documented (includi ng their companion metrology) as part of a
contract established between the customer and the service provider, and Servic e Level Objectives (SLOs) are defined, contract established between the customer and the service provider, and Servic e Level Objectives (SLOs) are defined,
it is essential to provide means to verify that what has been delivered compli es with what has been possibly negotiated it is essential to provide means to verify that what has been delivered compli es with what has been possibly negotiated
and (contractually) defined between the customer and the service provider. and (contractually) defined between the customer and the service provider.
<!-- Examples of Service Level Indicators (SLIs) include packet latency and pa Examples of SLOs would be target values for the maximum packet delay
cket loss ratio. -->
Examples of SLOs <!-- associated with such SLIs -->would be target values for
the maximum packet delay
(one-way and/or round-trip) or maximum packet loss ratio that would be deemed acceptable. (one-way and/or round-trip) or maximum packet loss ratio that would be deemed acceptable.
</t> </t>
<t> <t>
More generally, SLOs can be used to characterize the ability of a particular set of nodes to communicate More generally, SLOs can be used to characterize the ability of a particular set of nodes to communicate
according to certain measurable expectations. Those expectations can include but are not limited to aspects according to certain measurable expectations. Those expectations can include but are not limited to aspects
such as latency, delay variation, loss, capacity/throughput, ordering, and fragm entation. such as latency, delay variation, loss, capacity/throughput, ordering, and fragm entation.
Whatever SLO parameters are chosen and whichever way service level parameters ar Whatever SLO parameters are chosen and whichever way service-level parameters ar
e being measured, e being measured,
precision availability metrics indicate whether or not a given service has been Precision Availability Metrics indicate whether or not a given service has been
available according to expectations at all times. available according to expectations at all times.
</t> </t>
<t> <t>
Several metrics (often documented in the IANA Registry of Performance Metrics < Several metrics (often documented in the IANA "Performance Metrics" registry <x
xref target="IANA-PM-Registry"/> ref target="IANA-PM-Registry"/>
according to <xref target="RFC8911"/> and <xref target="RFC8912"/>), can be use according to <xref target="RFC8911"/> and <xref target="RFC8912"/>) can be used
d to characterize the service quality, expressing to characterize the service quality, expressing
the perceived quality of delivered networking services versus their SLOs. the perceived quality of delivered networking services versus their SLOs.
Of concern is not so much the absolute service level (for example, actual late ncy experienced) Of concern is not so much the absolute service level (for example, actual late ncy experienced)
but whether the service is provided in compliance with the negotiated and even tually contracted service levels. but whether the service is provided in compliance with the negotiated and even tually contracted service levels.
For instance, this may include whether the experienced packet delay falls with in For instance, this may include whether the experienced packet delay falls with in
an acceptable range that has been contracted for the service. an acceptable range that has been contracted for the service.
The specific quality of service depends on the SLO or a set thereof for a give n service that is in effect. The specific quality of service depends on the SLO or a set thereof for a give n service that is in effect.
<!-- Different groups of applications set forth requirements for varying sets Non-compliance to an SLO might result in the degradation of the quality of exp
of service levels with different target values. erience for gamers
Such applications range from Augmented Reality/Virtual Reality to mission-crit
ical controlling industrial processes. -->
A non-compliance to an SLO might result in the degradation of the quality of e
xperience for gamers
or even jeopardize the safety of a large geographical area. or even jeopardize the safety of a large geographical area.
<!-- However, as those applications represent clear business opportunities, th ey demand dependable technical solutions. -->
</t> </t>
<t> <t>
The same service level may be deemed acceptable for one application, while una The same service level may be deemed acceptable for one application, while
cceptable for another, unacceptable for another, depending on the needs of the application. Hence,
depending on the needs of the application. Hence it is not sufficient to measu it is not sufficient to measure service levels per se over time; the quality
re of the service being contextually provided (e.g., with the applicable SLO in
service levels per se over time, but to assess the quality of the mind) must be also assessed. However, at this point, there are no standard
service being contextually provided (e.g., with the applicable SLO in mind). metrics that can be used to account for the quality with which services are
However, at this point, there are no standard metrics that can be used to accoun delivered relative to their SLOs or to determine whether their SLOs are
t for the quality with which services being met at all times. Such metrics and the instrumentation to support
are delivered relative to their SLOs, and whether their SLOs are being met at al them are essential for various purposes, including monitoring (to ensure
l times. that networking services are performing according to their objectives) as
Such metrics and the instrumentation to support them are essential well as accounting (to maintain a record of service levels delivered, which
for various purposes, including monitoring (to ensure that networking services is important for the monetization of such services as well as for the
are performing according to their objectives) as well as accounting (to maintain triaging of problems).
a record of service levels delivered, which is important
for the monetization of such services as well as for the triaging of problems).
</t> </t>
<t>
The current state-of-the-art of metrics includes, for example, <t>
interface metrics, useful to obtain statistical data on traffic volume and The current state-of-the-art of metrics include, for example,
interface metrics that can be used to obtain statistical data on traffic volume
and
behavior that can be observed at an interface <xref target="RFC2863"/> behavior that can be observed at an interface <xref target="RFC2863"/>
and <xref target="RFC8343"/>. However, they are agnostic of actual service leve <xref target="RFC8343"/>. However, they are agnostic of actual service levels a
ls and not specific to nd not specific to
distinct flows. Flow records <xref target="RFC7011"/> and <xref target="RFC70 distinct flows. Flow records <xref target="RFC7011"/> <xref target="RFC7012"/
12"/> maintain statistics > maintain statistics
about flows, including flow volume and flow duration, but again, about flows, including flow volume and flow duration, but again, they
contain very little information about service levels, let contain very little information about service levels, let
alone whether the service levels delivered meet their respective targets, i.e. , their associated SLOs. alone whether the service levels delivered meet their respective targets, i.e. , their associated SLOs.
</t> </t>
<t> <t>
This specification introduces a new set of metrics, Precision Availability Met rics (PAM), aimed at capturing This specification introduces a new set of metrics, Precision Availability Met rics (PAMs), aimed at capturing
service levels for a flow, specifically the degree to service levels for a flow, specifically the degree to
which the flow complies with the SLOs that are in effect. which the flow complies with the SLOs that are in effect.
PAM can be used to assess whether a service is provided in compliance with it s defined SLOs. PAMs can be used to assess whether a service is provided in compliance with i ts defined SLOs.
This information can be used in multiple ways, for example, This information can be used in multiple ways, for example,
to optimize service delivery, take timely counteractions in the event of serv ice degradation, to optimize service delivery, take timely counteractions in the event of serv ice degradation,
or account for the quality of services being delivered. or account for the quality of services being delivered.
</t> </t>
<t> <t>
Availability is discussed in Section 3.4 of <xref target="RFC7297"/>. Availability is discussed in <xref target="RFC7297" sectionFormat="of" sectio n="3.4"/>.
In this document, the term "availability" reflects that In this document, the term "availability" reflects that
a service that is characterized by its SLOs is considered unavailable wheneve r those SLOs are violated, a service that is characterized by its SLOs is considered unavailable wheneve r those SLOs are violated,
even if basic connectivity is still working. "Precision" refers to services even if basic connectivity is still working. "Precision" refers to services
whose service levels are governed by SLOs and must be delivered precisely whose service levels are governed by SLOs and must be delivered precisely
according to the associated quality and performance requirements. It should b e noted that precision according to the associated quality and performance requirements. It should b e noted that precision
refers to what is being assessed, not the mechanism used to measure it. In ot her words, refers to what is being assessed, not the mechanism used to measure it. In ot her words,
it does not refer to the precision of the mechanism with which actual service levels are measured. it does not refer to the precision of the mechanism with which actual service levels are measured.
Furthermore, the precision, with respect to the delivery of an SLO, particula rly applies when a metric value Furthermore, the precision, with respect to the delivery of an SLO, particula rly applies when a metric value
approaches the specified threshold levels in the SLO. approaches the specified threshold levels in the SLO.
</t> </t>
<t> <t>
The specification and implementation of methods The specification and implementation of methods
that provide for accurate measurements are separate topics independent of the definition of that provide for accurate measurements are separate topics independent of the definition of
the metrics in which the results of such measurements would be expressed. the metrics in which the results of such measurements would be expressed.
Likewise, Service Level Expectations (SLEs), as defined in Section 5.1 of < xref target="I-D.ietf-teas-ietf-network-slices"/>, Likewise, Service Level Expectations (SLEs), as defined in <xref target="RF C9543" sectionFormat="of" section="5.1"/>,
are outside the scope of this document. are outside the scope of this document.
<!--, because it is in the nature of SLEs that they define parts of the SL A that are not easily measured.-->
</t> </t>
<!--
<t>
[Ed.note: It should be noted that at this point, the set of metrics propos
ed
here is intended as a "starter set" that is intended to spark further
discussion. Other metrics are certainly conceivable; we expect that
the list of metrics will evolve as part of the Working Group discussions.]
</t>
</section> </section>
<section numbered="true" toc="default"> <section numbered="true" toc="default">
<name>Conventions and Terminology</name> <name>Conventions</name>
<section numbered="true" toc="default"> <section numbered="true" toc="default">
<name>Terminology</name> <name>Terminology</name>
<t> <t>
In this document, SLA and SLO are used as defined in <xref target="RFC31 98"/>. In this document, SLA and SLO are used as defined in <xref target="RFC31 98"/>.
The reader may refer to Section 5.1 of <xref target="I-D.ietf-teas-ietf- The reader may refer to <xref target="RFC9543" sectionFormat="of" sectio
network-slices"/> n="5.1"/>
for an applicability example of these concepts in the context of RFC XXX for an applicability example of these concepts in the context of RFC 954
X Network Slice Services. 3 Network Slice Services.
</t> </t>
<t>Note to the RFC Editor: Please update "RFC XXXX Network Slice"
with the RFC number assigned to <xref target="I-D.ietf-teas-ietf-ne
twork-slices"/>.</t>
</section> </section>
<section numbered="true" toc="default"> <section numbered="true" toc="default">
<name>Acronyms</name> <name>Acronyms</name>
<dl indent="7" newline="false" spacing="normal">
<t>PAM Precision Availability Metric</t> <dt>IPFIX</dt><dd>IP Flow Information Export</dd>
<t>OAM Operations, Administration, and Maintenance</t> <dt>PAM </dt><dd>Precision Availability Metric</dd>
<t>SLA Service Level Agreement</t> <dt>SLA </dt><dd>Service Level Agreement</dd>
<t>SLE Service Level Expectations</t> <dt>SLE </dt><dd>Service Level Expectation</dd>
<!-- <t>SLI Service Level Indicator</t> --> <dt>SLO </dt><dd>Service Level Objective</dd>
<t>SLO Service Level Objective</t> <dt>SVI </dt><dd>Severely Violated Interval</dd>
<t>VI Violated Interval</t> <dt>SVIR </dt><dd>Severely Violated Interval Ratio</dd>
<t>VIR Violated Interval Ratio</t> <dt>SVPC </dt><dd>Severely Violated Packets Count </dd>
<t>VPC Violated Packets Count </t> <dt>VFI </dt><dd>Violation-Free Interval</dd>
<t>SVI Severely Violated Interval</t> <dt>VI </dt><dd>Violated Interval</dd>
<t>SVIR Severely Violated Interval Ratio</t> <dt>VIR </dt><dd>Violated Interval Ratio</dd>
<t>SVPC Severely Violated Packets Count </t> <dt>VPC </dt><dd>Violated Packets Count </dd>
<t>VFI Violation-Free Interval</t> </dl>
</section>
<!--
<section numbered="true" toc="default">
<name>Requirements Language</name>
<t>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119" format="default"/> <xref target="R
FC8174" format="default"/>
when, and only when, they appear in all capitals, as shown here.
</t>
</section> </section>
-->
</section> </section>
<section anchor="ep-metrics-section" numbered="true" toc="default"> <section anchor="ep-metrics-section" numbered="true" toc="default">
<name>Precision Availability Metrics</name> <name>Precision Availability Metrics</name>
<section anchor="preliminaries" numbered="true" toc="default"> <section anchor="preliminaries" numbered="true" toc="default">
<name>Introducing Violated Intervals</name> <name>Introducing Violated Intervals</name>
<t> <t>
When analyzing the availability metrics of a service between two measurement poi nts, When analyzing the availability metrics of a service between two measurement poi nts,
a time interval as the unit of PAM needs to be selected. In <xref target="ITU.G. 826" format="default"/>, a time interval as the unit of PAMs needs to be selected. In <xref target="ITU.G .826" format="default"/>,
a time interval of one second is used. That is reasonable, but some services may require different granularity (e.g., decamillisecond). a time interval of one second is used. That is reasonable, but some services may require different granularity (e.g., decamillisecond).
For that reason, the time interval in PAM is viewed as a variable parameter thou For that reason, the time interval in PAMs is viewed as a variable parameter, th
gh constant for a particular measurement session. ough constant for a particular measurement session.
Furthermore, for the purpose of PAM, each time interval is classified either as Furthermore, for the purpose of PAMs, each time interval is classified as either
Violated Interval (VI), Violated Interval (VI),
Severely Violated Interval (SVI), or Violation-Free Interval (VFI). These are de fined as follows: Severely Violated Interval (SVI), or Violation-Free Interval (VFI). These are de fined as follows:
</t> </t>
<ul spacing="normal"> <ul spacing="normal">
<li>VI is a time interval during which at least one of the performance <li>VI is a time interval during which at least one of the performance
parameters degraded below its configurable optimal level threshold.</li> parameters degraded below its configurable optimal threshold.</li>
<li>SVI is a time interval during which at least one of the performance <li>SVI is a time interval during which at least one of the performance
parameters degraded below its configurable critical threshold.</li> parameters degraded below its configurable critical threshold.</li>
<li>Consequently, VFI is a time interval during which all performance pa rameters are <li>Consequently, VFI is a time interval during which all performance pa rameters are
at or better than their respective pre-defined optimal levels. at or better than their respective pre-defined optimal levels.</li>
<!-- In such a case, the service is in compliance with its specification
. --></li>
</ul> </ul>
<t> <t>
The monitoring of performance parameters to determine the quality of an in terval The monitoring of performance parameters to determine the quality of an in terval
is performed between the elements of the network that are referred to for is performed between the elements of the network that are identified in th
the SLO corresponding e SLO corresponding to the performance parameter.
to the performance parameter. Mechanisms for setting levels of a threshold of an SLO are outside the sco
Mechanisms of setting levels of a threshold of an SLO are outside the scop pe of this document.
e of this document.
</t> </t>
<t> <t>
From these definitions, a set of basic metrics can be defined that count the num From the definitions above, a set of basic metrics can be defined that count the
bers of time intervals that fall into each category: number of time intervals that fall into each category:
</t> </t>
<ul spacing="normal"> <ul spacing="normal">
<li>VI count. </li> <li>VI count </li>
<li>SVI count. </li> <li>SVI count </li>
<li>VFI count. </li> <li>VFI count </li>
</ul> </ul>
<t> <t>
These count metrics are essential in calculating respective ratios (see <xref ta rget="derived-ep-metrics-section"/>) These count metrics are essential in calculating respective ratios (see <xref ta rget="derived-ep-metrics-section"/>)
that can be used to assess the instability of a service. that can be used to assess the instability of a service.
</t> </t>
<t> Beyond accounting for violated intervals, it is sometimes beneficial to main
tain counts <t> Beyond accounting for violated intervals, it is sometimes beneficial to
of packets for which a performance threshold is violated. For example, this all maintain counts of packets for which a performance threshold is violated. For
ows distinguishing between cases example, this allows for distinguishing between cases in which violated
in which violated intervals are caused by isolated violation occurrences (such a intervals are caused by isolated violation occurrences (such as a sporadic
s, a sporadic issue issue that may be caused by a temporary spike in a queue depth along the
that may be caused by a temporary spike in a queue depth along the packet's path packet's path) or by broad violations across multiple packets (such as a
) or by broad violations problem with slow route convergence across the network or more foundational
across multiple packets (such as a problem with slow route convergence across th issues such as insufficient network resources). Maintaining such counts and
e network or more comparing them with the overall amount of traffic also facilitate assessing
foundational issues such as insufficient network resources). Maintaining such c compliance with statistical SLOs (see <xref
ounts and comparing them with target="statistical-slo-section"/>). For these reasons, the following
the overall amount of traffic also facilitates assessing compliance with statist additional metrics are defined:
ical SLOs (see <xref target="statistical-slo-section"/>).
For these reasons, the following additional metrics are defined:
</t> </t>
<ul spacing="normal"> <ul spacing="normal">
<li>VPC: Violated packets count </li> <li>VPC (Violated Packets Count) </li>
<li>SVPC: Severely violated packets count </li> <li>SVPC (Severely Violated Packets Count) </li>
</ul> </ul>
</section> </section>
<section anchor="derived-ep-metrics-section" numbered="true" toc="default"> <section anchor="derived-ep-metrics-section" numbered="true" toc="default">
<name>Derived Precision Availability Metrics</name> <name>Derived Precision Availability Metrics</name>
<t> <t>
A set of metrics can be created based on PAM introduced in <xref target="e A set of metrics can be created based on PAMs as introduced in this docume
p-metrics-section"/>. nt.
In this document, these metrics are referred to as "derived PAM". In this document, these metrics are referred to as "derived PAMs".
Some of these metrics are modeled after Mean Time Between Failure (MTBF) m Some of these metrics are modeled after Mean Time Between Failure (MTBF) m
etrics - a etrics; a
"failure" in this context referring to a failure to deliver a service accordi "failure" in this context refers to a failure to deliver a service according
ng to its SLO. to its SLO.
</t> </t>
<ul spacing="normal"> <ul spacing="normal">
<li> <li>
Time since the last violated interval (e.g., since last violated ms, Time since the last violated interval (e.g., since last violated ms or
since last violated second). since last violated second).
(This parameter is suitable for monitoring the current compliance status o f the service, e.g., for trending analysis.) This parameter is suitable for monitoring the current compliance status of the service, e.g., for trending analysis.
</li> </li>
<li> <li>
Number of packets since the last violated packet. (This parameter is Number of packets since the last violated packet. This parameter is
suitable for the monitoring of the current compliance status of the service suitable for the monitoring of the current compliance status of the service
.) .
</li> </li>
<li> <li>
Mean time between VIs (e.g., between violated milliseconds, violated secon ds) is the Mean time between VIs (e.g., between violated milliseconds or between viol ated seconds). This parameter is the
arithmetic mean of time between consecutive VIs. arithmetic mean of time between consecutive VIs.
</li> </li>
<li> <li>
Mean packets between VIs is the arithmetic Mean packets between VIs. This parameter is the arithmetic
mean of the number of SLO-compliant packets between consecutive VIs. mean of the number of SLO-compliant packets between consecutive VIs.
(Another variation of "MTBF" in a service setting.) It is another variation of MTBF in a service setting.
</li> </li>
</ul> </ul>
<t>An analogous set of metrics can be produced for SVI:</t> <t>An analogous set of metrics can be produced for SVI:</t>
<ul spacing="normal"> <ul spacing="normal">
<li> <li>
Time since the last SVI (e.g., since last violated ms, since last violated Time since the last SVI (e.g., since last violated ms or since last violat
second). (This parameter is suitable ed second). This parameter is suitable
for the monitoring of the current compliance status of the service.) for the monitoring of the current compliance status of the service.
</li> </li>
<li> <li>
Number of packets since the last severely violated packet. (This paramete Number of packets since the last severely violated packet. This parameter
r is is
suitable for the monitoring of the current compliance status of the servic suitable for the monitoring of the current compliance status of the servic
e.) e.
</li> </li>
<li> <li>
Mean time between SVIs (e.g., between severely violated Mean time between SVIs (e.g., between severely violated
milliseconds, severely violated seconds) is the milliseconds or between severely violated seconds). This parameter is the
arithmetic mean of time between consecutive SVIs. arithmetic mean of time between consecutive SVIs.
</li> </li>
<li> <li>
Mean packets between SVIs is the arithmetic Mean packets between SVIs. This parameter is the arithmetic
mean of the number of SLO-compliant packets between consecutive SVIs. mean of the number of SLO-compliant packets between consecutive SVIs.
(Another variation of "MTBF" in a service setting.) It is another variation of "MTBF" in a service setting.
</li> </li>
</ul> </ul>
<t> <t>
To indicate a historic degree of precision availability, additional derived PAMs can be defined as follows: To indicate a historic degree of precision availability, additional derived PAMs can be defined as follows:
</t> </t>
<ul spacing="normal"> <ul spacing="normal">
<li> <li>
Violated Interval Ratio (VIR) is the ratio of the summed numbers of VIs and SVI s to the total number of time unit intervals in a Violated Interval Ratio (VIR) is the ratio of the summed numbers of VIs and SVI s to the total number of time unit intervals in a
time of the availability periods during a fixed measurement session. time of the availability periods during a fixed measurement session.
</li> </li>
<li> <li>
Severely Violated Interval Ratio (SVIR) is the ratio of SVIs to the total number of time unit intervals in a time of the availability periods Severely Violated Interval Ratio (SVIR) is the ratio of SVIs to the total number of time unit intervals in a time of the availability periods
during a fixed measurement session. during a fixed measurement session.
</li> </li>
</ul> </ul>
</section> </section>
<section anchor="policy-section" numbered="true" toc="default"> <section anchor="policy-section" numbered="true" toc="default">
<name>PAM Configuration Settings and Service Availability</name> <name>PAM Configuration Settings and Service Availability</name>
<t> <t>
It might be useful for a service provider to determine the current condition of the service for which It might be useful for a service provider to determine the current condition of the service for which
PAMs are maintained. To facilitate this, it is conceivable to complement PAM wi th a state model. PAMs are maintained. To facilitate this, it is conceivable to complement PAMs w ith a state model.
Such a state model can be used to indicate whether a service is currently consid ered as available or unavailable Such a state model can be used to indicate whether a service is currently consid ered as available or unavailable
depending on the network's recent ability to provide service without incurring i ntervals during which violations occur. depending on the network's recent ability to provide service without incurring i ntervals during which violations occur.
It is conceivable to define such a state model in which transitions occur per so me predefined PAM settings. It is conceivable to define such a state model in which transitions occur per so me predefined PAM settings.
</t> </t>
<t> <t>
While the definition of a service state model is outside the scope of this docum ent, the following section provides While the definition of a service state model is outside the scope of this docum ent, this section provides
some considerations for how such a state model and accompanying configuration se ttings could be defined. some considerations for how such a state model and accompanying configuration se ttings could be defined.
</t> </t>
<t>For example, a state model could be defined by a Finite State Machine featuri ng two states, <t>For example, a state model could be defined by a Finite State Machine featuri ng two states:
"available" and "unavailable". The initial state could be "available". A serv ice could subsequently be deemed as "unavailable" "available" and "unavailable". The initial state could be "available". A serv ice could subsequently be deemed as "unavailable"
based on the number of successive interval violations that have been experienced up to the particular observation time moment. based on the number of successive interval violations that have been experienced up to the particular observation time moment.
To return to a state of "available", a number of intervals without violations wo uld need to be observed. To return to a state of "available", a number of intervals without violations wo uld need to be observed.
</t> </t>
<t> <t>
The number of successive intervals with violations, as well as the The number of successive intervals with violations, as well as the
number of successive intervals that are free of violations, required number of successive intervals that are free of violations, required
for a state to transition to another state is defined by a configuration setting . for a state to transition to another state is defined by a configuration setting .
Specifically, the following configuration parameters are defined: Specifically, the following configuration parameters are defined:
</t> </t>
<ul spacing="normal"> <dl newline="false" spacing="normal">
<li>Unavailability threshold: The number of successive intervals during which <dt>Unavailability threshold:</dt><dd>The number of successive intervals durin
a violation occurs to transition to an unavailable state. </li> g which a violation occurs to transition to an unavailable state. </dd>
<li>Availability threshold: The number of successive intervals during which no <dt>Availability threshold:</dt><dd>The number of successive intervals during
violations must occur to allow transition which
to an available state from a previously unavailable state. </li> no violations must occur to allow transition to an available state from a
</ul> previously unavailable state. </dd>
</dl>
<t> <t>
Additional configuration parameters could be defined to account for the severity of violations. Likewise, it is conceivable to define Additional configuration parameters could be defined to account for the severity of violations. Likewise, it is conceivable to define
configuration settings that also take VIR and SVIR into account. configuration settings that also take VIR and SVIR into account.
</t> </t>
</section> </section>
</section> </section>
<section anchor="statistical-slo-section" numbered="true" toc="default"> <section anchor="statistical-slo-section" numbered="true" toc="default">
<name>Statistical SLO</name> <name>Statistical SLO</name>
<t> <t>
skipping to change at line 423 skipping to change at line 422
SLO violation. However, it is still useful to maintain those SLO violation. However, it is still useful to maintain those
statistics, as the number of out-of-SLO packets still matters when statistics, as the number of out-of-SLO packets still matters when
looked at in proportion to the total number of packets. looked at in proportion to the total number of packets.
</t> </t>
<t> <t>
Along that vein, an SLA might establish a multi-tiered SLO of, say, end-to-en d Along that vein, an SLA might establish a multi-tiered SLO of, say, end-to-en d
latency (from the lowest to highest tier) as follows: latency (from the lowest to highest tier) as follows:
</t> </t>
<ul spacing="normal"> <ul spacing="normal">
<li>not to exceed 30 ms for any packet;</li> <li>not to exceed 30 ms for any packet;</li>
<li>to not exceed 25 ms for 99.999% of packets;</li> <li>not to exceed 25 ms for 99.999% of packets; and</li>
<li>to not exceed 20 ms for 99% of packets.</li> <li>not to exceed 20 ms for 99% of packets.</li>
</ul> </ul>
<t> <t>
In that case, any individual packet with a latency greater than 20 ms latency In that case, any individual packet with a latency greater than 20 ms latency
and lower than 30 ms cannot be considered an SLO violation in itself, but com pliance with and lower than 30 ms cannot be considered an SLO violation in itself, but com pliance with
the SLO may need to be assessed after the fact. the SLO may need to be assessed after the fact.
</t> </t>
<t> <t>
To support statistical SLOs more directly requires To support statistical SLOs more directly requires
additional metrics, for example, metrics that represent histograms for additional metrics, for example, metrics that represent histograms for
service level parameters with buckets corresponding to individual service-level parameters with buckets corresponding to individual
service level objectives. Although the definition of histogram metrics is out SLOs. Although the definition of histogram metrics is outside the scope of th
side the scope of this document is document
and could be considered for future work <xref target="for-discussion"/>, for and could be considered for future work (see <xref target="for-discussion"/>)
the example just given, a histogram , for the example just given, a histogram
for a particular flow could be maintained with four buckets: one for a particular flow could be maintained with four buckets: one
containing the count of packets within 20 ms, a second with a count of containing the count of packets within 20 ms, a second with a count of
packets between 20 and 25 ms (or simply all within 25 ms), a third with packets between 20 and 25 ms (or simply all within 25 ms), a third with
a count of packets between 25 and 30 ms (or merely all packets within a count of packets between 25 and 30 ms (or merely all packets within
30 ms, and a fourth with a count of anything beyond (or simply a total 30 ms), and a fourth with a count of anything beyond (or simply a total
count). Of course, the number of buckets and the boundaries between count). Of course, the number of buckets and the boundaries between
those buckets should correspond to the needs of the SLA associated with the a pplication, those buckets should correspond to the needs of the SLA associated with the a pplication,
i.e., to the specific guarantees and SLOs that were i.e., to the specific guarantees and SLOs that were
provided. provided.
</t> </t>
</section> </section>
<section anchor="other" numbered="true" toc="default"> <section anchor="other" numbered="true" toc="default">
<name>Other Expected PAM Benefits <name>Other Expected PAM Benefits
</name> </name>
<t> <t>
PAM provides several benefits with other, more conventional performance met PAMs provide several benefits with other, more conventional performance met
rics. rics.
Without PAM, it would be possible to conduct ongoing measurements of servic Without PAMs, it would be possible to conduct ongoing measurements of servi
e levels ce levels,
and maintain a time-series of service level records, then assess compliance maintain a time series of service-level records, and then assess compliance
with specific with specific
SLOs after the fact. However, doing so would require the collection of vas t amounts of data SLOs after the fact. However, doing so would require the collection of vas t amounts of data
that would need to be generated, exported, transmitted, collected, and stor ed. that would need to be generated, exported, transmitted, collected, and stor ed.
In addition, extensive postprocessing would be required to compare that dat a against SLOs In addition, extensive post-processing would be required to compare that da ta against SLOs
and analyze its compliance. Being able to perform these tasks at scale and analyze its compliance. Being able to perform these tasks at scale
and in real-time would present significant additional challenges. and in real time would present significant additional challenges.
</t> </t>
<t> <t>
Adding PAM allows for a more compact expression of service level compliance Adding PAMs allows for a more compact expression of service-level complianc
. e.
In that sense, PAM does not simply represent raw data but expresses actiona In that sense, PAMs do not simply represent raw data but expresses actionab
ble information. le information.
In conjunction with proper instrumentation, PAM can thus help avoid expensi In conjunction with proper instrumentation, PAMs can thus help avoid expens
ve postprocessing. ive post-processing.
</t> </t>
</section> </section>
<section anchor="for-discussion" numbered="true" toc="default"> <section anchor="for-discussion" numbered="true" toc="default">
<name>Extensions and Future Work</name> <name>Extensions and Future Work</name>
<!--
<li>Terminology - "Errored" vs. "Violated". The key metrics defined in
this draft refer to intervals during which violations of
objectives for service level parameters occur as "violated". The term "err
ored" was chosen in continuity with the
concept of "errored seconds", often used in transmission systems.
However, "violated" may be a more accurate term, as the metrics
defined here are not "errors" in an absolute sense, but relative
to a set of defined objectives. </li>
-->
<t> <t>
The following is a list of items that are outside the scope of this specif ication, but which will be useful extensions and opportunities for future work: The following is a list of items that are outside the scope of this specif ication but will be useful extensions and opportunities for future work:
</t> </t>
<ul spacing="normal">
<li>A YANG data model will allow PAM to be incorporated into monitoring app <ul spacing="normal">
lications based on the YANG/NETCONF/RESTCONF framework. <li>A YANG data model will allow PAMs to be incorporated into monitoring ap
plications based on the YANG, NETCONF, and RESTCONF frameworks.
In addition, a YANG data model will enable the configuration and retrieval of PAM-related settings. </li> In addition, a YANG data model will enable the configuration and retrieval of PAM-related settings. </li>
<li>A set of IPFIX Information Elements will allow PAM to be associated wit h flow records and exported as part of flow data, <li>A set of IPFIX Information Elements will allow PAMs to be associated wi th flow records and exported as part of flow data,
for example, for processing by accounting applications that assess complian ce of delivered services with quality guarantees. </li> for example, for processing by accounting applications that assess complian ce of delivered services with quality guarantees. </li>
<li>Additional second-order metrics, such as "longest disruption of service time" (measuring consecutive time units with SVIs), <li>Additional second-order metrics, such as "longest disruption of service time" (measuring consecutive time units with SVIs),
can be defined and would be deemed useful by some users. At the same time, such metrics can be computed in a straightforward manner can be defined and would be deemed useful by some users. At the same time, such metrics can be computed in a straightforward manner
and will in many cases be application-specific. For this reason, further s and will be application specific in many cases. For this reason, such metr
uch metrics are omitted here in order to not overburden this specification. </li ics are omitted here in order to not overburden this specification. </li>
> <li>Metrics can be defined to represent histograms for service-level parame
<li>The definition of the metrics that represent histograms for service lev ters with buckets corresponding to individual SLOs.</li>
el parameters with buckets corresponding to individual service level
objectives,</li>
</ul> </ul>
</section> </section>
<section anchor="iana-consider" numbered="true" toc="default"> <section anchor="iana-consider" numbered="true" toc="default">
<name>IANA Considerations</name> <name>IANA Considerations</name>
<t>This document has no IANA actions.</t> <t>This document has no IANA actions.</t>
</section> </section>
<section anchor="security" numbered="true" toc="default"> <section anchor="security" numbered="true" toc="default">
<name>Security Considerations</name> <name>Security Considerations</name>
<t> <t>
Instrumentation for metrics that are used to assess compliance with Instrumentation for metrics that are used to assess compliance with
SLOs constitute an attractive target for an attacker. By interfering SLOs constitutes an attractive target for an attacker. By interfering
with the maintenance of such metrics, services could be falsely with the maintenance of such metrics, services could be falsely
identified as complying (when they are not) or vice-versa identified as complying (when they are not) or vice versa
(i.e., flagged as being non-compliant when indeed they are). While this (i.e., flagged as being non-compliant when indeed they are). While this
document does not specify how networks should be instrumented to document does not specify how networks should be instrumented to
maintain the identified metrics, such instrumentation needs to be maintain the identified metrics, such instrumentation needs to be
adequately secured to ensure accurate measurements and prohibit adequately secured to ensure accurate measurements and prohibit
tampering with metrics being kept. tampering with metrics being kept.
</t> </t>
<t> <t>
Where metrics are being defined relative to an SLO, the configuration Where metrics are being defined relative to an SLO, the configuration
of those SLOs needs to be adequately secured. Likewise, where of those SLOs needs to be adequately secured. Likewise, where
SLOs can be adjusted, the correlation between any metric instance SLOs can be adjusted, the correlation between any metric instance
and a particular SLO must be unambiguous. The same service levels that consti tute and a particular SLO must be unambiguous. The same service levels that consti tute
SLO violations for one flow that should be maintained as part of SLO violations for one flow and should be maintained as part of
the "violated time units" and related metrics, the "violated time units" and related metrics
may be compliant for another flow. In cases when it is may be compliant for another flow. In cases when it is
impossible to tie together SLOs and PAM, it will impossible to tie together SLOs and PAMs, it is
be preferable to merely maintain statistics about service levels preferable to merely maintain statistics about service levels
delivered (for example, overall histograms of end-to-end delivered (for example, overall histograms of end-to-end
latency) without assessing which constitutes violations. latency) without assessing which constitute violations.
</t> </t>
<t> <t>
By the same token, where the definition of what constitutes a By the same token, the definition of what constitutes a
"severe" or a "significant" violation depends on configuration settings or "severe" or a "significant" violation depends on configuration settings or
context. The configuration of such settings or context needs to be context. The configuration of such settings or context needs to be
specially secured. Also, the configuration must be bound to specially secured. Also, the configuration must be bound to
the metrics being maintained. Thus, it will be clear which configuration set ting the metrics being maintained. Thus, it will be clear which configuration set ting
was in effect when those metrics were being assessed. An attacker was in effect when those metrics were being assessed. An attacker
that can tamper with such configuration settings will render the that can tamper with such configuration settings will render the
corresponding metrics useless (in the best case) or misleading (in corresponding metrics useless (in the best case) or misleading (in
the worst case). the worst case).
</t> </t>
</section> </section>
<section numbered="true" toc="default">
<name>Acknowledgments</name>
<t>
The authors greatly appreciate review and comments by Bjørn Ivar Teigen
and Christian Jacquenet.
</t>
</section>
</middle> </middle>
<back> <back>
<references>
<name>References</name>
<!--
<references>
<name>Normative References</name>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.R
FC.2119.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.R
FC.8174.xml"/>
<?rfc include="reference.RFC.8126"?>
<?rfc include="reference.RFC.4656"?>
<?rfc include="reference.RFC.6038"?>
</references>
-->
<references> <references>
<name>Informative References</name> <name>Informative References</name>
<!--
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.R
FC.7799.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.R
FC.5880.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.R
FC.8762.xml"/>
-->
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.R
FC.2863.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.834
3.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.701
1.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.701
2.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.729
7.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.319
8.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.89
11.xml"/>
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.891
2.xml"/>
<xi:include href="https://datatracker.ietf.org/doc/bibxml3/draft-ietf-teas-iet
f-network-slices.xml"/>
<!-- <xi:include href="https://datatracker.ietf.org/doc/bibxml3/draft-mmm-rtgw <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2
g-integrated-oam.xml"/> --> 863.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.83
43.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.70
11.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.70
12.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.72
97.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.31
98.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.89
11.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.89
12.xml"/>
<!-- [I-D.ietf-teas-ietf-network-slices] in EDIT state as of 12/18/23; companion
document RFC9543 -->
<reference anchor="RFC9543" target="https://www.rfc-editor.org/info/rfc9543">
<front>
<title>A Framework for Network Slices in Networks Built from IETF Technologies</
title>
<author initials="A." surname="Farrel" fullname="Adrian Farrel" role="editor">
<organization>Old Dog Consulting</organization>
</author>
<author initials="J." surname="Drake" fullname="John Drake" role="editor">
<organization>Juniper Networks</organization>
</author>
<author initials="R." surname="Rokui" fullname="Reza Rokui">
<organization>Ciena</organization>
</author>
<author initials="S." surname="Homma" fullname="Shunsuke Homma">
<organization>NTT</organization>
</author>
<author initials="K." surname="Makhijani" fullname="Kiran Makhijani">
<organization>Futurewei</organization>
</author>
<author initials="L." surname="Contreras" fullname="Luis M. Contreras">
<organization>Telefonica</organization>
</author>
<author initials="J." surname="Tantsura" fullname="Jeff Tantsura">
<organization>Nvidia</organization>
</author>
<date month="February" year="2024"/>
</front>
<seriesInfo name="RFC" value="9543"/>
<seriesInfo name="DOI" value="10.17487/RFC9543"/>
</reference>
<reference anchor="ITU.G.826"> <reference anchor="ITU.G.826">
<front> <front>
<title>End-to-end error performance parameters and objectives for in ternational, constant bit-rate digital paths and connections</title> <title>End-to-end error performance parameters and objectives for in ternational, constant bit-rate digital paths and connections</title>
<author> <author>
<organization>ITU-T</organization> <organization>ITU-T</organization>
</author> </author>
<date month="December" year="2002"/> <date month="December" year="2002"/>
</front> </front>
<seriesInfo name="ITU-T" value="G.826"/> <seriesInfo name="ITU-T" value="G.826"/>
</reference> </reference>
<reference anchor="IANA-PM-Registry" target="https://www.iana.org/assi gnments/performance-metrics/performance-metrics.xhtml"> <reference anchor="IANA-PM-Registry" target="https://www.iana.org/assi gnments/performance-metrics">
<front> <front>
<title>IANA Registry of Performance Metrics</title> <title>Performance Metrics</title>
<author> <author>
<organization>IANA</organization> <organization>IANA</organization>
</author> </author>
<date month="March" year="2020"/>
</front> </front>
</reference> </reference>
</references>
</references> </references>
<section numbered="false" toc="default">
<name>Acknowledgments</name>
<t>
The authors greatly appreciate review and comments by <contact fullname
="Bjørn Ivar Teigen"/> and <contact fullname="Christian Jacquenet"/>.
</t>
</section>
<section anchor="contr-sec" numbered="false" toc="default"> <section anchor="contr-sec" numbered="false" toc="default">
<name>Contributors' Addresses</name> <name>Contributors</name>
<contact fullname="Liuyan Han" initials="L." surname="Han"> <contact fullname="Liuyan Han" initials="L." surname="Han">
<organization>China Mobile</organization> <organization>China Mobile</organization>
<address> <address>
<postal> <postal>
<street>32 XuanWuMenXi Street</street> <street>32 XuanWuMenXi Street</street>
<city>Beijing</city> <city>Beijing</city>
<code>100053</code> <code>100053</code>
<country>China</country> <country>China</country>
</postal> </postal>
 End of changes. 94 change blocks. 
309 lines changed or deleted 281 lines changed or added

This html diff was produced by rfcdiff 1.48.