rfc9584.original | rfc9584.txt | |||
---|---|---|---|---|
avtcore S. Zhao | Internet Engineering Task Force (IETF) S. Zhao | |||
Internet-Draft Intel | Request for Comments: 9584 Intel | |||
Intended status: Standards Track S. Wenger | Category: Standards Track S. Wenger | |||
Expires: 21 June 2024 Tencent | ISSN: 2070-1721 Tencent | |||
Y. Lim | Y. Lim | |||
Samsung Electronics | Samsung Electronics | |||
19 December 2023 | June 2024 | |||
RTP Payload Format for Essential Video Coding (EVC) | RTP Payload Format for Essential Video Coding (EVC) | |||
draft-ietf-avtcore-rtp-evc-07 | ||||
Abstract | Abstract | |||
This document describes an RTP payload format for the Essential Video | This document describes an RTP payload format for the Essential Video | |||
Coding (EVC) standard, published as ISO/IEC International Standard | Coding (EVC) standard, published as ISO/IEC International Standard | |||
23094-1. EVC was developed by the Moving Picture Experts Group | 23094-1. EVC was developed by the MPEG. The RTP payload format | |||
(MPEG). The RTP payload format allows for the packetization of one | allows for the packetization of one or more Network Abstraction Layer | |||
or more Network Abstraction Layer (NAL) units in each RTP packet | (NAL) units in each RTP packet payload and the fragmentation of a NAL | |||
payload and the fragmentation of a NAL unit into multiple RTP | unit into multiple RTP packets. The payload format has broad | |||
packets. The payload format has broad applicability in | applicability in videoconferencing, Internet video streaming, and | |||
videoconferencing, Internet video streaming, and high-bitrate | high-bitrate entertainment-quality video, among other applications. | |||
entertainment-quality video, among other applications. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 21 June 2024. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9584. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2023 IETF Trust and the persons identified as the | Copyright (c) 2024 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
in the Revised BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction | |||
1.1. Overview of the EVC Codec . . . . . . . . . . . . . . . . 3 | 1.1. Overview of the EVC Codec | |||
1.1.1. Coding-Tool Features (informative) . . . . . . . . . 4 | 1.1.1. Coding-Tool Features (Informative) | |||
1.1.2. Systems and Transport Interfaces . . . . . . . . . . 6 | 1.1.2. Systems and Transport Interfaces | |||
1.1.3. Parallel Processing Support (informative) . . . . . . 9 | 1.1.3. Parallel Processing Support (Informative) | |||
1.1.4. NAL Unit Header . . . . . . . . . . . . . . . . . . . 9 | 1.1.4. NAL Unit Header | |||
1.2. Overview of the Payload Format . . . . . . . . . . . . . 10 | 1.2. Overview of the Payload Format | |||
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 2. Conventions | |||
3. Definitions and Abbreviations . . . . . . . . . . . . . . . . 11 | 3. Definitions and Abbreviations | |||
3.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 11 | 3.1. Definitions | |||
3.1.1. Definitions from the EVC Standard . . . . . . . . . . 11 | 3.1.1. Definitions from the EVC Standard | |||
3.1.2. Definitions Specific to This Document . . . . . . . . 13 | 3.1.2. Definitions Specific to This Document | |||
3.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 14 | 3.2. Abbreviations | |||
4. RTP Payload Format . . . . . . . . . . . . . . . . . . . . . 16 | 4. RTP Payload Format | |||
4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 16 | 4.1. RTP Header Usage | |||
4.2. Payload Header Usage . . . . . . . . . . . . . . . . . . 17 | 4.2. Payload Header Usage | |||
4.3. Payload Structures . . . . . . . . . . . . . . . . . . . 17 | 4.3. Payload Structures | |||
4.3.1. Single NAL Unit Packets . . . . . . . . . . . . . . . 18 | 4.3.1. Single NAL Unit Packets | |||
4.3.2. Aggregation Packets (APs) . . . . . . . . . . . . . . 19 | 4.3.2. Aggregation Packets (APs) | |||
4.3.3. Fragmentation Units . . . . . . . . . . . . . . . . . 23 | 4.3.3. Fragmentation Units (FUs) | |||
4.4. Decoding Order Number . . . . . . . . . . . . . . . . . . 26 | 4.4. Decoding Order Number | |||
5. Packetization Rules . . . . . . . . . . . . . . . . . . . . . 27 | 5. Packetization Rules | |||
6. De-packetization Process . . . . . . . . . . . . . . . . . . 28 | 6. De-packetization Process | |||
7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 30 | 7. Payload Format Parameters | |||
7.1. Media Type Registration . . . . . . . . . . . . . . . . . 30 | 7.1. Media Type Registration | |||
7.2. Optional Parameters Definition . . . . . . . . . . . . . 31 | 7.2. Optional Parameters Definition | |||
7.3. SDP Parameters . . . . . . . . . . . . . . . . . . . . . 35 | 7.3. SDP Parameters | |||
7.3.1. Mapping of Payload Type Parameters to SDP . . . . . . 35 | 7.3.1. Mapping of Payload Type Parameters to SDP | |||
7.3.2. Usage with SDP Offer/Answer Model . . . . . . . . . . 37 | 7.3.2. Usage with SDP Offer/Answer Model | |||
7.3.3. Multicast . . . . . . . . . . . . . . . . . . . . . . 41 | 7.3.3. Multicast | |||
7.3.4. Usage in Declarative Session Descriptions . . . . . . 42 | 7.3.4. Usage in Declarative Session Descriptions | |||
7.3.5. Considerations for Parameter Sets . . . . . . . . . . 43 | 7.3.5. Considerations for Parameter Sets | |||
8. Use with Feedback Messages . . . . . . . . . . . . . . . . . 43 | 8. Use with Feedback Messages | |||
8.1. Picture Loss Indication (PLI) . . . . . . . . . . . . . . 43 | 8.1. Picture Loss Indication (PLI) | |||
8.2. Full Intra Request (FIR) . . . . . . . . . . . . . . . . 44 | 8.2. Full Intra Request (FIR) | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 44 | 9. Security Considerations | |||
10. Congestion Control . . . . . . . . . . . . . . . . . . . . . 46 | 10. Congestion Control | |||
11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47 | 11. IANA Considerations | |||
12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 47 | 12. References | |||
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 | 12.1. Normative References | |||
13.1. Normative References . . . . . . . . . . . . . . . . . . 47 | 12.2. Informative References | |||
13.2. Informative References . . . . . . . . . . . . . . . . . 49 | Acknowledgements | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 50 | Authors' Addresses | |||
1. Introduction | 1. Introduction | |||
The Essential Video Coding [EVC] standard, which is formally | The Essential Video Coding [EVC] standard, which is formally | |||
designated as ISO/IEC International Standard 23094-1 [ISO23094-1] has | designated as ISO/IEC International Standard 23094-1 [EVC], was | |||
been published in 2020. One goal of MPEG is to keep [EVC]'s Baseline | published in 2020. One of MPEG's goals is to keep EVC's Baseline | |||
profile essentially royalty-free by using technologies published more | profile essentially royalty-free by using technologies published more | |||
than 20 years ago or otherwise known to be available for use without | than 20 years ago or otherwise known to be available for use without | |||
a requirement for paying royalties, whereas more advanced profiles | a requirement for paying royalties, whereas more advanced profiles | |||
follow a reasonable and non-discriminatory licensing terms policy. | follow a reasonable and non-discriminatory licensing terms policy. | |||
Both the Baseline profile and higher profiles of [EVC] are reported | Both the Baseline profile and higher profiles of EVC [EVC] are | |||
to provide coding efficiency gains over High Efficiency Video Coding | reported to provide coding efficiency gains over High Efficiency | |||
[HEVC] and Advanced Video Coding [AVC] under certain configurations. | Video Coding [HEVC] and Advanced Video Coding [AVC] under certain | |||
configurations. | ||||
This document describes an RTP payload format for EVC. It shares its | This document describes an RTP payload format for EVC. It shares its | |||
basic design with the NAL unit-based RTP payload formats of H.264 | basic design with the NAL unit-based RTP payload formats of H.264 | |||
Video Coding [RFC6184], Scalable Video Coding (SVC) [RFC6190], High | Video Coding [RFC6184], Scalable Video Coding (SVC) [RFC6190], High | |||
Efficiency Video Coding (HEVC) [RFC7798], and Versatile Video Coding | Efficiency Video Coding (HEVC) [RFC7798], and Versatile Video Coding | |||
(VVC)[RFC9328]. With respect to design philosophy, security, | (VVC) [RFC9328]. With respect to design philosophy, security, | |||
congestion control, and overall implementation complexity, it has | congestion control, and overall implementation complexity, it has | |||
similar properties to those earlier payload format specifications. | similar properties to those earlier payload format specifications. | |||
This is a conscious choice, as at least [RFC6184] is widely deployed | This is a conscious choice, as at least the RTP Payload Format for | |||
and generally known in the relevant implementer communities. Certain | H.264 video as described in [RFC6184] is widely deployed and | |||
mechanisms known from [RFC6190] were incorporated as EVC supports | generally known in the relevant implementer communities. Certain | |||
mechanisms described in [RFC6190] were incorporated, as EVC supports | ||||
temporal scalability. EVC currently does not offer higher forms of | temporal scalability. EVC currently does not offer higher forms of | |||
scalability. | scalability. | |||
1.1. Overview of the EVC Codec | 1.1. Overview of the EVC Codec | |||
[EVC], [AVC], [HEVC] and [VVC] share a similar hybrid video codec | The codings described in [EVC], [AVC], [HEVC], and [VVC] share a | |||
design. In this document, we provide a very brief overview of those | similar hybrid video codec design. In this document, we provide a | |||
features of EVC that are, in some form, addressed by the payload | very brief overview of those features of EVC that are, in some form, | |||
format specified herein. Implementers have to read, understand, and | addressed by the payload format specified herein. Implementers have | |||
apply the ISO/IEC standard pertaining to EVC to arrive at | to read, understand, and apply the ISO/IEC standard pertaining to EVC | |||
interoperable, well-performing implementations. The EVC standard has | [EVC] to arrive at interoperable, well-performing implementations. | |||
a Baseline profile and a Main profile, the latter being a superset of | The EVC standard has a Baseline profile and a Main profile, the | |||
the Baseline profile but including more advanced features. EVC also | latter being a superset of the Baseline profile but including more | |||
includes still image variants of both Baseline and Main profiles, in | advanced features. EVC also includes still image variants of both | |||
each of which the bitstream is restricted to a single IDR picture. | Baseline and Main profiles, in each of which the bitstream is | |||
EVC facilitates certain walled-garden implementations under | restricted to a single IDR picture. EVC facilitates certain walled | |||
commercial constraints imposed by intellectual property rights by | garden implementations under commercial constraints imposed by | |||
including syntax elements that allow encoders to mark a bitstream as | intellectual property rights by including syntax elements that allow | |||
to what of the many independent coding tools are exercised in the | encoders to mark a bitstream as to what of the many independent | |||
bitstream, in a spirit similar to the general_constraint_flags of | coding tools are exercised in the bitstream, in a spirit similar to | |||
[VVC]. | the general_constraint_info of [VVC]. | |||
Conceptually, all EVC, AVC, HEVC and VVC include a Video Coding Layer | Conceptually, all EVC, AVC, HEVC, and VVC include a Video Coding | |||
(VCL); a term that is often used to refer to the coding-tool | Layer (VCL), a term that is often used to refer to the coding-tool | |||
features, and a Network Abstraction Layer (NAL), which usually refers | features, and a Network Abstraction Layer (NAL), which usually refers | |||
to the systems and transport interface aspects of the codecs. | to the systems and transport interface aspects of the codecs. | |||
1.1.1. Coding-Tool Features (informative) | 1.1.1. Coding-Tool Features (Informative) | |||
Coding blocks and transform structure | Coding blocks and transform structure | |||
EVC uses a traditional block-based coding structure, which divides | ||||
the encoded image into blocks of up to 64x64 luma samples for the | ||||
Baseline profile and 128x128 luma samples for the Main profile | ||||
that can be recursively divided into smaller blocks. The Baseline | ||||
profiles utilize HEVC-like quad-tree-blocks partitioning that | ||||
allows a block to be divided horizontally and vertically into four | ||||
smaller square blocks. The Main profile adds two advanced coding | ||||
structure tools: 1) Binary Ternary Tree (BTT) partitioning that | ||||
allows non-square coding units and 2) Split Unit Coding Order | ||||
segmentation that changes the processing order of the blocks from | ||||
traditional left-to-right and top-to-bottom scanning order | ||||
processing to an alternative right-to-left and bottom-to-top | ||||
scanning order. In the Main profile, the picture can be divided | ||||
into slices and tiles, which can be independently encoded and/or | ||||
decoded in parallel. | ||||
EVC uses a traditional block-based coding structure, which divides | EVC also uses a traditional video codecs prediction model assuming | |||
the encoded image into blocks of up to 64x64 luma samples for the | two general types of predictions: Intra (spatial) and Inter | |||
Baseline profile and 128x128 luma samples for the Main profile that | (temporal) predictions. A residue block is calculated by | |||
can be recursively divided into smaller blocks. The baseline | subtracting predicted data from the original (encoded) one. The | |||
profiles utilize an HEVC-like quad-tree blocks partitioning that | Baseline profile allows only discrete cosine transform (DCT-2) and | |||
allows to divide a block horizontally and vertically onto four | scalar quantization to transform and quantize residue data, | |||
smaller square blocks. The Main profile adds two advanced coding | wherein the Main profile additionally has options to use discrete | |||
structure tools: 1) Binary Ternary Tree (BTT) partitioning that | sine transform (DST-7) and another type of discrete cosine | |||
allows non-square coding units; and 2) Split Unit Coding Order | transform (DCT-8). In addition, for the Main profile, Improved | |||
segmentation that changes the processing order of the blocks from | Quantization and Transform (IQT) uses a different mapping or | |||
traditional left-to-right and top-to-bottom scanning order processing | clipping function for quantization. An inverse zig-zag scanning | |||
to an alternative right-to-left and bottom-to-top scanning order. In | order is used for coefficient coding. Advanced Coefficient Coding | |||
the Main profile, the picture can be divided into slices and tiles, | (ADCC) in the Main profile can code coefficient values more | |||
which can be independently encoded and/or decoded in parallel. | efficiently, for example, indicated by the last non-zero | |||
coefficient. The Baseline profile uses a straightforward RLE- | ||||
EVC also uses a traditional video codecs prediction model assuming | based approach to encode the quantized coefficients. | |||
two general types of predictions: Intra (spatial) and Inter | ||||
(temporal) predictions. A residue block is calculated by subtracting | ||||
predicted data from the original (encoded) one. The Baseline profile | ||||
allows only discrete cosine transform (DCT-2) and scalar quantization | ||||
to transform and quantize residue data, wherein the Main profile | ||||
additionally has options to use discrete sine transform (DST-7) and | ||||
another type of discrete cosine transform (DCT-8). In addition, for | ||||
the Main profile, Improved Quantization and Transform (IQT) uses a | ||||
different mapping/clipping function for quantization. An inverse | ||||
zig-zag scanning order is used for coefficient coding. Advanced | ||||
Coefficient Coding (ADCC) in the Main profile can code coefficient | ||||
values more efficiently, for example, indicated by the last non-zero | ||||
coefficient. The Baseline profile uses a straightforward run-length | ||||
encoding (RLE) based approach to encode the quantized coefficients. | ||||
Entropy coding | Entropy coding | |||
EVC uses a similar binary arithmetic coding mechanism as HEVC CABAC | EVC uses a similar binary arithmetic coding mechanism as HEVC | |||
and VVC. The mechanism includes a binarization step and a | CABAC (context adaptive binary arithmetic coding) and VVC. The | |||
probability update defined by a lookup table. In the Main profile, | mechanism includes a binarization step and a probability update | |||
the derivation process of syntax elements based on adjacent blocks | defined by a lookup table. In the Main profile, the derivation | |||
makes the context modeling and initialization process more efficient. | process of syntax elements based on adjacent blocks makes the | |||
context modeling and initialization process more efficient. | ||||
In-loop filtering | In-loop filtering | |||
The Baseline profile of EVC uses the deblocking filter defined in | ||||
H.263 Annex J [VIDEO-CODING]. In the Main profile, an Advanced | ||||
Deblocking Filter (ADDB) can be used as an alternative, which can | ||||
further reduce undesirable compression artifacts. The Main | ||||
profile also defines two additional in-loop filters that can be | ||||
used to improve the quality of decoded pictures before output and/ | ||||
or for Inter prediction. A Hadamard Transform Domain Filter | ||||
(HTDF) is applied to the luma samples before deblocking, and a | ||||
lookup table is used to determine four adjacent samples for | ||||
filtering. An adaptive Loop Filter (ALF) allows signals of up to | ||||
25 different filters to be sent for the luma components; the best | ||||
filter can be selected through the classification process for each | ||||
4x4 block. Similarly to VVC, the filter parameters of ALF are | ||||
signaled in the Adaptation Parameter Set (APS). | ||||
The Baseline profile of EVC uses the deblocking filter defined in | Inter prediction | |||
H.263 Annex J. In the Main profile, an Advanced Deblocking Filter | The basis of EVC's Inter prediction is motion compensation using | |||
(ADDB) can be used as an alternative, which can further reduce | interpolation filters with a quarter sample resolution. In the | |||
undesirable compression artifacts. The Main profile also defines two | Baseline profile, a motion vector is transmitted using one of | |||
additional in-loop filters that can be used to improve the quality of | three spatially neighboring motion vectors and a temporally | |||
decoded pictures before output and/or for inter-prediction. A | collocated motion vector as a predictor. A motion vector | |||
Hadamard Transform Domain Filter (HTDF) is applied to the luma | difference may be signaled relative to the selected predictor, but | |||
samples before deblocking, and a lookup table is used to determine | there is a case where no motion vector difference is signaled, and | |||
four adjacent samples for filtering. An adaptive Loop Filter (ALF) | there is no remaining data in the block. This mode is called a | |||
allows to send signals of up to 25 different filters for the luma | "skip" mode. The Main profile includes six additional tools to | |||
components, and the best filter can be selected through the | provide improved Inter prediction. With Advanced Motion Vectors | |||
classification process for each 4x4 block. Similarly to VVC, the | Prediction (ADMVP), adjacent blocks can be conceptually merged to | |||
filter parameters of ALF are signaled in the Adaptation Parameter Set | indicate that they use the same motion, but more advanced schemes | |||
(APS). | can also be used to create predictions from the basic model list | |||
of candidate predictors. The Merge with Motion Vector Difference | ||||
Inter-prediction | (MMVD) tool uses a process similar to the concept of merging | |||
neighboring blocks but also allows the use of expressions that | ||||
The basis of EVC's inter-prediction is motion compensation using | include a starting point, motion amplitude, and direction of | |||
interpolation filters with a quarter sample resolution. In the | motion to send a motion vector signal. Using Advanced Motion | |||
Baseline profile, a motion vector is transmitted using one of three | Vector Prediction (AMVP), candidate motion vector predictions for | |||
spatially neighboring motion vectors and a temporally collocated | the block can be derived from its neighboring blocks in the same | |||
motion vector as a predictor. A motion vector difference may be | picture and collocated blocks in the reference picture. The | |||
signaled relative to the selected predictor, but there is a case | Adaptive Motion Vector Resolution (AMVR) tool provides a way to | |||
where no motion vector difference is signaled, and there is no | reduce the accuracy of a motion vector from a quarter sample to | |||
remaining data in the block. This mode is called a skip mode. The | half sample, full sample, double sample, or quad sample, which | |||
Main profile includes six additional tools to provide improved inter- | provides an efficiency advantage, such as when sending large | |||
prediction. With Advanced Motion Vectors Prediction (ADMVP), | motion vector differences. The Main profile also includes the | |||
adjacent blocks can be conceptually merged to indicate that they use | Decoder-side Motion Vector Refinement (DMVR), which uses a | |||
the same motion, but more advanced schemes can also be used to create | bilateral template matching process to refine the motion vectors | |||
predictions from the basic model list of candidate predictors. The | without additional signaling. | |||
Merge with Motion Vector Difference (MMVD) tool uses a process | ||||
similar to the concept of merging neighboring blocks but also allows | ||||
the use of expressions that include a starting point, motion | ||||
amplitude, and direction of motion to send a motion vector signal. | ||||
Using Advanced Motion Vector Prediction (AMVP), candidate motion | ||||
vector predictions for the block can be derived from its neighboring | ||||
blocks in the same picture and collocated blocks in the reference | ||||
picture. The Adaptive Motion Vector Resolution (AMVR) tool provides | ||||
a way to reduce the accuracy of a motion vector from a quarter sample | ||||
to half sample, full sample, double sample, or quad sample, which | ||||
provides an efficiency advantage, such as when sending large motion | ||||
vector differences. The Main profile also includes the Decoder-side | ||||
Motion Vector Refinement (DMVR), which uses a bilateral template | ||||
matching process to refine the motion vectors without additional | ||||
signaling. | ||||
Intra prediction and intra-coding | ||||
Intra prediction in EVC is performed on adjacent samples of coding | Intra prediction and intra coding | |||
units in a partitioned structure. For the Baseline profile, when all | Intra prediction in EVC is performed on adjacent samples of coding | |||
coding units are square, there are five different prediction modes: | units in a partitioned structure. For the Baseline profile, when | |||
DC (mean value of the neighborhood), horizontal, vertical, and two | all coding units are square, there are five different prediction | |||
different diagonal directions. In the Main profile, intra prediction | modes: DC (mean value of the neighborhood), horizontal, vertical, | |||
can be applied to any rectangular coding unit, and 28 additional | and two different diagonal directions. In the Main profile, intra | |||
direction modes are available in the so-called Enhanced Intra | prediction can be applied to any rectangular coding unit, and 28 | |||
Prediction Directions (EIPD). In the Main profile, an encoder can | additional direction modes are available in the Enhanced Intra | |||
also use Intra Block Copy (IBC), where previously decoded sample | Prediction Directions (EIPDs). In the Main profile, an encoder | |||
blocks of the same picture are used as a predictor. A displacement | can also use Intra Block Copy (IBC), where previously decoded | |||
vector in integer sample precision is signaled to indicate where the | sample blocks of the same picture are used as a predictor. A | |||
prediction block in the current picture is used for this mode. | displacement vector in integer sample precision is signaled to | |||
indicate where the prediction block in the current picture is used | ||||
for this mode. | ||||
Reference frames management | Reference frames management | |||
In EVC, decoded pictures can be stored in a decoded picture buffer | ||||
In EVC, decoded pictures can be stored in a decoded picture buffer | (DPB) for predicting pictures that follow them in the decoding | |||
(DPB) for predicting pictures that follow them in the decoding order. | order. In the Baseline profile, the management of the DPB (i.e., | |||
In the Baseline profile, the management of the DPB (i.e., the process | the process of adding and deleting reference pictures) is | |||
of adding and deleting reference pictures) is controlled by a | controlled by a straightforward AVC-like sliding window approach | |||
straightforward AVC-like sliding window approach with very few | with very few parameters from the sequence parameter set (SPS). | |||
parameters from the SPS. For the Main profile, DPB management can be | For the Main profile, DPB management can be handled much more | |||
handled much more flexibly using explicitly signaled reference | flexibly using explicitly signaled Reference Picture Lists (RPLs) | |||
Picture Lists (RPL) in the SPS or slice level. | in the SPS or slice level. | |||
1.1.2. Systems and Transport Interfaces | 1.1.2. Systems and Transport Interfaces | |||
EVC inherits the basic systems and transport interface designs from | EVC inherits the basic systems and transport interface designs from | |||
AVC and HEVC. These include the NAL-unit-based syntax, hierarchical | AVC and HEVC. These include the NAL-unit-based syntax, hierarchical | |||
syntax and data unit structure, and Supplemental Enhancement | syntax and data unit structure, and Supplemental Enhancement | |||
Information (SEI) message mechanism. The hierarchical syntax and | Information (SEI) message mechanism. The hierarchical syntax and | |||
data unit structure consists of a sequence-level parameter set (SPS), | data unit structure consists of a sequence-level parameter set (i.e., | |||
two picture-level parameter sets (PPS and APS, each of which can | SPS), two picture-level parameter sets (i.e., PPS and APS, each of | |||
apply to one or more pictures), slice-level header parameters, and | which can apply to one or more pictures), slice-level header | |||
lower-level parameters. | parameters, and lower-level parameters. | |||
A number of key components that influenced the Network Abstraction | A number of key components that influenced the NAL design of EVC as | |||
Layer design of EVC as well as this document, are described below: | well as this document are described below: | |||
Sequence parameter set | Sequence parameter set | |||
The Sequence Parameter Set (SPS) contains syntax elements | The Sequence Parameter Set (SPS) contains syntax elements | |||
pertaining to a Coded Video Sequence (CVS), which is a group of | pertaining to a Coded Video Sequence (CVS), which is a group of | |||
pictures, starting with a random access point picture and followed | pictures, starting with a random access point picture and followed | |||
by zero or more pictures that may depend on each other and the | by zero or more pictures that may depend on each other and the | |||
random access point picture. In MPEG-2, the equivalent of a CVS | random access point picture. In MPEG-2, the equivalent of a CVS | |||
is a Group of Pictures (GOP), which generally started with an I | is a Group of Pictures (GOP), which generally starts with an I | |||
frame and is followed by P and B frames. While more complex in | frame and is followed by P and B frames. While more complex in | |||
its options of random access points, EVC retains this basic | its options of random access points, EVC retains this basic | |||
concept. In many TV-like applications, a CVS contains a few | concept. In many TV-like applications, a CVS contains a few | |||
hundred milliseconds to a few seconds of video. In video | hundred milliseconds to a few seconds of video. In video | |||
conferencing (without switching MCUs involved), a CVS can be as | conferencing (without switching Multipoint Control Units (MCUs) | |||
long in duration as the whole session. | involved), a CVS can be as long in duration as the whole session. | |||
Picture and adaptation parameter set | Picture and adaptation parameter set | |||
The Picture Parameter Set (PPS) and the Adaptation Parameter Set | ||||
The Picture Parameter Set and the Adaptation Parameter Set (PPS | (APS) carry information pertaining to a single picture. The PPS | |||
and APS, respectively) carry information pertaining to a single | contains information that is likely to stay constant from picture | |||
picture. The PPS contains information that is likely to stay | to picture, at least for pictures of a certain type; whereas the | |||
constant from picture to picture, at least for pictures of a | APS contains information, such as adaptive loop filter | |||
certain type whereas the APS contains information, such as | coefficients, that are likely to change from picture to picture. | |||
adaptive loop filter coefficients, that are likely to change from | ||||
picture to picture. | ||||
Profile, level, and toolsets | Profile, level, and toolsets | |||
Profiles and levels follow the same design considerations known | Profiles and levels follow the same design considerations known | |||
from AVC, HEVC, and video codecs as old as MPEG-1 Video. The | from AVC, HEVC, and video codecs as old as MPEG-1 Video. The | |||
profile defines a set of tools (not to confuse with the "toolset" | profile defines a set of tools (not to be confused with the | |||
discussed below) that a decoder compliant with this profile has to | "toolset" discussed below) that a decoder compliant with this | |||
support. In EVC, profiles are defined in Annex A. Formally, they | profile has to support. In EVC, profiles are defined in Annex A | |||
are defined as a set of constraints that a bitstream needs to | of [EVC]. Formally, they are defined as a set of constraints that | |||
conform to. In EVC, the Baseline profile is much more severely | a bitstream needs to conform to. In EVC, the Baseline profile is | |||
constraint than the Main profile, reducing implementation | much more severely constrained than the Main profile, reducing | |||
complexity. Levels relate to bitstream complexity in dimensions | implementation complexity. Levels relate to bitstream complexity | |||
such as maximum sample decoding rate, maximum picture size, and | in dimensions such as maximum sample decoding rate, maximum | |||
similar parameters directly related to computational complexity | picture size, and similar parameters directly related to | |||
and/or memory demands. | computational complexity and/or memory demands. | |||
Profiles and levels are signaled in the highest parameter set | Profiles and levels are signaled in the highest parameter set | |||
available, the SPS. | available, the SPS. | |||
EVC contains another mechanism related to the use of coding tools, | EVC contains another mechanism related to the use of coding tools, | |||
known as the toolset syntax element. This syntax element, | known as the toolset syntax elements. These syntax elements, | |||
toolset_idc_h and toolset_idc_l located in the SPS, is a bitmask | toolset_idc_h and toolset_idc_l (located in the SPS), are bitmasks | |||
that allows encoders to indicate which coding tools they are using | that allow encoders to indicate which coding tools they are using | |||
within the menu of profiles offered by the profile that is also | within the menu of profiles offered by the profile that is also | |||
signaled. No decoder conformance point is associated with the | signaled. No decoder conformance point is associated with the | |||
toolset, but a bitstream that was using a coding tool that is | toolset, but a bitstream that was using a coding tool that is | |||
indicated as not being used in the toolset syntax element would be | indicated as not being used in the toolset syntax element would be | |||
non-compliant. While MPEG specifically rules out the use of the | non-compliant. While MPEG specifically rules out the use of the | |||
toolset syntax element as a conformance point, walled garden | toolset syntax element as a conformance point, walled garden | |||
implementations could do so without incurring the interoperability | implementations could do so without incurring the interoperability | |||
problems MPEG fears and create bitstreams and decoders that do not | problems MPEG fears and create bitstreams and decoders that do not | |||
support one or more given tools. That, in turn, may be useful to | support one or more given tools. That, in turn, may be useful to | |||
mitigate certain intellectual property-related risks. | mitigate certain intellectual property-related risks. | |||
skipping to change at page 8, line 13 ¶ | skipping to change at line 336 ¶ | |||
toolset, but a bitstream that was using a coding tool that is | toolset, but a bitstream that was using a coding tool that is | |||
indicated as not being used in the toolset syntax element would be | indicated as not being used in the toolset syntax element would be | |||
non-compliant. While MPEG specifically rules out the use of the | non-compliant. While MPEG specifically rules out the use of the | |||
toolset syntax element as a conformance point, walled garden | toolset syntax element as a conformance point, walled garden | |||
implementations could do so without incurring the interoperability | implementations could do so without incurring the interoperability | |||
problems MPEG fears and create bitstreams and decoders that do not | problems MPEG fears and create bitstreams and decoders that do not | |||
support one or more given tools. That, in turn, may be useful to | support one or more given tools. That, in turn, may be useful to | |||
mitigate certain intellectual property-related risks. | mitigate certain intellectual property-related risks. | |||
Bitstream and elementary stream | Bitstream and elementary stream | |||
Above the Coded Video Sequence (CVS), EVC defines a video | Above the Coded Video Sequence (CVS), EVC defines a video | |||
bitstream that can be used as an elementary stream in the MPEG | bitstream that can be used as an elementary stream in the MPEG | |||
systems context. For this document, the video bitstream syntax | systems context. For this document, the video bitstream syntax | |||
level is not relevant. | level is not relevant. | |||
Random access support | Random access support | |||
EVC supports random access mechanisms based on IDR and clean | ||||
EVC supports random access mechanisms based on IDR and CRA access | random access (CRA) access units. | |||
units. | ||||
Temporal scalability support | Temporal scalability support | |||
EVC supports temporal scalability through the generalized | EVC supports temporal scalability through the generalized | |||
reference picture selection approach known since AVC/SVC. Up to | reference picture selection approach known since AVC/SVC. Up to | |||
six temporal layers are supported. The temporal layer is signaled | six temporal layers are supported. The temporal layer is signaled | |||
in the NAL unit header (which co-serves as the payload header in | in the NAL unit header (which co-serves as the payload header in | |||
this document), in the nuh_temporal_id field. | this document), in the nuh_temporal_id field. | |||
Reference picture management | Reference picture management | |||
EVC's reference picture management is POC-based, similar to HEVC. | ||||
EVC's reference picture management is POC-based (Picture Order | In the Main profile, substantially all reference picture list | |||
Count), similar to HEVC. In the Main profile, substantially all | manipulations available in HEVC are specified, including explicit | |||
reference picture list manipulations available in HEVC are | transmissions or updates of reference picture lists. Although for | |||
available, including explicit transmissions/updates of reference | reference pictures management purposes, EVC uses a modern VVC-like | |||
picture lists, although for reference pictures management | RPL approach, which is conceptually simpler than the HEVC one. In | |||
purposes, EVC uses a modern VVC-like RPL approach, which is | the Baseline profile, reference picture management is more | |||
conceptually simpler than the HEVC one. In the Baseline profile, | restricted, allowing for a comparatively simple group of picture | |||
reference picture management is more restricted, allowing for a | structures only. | |||
comparatively simple group of picture structures only. | ||||
SEI Message | SEI Message | |||
EVC inherits many of HEVC's SEI messages, occasionally with syntax | ||||
EVC inherits many of HEVC's SEI Messages, occasionally with syntax | ||||
and/or semantics changes, making them applicable to EVC. In | and/or semantics changes, making them applicable to EVC. In | |||
addition, some of the codec-agnostic SEI Messages of the VSEI | addition, some of the codec-agnostic SEI messages of the VSEI | |||
specification are also mapped. | specification [VSEI] are also mapped. | |||
1.1.3. Parallel Processing Support (informative) | 1.1.3. Parallel Processing Support (Informative) | |||
EVC's Baseline profile includes no tools specifically addressing | EVC's Baseline profile includes no tools specifically addressing | |||
parallel processing support. The Main profile includes | parallel-processing support. The Main profile includes independently | |||
independently decodable slices for parallel processing. The | decodable slices for parallel processing. The slices are defined as | |||
slices are defined as any rectangular region within a picture and | any rectangular region within a picture. They can be encoded to have | |||
can be encoded to have no coding dependencies with other slices in | coding dependencies with other slices from the previous picture but | |||
the same picture but with other slices from the previous picture. | not with other slices in the same picture. No specific support for | |||
No specific support for parallel processing is specified in this | parallel processing is specified in this RTP payload format. | |||
RTP payload format. | ||||
1.1.4. NAL Unit Header | 1.1.4. NAL Unit Header | |||
EVC maintains the NAL unit concept of [VVC] with different parameter | EVC maintains the NAL unit concept of [VVC] with different parameter | |||
options. EVC also uses a two-byte NAL unit header, as shown in | options. EVC also uses a two-byte NAL unit header, as shown in | |||
Figure 1. The payload of a NAL unit refers to the NAL unit excluding | Figure 1. The payload of a NAL unit refers to the NAL unit excluding | |||
the NAL unit header. | the NAL unit header. | |||
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
|F| Type | TID | Reserve |E| | |F| Type | TID | Reserve |E| | |||
+-------------+-----------------+ | +-------------+-----------------+ | |||
The Structure of the EVC NAL Unit Header | Figure 1: The Structure of the EVC NAL Unit Header | |||
Figure 1 | ||||
The semantics of the fields in the NAL unit header are as specified | The semantics of the fields in the NAL unit header are as specified | |||
in EVC and described briefly below for convenience. In addition to | in EVC and described briefly below for convenience. In addition to | |||
the name and size of each field, the corresponding syntax element | the name and size of each field, the corresponding syntax element | |||
name in EVC is also provided. | name in EVC is also provided. | |||
F: 1 bit | F: 1 bit | |||
forbidden_zero_bit. Required to be zero in EVC. Note that the | forbidden_zero_bit: Required to be zero in EVC. Note that the | |||
inclusion of this bit in the NAL unit header was included to | inclusion of this bit in the NAL unit header was included to | |||
enable transport of EVC video over MPEG-2 transport systems | enable transport of EVC video over MPEG-2 transport systems | |||
(avoidance of start code emulations) [MPEG2S]. In this document, | (avoidance of start code emulations) [MPEG2S]. In this | |||
the value 1 may be used to indicate a syntax violation, e.g., for | document, the value 1 may be used to indicate a syntax | |||
a NAL unit resulting from aggregating a number of fragmented units | violation, e.g., for a NAL unit resulting from aggregating a | |||
of a NAL unit but missing the last fragment, as described in | number of fragmented units of a NAL unit but missing the last | |||
Section 4.3.3. | fragment, as described in Section 4.3.3. | |||
Type: 6 bits | Type: 6 bits | |||
nal_unit_type_plus1. This field allows the NAL Unit Type to be | nal_unit_type_plus1: This field allows the NAL Unit Type to be | |||
computed. The NAL Unit Type (NalUnitType) is equal to the value | computed. The NAL Unit Type (NalUnitType) is equal to the | |||
found in this field, minus 1; in other words: | value found in this field, minus 1; in other words: | |||
NalUnitType = nal_unit_type_plus1 - 1. | NalUnitType = nal_unit_type_plus1 - 1. | |||
The NAL unit type is detailed in Table 4 of [EVC]. If the value | The NAL unit type is detailed in Table 4 of [EVC]. If the | |||
of NalUnitType is less than or equal to 23, the NAL unit is a VCL | value of NalUnitType is less than or equal to 23, the NAL unit | |||
NAL unit. Otherwise, the NAL unit is a non-VCL NAL unit. For a | is a VCL NAL unit. Otherwise, the NAL unit is a non-VCL NAL | |||
reference of all currently defined NAL unit types and their | unit. For a reference of all currently defined NAL unit types | |||
semantics, please refer to Section 7.4.2.2 in [EVC]. Note that | and their semantics, please refer to Section 7.4.2.2 of [EVC]. | |||
nal_unit_type_plus1 MUST NOT be zero. | Note that nal_unit_type_plus1 MUST NOT be zero. | |||
TID: 3 bits | TID: 3 bits | |||
nuh_temporal_id. This field specifies the temporal identifier of | nuh_temporal_id: This field specifies the temporal identifier of | |||
the NAL unit. The value of TemporalId is equal to TID. | the NAL unit. The value of TemporalId is equal to TID. | |||
TemporalId shall be equal to 0 if it is an IDR NAL unit type (NAL | TemporalId shall be equal to 0 if it is an IDR NAL unit type | |||
unit type 1). | (NAL unit type 1). | |||
Reserve: 5 bits | Reserve: 5 bits | |||
nuh_reserved_zero_5bits. This field shall be equal to the version | nuh_reserved_zero_5bits: This field shall be equal to the version | |||
of the EVC standard. Values of nuh_reserved_zero_5bits greater | of the EVC standard. Values of nuh_reserved_zero_5bits greater | |||
than 0 are reserved for future use by ISO/IEC. Decoders | than 0 are reserved for future use by ISO/IEC. Decoders | |||
conforming to a profile specified in [EVC]'s Annex A shall ignore | conforming to a profile specified in Annex A of [EVC] shall | |||
(i.e., remove from the bitstream and discard) all NAL units with | ignore (i.e., remove from the bitstream and discard) all NAL | |||
values of nuh_reserved_zero_5bits greater than 0. | units with values of nuh_reserved_zero_5bits greater than 0. | |||
E: 1 bit | E: 1 bit | |||
nuh_extension_flag. This field shall be equal to the version of | nuh_extension_flag: This field shall be equal to the version of | |||
the EVC standard. The value of nuh_extension_flag equal to 1 is | the EVC standard. The value of nuh_extension_flag equal to 1 | |||
reserved for future use by ISO/IEC. Decoders conforming to a | is reserved for future use by ISO/IEC. Decoders conforming to | |||
profile specified in [EVC]'s Annex A shall ignore (i.e., remove | a profile specified in Annex A of [EVC] shall ignore (i.e., | |||
from the bitstream and discard) all NAL units with values of | remove from the bitstream and discard) all NAL units with | |||
nuh_extension_flag equal to 1. | values of nuh_extension_flag equal to 1. | |||
1.2. Overview of the Payload Format | 1.2. Overview of the Payload Format | |||
This payload format defines the following processes required for | This payload format defines the following processes required for | |||
transport of EVC-coded data over RTP [RFC3550]: | transport of EVC-coded data over RTP [RFC3550]: | |||
* usage of RTP header with this payload format | * usage of RTP header with this payload format | |||
* packetization of EVC-coded NAL units into RTP packets using three | * packetization of EVC-coded NAL units into RTP packets using three | |||
types of payload structures: a single NAL unit, aggregation, and | types of payload structures: a single NAL unit, aggregation, and | |||
fragment unit | fragment unit | |||
* transmission of EVC NAL units of the same bitstream within a | * transmission of EVC NAL units of the same bitstream within a | |||
single RTP stream. | single RTP stream | |||
* media type parameters to be used with the Session Description | * usage of media type parameters to be used with the Session | |||
Protocol (SDP) [RFC8866] | Description Protocol (SDP) [RFC8866] | |||
* usage of RTCP feedback messages | * usage of RTCP feedback messages | |||
2. Conventions | 2. Conventions | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown above. | capitals, as shown here. | |||
3. Definitions and Abbreviations | 3. Definitions and Abbreviations | |||
3.1. Definitions | 3.1. Definitions | |||
This document uses the terms and definitions of EVC. Section 3.1.1 | This document uses the terms and definitions of EVC. Section 3.1.1 | |||
lists relevant definitions from [EVC] for convenience. Section 3.1.2 | lists relevant definitions from [EVC] for convenience. Section 3.1.2 | |||
provides definitions specific to this document. | provides definitions specific to this document. | |||
3.1.1. Definitions from the EVC Standard | 3.1.1. Definitions from the EVC Standard | |||
Access Unit: A set of NAL units that are associated with each other | Access Unit (AU): | |||
according to a specified classification rule, are consecutive in | A set of NAL units that are associated with each other according | |||
decoding order, and contain exactly one coded picture. | to a specified classification rule, are consecutive in decoding | |||
order, and contain exactly one coded picture. | ||||
Adaptation parameter set (APS): A syntax structure containing syntax | Adaptation Parameter Set (APS): | |||
elements that apply to zero or more slices as determined by zero or | A syntax structure containing syntax elements that apply to zero | |||
more syntax elements found in slice headers. | or more slices as determined by zero or more syntax elements found | |||
in slice headers. | ||||
Bitstream: A sequence of bits, in the form of a NAL unit stream or a | Bitstream: | |||
byte stream, that forms the representation of coded pictures and | A sequence of bits, in the form of a NAL unit stream or a byte | |||
associated data forming one or more coded video sequences (CVSs). | stream, that forms the representation of coded pictures and | |||
associated data forming one or more CVSs. | ||||
Coded Picture: A coded representation of a picture containing all | Coded Picture: | |||
CTUs of the picture. | A coded representation of a picture containing all CTUs of the | |||
picture. | ||||
Coded Video Sequence (CVS): A sequence of access units that consists, | Coded Video Sequence (CVS): | |||
in decoding order, of an IDR access unit, followed by zero or more | A sequence of access units that consists, in decoding order, of an | |||
access units that are not IDR access units, including all subsequent | IDR access unit, followed by zero or more access units that are | |||
access units up to but not including any subsequent access unit that | not IDR access units, including all subsequent access units up to | |||
is an IDR access unit. | but not including any subsequent access unit that is an IDR access | |||
unit. | ||||
Coding Tree Block (CTB): An NxN block of samples for some value of N | Coding Tree Block (CTB): | |||
such that the division of a component into CTBs is a partitioning. | An NxN block of samples for some value of N such that the division | |||
of a component into CTBs is a partitioning. | ||||
Coding Tree Unit (CTU): A CTB of luma samples, two corresponding CTBs | Coding Tree Unit (CTU): | |||
of chroma samples of a picture that has three sample arrays, or a CTB | A CTB of luma samples, two corresponding CTBs of chroma samples of | |||
of samples of a monochrome picture or a picture that is coded using | a picture that has three sample arrays, or a CTB of samples of a | |||
three separate colour planes and syntax structures used to code the | monochrome picture or a picture that is coded using three separate | |||
samples. | color planes and syntax structures used to code the samples. | |||
Decoded Picture: A decoded picture is derived by decoding a coded | Decoded Picture: | |||
picture. | A decoded picture is derived by decoding a coded picture. | |||
Decoded Picture Buffer (DPB): A buffer holding decoded pictures for | Decoded Picture Buffer (DPB): | |||
reference, output reordering, or output delay specified for the | A buffer holding decoded pictures for reference, output | |||
hypothetical reference decoder in Annex C of [EVC] standard. | reordering, or output delay specified for the hypothetical | |||
reference decoder in Annex C of the [EVC] standard. | ||||
Dynamic Range Adjustment (DRA): A mapping process that is applied to | Dynamic Range Adjustment (DRA): | |||
decoded picture prior to cropping and output as part of the decoding | A mapping process that is applied to the decoded picture prior to | |||
process and is controlled by parameters conveyed in an Adaptation | cropping and output as part of the decoding process; it is | |||
Parameter Set (APS). | controlled by parameters conveyed in an Adaptation Parameter Set | |||
(APS). | ||||
Hypothetical Reference Decoder (HRD): A hypothetical decoder model | Hypothetical Reference Decoder (HRD): | |||
that specifies constraints on the variability of conforming NAL unit | A hypothetical decoder model that specifies constraints on the | |||
streams or conforming byte streams that an encoding process may | variability of conforming NAL unit streams or conforming byte | |||
produce. | streams that an encoding process may produce. | |||
IDR access unit: access unit in which the coded picture is an IDR | IDR Access Unit: | |||
picture. | An access unit in which the coded picture is an IDR picture. | |||
IDR picture: coded picture for which each VCL NAL unit has | IDR Picture: | |||
NalUnitType equal to IDR_NUT. | The coded picture for which each VCL NAL unit has NalUnitType | |||
equal to IDR_NUT. | ||||
Level: A defined set of constraints on the values that may be taken | Level: | |||
by the syntax elements and variables of this document, or the value | A defined set of constraints on the values that may be taken by | |||
of a transform coefficient prior to scaling. | the syntax elements and variables of this document, or the value | |||
of a transform coefficient prior to scaling. | ||||
Network Abstraction Layer (NAL) unit: A syntax structure containing | Network Abstraction Layer (NAL) Unit: | |||
an indication of the type of data to follow and bytes containing that | A syntax structure containing an indication of the type of data to | |||
data in the form of an RBSP interspersed as necessary. | follow and bytes containing that data in the form of an RBSP | |||
interspersed as necessary. | ||||
Network Abstraction Layer (NAL) Unit Stream: A sequence of NAL units. | Network Abstraction Layer (NAL) Unit Stream: | |||
A sequence of NAL units. | ||||
Non-IDR Picture: A coded picture that is not an IDR picture. | Non-IDR Picture: | |||
A coded picture that is not an IDR picture. | ||||
Non-VCL NAL Unit: A NAL unit that is not a VCL NAL unit. | Non-VCL NAL Unit: | |||
A NAL unit that is not a VCL NAL unit. | ||||
Picture Parameter Set (PPS): A syntax structure containing syntax | Picture Parameter Set (PPS): | |||
elements that apply to zero or more entire coded pictures as | A syntax structure containing syntax elements that apply to zero | |||
determined by a syntax element found in each slice header. | or more entire coded pictures as determined by a syntax element | |||
found in each slice header. | ||||
Picture Order Count (POC): A variable that is associated with each | Picture Order Count (POC): | |||
picture, uniquely identifies the associated picture among all | A variable that is associated with each picture, uniquely | |||
pictures in the CVS, and, when the associated picture is to be output | identifies the associated picture among all pictures in the CVS, | |||
from the decoded picture buffer, indicates the position of the | and (when the associated picture is to be output from the DPB) | |||
associated picture in output order relative to the output order | indicates the position of the associated picture in output order | |||
positions of the other pictures in the same CVS that are to be output | relative to the output order positions of the other pictures in | |||
from the decoded picture buffer. | the same CVS that are to be output from the DPB. | |||
Raw Byte Sequence Payload (RBSP): A syntax structure containing an | Raw Byte Sequence Payload (RBSP): | |||
integer number of bytes that is encapsulated in a NAL unit and that | A syntax structure containing an integer number of bytes that is | |||
is either empty or has the form of a string of data bits containing | encapsulated in a NAL unit and that is either empty or has the | |||
syntax elements followed by an RBSP stop bit and zero or more | form of a string of data bits containing syntax elements followed | |||
subsequent bits equal to 0. | by an RBSP stop bit and zero or more subsequent bits equal to 0. | |||
Sequence Parameter Set (SPS): A syntax structure containing syntax | Sequence Parameter Set (SPS): | |||
elements that apply to zero or more entire CVSs as determined by the | A syntax structure containing syntax elements that apply to zero | |||
content of a syntax element found in the PPS referred to by a syntax | or more entire CVSs as determined by the content of a syntax | |||
element found in each slice header. | element found in the PPS referred to by a syntax element found in | |||
each slice header. | ||||
Slice: integer number of tiles of a picture in the tile scan of the | Slice: | |||
picture and that are exclusively contained in a single NAL unit. | An integer number of tiles of a picture in the tile scan of the | |||
picture, exclusively contained in a single NAL unit. | ||||
Tile: rectangular region of CTUs within a particular tile column and | Tile: | |||
a particular tile row in a picture. | A rectangular region of CTUs within a particular tile column and a | |||
particular tile row in a picture. | ||||
Tile column: rectangular region of CTUs having a height equal to the | Tile Column: | |||
height of the picture and width specified by syntax elements in the | A rectangular region of CTUs having a height equal to the height | |||
PPS. | of the picture and width specified by syntax elements in the PPS. | |||
Tile row: A rectangular region of CTUs having a height specified by | Tile Row: | |||
syntax elements in the PPS and a width equal to the width of the | A rectangular region of CTUs having a height specified by syntax | |||
picture. | elements in the PPS and a width equal to the width of the picture. | |||
Tile scan: A specific sequential ordering of CTUs partitioning a | Tile Scan: | |||
picture in which the CTUs are ordered consecutively in CTU raster | A specific sequential ordering of CTUs partitioning a picture in | |||
scan in a tile whereas tiles in a picture are ordered consecutively | which the CTUs are ordered consecutively in CTU raster scan in a | |||
in a raster scan of the tiles of the picture. | tile, whereas tiles in a picture are ordered consecutively in a | |||
raster scan of the tiles of the picture. | ||||
Video coding layer (VCL) NAL unit: A collective term for coded slice | Video Coding Layer (VCL) NAL Unit: | |||
NAL units and the subset of NAL units that have reserved values of | A collective term for coded slice NAL units and the subset of NAL | |||
NalUnitType that are classified as VCL NAL units in this document. | units that have reserved values of NalUnitType that are classified | |||
as VCL NAL units in this document. | ||||
3.1.2. Definitions Specific to This Document | 3.1.2. Definitions Specific to This Document | |||
Media-Aware Network Element (MANE): A network element, such as a | Media-Aware Network Element (MANE): | |||
middlebox, selective forwarding unit, or application-layer gateway | A network element, such as a middlebox, selective forwarding unit, | |||
that is capable of parsing certain aspects of the RTP payload headers | or application-layer gateway, that is capable of parsing certain | |||
or the RTP payload and reacting to their contents. | aspects of the RTP payload headers or the RTP payload and reacting | |||
to their contents. | ||||
Informative note: The concept of a MANE goes beyond normal routers | | Informative note: The concept of a MANE goes beyond normal | |||
or gateways in that a MANE has to be aware of the signaling (e.g., | | routers or gateways in that a MANE has to be aware of the | |||
to learn about the payload type mappings of the media streams), | | signaling (e.g., to learn about the payload type mappings of | |||
and in that it has to be trusted when working with Secure RTP | | the media streams), and in that it has to be trusted when | |||
(SRTP). The advantage of using MANEs is that they allow packets | | working with Secure RTP (SRTP). The advantage of using | |||
to be dropped according to the needs of the media coding. For | | MANEs is that they allow packets to be dropped according to | |||
example, if a MANE has to drop packets due to congestion on a | | the needs of the media coding. For example, if a MANE has | |||
certain link, it can identify and remove those packets whose | | to drop packets due to congestion on a certain link, it can | |||
elimination produces the least adverse effect on the user | | identify and remove those packets whose elimination produces | |||
experience. After dropping packets, MANEs must rewrite RTCP | | the least adverse effect on the user experience. After | |||
packets to match the changes to the RTP stream, as specified in | | dropping packets, MANEs must rewrite RTCP packets to match | |||
Section 7 of [RFC3550]. | | the changes to the RTP stream, as specified in Section 7 of | |||
| [RFC3550]. | ||||
NAL unit decoding order: A NAL unit order that conforms to the | NAL unit decoding order: | |||
constraints on NAL unit order given in Section 7.4.2.3 in [EVC], | A NAL unit order that conforms to the constraints on NAL unit | |||
follow the order of NAL units in the bitstream. | order given in Section 7.4.2.3 of [EVC] and follows the order of | |||
NAL units in the bitstream. | ||||
NALU-time: The value that the RTP timestamp would have if the NAL | NALU-time: | |||
unit would be transported in its own RTP packet. | The value that the RTP timestamp would have if the NAL unit would | |||
be transported in its own RTP packet. | ||||
NAL unit output order: A NAL unit order in which NAL units of | NAL unit output order: | |||
different access units are in the output order of the decoded | A NAL unit order in which NAL units of different access units are | |||
pictures corresponding to the access units, as specified in [EVC], | in the output order of the decoded pictures corresponding to the | |||
and in which NAL units within an access unit are in their decoding | access units, as specified in [EVC], and in which NAL units within | |||
order. | an access unit are in their decoding order. | |||
RTP stream: See [RFC7656]. Within the scope of this document, one | RTP stream: | |||
RTP stream is utilized to transport a EVC bitstream, which may | See [RFC7656]. Within the scope of this document, one RTP stream | |||
contain one or more temporal sub-layers. | is utilized to transport an EVC bitstream, which may contain one | |||
or more temporal sub-layers. | ||||
Transmission order: The order of packets in ascending RTP sequence | Transmission order: | |||
number order (in modulo arithmetic). Within an aggregation packet, | The order of packets in ascending RTP sequence number order (in | |||
the NAL unit transmission order is the same as the order of | modulo arithmetic). Within an Aggregation Packet (AP), the NAL | |||
appearance of NAL units in the packet. | unit transmission order is the same as the order of appearance of | |||
NAL units in the packet. | ||||
3.2. Abbreviations | 3.2. Abbreviations | |||
AU Access Unit | AU Access Unit | |||
AP Aggregation Packet | AP Aggregation Packet | |||
APS Adaptation Parameter Set | APS Adaptation Parameter Set | |||
ATS Adaptive Transform Selection | ATS Adaptive Transform Selection | |||
B Bi-predictive | B Bi-predictive | |||
CBR Constant Bit Rate | CBR Constant Bit Rate | |||
CPB Coded Picture Buffer | ||||
CTB Coding Tree Block | CPB Coded Picture Buffer | |||
CTU Coding Tree Unit | CTB Coding Tree Block | |||
CVS Coded Video Sequence | CTU Coding Tree Unit | |||
DPB Decoded Picture Buffer | CVS Coded Video Sequence | |||
HRD Hypothetical Reference Decoder | DPB Decoded Picture Buffer | |||
HSS Hypothetical Stream Scheduler | HRD Hypothetical Reference Decoder | |||
I Intra | HSS Hypothetical Stream Scheduler | |||
IDR Instantaneous Decoding Refresh | I Intra | |||
LSB Least Significant Bit | IDR Instantaneous Decoding Refresh | |||
LTRP Long-Term Reference Picture | LSB Least Significant Bit | |||
MMVD Merge with Motion Vector Difference | LTRP Long-Term Reference Picture | |||
MSB Most Significant Bit | MMVD Merge with Motion Vector Difference | |||
NAL Network Abstraction Layer | MSB Most Significant Bit | |||
P Predictive | NAL Network Abstraction Layer | |||
POC Picture Order Count | P Predictive | |||
PPS Picture Parameter Set | POC Picture Order Count | |||
QP Quantization Parameter | PPS Picture Parameter Set | |||
RBSP Raw Byte Sequence Payload | QP Quantization Parameter | |||
RGB Same as GBR | RBSP Raw Byte Sequence Payload | |||
SAR Sample Aspect Ratio | RGB Red, Green, and Blue | |||
SEI Supplemental Enhancement Information | SAR Sample Aspect Ratio | |||
SODB String Of Data Bits | SEI Supplemental Enhancement Information | |||
SPS Sequence Parameter Set | SODB String Of Data Bits | |||
STRP Short-Term Reference Picture | ||||
VBR Variable Bit Rate | SPS Sequence Parameter Set | |||
VCL Video Coding Layer | STRP Short-Term Reference Picture | |||
VBR Variable Bit Rate | ||||
VCL Video Coding Layer | ||||
4. RTP Payload Format | 4. RTP Payload Format | |||
4.1. RTP Header Usage | 4.1. RTP Header Usage | |||
The format of the RTP header is specified in [RFC3550] (reprinted as | The format of the RTP header is specified in [RFC3550] (included as | |||
Figure 2 for convenience). This payload format uses the fields of | Figure 2 for convenience). This payload format uses the fields of | |||
the header in a manner consistent with that specification. | the header in a manner consistent with that specification. | |||
The RTP payload (and the settings for some RTP header bits) for | The RTP payload (and the settings for some RTP header bits) for APs | |||
aggregation packets and fragmentation units are specified in | and Fragmentation Units (FUs) are specified in Sections 4.3.2 and | |||
Section 4.3.2 and Section 4.3.3, respectively. | 4.3.3, respectively. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
|V=2|P|X| CC |M| PT | sequence number | | |V=2|P|X| CC |M| PT | sequence number | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| timestamp | | | timestamp | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| synchronization source (SSRC) identifier | | | synchronization source (SSRC) identifier | | |||
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | |||
| contributing source (CSRC) identifiers | | | contributing source (CSRC) identifiers | | |||
| .... | | | .... | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
RTP Header According to [RFC3550] | Figure 2: RTP Header According to RFC 3550 | |||
Figure 2 | ||||
The RTP header information to be set according to this RTP payload | The RTP header information to be set according to this RTP payload | |||
format is set as follows: | format is set as follows: | |||
Marker bit (M): 1 bit | Marker bit (M): 1 bit | |||
Set for the last packet of the access unit, carried in the current | Set for the last packet of the access unit and carried in the | |||
RTP stream. This is in line with the normal use of the M bit in | current RTP stream. This is in line with the normal use of the M | |||
video formats to allow an efficient playout buffer handling. | bit in video formats to allow an efficient playout buffer | |||
handling. | ||||
Payload Type (PT): 7 bits | ||||
Payload Type (PT): 7 bits | ||||
The assignment of an RTP payload type for this new payload format | The assignment of an RTP payload type for this new payload format | |||
is outside the scope of this document and will not be specified | is outside the scope of this document and will not be specified | |||
here. The assignment of a payload type has to be performed either | here. The assignment of a payload type has to be performed either | |||
through the profile used or in a dynamic way. | through the profile used or in a dynamic way. | |||
Sequence Number (SN): 16 bits | Sequence Number (SN): 16 bits | |||
Set and used in accordance with [RFC3550]. | Set and used in accordance with [RFC3550]. | |||
Timestamp: 32 bits | Timestamp: 32 bits | |||
The RTP timestamp is set to the sampling timestamp of the content. | The RTP timestamp is set to the sampling timestamp of the content. | |||
A 90 kHz clock rate MUST be used. If the NAL unit has no timing | A 90 kHz clock rate MUST be used. If the NAL unit has no timing | |||
properties of its own (e.g., parameter sets or certain SEI NAL | properties of its own (e.g., parameter sets or certain SEI NAL | |||
units), the RTP timestamp MUST be set to the RTP timestamp of the | units), the RTP timestamp MUST be set to the RTP timestamp of the | |||
coded picture of the access unit in which the NAL unit is | coded picture of the access unit in which the NAL unit is | |||
included. For SEI messages, this information is specified in | included. For SEI messages, this information is specified in | |||
Annex D of [EVC]. Receivers MUST use the RTP timestamp for the | Annex D of [EVC]. Receivers MUST use the RTP timestamp for the | |||
display process, even when the bitstream contains picture timing | display process, even when the bitstream contains picture timing | |||
SEI messages or decoding unit information SEI messages as | SEI messages or decoding unit information SEI messages as | |||
specified in [EVC]. | specified in [EVC]. | |||
Synchronization source (SSRC): 32 bits | Synchronization source (SSRC): 32 bits | |||
Used to identify the source of the RTP packets. According to this | Used to identify the source of the RTP packets. According to this | |||
document, a single SSRC is used for all parts of a single | document, a single SSRC is used for all parts of a single | |||
bitstream. | bitstream. | |||
4.2. Payload Header Usage | 4.2. Payload Header Usage | |||
The first two bytes of the payload of an RTP packet are referred to | The first two bytes of the payload of an RTP packet are referred to | |||
as the payload header. The payload header consists of the same | as the payload header. The payload header consists of the same | |||
fields (F, TID, Reserve and E) as the NAL unit header as shown in | fields (F, TID, Reserve, and E) as the NAL unit header, as shown in | |||
Section 1.1.4, irrespective of the type of the payload structure. | Section 1.1.4, irrespective of the type of the payload structure. | |||
The TID value indicates (among other things) the relative importance | The TID value indicates (among other things) the relative importance | |||
of an RTP packet, for example, because NAL units with larger TID | of an RTP packet, for example, because NAL units with larger TID | |||
value are not used for the decoding of the ones with smaller TID | values are not used to decode the ones with smaller TID values. A | |||
value. A lower value of TID indicates a higher importance. More- | lower value of TID indicates a higher importance. More important NAL | |||
important NAL units MAY be better protected against transmission | units MAY be better protected against transmission losses than less | |||
losses than less-important NAL units. | important NAL units. | |||
4.3. Payload Structures | 4.3. Payload Structures | |||
Three different types of RTP packet payload structures are specified. | Three different types of RTP packet payload structures are specified. | |||
A receiver can identify the type of an RTP packet payload through the | A receiver can identify the type of an RTP packet payload through the | |||
Type field in the payload header. | Type field in the payload header. | |||
The three different payload structures are as follows: | The three different payload structures are as follows: | |||
* Single NAL unit packet: Contains a single NAL unit in the payload, | * Single NAL unit packet: Contains a single NAL unit in the payload, | |||
skipping to change at page 18, line 20 ¶ | skipping to change at line 837 ¶ | |||
* Aggregation Packet (AP): Contains more than one NAL unit within | * Aggregation Packet (AP): Contains more than one NAL unit within | |||
one access unit. This payload structure is specified in | one access unit. This payload structure is specified in | |||
Section 4.3.2. | Section 4.3.2. | |||
* Fragmentation Unit (FU): Contains a subset of a single NAL unit. | * Fragmentation Unit (FU): Contains a subset of a single NAL unit. | |||
This payload structure is specified in Section 4.3.3. | This payload structure is specified in Section 4.3.3. | |||
4.3.1. Single NAL Unit Packets | 4.3.1. Single NAL Unit Packets | |||
A single NAL unit packet contains exactly one NAL unit, and consists | A single NAL unit packet contains exactly one NAL unit and consists | |||
of a payload header as defined in Table 4 of [EVC] (denoted as | of a payload header as defined in Table 4 of [EVC] (denoted as | |||
PayloadHdr), followed by a conditional 16-bit DONL field (in network | PayloadHdr), followed by a conditional 16-bit DONL field (in network | |||
byte order), and the NAL unit payload data (the NAL unit excluding | byte order), and the NAL unit payload data (the NAL unit excluding | |||
its NAL unit header) of the contained NAL unit, as shown in Figure 3. | its NAL unit header) of the contained NAL unit, as shown in Figure 3. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| PayloadHdr | DONL (conditional) | | | PayloadHdr | DONL (conditional) | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
| NAL unit payload data | | | NAL unit payload data | | |||
| | | | | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| :...OPTIONAL RTP padding | | | :...OPTIONAL RTP padding | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
The Structure of a Single NAL Unit Packet | Figure 3: The Structure of a Single NAL Unit Packet | |||
Figure 3 | ||||
The DONL field, when present, specifies the value of the 16 least | The DONL field, when present, specifies the value of the 16 least | |||
significant bits of the decoding order number of the contained NAL | significant bits of the decoding order number of the contained NAL | |||
unit. If sprop-max-don-diff (defined in Section 7.2 is greater than | unit. If sprop-max-don-diff (defined in Section 7.2) is greater than | |||
0, the DONL field MUST be present, and the variable DON for the | 0, the DONL field MUST be present, and the variable DON for the | |||
contained NAL unit is derived as equal to the value of the DONL | contained NAL unit is derived as equal to the value of the DONL | |||
field. Otherwise (sprop-max-don-diff is equal to 0), the DONL field | field. Otherwise (where sprop-max-don-diff is equal to 0), the DONL | |||
MUST NOT be present. | field MUST NOT be present. | |||
4.3.2. Aggregation Packets (APs) | 4.3.2. Aggregation Packets (APs) | |||
Aggregation Packets (APs) enable the reduction of packetization | Aggregation Packets (APs) enable the reduction of packetization | |||
overhead for small NAL units, such as most of the non-VCL NAL units, | overhead for small NAL units, such as most of the non-VCL NAL units, | |||
which are often only a few octets in size. | which are often only a few octets in size. | |||
An AP aggregates NAL units of one access unit, and it MUST NOT | An AP aggregates NAL units of one access unit, and it MUST NOT | |||
contain NAL units from more than one AU. Each NAL unit to be carried | contain NAL units from more than one AU. Each NAL unit to be carried | |||
in an AP is encapsulated in an aggregation unit. NAL units | in an AP is encapsulated in an aggregation unit. NAL units | |||
aggregated in one AP are included in NAL-unit-decoding order. | aggregated in one AP are included in NAL-unit-decoding order. | |||
An AP consists of a payload header, as defined in Table 4 of [EVC] | An AP consists of a payload header, as defined in Table 4 of [EVC] | |||
(denoted here as PayloadHdr with Type=56) followed by two or more | (denoted here as PayloadHdr with Type=56), followed by two or more | |||
aggregation units, as shown in Figure 4. | aggregation units, as shown in Figure 4. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| PayloadHdr (Type=56) | | | | PayloadHdr (Type=56) | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | |||
| | | | | | |||
| two or more aggregation units | | | two or more aggregation units | | |||
| | | | | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| :...OPTIONAL RTP padding | | | :...OPTIONAL RTP padding | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
The Structure of an Aggregation Packet | Figure 4: The Structure of an Aggregation Packet | |||
Figure 4 | ||||
The fields in the payload header of an AP are set as follows. The F | The fields in the payload header of an AP are set as follows. The F | |||
bit MUST be equal to 0 if the F bit of each aggregated NAL unit is | bit MUST be equal to 0 if the F bit of each aggregated NAL unit is | |||
equal to zero; otherwise, it MUST be equal to 1. The Type field MUST | equal to zero; otherwise, it MUST be equal to 1. The Type field MUST | |||
be equal to 56. | be equal to 56. | |||
The value of TID MUST be the smallest value of TID of all the | The value of TID MUST be the smallest value of TID of all the | |||
aggregated NAL units. The value of Reserve and E MUST be equal to 0 | aggregated NAL units. The value of Reserve and E MUST be equal to 0 | |||
for this specification. | for this specification. | |||
Informative note: All VCL NAL units in an AP have the same TID | | Informative note: All VCL NAL units in an AP have the same TID | |||
value since they belong to the same access unit. However, an AP | | value since they belong to the same access unit. However, an | |||
may contain non-VCL NAL units for which the TID value in the NAL | | AP may contain non-VCL NAL units for which the TID value in the | |||
unit header may be different from the TID value of the VCL NAL | | NAL unit header may be different from the TID value of the VCL | |||
units in the same AP. | | NAL units in the same AP. | |||
An AP MUST carry at least two aggregation units and can carry as many | An AP MUST carry at least two aggregation units and can carry as many | |||
aggregation units as necessary; however, the total amount of data in | aggregation units as necessary; however, the total amount of data in | |||
an AP obviously MUST fit into an IP packet, and the size SHOULD be | an AP obviously MUST fit into an IP packet, and the size SHOULD be | |||
chosen so that the resulting IP packet is smaller than the path MTU | chosen so that the resulting IP packet is smaller than the path MTU | |||
size so to avoid IP layer fragmentation. An AP MUST NOT contain FUs | size so to avoid IP layer fragmentation. An AP MUST NOT contain FUs | |||
specified in Section 4.3.3. APs MUST NOT be nested; i.e., an AP can | specified in Section 4.3.3. APs MUST NOT be nested; i.e., an AP | |||
not contain another AP. | cannot contain another AP. | |||
Informative note: If a receiver encounters nested Aggregation | | Informative note: If a receiver encounters nested APs, which is | |||
Packets, which is against the aforementioned requirement, it has | | against the aforementioned requirement, it has several options, | |||
several options, listed in order of ease of implementation: 1) | | listed in order of ease of implementation: 1) ignore the nested | |||
Ignore the nested AP; 2) Ignore the nested AP and report a "packet | | AP; 2) ignore the nested AP and report a "packet loss" to the | |||
loss" to the decoder, if such functionality exists in the API, 3) | | decoder, if such functionality exists in the API; and 3) | |||
Implement support for nested APs and extract the Network | | implement support for nested APs and extract the NAL units from | |||
Abstraction Layer (NAL) units from these nested APs. | | these nested APs. | |||
The first aggregation unit in an AP consists of a conditional 16-bit | The first aggregation unit in an AP consists of a conditional 16-bit | |||
DONL field (in network byte order) followed by a 16-bit unsigned size | DONL field (in network byte order) followed by a 16-bit unsigned size | |||
information (in network byte order) that indicates the size of the | information (in network byte order) that indicates the size of the | |||
NAL unit in bytes (excluding these two octets but including the NAL | NAL unit in bytes (excluding these two octets but including the NAL | |||
unit header), followed by the NAL unit itself, including its NAL unit | unit header), followed by the NAL unit itself, including its NAL unit | |||
header, as shown in Figure 5. | header, as shown in Figure 5. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| : DONL (conditional) | NALU size | | | : DONL (conditional) | NALU size | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| NALU size | | | | NALU size | | | |||
+-+-+-+-+-+-+-+-+ NAL unit | | +-+-+-+-+-+-+-+-+ NAL unit | | |||
| | | | | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| : | | : | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
The Structure of the First Aggregation Unit in an AP | Figure 5: The Structure of the First Aggregation Unit in an AP | |||
Figure 5 | ||||
Informative note: The first octet of Figure 5 (indicated by the | | Informative note: The first octet of Figure 5 (indicated by the | |||
first colon) belongs to a previous aggregation unit. It is | | first colon) belongs to a previous aggregation unit. It is | |||
depicted to emphasize that aggregation units are octet aligned | | depicted to emphasize that aggregation units are octet aligned | |||
only. Similarly, the NAL unit carried in the aggregation unit can | | only. Similarly, the NAL unit carried in the aggregation unit | |||
terminate at the octet boundary. | | can terminate at the octet boundary. | |||
The DONL field, when present, specifies the value of the 16 least | The DONL field, when present, specifies the value of the 16 least | |||
significant bits of the decoding order number of the aggregated NAL | significant bits of the decoding order number of the aggregated NAL | |||
unit. | unit. | |||
If sprop-max-don-diff is greater than 0, the DONL field MUST be | If sprop-max-don-diff is greater than 0, the DONL field MUST be | |||
present in an aggregation unit that is the first aggregation unit in | present in an aggregation unit that is the first aggregation unit in | |||
an AP. The variable DON for the aggregated NAL unit is derived as | an AP. The variable DON for the aggregated NAL unit is derived as | |||
equal to the value of the DONL field, and the variable DON for an | equal to the value of the DONL field, and the variable Decoding Order | |||
aggregation unit that is not the first aggregation unit in an AP- | Number (DON) for an aggregation unit that is not the first | |||
aggregated NAL unit is derived as equal to the DON of the preceding | aggregation unit in an AP-aggregated NAL unit is derived as equal to | |||
aggregated NAL unit in the same AP plus 1 modulo 65536. Otherwise | the DON of the preceding aggregated NAL unit in the same AP plus 1 | |||
(sprop-max-don-diff is equal to 0), the DONL field MUST NOT be | modulo 65536. Otherwise (where sprop-max-don-diff is equal to 0), | |||
present in an aggregation unit that is the first aggregation unit in | the DONL field MUST NOT be present in an aggregation unit that is the | |||
an AP | first aggregation unit in an AP. | |||
An aggregation unit that is not the first aggregation unit in an AP | An aggregation unit that is not the first aggregation unit in an AP | |||
will be followed immediately by a 16-bit unsigned size information | will be followed immediately by a 16-bit unsigned size information | |||
(in network byte order) that indicates the size of the NAL unit in | (in network byte order) that indicates the size of the NAL unit in | |||
bytes (excluding these two octets but including the NAL unit header), | bytes (excluding these two octets but including the NAL unit header), | |||
followed by the NAL unit itself, including its NAL unit header, as | followed by the NAL unit itself, including its NAL unit header, as | |||
shown in Figure 6. | shown in Figure 6. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| : NALU size | NAL unit | | | : NALU size | NAL unit | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | |||
| | | | | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| : | | : | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
The Structure of an Aggregation Unit That Is Not the First | Figure 6: The Structure of an Aggregation Unit That Is Not the First | |||
Aggregation Unit in an AP | Aggregation Unit in an AP | |||
Figure 6 | ||||
Informative note: The first octet of Figure 6 (indicated by the | | Informative note: The first octet of Figure 6 (indicated by the | |||
first colon) belongs to a previous aggregation unit. It is | | first colon) belongs to a previous aggregation unit. It is | |||
depicted to emphasize that aggregation units are octet aligned | | depicted to emphasize that aggregation units are octet aligned | |||
only. Similarly, the NAL unit carried in the aggregation unit can | | only. Similarly, the NAL unit carried in the aggregation unit | |||
terminate at the octet boundary. | | can terminate at the octet boundary. | |||
Figure 7 presents an example of an AP that contains two aggregation | Figure 7 presents an example of an AP that contains two aggregation | |||
units, labeled as NALU 1 and NALU 2 in the figure, without the DONL | units, labeled "NALU 1" and "NALU 2", without the DONL field being | |||
field being present. | present. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| RTP Header | | | RTP Header | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| PayloadHdr (Type=56) | NALU 1 Size | | | PayloadHdr (Type=56) | NALU 1 Size | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| NALU 1 HDR | | | | NALU 1 HDR | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NALU 1 Data | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NALU 1 Data | | |||
skipping to change at page 22, line 26 ¶ | skipping to change at line 1018 ¶ | |||
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| . . . | NALU 2 Size | NALU 2 HDR | | | . . . | NALU 2 Size | NALU 2 HDR | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| NALU 2 HDR | | | | NALU 2 HDR | | | |||
+-+-+-+-+-+-+-+-+ NALU 2 Data | | +-+-+-+-+-+-+-+-+ NALU 2 Data | | |||
| . . . | | | . . . | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| :...OPTIONAL RTP padding | | | :...OPTIONAL RTP padding | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
An Example of an AP Packet Containing | Figure 7: An Example of an AP Packet Containing Two Aggregation | |||
Two Aggregation Units without the DONL Field | Units without the DONL Field | |||
Figure 7 | ||||
Figure 8 presents an example of an AP that contains two aggregation | Figure 8 presents an example of an AP that contains two aggregation | |||
units, labeled as NALU 1 and NALU 2 in the figure, with the DONL | units, labeled "NALU 1" and "NALU 2", with the DONL field being | |||
field being present. | present. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| RTP Header | | | RTP Header | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| PayloadHdr (Type=56) | NALU 1 DONL | | | PayloadHdr (Type=56) | NALU 1 DONL | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| NALU 1 Size | NALU 1 HDR | | | NALU 1 Size | NALU 1 HDR | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
skipping to change at page 23, line 27 ¶ | skipping to change at line 1047 ¶ | |||
+ . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| : NALU 2 Size | | | : NALU 2 Size | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| NALU 2 HDR | | | | NALU 2 HDR | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NALU 2 Data | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NALU 2 Data | | |||
| | | | | | |||
| . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| :...OPTIONAL RTP padding | | | :...OPTIONAL RTP padding | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
An Example of an AP Containing | Figure 8: An Example of an AP Containing Two Aggregation Units | |||
Two Aggregation Units with the DONL Field | with the DONL Field | |||
Figure 8 | ||||
4.3.3. Fragmentation Units | 4.3.3. Fragmentation Units (FUs) | |||
Fragmentation Units (FUs) are introduced to enable fragmenting a | FUs are introduced to enable fragmenting a single NAL unit into | |||
single NAL unit into multiple RTP packets, possibly without | multiple RTP packets, possibly without cooperation or knowledge of | |||
cooperation or knowledge of the EVC encoder. A fragment of a NAL | the EVC encoder. A fragment of a NAL unit consists of an integer | |||
unit consists of an integer number of consecutive octets of that NAL | number of consecutive octets of that NAL unit. Fragments of the same | |||
unit. Fragments of the same NAL unit MUST be sent in consecutive | NAL unit MUST be sent in consecutive order with ascending RTP | |||
order with ascending RTP sequence numbers (with no other RTP packets | sequence numbers (with no other RTP packets within the same RTP | |||
within the same RTP stream being sent between the first and last | stream being sent between the first and last fragment). | |||
fragment). | ||||
When a NAL unit is fragmented and conveyed within FUs, it is referred | When a NAL unit is fragmented and conveyed within FUs, it is referred | |||
to as a fragmented NAL unit. APs MUST NOT be fragmented. FUs MUST | to as a fragmented NAL unit. APs MUST NOT be fragmented. FUs MUST | |||
NOT be nested; i.e., an FU must not contain a subset of another FU. | NOT be nested; i.e., an FU must not contain a subset of another FU. | |||
The RTP timestamp of an RTP packet carrying an FU is set to the NALU- | The RTP timestamp of an RTP packet carrying an FU is set to the NALU- | |||
time of the fragmented NAL unit. | time of the fragmented NAL unit. | |||
An FU consists of a payload header as defined in Table 4 of [EVC] | An FU consists of a payload header as defined in Table 4 of [EVC] | |||
(denoted as PayloadHdr with type=57), an FU header of one octet, a | (denoted as PayloadHdr with Type=57), an FU header of one octet, a | |||
conditional 16-bit DONL field (in network byte order), and an FU | conditional 16-bit DONL field (in network byte order), and an FU | |||
payload, as shown in Figure 9. | payload, as shown in Figure 9. | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| PayloadHdr (Type=57) | FU header | DONL (cond) | | | PayloadHdr (Type=57) | FU header | DONL (cond) | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-| | |||
| DONL (cond) | | | | DONL (cond) | | | |||
|-+-+-+-+-+-+-+-+ | | |-+-+-+-+-+-+-+-+ | | |||
| FU payload | | | FU payload | | |||
| | | | | | |||
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| :...OPTIONAL RTP padding | | | :...OPTIONAL RTP padding | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
The Structure of an FU | Figure 9: The Structure of an FU | |||
Figure 9 | ||||
The fields in the payload header are set as follows. The Type field | The fields in the payload header are set as follows. The Type field | |||
MUST be equal to 57. The fields F, TID, Reserve and E MUST be equal | MUST be equal to 57. The fields F, TID, Reserve, and E MUST be equal | |||
to the fields F, TID, Reserve and E, respectively, of the fragmented | to the fields F, TID, Reserve, and E, respectively, of the fragmented | |||
NAL unit. | NAL unit. | |||
The FU header consists of an S bit, an E bit, and a 6-bit FuType | The FU header consists of an S bit, an E bit, and a 6-bit FuType | |||
field, as shown in Figure 10. | field, as shown in Figure 10. | |||
0 1 2 3 4 5 6 7 | 0 1 2 3 4 5 6 7 | |||
+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+ | |||
|S|E| FuType | | |S|E| FuType | | |||
+---------------+ | +---------------+ | |||
The Structure of FU Header | Figure 10: The Structure of FU Header | |||
Figure 10 | ||||
The semantics of the FU header fields are as follows: | The semantics of the FU header fields are as follows: | |||
S: 1 bit | S: 1 bit | |||
When set to 1, the S bit indicates the start of a fragmented NAL | When set to 1, the S bit indicates the start of a fragmented NAL | |||
unit, i.e., the first byte of the FU payload is also the first | unit, i.e., the first byte of the FU payload is also the first | |||
byte of the payload of the fragmented NAL unit. When the FU | byte of the payload of the fragmented NAL unit. When the FU | |||
payload is not the start of the fragmented NAL unit payload, the S | payload is not the start of the fragmented NAL unit payload, the S | |||
bit MUST be set to 0. | bit MUST be set to 0. | |||
E: 1 bit | E: 1 bit | |||
When set to 1, the E bit indicates the end of a fragmented NAL | When set to 1, the E bit indicates the end of a fragmented NAL | |||
unit, i.e., the last byte of the payload is also the last byte of | unit, i.e., the last byte of the payload is also the last byte of | |||
the fragmented NAL unit. When the FU payload is not the last | the fragmented NAL unit. When the FU payload is not the last | |||
fragment of a fragmented NAL unit, the E bit MUST be set to 0. | fragment of a fragmented NAL unit, the E bit MUST be set to 0. | |||
FuType: 6 bits | FuType: 6 bits | |||
The field FuType MUST be equal to the field Type of the fragmented | The field FuType MUST be equal to the field Type of the fragmented | |||
NAL unit. | NAL unit. | |||
The DONL field, when present, specifies the value of the 16 least | The DONL field, when present, specifies the value of the 16 least | |||
significant bits of the decoding order number of the fragmented NAL | significant bits of the decoding order number of the fragmented NAL | |||
unit. | unit. | |||
If sprop-max-don-diff is greater than 0, and the S bit is equal to 1, | If sprop-max-don-diff is greater than 0 and the S bit is equal to 1, | |||
the DONL field MUST be present in the FU, and the variable DON for | the DONL field MUST be present in the FU, and the variable DON for | |||
the fragmented NAL unit is derived as equal to the value of the DONL | the fragmented NAL unit is derived as equal to the value of the DONL | |||
field. Otherwise (sprop-max-don-diff is equal to 0, or the S bit is | field. Otherwise (where sprop-max-don-diff is equal to 0, or where | |||
equal to 0), the DONL field MUST NOT be present in the FU. | the S bit is equal to 0), the DONL field MUST NOT be present in the | |||
FU. | ||||
A non-fragmented NAL unit MUST NOT be transmitted in one FU; i.e., | A non-fragmented NAL unit MUST NOT be transmitted in one FU; i.e., | |||
the Start bit and End bit MUST NOT both be set to 1 in the same FU | the S-bit and E-bit MUST NOT both be set to 1 in the same FU header. | |||
header. | ||||
The FU payload consists of fragments of the payload of the fragmented | The FU payload consists of fragments of the payload of the fragmented | |||
NAL unit so that if the FU payloads of consecutive FUs, starting with | NAL unit so that if the FU payloads of consecutive FUs, starting with | |||
an FU with the S bit equal to 1 and ending with an FU with the E bit | an FU with the S bit equal to 1 and ending with an FU with the E bit | |||
equal to 1, are sequentially concatenated, the payload of the | equal to 1, are sequentially concatenated, the payload of the | |||
fragmented NAL unit can be reconstructed. The NAL unit header of the | fragmented NAL unit can be reconstructed. The NAL unit header of the | |||
fragmented NAL unit is not included as such in the FU payload, but | fragmented NAL unit is not included as such in the FU payload. | |||
rather the information of the NAL unit header of the fragmented NAL | Instead, the information of the NAL unit header of the fragmented NAL | |||
unit is conveyed in F, TID, Reserve and E fields of the FU payload | unit is conveyed in F, TID, Reserve, and E fields of the FU payload | |||
headers of the FUs and the FuType field of the FU header of the FUs. | headers of the FUs and the FuType field of the FU header of the FUs. | |||
An FU payload MUST NOT be empty. | An FU payload MUST NOT be empty. | |||
If an FU is lost, the receiver SHOULD discard all following | If an FU is lost, the receiver SHOULD discard all following | |||
fragmentation units in transmission order corresponding to the same | fragmentation units in transmission order corresponding to the same | |||
fragmented NAL unit unless the decoder in the receiver is known to | fragmented NAL unit unless the decoder in the receiver is known to | |||
gracefully handle incomplete NAL units. | gracefully handle incomplete NAL units. | |||
A receiver in an endpoint or a MANE MAY aggregate the first n-1 | A receiver in an endpoint or a MANE MAY aggregate the first n-1 | |||
fragments of a NAL unit to an (incomplete) NAL unit, even if fragment | fragments of a NAL unit to an (incomplete) NAL unit, even if fragment | |||
n of that NAL unit is not received. In this case, the | n of that NAL unit is not received. In this case, the | |||
forbidden_zero_bit of the NAL unit MUST be set to 1 to indicate a | forbidden_zero_bit of the NAL unit MUST be set to 1 to indicate a | |||
syntax violation. | syntax violation. | |||
4.4. Decoding Order Number | 4.4. Decoding Order Number | |||
For each NAL unit, the variable AbsDon is derived, representing the | For each NAL unit, the variable AbsDon is derived; it represents the | |||
decoding order number that is indicative of the NAL unit decoding | decoding order number that is indicative of the NAL unit decoding | |||
order. | order. | |||
Let NAL unit n be the n-th NAL unit in transmission order within an | Let NAL unit n be the n-th NAL unit in transmission order within an | |||
RTP stream. | RTP stream. | |||
If sprop-max-don-diff is equal to 0, AbsDon[n], the value of AbsDon | If sprop-max-don-diff is equal to 0, then AbsDon[n] (the value of | |||
for NAL unit n, is derived as equal to n. | AbsDon for NAL unit n) is derived as equal to n. | |||
Otherwise (sprop-max-don-diff is greater than 0), AbsDon[n] is | Otherwise (where sprop-max-don-diff is greater than 0), AbsDon[n] is | |||
derived as follows, where DON[n] is the value of the variable DON for | derived as follows, where DON[n] is the value of the variable DON for | |||
NAL unit n: | NAL unit n: | |||
* If n is equal to 0 (i.e., NAL unit n is the very first NAL unit in | * If n is equal to 0 (i.e., NAL unit n is the very first NAL unit in | |||
transmission order), AbsDon[0] is set equal to DON[0]. | transmission order), AbsDon[0] is set equal to DON[0]. | |||
* Otherwise (n is greater than 0), the following applies for | * Otherwise (where n is greater than 0), the following applies for | |||
derivation of AbsDon[n]: | derivation of AbsDon[n]: | |||
If DON[n] == DON[n-1], | If DON[n] == DON[n-1], | |||
AbsDon[n] = AbsDon[n-1] | AbsDon[n] = AbsDon[n-1] | |||
If (DON[n] > DON[n-1] and DON[n] - DON[n-1] < 32768), | If (DON[n] > DON[n-1] and DON[n] - DON[n-1] < 32768), | |||
AbsDon[n] = AbsDon[n-1] + DON[n] - DON[n-1] | AbsDon[n] = AbsDon[n-1] + DON[n] - DON[n-1] | |||
If (DON[n] < DON[n-1] and DON[n-1] - DON[n] >= 32768), | If (DON[n] < DON[n-1] and DON[n-1] - DON[n] >= 32768), | |||
AbsDon[n] = AbsDon[n-1] + 65536 - DON[n-1] + DON[n] | AbsDon[n] = AbsDon[n-1] + 65536 - DON[n-1] + DON[n] | |||
If (DON[n] > DON[n-1] and DON[n] - DON[n-1] >= 32768), | If (DON[n] > DON[n-1] and DON[n] - DON[n-1] >= 32768), | |||
AbsDon[n] = AbsDon[n-1] - (DON[n-1] + 65536 - DON[n]) | AbsDon[n] = AbsDon[n-1] - (DON[n-1] + 65536 - DON[n]) | |||
If (DON[n] < DON[n-1] and DON[n-1] - DON[n] < 32768), | If (DON[n] < DON[n-1] and DON[n-1] - DON[n] < 32768), | |||
AbsDon[n] = AbsDon[n-1] - (DON[n-1] - DON[n]) | AbsDon[n] = AbsDon[n-1] - (DON[n-1] - DON[n]) | |||
For any two NAL units m and n, the following applies: | For any two NAL units (m and n), the following applies: | |||
* AbsDon[n] greater than AbsDon[m] indicates that NAL unit n follows | * When AbsDon[n] is greater than AbsDon[m], the NAL unit n follows | |||
NAL unit m in NAL unit decoding order. | NAL unit m in NAL unit decoding order. | |||
* When AbsDon[n] is equal to AbsDon[m], the NAL unit decoding order | * When AbsDon[n] is equal to AbsDon[m], the NAL unit decoding order | |||
of the two NAL units can be in either order. | of the two NAL units can be in either order. | |||
* AbsDon[n] less than AbsDon[m] indicates that NAL unit n precedes | * When AbsDon[n] is less than AbsDon[m], the NAL unit n precedes NAL | |||
NAL unit m in decoding order. | unit m in decoding order. | |||
Informative note: When two consecutive NAL units in the NAL | | Informative note: When two consecutive NAL units in the NAL | |||
unit decoding order has different values of AbsDon, the the | | unit decoding order has different values of AbsDon, the | |||
absolute difference between the two AbsDon values may be | | absolute difference between the two AbsDon values may be | |||
greater than or equal to 1. | | greater than or equal to 1. | |||
Informative note: There are multiple reasons to allow for | | Informative note: There are multiple reasons to allow the | |||
the absolute difference of the values of AbsDon for two | | absolute difference of the values of AbsDon for two consecutive | |||
consecutive NAL units in the NAL unit decoding order to be | | NAL units in the NAL unit decoding order to be greater than | |||
greater than one. An increment by one is not required, as | | one. An increment by one is not required as at the time of | |||
at the time of associating values of AbsDon to NAL units, it | | associating values of AbsDon to NAL units, it may not be known | |||
may not be known whether all NAL units are to be delivered | | whether all NAL units are to be delivered to the receiver. For | |||
to the receiver. For example, a gateway might not forward | | example, a gateway might not forward VCL NAL units of higher | |||
VCL NAL units of higher sub-layers or some SEI NAL units | | sub-layers or some SEI NAL units when there is congestion in | |||
when there is congestion in the network. In another | | the network. In another example, the first intra-coded picture | |||
example, the first intra-coded picture of a pre-encoded clip | | of a pre-encoded clip is transmitted in advance to ensure that | |||
is transmitted in advance to ensure that it is readily | | it is readily available in the receiver. When transmitting the | |||
available in the receiver. When transmitting the first | | first intra-coded picture, the originator still determines how | |||
intra-coded picture, the originator still determines how | | many NAL units will be encoded before the first intra-coded | |||
many NAL units will be encoded before the first intra-coded | | picture of the pre-encoded clip follows in decoding order. | |||
picture of the pre-encoded clip follows in decoding order. | | Thus, the values of AbsDon for the NAL units of the first | |||
Thus, the values of AbsDon for the NAL units of the first | | intra-coded picture of the pre-encoded clip have to be | |||
intra-coded picture of the pre-encoded clip have to be | | estimated when they are transmitted and gaps in the values of | |||
estimated when they are transmitted, and gaps in the values | | AbsDon may occur. | |||
of AbsDon may occur. | ||||
5. Packetization Rules | 5. Packetization Rules | |||
The following packetization rules apply: | The following packetization rules apply: | |||
* If sprop-max-don-diff is greater than 0, the transmission order of | * If sprop-max-don-diff is greater than 0, the transmission order of | |||
NAL units carried in the RTP stream MAY be different from the NAL | NAL units carried in the RTP stream MAY be different from the NAL | |||
unit decoding order. Otherwise (sprop-max-don-diff equals 0), the | unit decoding order. Otherwise (where sprop-max-don-diff equals | |||
transmission order of NAL units carried in the RTP stream MUST be | 0), the transmission order of NAL units carried in the RTP stream | |||
the same as the NAL unit decoding order. | MUST be the same as the NAL unit decoding order. | |||
* A NAL unit of small size SHOULD be encapsulated in an aggregation | * A NAL unit of small size SHOULD be encapsulated in an AP together | |||
packet together with one or more other NAL units to avoid the | with one or more other NAL units to avoid the unnecessary | |||
unnecessary packetization overhead for small NAL units. For | packetization overhead for small NAL units. For example, non-VCL | |||
example, non-VCL NAL units, such as access unit delimiters, | NAL units, such as access unit delimiters, parameter sets, or SEI | |||
parameter sets, or SEI NAL units, are typically small and can | NAL units, are typically small and can often be aggregated with | |||
often be aggregated with VCL NAL units without violating MTU size | VCL NAL units without violating MTU size constraints. | |||
constraints. | ||||
* Each non-VCL NAL unit SHOULD, when possible from an MTU size match | * Each non-VCL NAL unit SHOULD, when possible from an MTU size match | |||
viewpoint, be encapsulated in an aggregation packet with its | viewpoint, be encapsulated in an AP with its associated VCL NAL | |||
associated VCL NAL unit, as typically, a non-VCL NAL unit would be | unit as, typically, a non-VCL NAL unit would be meaningless | |||
meaningless without the associated VCL NAL unit being available. | without the associated VCL NAL unit being available. | |||
* For carrying precisely one NAL unit in an RTP packet, a single NAL | * A single NAL unit packet MUST be used for carrying precisely one | |||
unit packet MUST be used. | NAL unit in an RTP packet. | |||
6. De-packetization Process | 6. De-packetization Process | |||
The general concept behind de-packetization is to get the NAL units | The general concept behind de-packetization is to get the NAL units | |||
out of the RTP packets in an RTP stream and pass them to the decoder | out of the RTP packets in an RTP stream and pass them to the decoder | |||
in the NAL unit decoding order. | in the NAL unit decoding order. | |||
The de-packetization process is implementation dependent. Therefore, | The de-packetization process is implementation dependent. Therefore, | |||
the following description should be seen as an example of a suitable | the following description should be seen as an example of a suitable | |||
implementation. Other schemes may also be used as long as the output | implementation. Other schemes may also be used as long as the output | |||
for the same input is the same as the process described below. The | for the same input is the same as the process described below. The | |||
output is the same when the set of output NAL units and their order | output is the same when the set of output NAL units and their order | |||
are both identical. Optimizations relative to the described | are both identical. Optimizations relative to the described | |||
algorithms are possible. | algorithms are possible. | |||
All normal RTP mechanisms related to buffer management apply. In | All normal RTP mechanisms related to buffer management apply. In | |||
particular, duplicated or outdated RTP packets (as indicated by the | particular, duplicated or outdated RTP packets (as indicated by the | |||
RTP sequence number and the RTP timestamp) are removed. To determine | RTP sequence number and the RTP timestamp) are removed. To determine | |||
the exact time for decoding, factors such as a possible intentional | the exact time for decoding, factors such as a possible intentional | |||
delay to allow for proper inter-stream synchronization, MUST be | delay to allow for proper inter-stream synchronization must be | |||
factored in. | considered. | |||
NAL units with NAL unit type values in the range of 0 to 55, | NAL units with NAL unit type values in the range of 0 to 55, | |||
inclusive, may be passed to the decoder. NAL-unit-like structures | inclusive, may be passed to the decoder. NAL-unit-like structures | |||
with NAL unit type values in the range of 56 to 62, inclusive, MUST | with NAL unit type values in the range of 56 to 62, inclusive, MUST | |||
NOT be passed to the decoder. | NOT be passed to the decoder. | |||
The receiver includes a receiver buffer, which is used to compensate | The receiver includes a receiver buffer, which is used to compensate | |||
for transmission delay jitter within individual RTP streams and to | for transmission delay jitter within individual RTP streams and to | |||
reorder NAL units from transmission order to the NAL unit decoding | reorder NAL units from transmission order to the NAL unit decoding | |||
order. In this section, the receiver operation is described under | order. In this section, the receiver operation is described under | |||
the assumption that there is no transmission delay jitter within an | the assumption that there is no transmission delay jitter within an | |||
RTP stream. To clarify the distinction from a practical receiver | RTP stream. To clarify the distinction from a practical receiver | |||
buffer, which is also used to compensate for transmission delay | buffer, which is also used to compensate for transmission delay | |||
jitter, the buffer in this section will henceforth be referred to as | jitter, the buffer in this section will henceforth be referred to as | |||
the de-packetization buffer. Receivers should also prepare for | the "de-packetization" buffer. Receivers should also prepare for | |||
transmission delay jitter; that is, either reserve separate buffers | transmission delay jitter; that is, either reserve separate buffers | |||
for transmission delay jitter buffering and de-packetization | for transmission delay jitter buffering and de-packetization | |||
buffering or use a receiver buffer for both transmission delay jitter | buffering, or use a receiver buffer for both transmission delay | |||
and de-packetization. Moreover, receivers should take transmission | jitter and de-packetization. Moreover, receivers should take | |||
delay jitter into account in the buffering operation, e.g., by | transmission delay jitter into account in the buffering operation, | |||
additional initial buffering before starting of decoding and | e.g., by additional initial buffering before starting decoding and | |||
playback. | playback. | |||
The de-packetization process extracts the NAL units from the RTP | The de-packetization process extracts the NAL units from the RTP | |||
packets in an RTP stream as follows. When an RTP packet carries a | packets in an RTP stream as follows. When an RTP packet carries a | |||
single NAL unit packet, the payload of the RTP packet is extracted as | single NAL unit packet, the payload of the RTP packet is extracted as | |||
a single NAL unit, excluding the DONL field, i.e., third and fourth | a single NAL unit, excluding the DONL field, i.e., third and fourth | |||
bytes, when sprop-max-don-diff is greater than 0. When an RTP packet | bytes, when sprop-max-don-diff is greater than 0. When an RTP packet | |||
carries an aggregation packet, several NAL units are extracted from | carries an AP, several NAL units are extracted from the payload of | |||
the payload of the RTP packet. In this case, each NAL unit | the RTP packet. In this case, each NAL unit corresponds to the part | |||
corresponds to the part of the payload of each aggregation unit that | of the payload of each aggregation unit that follows the NALU size | |||
follows the NALU size field, as described in Section 4.3.2. When an | field, as described in Section 4.3.2. When an RTP packet carries a | |||
RTP packet carries a Fragmentation Unit (FU), all RTP packets from | Fragmentation Unit (FU), all RTP packets from the first FU (with the | |||
the first FU (with the S field equal to 1) of the fragmented NAL unit | S field equal to 1) of the fragmented NAL unit up to the last FU | |||
up to the last FU (with the E field equal to 1) of the fragmented NAL | (with the E field equal to 1) of the fragmented NAL unit are | |||
unit are collected. The NAL unit is extracted from these RTP packets | collected. The NAL unit is extracted from these RTP packets by | |||
by concatenating all FU payloads in the same order as the | concatenating all FU payloads in the same order as the corresponding | |||
corresponding RTP packets and appending the NAL unit header with the | RTP packets and appending the NAL unit header with the fields F and | |||
fields F and TID set to equal the values of the fields F and TID in | TID set to equal the values of the fields F and TID in the payload | |||
the payload header of the FUs, respectively, and with the NAL unit | header of the FUs, respectively, and with the NAL unit type set equal | |||
type set equal to the value of the field FuType in the FU header of | to the value of the field FuType in the FU header of the FUs, as | |||
the FUs, as described in Section 4.3.3. | described in Section 4.3.3. | |||
When sprop-max-don-diff is equal to 0, the de-packetization buffer | When sprop-max-don-diff is equal to 0, the de-packetization buffer | |||
size is zero bytes, and the NAL units carried in the single RTP | size is zero bytes, and the NAL units carried in the single RTP | |||
stream are directly passed to the decoder in their transmission | stream are directly passed to the decoder in their transmission | |||
order, which is identical to their decoding order. | order, which is identical to their decoding order. | |||
When sprop-max-don-diff is greater than 0, the process described in | When sprop-max-don-diff is greater than 0, the process described in | |||
the remainder of this section applies. | the remainder of this section applies. | |||
The receiver has two buffering states: initial buffering and | The receiver has two buffering states: initial buffering and | |||
skipping to change at page 30, line 5 ¶ | skipping to change at line 1349 ¶ | |||
Initial buffering lasts until the difference between the greatest and | Initial buffering lasts until the difference between the greatest and | |||
smallest AbsDon values of the NAL units in the de-packetization | smallest AbsDon values of the NAL units in the de-packetization | |||
buffer is greater than or equal to the value of sprop-max-don-diff. | buffer is greater than or equal to the value of sprop-max-don-diff. | |||
After initial buffering, whenever the difference between the greatest | After initial buffering, whenever the difference between the greatest | |||
and smallest AbsDon values of the NAL units in the de-packetization | and smallest AbsDon values of the NAL units in the de-packetization | |||
buffer is greater than or equal to the value of sprop-max-don-diff, | buffer is greater than or equal to the value of sprop-max-don-diff, | |||
the following operation is repeatedly applied until this difference | the following operation is repeatedly applied until this difference | |||
is smaller than sprop-max-don-diff: | is smaller than sprop-max-don-diff: | |||
* The NAL unit in the de-packetization buffer with the smallest | The NAL unit in the de-packetization buffer with the smallest | |||
value of AbsDon is removed from the de-packetization buffer and | value of AbsDon is removed from the de-packetization buffer and | |||
passed to the decoder. | passed to the decoder. | |||
When no more NAL units are flowing into the de-packetization buffer, | When no more NAL units are flowing into the de-packetization buffer, | |||
all NAL units remaining in the de-packetization buffer are removed | all NAL units remaining in the de-packetization buffer are removed | |||
from the buffer and passed to the decoder in the order of increasing | from the buffer and passed to the decoder in the order of increasing | |||
AbsDon values. | AbsDon values. | |||
7. Payload Format Parameters | 7. Payload Format Parameters | |||
This section specifies the optional parameters. A mapping of the | This section specifies the optional parameters. A mapping of the | |||
parameters with Session Description Protocol (SDP) [RFC8866] is also | parameters with the Session Description Protocol (SDP) [RFC8866] is | |||
provided for applications that use SDP. | also provided for applications that use SDP. | |||
Parameters starting with the string "sprop" for stream properties can | Parameters starting with the string "sprop" for stream properties can | |||
be used by a sender to provide a receiver with the properties of the | be used by a sender to provide a receiver with the properties of the | |||
stream that is or will be sent. The media sender (and not the | stream that is or will be sent. The media sender (and not the | |||
receiver) selects whether, and with what values, "sprop" parameters | receiver) selects whether, and with what values, "sprop" parameters | |||
are being sent. This uncommon characteristic of the "sprop" | are being sent. This uncommon characteristic of the "sprop" | |||
parameters may not be intuitive in the context of some signaling | parameters may not be intuitive in the context of some signaling | |||
protocol concepts, especially with offer/answer. Please see | protocol concepts, especially with Offer/Answer. Please see | |||
Section 7.3.2 for guidance specific to the use of sprop parameters in | Section 7.3.2 for guidance specific to the use of sprop parameters in | |||
the Offer/Answer case. | the Offer/Answer case. | |||
7.1. Media Type Registration | 7.1. Media Type Registration | |||
The receiver MUST ignore any parameter unspecified in this document. | The receiver MUST ignore any parameter unspecified in this document. | |||
Type name: video | Type name: video | |||
Subtype name: evc | ||||
Required parameters: N/A | ||||
Optional parameters: profile-id, level-id, toolset-id, max-recv- | ||||
level-id, sprop-sps, sprop-pps, sprop-sei, sprop-max-don-diff, sprop- | ||||
depack-buf-bytes, depack-buf-cap (refer to Section 7.2 for | ||||
definitions) | ||||
Encoding considerations: | ||||
This type is only defined for transfer via RTP (RFC 3550). | Subtype name: evc | |||
Security considerations: | Required parameters: N/A | |||
See Section 9 of RFC XXXX. | Optional parameters: profile-id, level-id, toolset-id, max-recv- | |||
level-id, sprop-sps, sprop-pps, sprop-sei, sprop-max-don-diff, | ||||
sprop-depack-buf-bytes, depack-buf-cap (refer to Section 7.2 for | ||||
definitions) | ||||
Interoperability considerations: N/A | Encoding considerations: This type is only defined for transfer via | |||
RTP [RFC3550]. | ||||
Published specification: | Security considerations: See Section 9 of RFC 9584. | |||
Please refer to RFC XXXX and EVC standard [EVC]. | Interoperability considerations: N/A | |||
Applications that use this media type: | Published specification: Please refer to RFC 9584 and EVC standard | |||
[EVC]. | ||||
Any application that relies on EVC-based video services over RTP | Applications that use this media type: Any application that relies | |||
on EVC-based video services over RTP | ||||
Fragment identifier considerations: N/A | Fragment identifier considerations: N/A | |||
Additional information: N/A | Additional information: N/A | |||
Person & email address to contact for further information: | Person & email address to contact for further information: | |||
Stephan Wenger (stewe@stewe.org) | Stephan Wenger (stewe@stewe.org) | |||
Intended usage: COMMON | Intended usage: COMMON | |||
Restrictions on usage: N/A | ||||
Author: See Authors' Addresses section of RFC XXXX. | Restrictions on usage: N/A | |||
Change controller: | Author: See Authors' Addresses section of RFC 9584. | |||
IETF <avtcore@ietf.org> | Change controller: IETF <avtcore@ietf.org> | |||
7.2. Optional Parameters Definition | 7.2. Optional Parameters Definition | |||
profile-id, level-id, toolset-id: | profile-id, level-id, toolset-id: | |||
These parameters indicate the profile, the level, and constraints | These parameters indicate the profile, the level, and constraints | |||
of the bitstream carried by the RTP stream, or a specific set of | of the bitstream carried by the RTP stream or a specific set of | |||
the profile, the level, and constraints the receiver supports. | the profile, the level, and constraints the receiver supports. | |||
More specifications of these parameters, including how they relate | More specifications of these parameters, including how they relate | |||
to syntax elements specified in [EVC] are provided below. | to syntax elements specified in [EVC] are provided below. | |||
profile-id: | profile-id: | |||
When profile-id is not present, a value of 0 (i.e., the Baseline | When profile-id is not present, a value of 0 (i.e., the Baseline | |||
profile) MUST be inferred. | profile) MUST be inferred. | |||
When used to indicate properties of a bitstream, profile-id MUST | When used to indicate properties of a bitstream, profile-id MUST | |||
be derived from the profile_idc in the SPS. | be derived from the profile_idc in the SPS. | |||
EVC bitstreams transported over RTP using the technologies of this | EVC bitstreams transported over RTP using the technologies of this | |||
document SHOULD refer only to SPSs that have the same value in | document SHOULD refer only to SPSs that have the same value in | |||
profile_idc, unless the sender has a priori knowledge that a | profile_idc, unless the sender has a priori knowledge that a | |||
receiver can correctly decode the EVC bitstream with different | receiver can correctly decode the EVC bitstream with different | |||
profile_idc values (for example in walled garden scenarios). As | profile_idc values (for example, in walled garden scenarios). As | |||
exceptions to this rule, if the receiver is known to support | exceptions to this rule, if the receiver is known to support a | |||
Baseline profile, a bitstream could safely end with CVS referring | Baseline profile, a bitstream could safely end with CVS referring | |||
to an SPS wherein profile_idc indicates the Baseline Still Picture | to an SPS wherein profile_idc indicates the Baseline Still picture | |||
profile. A similar exception can be made for Main profile and | profile. A similar exception can be made for Main profile and | |||
Main Still picture profile. | Main Still picture profile. | |||
level-id: | level-id: | |||
When level-id is not present, a value of 90 (corresponding to | When level-id is not present, a value of 90 (corresponding to | |||
level 3, which allows for approximately SD TV resolution and frame | level 3, which allows for approximately standard-definition | |||
rates; for details please see Annex A of EVC) MUST be inferred. | television (SD TV) resolution and frame rates; see Annex A of | |||
[EVC]) MUST be inferred. | ||||
When used to indicate properties of a bitstream, level-id MUST be | When used to indicate properties of a bitstream, level-id MUST be | |||
derived from the level_idc in the SPS. | derived from the level_idc in the SPS. | |||
If the level-id parameter is used for capability exchange, the | If the level-id parameter is used for capability exchange, the | |||
following applies. If max-recv-level-id is not present, the | following applies. If max-recv-level-id is not present, the | |||
default level defined by level-id indicates the highest level the | default level defined by level-id indicates the highest level the | |||
codec wishes to support. Otherwise, max-recv-level-id indicates | codec wishes to support. Otherwise, max-recv-level-id indicates | |||
the highest level the codec supports for receiving. For either | the highest level the codec supports for receiving. For either | |||
receiving or sending, all levels that are lower than the highest | receiving or sending, all levels that are lower than the highest | |||
level supported MUST also be supported. | level supported MUST also be supported. | |||
toolset-id: | toolset-id: | |||
This parameter is a base64-encoding representation (Section 4 of | ||||
This parameter is a base64 encoding (Section 4 of [RFC4648]) | [RFC4648]) of a 64-bit unsigned integer bit mask derived from the | |||
representation of a 64 bit unsigned integer bit mask derived from | concatenation, in network byte order, of the syntax elements | |||
the concatenation, in network byte order, of the syntax elements | ||||
toolset_idc_h and toolset_idc_l. When used to indicate properties | toolset_idc_h and toolset_idc_l. When used to indicate properties | |||
of a bitstream, its value MUST be derived from toolset_idh_h and | of a bitstream, its value MUST be derived from toolset_idh_h and | |||
toolset_idc_l in the sequence parameter set. | toolset_idc_l in the sequence parameter set. | |||
max-recv-level-id: | max-recv-level-id: | |||
This parameter MAY be used to indicate the highest level a | This parameter MAY be used to indicate the highest level a | |||
receiver supports. | receiver supports. | |||
The value of max-recv-level-id MUST be in the range of 0 to 255, | The value of max-recv-level-id MUST be in the range of 0 to 255, | |||
inclusive.P. | inclusive. | |||
When max-recv-level-id is not present, the value is inferred to be | When max-recv-level-id is not present, the value is inferred to be | |||
equal to level-id. | equal to level-id. | |||
max-recv-level-id MUST NOT be present when the highest level the | max-recv-level-id MUST NOT be present when the highest level the | |||
receiver supports is not higher than the default level. | receiver supports is not higher than the default level. | |||
sprop-sps: | sprop-sps: | |||
This parameter MAY be used to convey sequence parameter set NAL | This parameter MAY be used to convey sequence parameter set NAL | |||
units of the bitstream for out-of-band transmission of sequence | units of the bitstream for out-of-band transmission of sequence | |||
parameter sets. The value of the parameter is a comma-separated | parameter sets. The value of the parameter is a comma-separated | |||
(',') list of base64 encoding (Section 4 of [RFC4648]) | (',') list of base64-encoding representations (Section 4 of | |||
representations of the sequence parameter set NAL units as | [RFC4648]) of the sequence parameter set NAL units as specified in | |||
specified in Section 7.3.2.1 of [EVC]. | Section 7.3.2.1 of [EVC]. | |||
sprop-pps: | sprop-pps: | |||
This parameter MAY be used to convey picture parameter set NAL | This parameter MAY be used to convey picture parameter set NAL | |||
units of the bitstream for out-of-band transmission of picture | units of the bitstream for out-of-band transmission of picture | |||
parameter sets. The value of the parameter is a comma-separated | parameter sets. The value of the parameter is a comma-separated | |||
(',') list of base64 encoding (Section 4 of [RFC4648]) | (',') list of base64-encoding representations (Section 4 of | |||
representations of the picture parameter set NAL units as | [RFC4648]) of the picture parameter set NAL units as specified in | |||
specified in Section 7.3.2.2 of [EVC]. | Section 7.3.2.2 of [EVC]. | |||
sprop-sei: | sprop-sei: | |||
This parameter MAY be used to convey one or more SEI messages that | This parameter MAY be used to convey one or more SEI messages that | |||
describe bitstream characteristics. When present, a decoder can | describe bitstream characteristics. When present, a decoder can | |||
rely on the bitstream characteristics that are described in the | rely on the bitstream characteristics that are described in the | |||
SEI messages for the entire duration of the session, independently | SEI messages for the entire duration of the session, independently | |||
from the persistence scopes of the SEI messages as specified in | from the persistence scopes of the SEI messages as specified in | |||
[VSEI]. | [VSEI]. | |||
The value of the parameter is a comma-separated (',') list of | The value of the parameter is a comma-separated (',') list of | |||
base64 encoding (Section 4 of [RFC4648]) representations of SEI | base64-encoding representations (Section 4 of [RFC4648]) of SEI | |||
NAL units as specified in [VSEI]. | NAL units as specified in [VSEI]. | |||
Informative note: Intentionally, no list of applicable or | | Informative note: Intentionally, no list of applicable or | |||
inapplicable SEI messages is specified here. Conveying certain | | inapplicable SEI messages is specified here. Conveying | |||
SEI messages in sprop-sei may be sensible in some application | | certain SEI messages in sprop-sei may be sensible in some | |||
scenarios and meaningless in others. However, a few examples | | application scenarios and meaningless in others. However, a | |||
are described below: | | couple of examples are described below. | |||
| | ||||
1) In an environment where the bitstream was created from film- | | 1. In an environment where the bitstream was created from | |||
based source material, and no splicing is going to occur during | | film-based source material, and no splicing is going to | |||
the lifetime of the session, the film grain characteristics SEI | | occur during the lifetime of the session, the film grain | |||
message is likely meaningful, and sending it in sprop-sei | | characteristics SEI message is likely meaningful; and | |||
rather than in the bitstream at each entry point may help with | | sending it in sprop-sei rather than in the bitstream at | |||
saving bits and allows one to configure the renderer only once, | | each entry point may help with saving bits and allow one | |||
avoiding unwanted artifacts. | | to configure the renderer only once, avoiding unwanted | |||
| artifacts. | ||||
2) Examples for SEI messages that would be meaningless to be | | | |||
conveyed in sprop-sei include the decoded picture hash SEI | | 2. Examples for SEI messages that would be meaningless to | |||
message (it is close to impossible that all decoded pictures | | be conveyed in sprop-sei include the decoded picture | |||
have the same hashtag) or the filler payload SEI message (as | | hash SEI message (it is close to impossible that all | |||
there is no point in just having more bits in SDP). | | decoded pictures have the same hashtag) or the filler | |||
| payload SEI message (as there is no point in just having | ||||
| more bits in SDP). | ||||
sprop-max-don-diff: | sprop-max-don-diff: | |||
If there is no NAL unit naluA that is followed in transmission | If there is no NAL unit naluA that is followed in transmission | |||
order by any NAL unit preceding naluA in decoding order (i.e., the | order by any NAL unit preceding naluA in decoding order (i.e., the | |||
transmission order of the NAL units is the same as the decoding | transmission order of the NAL units is the same as the decoding | |||
order), the value of this parameter MUST be equal to 0. | order), the value of this parameter MUST be equal to 0. | |||
Otherwise, this parameter specifies the maximum absolute | Otherwise, this parameter specifies the maximum absolute | |||
difference between the decoding order number (i.e., AbsDon) values | difference between the decoding order number (i.e., AbsDon) values | |||
of any two NAL units naluA and naluB, where naluA follows naluB in | of any two NAL units naluA and naluB, where naluA follows naluB in | |||
decoding order and precedes naluB in transmission order. | decoding order and precedes naluB in transmission order. | |||
skipping to change at page 34, line 30 ¶ | skipping to change at line 1551 ¶ | |||
of any two NAL units naluA and naluB, where naluA follows naluB in | of any two NAL units naluA and naluB, where naluA follows naluB in | |||
decoding order and precedes naluB in transmission order. | decoding order and precedes naluB in transmission order. | |||
The value of sprop-max-don-diff MUST be an integer in the range of | The value of sprop-max-don-diff MUST be an integer in the range of | |||
0 to 32767, inclusive. | 0 to 32767, inclusive. | |||
When not present, the value of sprop-max-don-diff is inferred to | When not present, the value of sprop-max-don-diff is inferred to | |||
be equal to 0. | be equal to 0. | |||
sprop-depack-buf-bytes: | sprop-depack-buf-bytes: | |||
This parameter signals the required size of the de-packetization | This parameter signals the required size of the de-packetization | |||
buffer in units of bytes. The value of the parameter MUST be | buffer in units of bytes. The value of the parameter MUST be | |||
greater than or equal to the maximum buffer occupancy (in units of | greater than or equal to the maximum buffer occupancy (in units of | |||
bytes) of the de-packetization buffer as specified in Section 6. | bytes) of the de-packetization buffer as specified in Section 6. | |||
The value of sprop-depack-buf-bytes MUST be an integer in the | The value of sprop-depack-buf-bytes MUST be an integer in the | |||
range of 0 to 4294967295, inclusive. | range of 0 to 4294967295, inclusive. | |||
When sprop-max-don-diff is present and greater than 0, this | When sprop-max-don-diff is present and greater than 0, this | |||
parameter MUST be present and the value MUST be greater than 0. | parameter MUST be present and the value MUST be greater than 0. | |||
When not present, the value of sprop-depack-buf-bytes is inferred | When not present, the value of sprop-depack-buf-bytes is inferred | |||
to be equal to 0. | to be equal to 0. | |||
Informative note: The value of sprop-depack-buf-bytes indicates | | Informative note: The value of sprop-depack-buf-bytes | |||
the required size of the de-packetization buffer only. When | | indicates the required size of the de-packetization buffer | |||
network jitter can occur, an appropriately sized jitter buffer | | only. When network jitter can occur, an appropriately sized | |||
has to be available as well. | | jitter buffer has to be available as well. | |||
depack-buf-cap: | depack-buf-cap: | |||
This parameter signals the capabilities of a receiver | This parameter signals the capabilities of a receiver | |||
implementation and indicates the amount of de-packetization buffer | implementation and indicates the amount of de-packetization buffer | |||
space in units of bytes that the receiver has available for | space in units of bytes that the receiver has available for | |||
reconstructing the NAL unit decoding order from NAL units carried | reconstructing the NAL unit decoding order from NAL units carried | |||
in the RTP stream. A receiver is able to handle any RTP stream | in the RTP stream. A receiver is able to handle any RTP stream | |||
for which the value of the sprop-depack-buf-bytes parameter is | for which the value of the sprop-depack-buf-bytes parameter is | |||
smaller than or equal to this parameter. | smaller than or equal to this parameter. | |||
When not present, the value of depack-buf-cap is inferred to be | When not present, the value of depack-buf-cap is inferred to be | |||
equal to 4294967295. The value of depack-buf-cap MUST be an | equal to 4294967295. The value of depack-buf-cap MUST be an | |||
integer in the range of 1 to 4294967295, inclusive. | integer in the range of 1 to 4294967295, inclusive. | |||
Informative note: depack-buf-cap indicates the maximum possible | | Informative note: The value of depack-buf-cap indicates the | |||
size of the de-packetization buffer of the receiver only, | | maximum possible size of the de-packetization buffer of the | |||
without allowing for network jitter. | | receiver only, without allowing for network jitter. When | |||
| network jitter occurs, an appropriately sized jitter buffer | ||||
| has to be available as well. | ||||
7.3. SDP Parameters | 7.3. SDP Parameters | |||
The receiver MUST ignore any parameter unspecified in this document. | The receiver MUST ignore any parameter unspecified in this document. | |||
7.3.1. Mapping of Payload Type Parameters to SDP | 7.3.1. Mapping of Payload Type Parameters to SDP | |||
The media type video/evc string is mapped to fields in the Session | The media type video/evc string is mapped to fields in the Session | |||
Description Protocol (SDP) [RFC8866] as follows: | Description Protocol (SDP) [RFC8866] as follows: | |||
* The media name in the "m=" line of SDP MUST be video. | * The media name in the "m=" line of SDP MUST be video. | |||
* The encoding name in the "a=rtpmap" line of SDP MUST be evc (the | * The encoding name in the "a=rtpmap" line of SDP MUST be evc (the | |||
media subtype). | media subtype). | |||
* The clock rate in the "a=rtpmap" line MUST be 90000. | * The clock rate in the "a=rtpmap" line MUST be 90000. | |||
* The OPTIONAL parameters profile-id, level-id, toolset-id, max- | * The OPTIONAL parameters profile-id, level-id, toolset-id, max- | |||
recv-level-id, sprop-max-don-diff, sprop-depack-buf-bytes, and | recv-level-id, sprop-max-don-diff, sprop-depack-buf-bytes, and | |||
depack-buf-cap, when present, MUST be included in the "a=fmtp" | depack-buf-cap, when present, MUST be included in the "a=fmtp" | |||
line of SDP. The fmtp line is expressed as a media type string, | line of SDP. The "a=fmtp" line is expressed as a media type | |||
in the form of a semicolon-separated list of parameter=value | string, in the form of a semicolon-separated list of | |||
pairs. | parameter=value pairs. | |||
* The OPTIONAL parameters sprop-sps, sprop-pps, and sprop-sei, when | * The OPTIONAL parameters sprop-sps, sprop-pps, and sprop-sei, when | |||
present, MUST be included in the "a=fmtp" line of SDP or conveyed | present, MUST be included in the "a=fmtp" line of SDP or conveyed | |||
using the "fmtp" source attribute as specified in Section 6.3 of | using the "fmtp" source attribute as specified in Section 6.3 of | |||
[RFC5576]. For a particular media format (i.e., RTP payload | [RFC5576]. For a particular media format (i.e., RTP payload | |||
type), sprop-sps, sprop-pps, or sprop-sei MUST NOT be both | type), sprop-sps, sprop-pps, or sprop-sei MUST NOT be both | |||
included in the "a=fmtp" line of SDP and conveyed using the "fmtp" | included in the "a=fmtp" line of SDP and conveyed using the "fmtp" | |||
source attribute. When included in the "a=fmtp" line of SDP, | source attribute. When included in the "a=fmtp" line of SDP, | |||
those parameters are expressed as a media type string, in the form | those parameters are expressed as a media type string, in the form | |||
of a semicolon-separated list of parameter=value pairs. When | of a semicolon-separated list of parameter=value pairs. When | |||
conveyed in the "a=fmtp" line of SDP for a particular payload | conveyed in the "a=fmtp" line of SDP for a particular payload | |||
type, the parameters sprop-sps, sprop-pps, and sprop-sei MUST be | type, the parameters sprop-sps, sprop-pps, and sprop-sei MUST be | |||
applied to each SSRC with the payload type. When conveyed using | applied to each SSRC with the payload type. When conveyed using | |||
the "fmtp" source attribute, these parameters are only associated | the "fmtp" source attribute, these parameters are only associated | |||
with the given source and payload type as parts of the "fmtp" | with the given source and payload type as parts of the "fmtp" | |||
source attribute. | source attribute. | |||
Informative note: Conveyance of sprop-sps and sprop-pps using the | | Informative note: Conveyance of sprop-sps and sprop-pps using | |||
"fmtp" source attribute allows for out-of-band transport of | | the "fmtp" source attribute allows for out-of-band transport of | |||
parameter sets in topologies like Topo-Video-switch-MCU, as | | parameter sets in topologies like Topo-Video-switch-MCU, as | |||
specified in [RFC7667]. | | specified in [RFC7667]. | |||
A general usage of media representation in SDP is as follows: | A general usage of media representation in SDP is as follows: | |||
m=video 49170 RTP/AVP 98 | m=video 49170 RTP/AVP 98 | |||
a=rtpmap:98 evc/90000 | a=rtpmap:98 evc/90000 | |||
a=fmtp:98 profile-id=1; | a=fmtp:98 profile-id=1; | |||
sprop-sps=<sequence parameter set data>; | sprop-sps=<sequence parameter set data>; | |||
sprop-pps=<picture parameter set data>; | sprop-pps=<picture parameter set data>; | |||
A SIP offer/answer exchange wherein both parties are expected to both | A SIP Offer/Answer exchange wherein both parties are expected to both | |||
send and receive could look like the following. Only the media | send and receive could look like the following. Only the media | |||
codec-specific parts of the SDP are shown. | codec-specific parts of the SDP are shown. | |||
Offerer->Answerer: | Offerer->Answerer: | |||
m=video 49170 RTP/AVP 98 | m=video 49170 RTP/AVP 98 | |||
a=rtpmap:98 evc/90000 | a=rtpmap:98 evc/90000 | |||
a=fmtp:98 profile-id=1; level_id=90; | a=fmtp:98 profile-id=1; level_id=90; | |||
The above represents an offer for symmetric video communication | The above represents an offer for symmetric video communication using | |||
using [EVC] and its payload specification at the main profile and | [EVC] and its payload specification at the main profile and level 3. | |||
level 3.0. Informally speaking, this offer tells the receiver of | Informally speaking, this offer tells the receiver of the offer that | |||
the offer that the sender is willing to receive up to xKpxx | the sender is willing to receive up to xKpxx resolution at the | |||
resolution at the maximum bitrates specified in [EVC]. At the | maximum bitrates specified in [EVC]. At the same time, if this offer | |||
same time, if this offer were accepted "as is", the offer can | were accepted "as is", the offer can expect that the Answerer would | |||
expect that the answerer would be able to receive and properly | be able to receive and properly decode EVC media up to and including | |||
decode EVC media up to and including level 3.0. | level 3. | |||
Answerer->Offerer: | Answerer->Offerer: | |||
m=video 49170 RTP/AVP 98 | m=video 49170 RTP/AVP 98 | |||
a=rtpmap:98 evc/90000 | a=rtpmap:98 evc/90000 | |||
a=fmtp:98 profile-id=1; level_id=60 | a=fmtp:98 profile-id=1; level_id=60 | |||
Informative note: level_id shall be set equal to a value of 30 | | Informative note: level_id shall be set equal to a value of 30 | |||
times the level number specified in Table A.1 of EVC. | | times the level number specified in Table A.1 of [EVC]. | |||
With this answer to the offer above, the system receiving the offer | With this answer to the offer above, the system receiving the offer | |||
advises the offerer that it is incapable of handing evc at level 3.0 | advises the Offerer that it is incapable of handling evc at level 3 | |||
but is capable of decoding level 2. As EVC video codecs must support | but is capable of decoding level 2. As EVC video codecs must support | |||
decoding at all levels below the maximum level they implement, the | decoding at all levels below the maximum level they implement, the | |||
resulting user experience would likely be that both systems send | resulting user experience would likely be that both systems send | |||
video at level 2. However, nothing prevents an encoder from further | video at level 2. However, nothing prevents an encoder from further | |||
downgrading its sending to, for example, level 1 if it were short of | downgrading its sending to, for example, level 1 if it were short of | |||
cycles or bandwidth or for other reasons. | cycles or bandwidth or for other reasons. | |||
7.3.2. Usage with SDP Offer/Answer Model | 7.3.2. Usage with SDP Offer/Answer Model | |||
This section describes the negotiation of unicast messages using the | This section describes the negotiation of unicast messages using the | |||
offer/answer model described in [RFC3264] and its updates. | Offer/Answer model described in [RFC3264] and its updates. | |||
This section applies to all profiles defined in [EVC], specifically | This section applies to all profiles defined in [EVC], specifically | |||
to Baseline, Main, and the associated still image profiles. | to Baseline, Main, and the associated still image profiles. | |||
The following limitations and rules pertaining to the media | The following limitations and rules pertaining to the media | |||
configuration apply: | configuration apply: | |||
The parameters identifying a media format configuration for EVC are | The parameters identifying a media format configuration for EVC are | |||
profile-id and level-id. Profile_id MUST be used symmetrically. | profile-id and level-id. Profile_id MUST be used symmetrically. | |||
The answerer MUST structure its answer according to one of the | The Answerer MUST structure its answer according to one of the | |||
following three options: | following two options: | |||
- maintain all configuration parameters with the values remaining | * maintain all configuration parameters with the values remaining | |||
the same as in the offer for the media format (payload type), | the same as in the offer for the media format (payload type), with | |||
with the exception that the value of level-id is changeable as | the exception that the value of level-id is changeable as long as | |||
long as the highest level indicated by the answer is not higher | the highest level indicated by the answer is not higher than that | |||
than that indicated by the offer; or | indicated by the offer; or | |||
- remove the media format (payload type) completely (when one or | * remove the media format (payload type) completely (when one or | |||
more of the parameter values are not supported). | more of the parameter values are not supported). | |||
Informative note: The above requirement for symmetric use does not | | Informative note: The above requirement for symmetric use does | |||
apply for level-id and does not apply for the other bitstream or RTP | | not apply for level-id and does not apply for the other | |||
stream properties and capability parameters, as described in | | bitstream or RTP stream properties and capability parameters, | |||
Section 7.3.2.1 (Payload format config) below. | | as described in Section 7.3.2.1 ("Payload Format | |||
| Configuration"). | ||||
To simplify handling and matching of these configurations, the same | To simplify handling and matching of these configurations, the same | |||
RTP payload type number used in the offer SHOULD also be used in the | RTP payload type number used in the offer SHOULD also be used in the | |||
answer, as specified in [RFC3264]. | answer, as specified in [RFC3264]. | |||
The answer MUST NOT contain a payload type number used in the offer | The answer MUST NOT contain a payload type number used in the offer | |||
for the media subtype unless the configuration is the same as in the | for the media subtype unless the configuration is the same as in the | |||
offer or the configuration in the answer only differs from that in | offer or the configuration in the answer only differs from that in | |||
the offer with a different value of level-id. | the offer with a different value of level-id. | |||
7.3.2.1. Payload Format Configuration | 7.3.2.1. Payload Format Configuration | |||
The following limitations and rules pertain to the configuration of | The following limitations and rules pertain to the configuration of | |||
the payload format buffer management. | the payload format buffer management. | |||
The parameters sprop-max-don-diff and sprop-depack-buf-bytes describe | * The parameters sprop-max-don-diff and sprop-depack-buf-bytes | |||
the properties of an RTP stream that the offerer or the answerer is | describe the properties of an RTP stream that the Offerer or the | |||
sending for the media format configuration. This differs from the | Answerer is sending for the media format configuration. This | |||
normal usage of the offer/answer parameters; normally, such | differs from the normal usage of the Offer/Answer parameters; | |||
parameters declare the properties of the bitstream or RTP stream that | normally, such parameters declare the properties of the bitstream | |||
the offerer or the answerer is able to receive. When dealing with | or RTP stream that the Offerer or the Answerer is able to receive. | |||
EVC, the offerer assumes that the answerer will be able to receive | When dealing with EVC, the Offerer assumes that the Answerer will | |||
media encoded using the configuration being offered. | be able to receive media encoded using the configuration being | |||
offered. | ||||
Informative note: The above parameters apply for any RTP stream, when | | Informative note: The above parameters apply for any RTP | |||
present, sent by a declaring entity with the same configuration. In | | stream, when present, sent by a declaring entity with the same | |||
other words, the applicability of the above parameters to RTP streams | | configuration. In other words, the applicability of the above | |||
depends on the source endpoint. Rather than being bound to the | | parameters to RTP streams depends on the source endpoint. | |||
payload type, the values may have to be applied to another payload | | Rather than being bound to the payload type, the values may | |||
type when being sent, as they apply for the configuration. | | have to be applied to another payload type when being sent, as | |||
| they apply for the configuration. | ||||
When an offerer offers an interleaved stream, indicated by the | * When an Offerer offers an interleaved stream, indicated by the | |||
presence of sprop-max-don-diff with a value larger than zero, the | presence of sprop-max-don-diff with a value larger than zero, the | |||
offerer MUST include the size of the de-packetization buffer sprop- | Offerer MUST include the size of the de-packetization buffer | |||
depack-buf-bytes. | sprop-depack-buf-bytes. | |||
To enable the offerer and answerer to inform each other about their | * To enable the Offerer and Answerer to inform each other about | |||
capabilities for de-packetization buffering in receiving RTP streams, | their capabilities for de-packetization buffering in receiving RTP | |||
both parties are RECOMMENDED to include depack-buf-cap. | streams, both parties are RECOMMENDED to include depack-buf-cap. | |||
The parameters sprop-sps, or sprop-pps, when present (included in the | * The parameters sprop-sps or sprop-pps, when present (included in | |||
"a=fmtp" line of SDP or conveyed using the "fmtp" source attribute, | the "a=fmtp" line of SDP or conveyed using the "fmtp" source | |||
as specified in Section 6.3 of [RFC5576]), are used for out-of-band | attribute, as specified in Section 6.3 of [RFC5576]), are used for | |||
transport of the parameter sets (SPS or PPS, respectively). The | out-of-band transport of the parameter sets (SPS or PPS, | |||
answerer MAY use either out-of-band or in-band transport of parameter | respectively). The Answerer MAY use either out-of-band or in-band | |||
sets for the bitstream it is sending, regardless of whether out-of- | transport of parameter sets for the bitstream it is sending, | |||
band parameter sets transport has been used in the offerer-to- | regardless of whether out-of-band parameter sets transport has | |||
answerer direction. Parameter sets included in an answer are | been used in the Offerer-to-Answerer direction. Parameter sets | |||
independent of those parameter sets included in the offer, as they | included in an answer are independent of those parameter sets | |||
are used for decoding two different bitstreams; one from the answerer | included in the offer, as they are used for decoding two different | |||
to the offerer and the other in the opposite direction. In case some | bitstreams: one from the Answerer to the Offerer, and the other in | |||
RTP packets are sent before the SDP offer/answer settles down, in- | the opposite direction. In case some RTP packets are sent before | |||
band parameter sets MUST be used for those RTP stream parts sent | the SDP Offer/Answer settles down, in-band parameter sets MUST be | |||
before the SDP offer/answer. | used for those RTP stream parts sent before the SDP Offer/Answer. | |||
The following rules apply to transport of parameter sets in the | * The following rules apply to transport of parameter sets in the | |||
offerer-to-answerer direction. | Offerer-to-Answerer direction. | |||
An offer MAY include sprop-sps, and/or sprop-pps. If none of these | - An offer MAY include sprop-sps and/or sprop-pps. If none of | |||
parameters are present in the offer, then only in-band transport of | these parameters are present in the offer, then only in-band | |||
parameter sets is used. | transport of parameter sets is used. | |||
If the level to use in the offerer-to-answerer direction is equal to | - If the level to use in the Offerer-to-Answerer direction is | |||
the default level in the offer, the answerer MUST be prepared to use | equal to the default level in the offer, the Answerer MUST be | |||
the parameter sets included in sprop-sps, and sprop-pps (either | prepared to use the parameter sets included in sprop-sps and | |||
included in the "a=fmtp" line of SDP or conveyed using the "fmtp" | sprop-pps (either included in the "a=fmtp" line of SDP or | |||
source attribute) for decoding the incoming bitstream, e.g., by | conveyed using the "fmtp" source attribute) for decoding the | |||
passing these parameter set NAL units to the video decoder before | incoming bitstream, e.g., by passing these parameter set NAL | |||
passing any NAL units carried in the RTP streams. Otherwise, the | units to the video decoder before passing any NAL units carried | |||
answerer MUST ignore sprop-vps, sprop-sps, and sprop-pps (either | in the RTP streams. Otherwise, the Answerer MUST ignore sprop- | |||
included in the "a=fmtp" line of SDP or conveyed using the "fmtp" | vps, sprop-sps, and sprop-pps (either included in the "a=fmtp" | |||
source attribute) and the offerer MUST transmit parameter sets in- | line of SDP or conveyed using the "fmtp" source attribute), and | |||
band. | the Offerer MUST transmit parameter sets in-band. | |||
The following rules apply to transport of parameter sets in the | * The following rules apply to transport of parameter sets in the | |||
answerer-to-offerer direction. | Answerer-to-Offerer direction. | |||
An answer MAY include sprop-sps, and/or sprop-pps. If none of these | - An answer MAY include sprop-sps and/or sprop-pps. If none of | |||
parameters are present in the answer, then only in-band transport of | these parameters are present in the answer, then only in-band | |||
parameter sets is used. | transport of parameter sets is used. | |||
The offerer MUST be prepared to use the parameter sets included in | - The Offerer MUST be prepared to use the parameter sets included | |||
sprop-sps and sprop-pps (either included in the "a=fmtp" line of SDP | in sprop-sps and sprop-pps (either included in the "a=fmtp" | |||
or conveyed using the "fmtp" source attribute) for decoding the | line of SDP or conveyed using the "fmtp" source attribute) for | |||
incoming bitstream, e.g., by passing these parameter set NAL units to | decoding the incoming bitstream, e.g., by passing these | |||
the video decoder before passing any NAL units carried in the RTP | parameter set NAL units to the video decoder before passing any | |||
streams. | NAL units carried in the RTP streams. | |||
When sprop-sps and/or sprop-pps are conveyed using the "fmtp" source | * When sprop-sps and/or sprop-pps are conveyed using the "fmtp" | |||
attribute, as specified in Section 6.3 of [RFC5576], the receiver of | source attribute, as specified in Section 6.3 of [RFC5576], the | |||
the parameters MUST store the parameter sets included in sprop-sps | receiver of the parameters MUST store the parameter sets included | |||
and/or sprop-pps and associate them with the source given as part of | in sprop-sps and/or sprop-pps and associate them with the source | |||
the "fmtp" source attribute. Parameter sets associated with one | given as part of the "fmtp" source attribute. Parameter sets | |||
source (given as part of the "fmtp" source attribute) MUST only be | associated with one source (given as part of the "fmtp" source | |||
used to decode NAL units conveyed in RTP packets from the same source | attribute) MUST only be used to decode NAL units conveyed in RTP | |||
(given as part of the "fmtp" source attribute). When this mechanism | packets from the same source (given as part of the "fmtp" source | |||
is in use, SSRC collision detection and resolution MUST be performed | attribute). When this mechanism is in use, SSRC collision | |||
as specified in [RFC5576]. | detection and resolution MUST be performed as specified in | |||
[RFC5576]. | ||||
Figure 11 lists the interpretation of all the parameters that MAY be | Figure 11 lists the interpretation of all the parameters that MAY be | |||
used for the various combinations of offer, answer, and direction | used for the various combinations of offer, answer, and direction | |||
attributes. | attributes. | |||
sendonly --+ | sendonly --+ | |||
recvonly --+ | | recvonly --+ | | |||
sendrecv --+ | | | sendrecv --+ | | | |||
| | | | | | | | |||
profile-id C C P | profile-id C C P | |||
skipping to change at page 40, line 23 ¶ | skipping to change at line 1830 ¶ | |||
sprop-max-don-diff P - P | sprop-max-don-diff P - P | |||
sprop-depack-buf-bytes P - P | sprop-depack-buf-bytes P - P | |||
depack-buf-cap R R - | depack-buf-cap R R - | |||
sprop-sei P - P | sprop-sei P - P | |||
sprop-sps P - P | sprop-sps P - P | |||
sprop-pps P - P | sprop-pps P - P | |||
Legend: | Legend: | |||
C: configuration for sending and receiving bitstreams | C: configuration for sending and receiving bitstreams | |||
D: changeable configuration, same as C, except possible to | D: changeable configuration; same as C, except possible to | |||
answer with a different but consistent value (see the semantics | answer with a different but consistent value (see the semantics | |||
of the level-id parameter on these parameters being | of the level-id parameter on these parameters being | |||
consistent-basically, level down-grading is allowed) | consistent -- basically, level down-grading is allowed) | |||
P: properties of the bitstream to be sent | P: properties of the bitstream to be sent | |||
R: receiver capabilities | R: receiver capabilities | |||
-: not usable, when present MUST be ignored | -: not usable; when present MUST be ignored | |||
Interpretation of Parameters for Various Combinations of | ||||
Offers, Answers, and Direction Attributes. | ||||
Figure 11 | Figure 11: Interpretation of Parameters for Various Combinations | |||
of Offers, Answers, and Direction Attributes | ||||
Parameters used for declaring receiver capabilities are, in general, | Parameters used for declaring receiver capabilities are, in general, | |||
downgradable, i.e., they express the upper limit for a sender's | downgradable, i.e., they express the upper limit for a sender's | |||
possible behavior. Thus, a sender MAY select to set its encoder | possible behavior. Thus, a sender MAY select to set its encoder | |||
using only lower/lesser or equal values of these parameters. | using only lower/lesser or equal values of these parameters. | |||
When a sender's capabilities are declared with the configuration | When a sender's capabilities are declared with the configuration | |||
parameters, these parameters express a configuration that is | parameters, these parameters express a configuration that is | |||
acceptable for the sender to receive bitstreams. In order to achieve | acceptable for the sender to receive bitstreams. In order to achieve | |||
high interoperability levels, it is often advisable to offer multiple | high interoperability levels, it is often advisable to offer multiple | |||
skipping to change at page 41, line 11 ¶ | skipping to change at line 1862 ¶ | |||
configurations in a single payload type. Thus, when multiple | configurations in a single payload type. Thus, when multiple | |||
configuration offers are made, each offer requires its own RTP | configuration offers are made, each offer requires its own RTP | |||
payload type associated with the offer. | payload type associated with the offer. | |||
An implementation SHOULD be able to understand all media type | An implementation SHOULD be able to understand all media type | |||
parameters (including all optional media type parameters), even if it | parameters (including all optional media type parameters), even if it | |||
doesn't support the functionality related to the parameter. This, in | doesn't support the functionality related to the parameter. This, in | |||
conjunction with proper application logic in the implementation, | conjunction with proper application logic in the implementation, | |||
allows the implementation, after having received an offer, to create | allows the implementation, after having received an offer, to create | |||
an answer by potentially downgrading one or more of the optional | an answer by potentially downgrading one or more of the optional | |||
parameters to the point where the implementation can cope, leading to | parameters to the point where the implementation can cope. This | |||
higher chances of interoperability beyond the most basic interop | leads to higher chances of interoperability beyond the most basic | |||
points (for which, as described above, no optional parameters are | interop points (for which, as described above, no optional parameters | |||
necessary). | are necessary). | |||
Informative note: In implementations of various H.26x video coding | | Informative note: In implementations of various H.26x video | |||
payload Formats including those for [AVC] and [HEVC], it was | | coding payload formats including those for [AVC] and [HEVC], it | |||
occasionally observed that implementations were incapable of parsing | | was occasionally observed that implementations were incapable | |||
most (or all) of the optional parameters and hence rejected offers | | of parsing most (or all) of the optional parameters and hence | |||
other than the most basic offers. As a result, the offer/answer | | rejected offers other than the most basic offers. As a result, | |||
exchange resulted in a baseline performance (using the default values | | the Offer/Answer exchange resulted in a baseline performance | |||
for the optional parameters) with the resulting suboptimal user | | (using the default values for the optional parameters) with the | |||
experience. However, there are valid reasons to forego the | | resulting suboptimal user experience. However, there are valid | |||
implementation complexity of implementing the parsing of some or all | | reasons to forego the implementation complexity of implementing | |||
of the optional parameters, for example, when there is predetermined | | the parsing of some or all of the optional parameters, for | |||
knowledge, not negotiated by an SDP-based offer/answer process, of | | example, when there is predetermined knowledge, not negotiated | |||
the capabilities of the involved systems (walled gardens, baseline | | by an SDP-based Offer/Answer process, of the capabilities of | |||
requirements defined in application standards higher up in the stack, | | the involved systems (walled gardens, baseline requirements | |||
and similar). | | defined in application standards higher up in the stack, and | |||
| similar). | ||||
An answerer MAY extend the offer with additional media format | An Answerer MAY extend the offer with additional media format | |||
configurations. However, to enable their usage, in most cases, a | configurations. However, to enable their usage, in most cases, a | |||
second offer is required from the offerer to provide the bitstream | second offer is required from the Offerer to provide the bitstream | |||
property parameters that the media sender will use. This also has | property parameters that the media sender will use. This also has | |||
the effect that the offerer has to be able to receive this media | the effect that the Offerer has to be able to receive this media | |||
format configuration, not only to send it. | format configuration, and not only to send it. | |||
7.3.3. Multicast | 7.3.3. Multicast | |||
For bitstreams being delivered over multicast, the following rules | For bitstreams being delivered over multicast, the following rules | |||
apply: | apply: | |||
The media format configuration is identified by profile-id and level- | * The media format configuration is identified by profile-id and | |||
id. These media format configuration parameters, including level-id, | level-id. These media format configuration parameters, including | |||
MUST be used symmetrically; that is, the answerer MUST either | level-id, MUST be used symmetrically; that is, the Answerer MUST | |||
maintain all configuration parameters or remove the media format | either maintain all configuration parameters or remove the media | |||
(payload type) completely. Note that this implies that the level-id | format (payload type) completely. Note that this implies that the | |||
for offer/answer in multicast is not changeable. | level-id for Offer/Answer in multicast is not changeable. | |||
To simplify the handling and matching of these configurations, the | * To simplify the handling and matching of these configurations, the | |||
same RTP payload type number used in the offer SHOULD also be used in | same RTP payload type number used in the offer SHOULD also be used | |||
the answer, as specified in [RFC3264]. An answer MUST NOT contain a | in the answer, as specified in [RFC3264]. An answer MUST NOT | |||
payload type number used in the offer unless the configuration is the | contain a payload type number used in the offer unless the | |||
same as in the offer. | configuration is the same as in the offer. | |||
Parameter sets received MUST be associated with the originating | * Parameter sets received MUST be associated with the originating | |||
source and MUST only be used in decoding the incoming bitstream from | source and MUST only be used in decoding the incoming bitstream | |||
the same source. | from the same source. | |||
The rules for other parameters are the same as above for unicast as | * The rules for other parameters are the same as above for unicast | |||
long as the three above rules are obeyed. | as long as the three above rules are obeyed. | |||
7.3.4. Usage in Declarative Session Descriptions | 7.3.4. Usage in Declarative Session Descriptions | |||
When EVC over RTP is offered with SDP in a declarative style, as in | When EVC over RTP is offered with SDP in a declarative style, as in | |||
Real Time Streaming Protocol (RTSP) [RFC7826] or Session Announcement | the Real-Time Streaming Protocol (RTSP) [RFC7826] or Session | |||
Protocol (SAP) [RFC2974], the following considerations apply. | Announcement Protocol (SAP) [RFC2974], the following considerations | |||
apply. | ||||
All parameters capable of indicating both bitstream properties and | * All parameters capable of indicating both bitstream properties and | |||
receiver capabilities are used to indicate only bitstream properties. | receiver capabilities are used to indicate only bitstream | |||
For example, in this case, the parameters profile-id and level-id | properties. For example, in this case, the parameters profile-id | |||
declare the values used by the bitstream, not the capabilities for | and level-id declare the values used by the bitstream, not the | |||
receiving bitstreams. As a result, the following interpretation of | capabilities for receiving bitstreams. As a result, the following | |||
the parameters MUST be used: | interpretation of the parameters MUST be used: | |||
Declaring actual configuration or bitstream properties: | - Declaring actual configuration or bitstream properties: | |||
profile-id level-id sprop-sps sprop-pps sprop-max-don-diff sprop- | o profile-id | |||
depack-buf-bytes sprop-sei | o level-id | |||
o sprop-sps | ||||
o sprop-pps | ||||
o sprop-max-don-diff | ||||
o sprop-depack-buf-bytes | ||||
o sprop-sei | ||||
Not usable (when present, they MUST be ignored): | - Not usable (when present, they MUST be ignored): | |||
depack-buf-cap recv-sublayer-id | o depack-buf-cap | |||
o recv-sublayer-id | ||||
A receiver of the SDP is required to support all parameters and | - A receiver of the SDP is required to support all parameters and | |||
values of the parameters provided; otherwise, the receiver MUST | values of the parameters provided; otherwise, the receiver MUST | |||
reject (RTSP) or not participate in (SAP) the session. It falls on | reject (RTSP) or not participate in (SAP) the session. It | |||
the creator of the session to use values that are expected to be | falls on the creator of the session to use values that are | |||
supported by the receiving application. | expected to be supported by the receiving application. | |||
7.3.5. Considerations for Parameter Sets | 7.3.5. Considerations for Parameter Sets | |||
When out-of-band transport of parameter sets is used, parameter sets | When out-of-band transport of parameter sets is used, parameter sets | |||
MAY still be additionally transported in-band unless explicitly | MAY still be additionally transported in-band unless explicitly | |||
disallowed by an application, and some of these additional parameter | disallowed by an application, and some of these additional parameter | |||
sets may update some of the out-of-band transported parameter sets. | sets may update some of the out-of-band transported parameter sets. | |||
An update of a parameter set refers to the sending of a parameter set | An update of a parameter set refers to the sending of a parameter set | |||
of the same type using the same parameter set ID but with different | of the same type using the same parameter set ID but with different | |||
values for at least one other parameter of the parameter set. | values for at least one other parameter of the parameter set. | |||
8. Use with Feedback Messages | 8. Use with Feedback Messages | |||
The following subsections define the use of the Picture Loss | The following subsections define the use of the Picture Loss | |||
Indication (PLI) and Full Intra Request (FIR) feedback messages with | Indication (PLI) [RFC4585] and Full Intra Request (FIR) [RFC5104] | |||
[EVC]. The PLI is defined in [RFC4585], and the FIR message is | feedback messages with [EVC]. | |||
defined in [RFC5104]. | ||||
In accordance with this document, a sender MUST NOT send Slice Loss | In accordance with this document, a sender MUST NOT send Slice Loss | |||
Indication (SLI) or Reference Picture Selection Indication (RPSI), | Indication (SLI) or Reference Picture Selection Indication (RPSI); | |||
and a receiver MUST ignore RPSI and MUST treat a received SLI as a | and a receiver MUST ignore RPSI and MUST treat a received SLI as a | |||
received PLI, ignoring the "First", "Number", and "PictureID" fields | received PLI, ignoring the "First", "Number", and "PictureID" fields | |||
of the PLI. | of the PLI. | |||
8.1. Picture Loss Indication (PLI) | 8.1. Picture Loss Indication (PLI) | |||
As specified in Section 6.3.1 of [RFC4585], the reception of a PLI by | As specified in Section 6.3.1 of [RFC4585], the reception of a PLI by | |||
a media sender indicates "the loss of an undefined amount of coded | a media sender indicates "the loss of an undefined amount of coded | |||
video data belonging to one or more pictures". Without having any | video data belonging to one or more pictures". Without having any | |||
specific knowledge of the setup of the bitstream (such as use and | specific knowledge of the setup of the bitstream (such as use and | |||
location of in-band parameter sets, IDR picture locations, picture | location of in-band parameter sets, IDR picture locations, picture | |||
structures, and so forth), a reaction to the reception of a PLI by a | structures, and so forth), a reaction to the reception of a PLI by an | |||
EVC sender SHOULD be to send an IDR picture and relevant parameter | EVC sender SHOULD be to send an IDR picture and relevant parameter | |||
sets, potentially with sufficient redundancy so to ensure correct | sets, potentially with sufficient redundancy so as to ensure correct | |||
reception. However, sometimes information about the bitstream | reception. However, sometimes information about the bitstream | |||
structure is known. For example, such information can be parameter | structure is known. For example, such information can be parameter | |||
sets that have been conveyed out of band through mechanisms not | sets that have been conveyed out of band through mechanisms not | |||
defined in this document and that are known to stay static for the | defined in this document and that are known to stay static for the | |||
duration of the session. In that case, it is obviously unnecessary | duration of the session. In that case, it is obviously unnecessary | |||
to send them in-band as a result of the reception of a PLI. Other | to send them in-band as a result of the reception of a PLI. Other | |||
examples could be devised based on a priori knowledge of different | examples could be devised based on a priori knowledge of different | |||
aspects of the bitstream structure. In all cases, the timing and | aspects of the bitstream structure. In all cases, the timing and | |||
congestion control mechanisms of [RFC4585] MUST be observed. | congestion-control mechanisms of [RFC4585] MUST be observed. | |||
8.2. Full Intra Request (FIR) | 8.2. Full Intra Request (FIR) | |||
The purpose of the FIR message is to force an encoder to send an | The purpose of the FIR message is to force an encoder to send an | |||
independent decoder refresh point as soon as possible while observing | independent decoder refresh point as soon as possible while observing | |||
applicable congestion-control-related constraints, such as those set | applicable congestion-control-related constraints, such as those set | |||
out in [RFC8082]. | out in [RFC8082]. | |||
Upon reception of a FIR, a sender MUST send an IDR picture. | Upon reception of a FIR, a sender MUST send an IDR picture. | |||
Parameter sets MUST also be sent, except when there is a priori | Parameter sets MUST also be sent, except when there is a priori | |||
skipping to change at page 44, line 26 ¶ | skipping to change at line 2013 ¶ | |||
receiver, established by means outside this document, that parameter | receiver, established by means outside this document, that parameter | |||
sets are exclusively sent out of band. | sets are exclusively sent out of band. | |||
9. Security Considerations | 9. Security Considerations | |||
The scope of this section is limited to the payload format itself and | The scope of this section is limited to the payload format itself and | |||
to one feature of [EVC] that may pose a particularly serious security | to one feature of [EVC] that may pose a particularly serious security | |||
risk if implemented naively. The payload format, in isolation, does | risk if implemented naively. The payload format, in isolation, does | |||
not form a complete system. Implementers are advised to read and | not form a complete system. Implementers are advised to read and | |||
understand relevant security-related documents, especially those | understand relevant security-related documents, especially those | |||
pertaining to RTP (see the Security Considerations section in | pertaining to RTP (see the Security Considerations in Section 14 of | |||
[RFC3550]) and the security of the call-control stack chosen (that | [RFC3550]) and the security of the call-control stack chosen (that | |||
may make use of the media type registration of this document). | may make use of the media type registration of this document). | |||
Implementers should also consider known security vulnerabilities of | Implementers should also consider known security vulnerabilities of | |||
video coding and decoding implementations in general and avoid those. | video coding and decoding implementations in general and avoid those. | |||
Within this RTP payload format, and with the exception of the user | Within this RTP payload format, and with the exception of the user | |||
data SEI message as described below, no security threats other than | data SEI message as described below, no security threats other than | |||
those common to RTP payload formats are known. In other words, | those common to RTP payload formats are known. In other words, | |||
neither the various media-plane-based mechanisms nor the signaling | neither the various media-plane-based mechanisms nor the signaling | |||
part of this document seem to pose a security risk beyond those | part of this document seem to pose a security risk beyond those | |||
common to all RTP-based systems. | common to all RTP-based systems. | |||
RTP packets using the payload format defined in this specification | RTP packets using the payload format defined in this specification | |||
are subject to the security considerations discussed in the RTP | are subject to the security considerations discussed in the RTP | |||
specification [RFC3550], and in any applicable RTP profile such as | specification [RFC3550] and in any applicable RTP profile such as | |||
RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/ | RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/ | |||
SAVPF [RFC5124]. However, as "Securing the RTP Framework: Why RTP | SAVPF [RFC5124]. However, as "Securing the RTP Framework: Why RTP | |||
Does Not Mandate a Single Media Security Solution" [RFC7202] | Does Not Mandate a Single Media Security Solution" [RFC7202] | |||
discusses, it is not an RTP payload format's responsibility to | discusses, it is not an RTP payload format's responsibility to | |||
discuss or mandate what solutions are used to meet the basic security | discuss or mandate what solutions are used to meet the basic security | |||
goals like confidentiality, integrity and source authenticity for RTP | goals like confidentiality, integrity, and source authenticity for | |||
in general. This responsibility lays on anyone using RTP in an | RTP in general. This responsibility lies on anyone using RTP in an | |||
application. They can find guidance on available security mechanisms | application. They can find guidance on available security mechanisms | |||
and important considerations in "Options for Securing RTP Sessions" | and important considerations in "Options for Securing RTP Sessions" | |||
[RFC7201]. Applications SHOULD use one or more appropriate strong | [RFC7201]. Applications SHOULD use one or more appropriate strong | |||
security mechanisms. The rest of this section discusses the security | security mechanisms. The rest of this section discusses the security | |||
impacting properties of the payload format itself. | impacting properties of the payload format itself. | |||
Because the data compression used with this payload format is applied | Because the data compression used with this payload format is applied | |||
end-to-end, any encryption needs to be performed after compression. | end to end, any encryption needs to be performed after compression. | |||
A potential denial-of-service threat exists for data encodings using | A potential denial-of-service threat exists for data encodings using | |||
compression techniques that have non-uniform receiver-end | compression techniques that have non-uniform receiver-end | |||
computational load. The attacker can inject pathological datagrams | computational load. The attacker can inject pathological datagrams | |||
into the bitstream that are complex to decode and that cause the | into the bitstream that are complex to decode and that cause the | |||
receiver to be overloaded. | receiver to be overloaded. | |||
EVC is particularly vulnerable to such attacks, as it is extremely | EVC is particularly vulnerable to such attacks, as it is extremely | |||
simple to generate datagrams containing NAL units that affect the | simple to generate datagrams containing NAL units that affect the | |||
decoding process of many future NAL units. Therefore, the usage of | decoding process of many future NAL units. Therefore, the usage of | |||
data origin authentication and data integrity protection of at least | data origin authentication and data integrity protection of at least | |||
the RTP packet is RECOMMENDED based on the thoughts of [RFC7202]. | the RTP packet is RECOMMENDED based on [RFC7202]. | |||
Like HEVC [RFC7798] and [VVC], [EVC] includes a user data | Like HEVC [RFC7798] and VVC [VVC], EVC [EVC] includes a user data | |||
Supplemental Enhancement Information (SEI) message. This SEI message | Supplemental Enhancement Information (SEI) message. This SEI message | |||
allows inclusion of an arbitrary bitstring into the video bitstream. | allows inclusion of an arbitrary bitstring into the video bitstream. | |||
Such a bitstring could include JavaScript, machine code, and other | Such a bitstring could include JavaScript, machine code, and other | |||
active content. | active content. | |||
[EVC] leaves the handling of this SEI message to the receiving | EVC [EVC] leaves the handling of this SEI message to the receiving | |||
system. In order to avoid harmful side effects of the user data SEI | system. In order to avoid harmful side effects of the user data SEI | |||
message, decoder implementations cannot naively trust its content. | message, decoder implementations cannot naively trust its content. | |||
For example, forwarding all received JavaScript code detected by a | For example, forwarding all received JavaScript code detected by a | |||
decoder implementation to a web-browser unchecked would be a bad and | decoder implementation to a web browser unchecked would be a bad and | |||
insecure implementation practice. The safest way to deal with user | insecure implementation practice. The safest way to deal with user | |||
data SEI messages is to simply discard them, but that can have | data SEI messages is to simply discard them, but that can have | |||
negative side effects on the quality of experience by the user. | negative side effects on the quality of experience by the user. | |||
End-to-end security with authentication, integrity, or | End-to-end security with authentication, integrity, or | |||
confidentiality protection will prevent a MANE from performing media- | confidentiality protection will prevent a MANE from performing media- | |||
aware operations other than discarding complete packets. In the case | aware operations other than discarding complete packets. In the case | |||
of confidentiality protection, it will even be prevented from | of confidentiality protection, it will even be prevented from | |||
discarding packets in a media-aware way. To be allowed to perform | discarding packets in a media-aware way. To be allowed to perform | |||
such operations, a MANE is required to be a trusted entity that is | such operations, a MANE is required to be a trusted entity that is | |||
skipping to change at page 46, line 19 ¶ | skipping to change at line 2089 ¶ | |||
10. Congestion Control | 10. Congestion Control | |||
Congestion control for RTP SHALL be used in accordance with RTP | Congestion control for RTP SHALL be used in accordance with RTP | |||
[RFC3550] and with any applicable RTP profile, e.g., AVP [RFC3551] or | [RFC3550] and with any applicable RTP profile, e.g., AVP [RFC3551] or | |||
AVPF [RFC4585]. If best-effort service is being used, an additional | AVPF [RFC4585]. If best-effort service is being used, an additional | |||
requirement is that users of this payload format MUST monitor packet | requirement is that users of this payload format MUST monitor packet | |||
loss to ensure that the packet loss rate is within an acceptable | loss to ensure that the packet loss rate is within an acceptable | |||
range. Packet loss is considered acceptable if a TCP flow across the | range. Packet loss is considered acceptable if a TCP flow across the | |||
same network path and experiencing the same network conditions would | same network path and experiencing the same network conditions would | |||
achieve an average throughput, measured on a reasonable timescale, | achieve an average throughput, measured on a reasonable timescale, | |||
that is not less than all RTP streams combined are achieved. This | that is not less than all RTP streams combined. This condition can | |||
condition can be satisfied by implementing congestion-control | be satisfied by implementing congestion-control mechanisms to adapt | |||
mechanisms to adapt the transmission rate, by implementing the number | the transmission rate by implementing the number of layers subscribed | |||
of layers subscribed for a layered multicast session, or by arranging | for a layered multicast session or by arranging for a receiver to | |||
for a receiver to leave the session if the loss rate is unacceptably | leave the session if the loss rate is unacceptably high. | |||
high. | ||||
The bitrate adaptation necessary for obeying the congestion control | The bitrate adaptation necessary for obeying the congestion control | |||
principle is easily achievable when real-time encoding is used, for | principle is easily achievable when real-time encoding is used, for | |||
example, by adequately tuning the quantization parameter. However, | example, by adequately tuning the quantization parameter. However, | |||
when pre-encoded content is being transmitted, bandwidth adaptation | when pre-encoded content is being transmitted, bandwidth adaptation | |||
requires the pre-coded bitstream to be tailored for such adaptivity. | requires the pre-coded bitstream to be tailored for such adaptivity. | |||
The key mechanism available in [EVC] is temporal scalability. A | The key mechanism available in [EVC] is temporal scalability. A | |||
media sender can remove NAL units belonging to higher temporal sub- | media sender can remove NAL units belonging to higher temporal sub- | |||
layers (i.e., those NAL units with a large value of TID) until the | layers (i.e., those NAL units with a large value of TID) until the | |||
sending bitrate drops to an acceptable range. | sending bitrate drops to an acceptable range. | |||
The mechanisms mentioned above generally work within a defined | The mechanisms mentioned above generally work within a defined | |||
profile and level; therefore no renegotiation of the channel is | profile and level; therefore, no renegotiation of the channel is | |||
required. Only when non-downgradable parameters (such as profile) | required. Only when non-downgradable parameters (such as the | |||
are required to be changed does it become necessary to terminate and | profile) are required to be changed does it become necessary to | |||
restart the RTP stream(s). This may be accomplished by using | terminate and restart the RTP streams. This may be accomplished by | |||
different RTP payload types. | using different RTP payload types. | |||
MANEs MAY remove certain unusable packets from the RTP stream when | MANEs MAY remove certain unusable packets from the RTP stream when | |||
that RTP stream was damaged due to previous packet losses. This can | that RTP stream was damaged due to previous packet losses. This can | |||
help reduce the network load in certain special cases. For example, | help reduce the network load in certain special cases. For example, | |||
MANEs can remove those FUs where the leading FUs belonging to the | MANEs can remove those FUs where the leading FUs belonging to the | |||
same NAL unit have been lost, because the trailing FUs are | same NAL unit have been lost, because the trailing FUs are | |||
meaningless to most decoders. MANE can also remove higher temporal | meaningless to most decoders. MANE can also remove higher temporal | |||
scalable layers if the outbound transmission (from the MANE's | scalable layers if the outbound transmission (from the MANE's | |||
viewpoint) experiences congestion. | viewpoint) experiences congestion. | |||
11. IANA Considerations | 11. IANA Considerations | |||
A new media type, as specified in Section 7.1 of this document, has | The media type specified in Section 7.1 has been registered with | |||
been registered with IANA. | IANA. | |||
12. Acknowledgements | ||||
Large parts of this specification share text with the RTP payload | ||||
format for VVC [RFC9328]. Roman Chernyak is thanksed for his | ||||
valueable review comments. We thank the authors of that | ||||
specification for their excellent work. | ||||
13. References | ||||
13.1. Normative References | 12. References | |||
[EVC] "ISO/IEC 23094-1 Essential Video Coding", 2020, | 12.1. Normative References | |||
<https://www.iso.org/standard/57797.html>. | ||||
[ISO23094-1] | [EVC] "Information technology -- General video coding -- Part 1: | |||
"ISO/IEC DIS Information technology --- General video | Essential video coding", ISO/IEC 23094-1:2020, October | |||
coding --- Part 1 Essential video coding", n.d., | 2020, <https://www.iso.org/standard/57797.html>. | |||
<https://www.iso.org/standard/57797.html>. | ||||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/rfc/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model | [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model | |||
with Session Description Protocol (SDP)", RFC 3264, | with Session Description Protocol (SDP)", RFC 3264, | |||
DOI 10.17487/RFC3264, June 2002, | DOI 10.17487/RFC3264, June 2002, | |||
<https://www.rfc-editor.org/rfc/rfc3264>. | <https://www.rfc-editor.org/info/rfc3264>. | |||
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. | [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. | |||
Jacobson, "RTP: A Transport Protocol for Real-Time | Jacobson, "RTP: A Transport Protocol for Real-Time | |||
Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, | Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, | |||
July 2003, <https://www.rfc-editor.org/rfc/rfc3550>. | July 2003, <https://www.rfc-editor.org/info/rfc3550>. | |||
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and | [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and | |||
Video Conferences with Minimal Control", STD 65, RFC 3551, | Video Conferences with Minimal Control", STD 65, RFC 3551, | |||
DOI 10.17487/RFC3551, July 2003, | DOI 10.17487/RFC3551, July 2003, | |||
<https://www.rfc-editor.org/rfc/rfc3551>. | <https://www.rfc-editor.org/info/rfc3551>. | |||
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. | [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. | |||
Norrman, "The Secure Real-time Transport Protocol (SRTP)", | Norrman, "The Secure Real-time Transport Protocol (SRTP)", | |||
RFC 3711, DOI 10.17487/RFC3711, March 2004, | RFC 3711, DOI 10.17487/RFC3711, March 2004, | |||
<https://www.rfc-editor.org/rfc/rfc3711>. | <https://www.rfc-editor.org/info/rfc3711>. | |||
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, | [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, | |||
"Extended RTP Profile for Real-time Transport Control | "Extended RTP Profile for Real-time Transport Control | |||
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, | Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, | |||
DOI 10.17487/RFC4585, July 2006, | DOI 10.17487/RFC4585, July 2006, | |||
<https://www.rfc-editor.org/rfc/rfc4585>. | <https://www.rfc-editor.org/info/rfc4585>. | |||
[RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data | [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data | |||
Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, | Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, | |||
<https://www.rfc-editor.org/rfc/rfc4648>. | <https://www.rfc-editor.org/info/rfc4648>. | |||
[RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, | [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, | |||
"Codec Control Messages in the RTP Audio-Visual Profile | "Codec Control Messages in the RTP Audio-Visual Profile | |||
with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104, | with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104, | |||
February 2008, <https://www.rfc-editor.org/rfc/rfc5104>. | February 2008, <https://www.rfc-editor.org/info/rfc5104>. | |||
[RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for | [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for | |||
Real-time Transport Control Protocol (RTCP)-Based Feedback | Real-time Transport Control Protocol (RTCP)-Based Feedback | |||
(RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February | (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February | |||
2008, <https://www.rfc-editor.org/rfc/rfc5124>. | 2008, <https://www.rfc-editor.org/info/rfc5124>. | |||
[RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific | [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific | |||
Media Attributes in the Session Description Protocol | Media Attributes in the Session Description Protocol | |||
(SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009, | (SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009, | |||
<https://www.rfc-editor.org/rfc/rfc5576>. | <https://www.rfc-editor.org/info/rfc5576>. | |||
[RFC7826] Schulzrinne, H., Rao, A., Lanphier, R., Westerlund, M., | [RFC7826] Schulzrinne, H., Rao, A., Lanphier, R., Westerlund, M., | |||
and M. Stiemerling, Ed., "Real-Time Streaming Protocol | and M. Stiemerling, Ed., "Real-Time Streaming Protocol | |||
Version 2.0", RFC 7826, DOI 10.17487/RFC7826, December | Version 2.0", RFC 7826, DOI 10.17487/RFC7826, December | |||
2016, <https://www.rfc-editor.org/rfc/rfc7826>. | 2016, <https://www.rfc-editor.org/info/rfc7826>. | |||
[RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund, | [RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund, | |||
"Using Codec Control Messages in the RTP Audio-Visual | "Using Codec Control Messages in the RTP Audio-Visual | |||
Profile with Feedback with Layered Codecs", RFC 8082, | Profile with Feedback with Layered Codecs", RFC 8082, | |||
DOI 10.17487/RFC8082, March 2017, | DOI 10.17487/RFC8082, March 2017, | |||
<https://www.rfc-editor.org/rfc/rfc8082>. | <https://www.rfc-editor.org/info/rfc8082>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[RFC8866] Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP: | [RFC8866] Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP: | |||
Session Description Protocol", RFC 8866, | Session Description Protocol", RFC 8866, | |||
DOI 10.17487/RFC8866, January 2021, | DOI 10.17487/RFC8866, January 2021, | |||
<https://www.rfc-editor.org/rfc/rfc8866>. | <https://www.rfc-editor.org/info/rfc8866>. | |||
[RFC9328] Zhao, S., Wenger, S., Sanchez, Y., Wang, Y.-K., and M. M. | [RFC9328] Zhao, S., Wenger, S., Sanchez, Y., Wang, Y.-K., and M. M. | |||
Hannuksela, "RTP Payload Format for Versatile Video Coding | Hannuksela, "RTP Payload Format for Versatile Video Coding | |||
(VVC)", RFC 9328, DOI 10.17487/RFC9328, December 2022, | (VVC)", RFC 9328, DOI 10.17487/RFC9328, December 2022, | |||
<https://www.rfc-editor.org/rfc/rfc9328>. | <https://www.rfc-editor.org/info/rfc9328>. | |||
[VSEI] "Versatile supplemental enhancement information messages | [VSEI] ITU-T, "Versatile supplemental enhancement information | |||
for coded video bitstreams", 2020, | messages for coded video bitstreams", ITU-T | |||
Recommendation H.274, March 2024, | ||||
<https://www.itu.int/rec/T-REC-H.274>. | <https://www.itu.int/rec/T-REC-H.274>. | |||
13.2. Informative References | 12.2. Informative References | |||
[AVC] "ITU-T Recommendation H.264 - Advanced video coding for | [AVC] ITU-T, "Part 10: Advanced video coding", ITU-T | |||
generic audiovisual services", 2014, | Recommendation H.264, October 2014, | |||
<https://www.iso.org/standard/66069.html>. | <https://www.iso.org/standard/66069.html>. | |||
[HEVC] "High efficiency video coding, ITU-T Recommendation | [HEVC] ITU-T, "High efficiency video coding", ITU-T | |||
H.265", 2019, <https://www.itu.int/rec/T-REC-H.265>. | Recommendation H.265, November 2019, | |||
<https://www.itu.int/rec/T-REC-H.265>. | ||||
[MPEG2S] IS0/IEC, "Information technology - Generic coding ofmoving | [MPEG2S] IS0/IEC, "Information technology - Generic coding of | |||
pictures and associated audio information - Part | moving pictures and associated audio information - Part 1: | |||
1:Systems, ISO International Standard 13818-1", 2013. | Systems", ISO/IEC 13818-1:2013, June 2013. | |||
[RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session | [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session | |||
Announcement Protocol", RFC 2974, DOI 10.17487/RFC2974, | Announcement Protocol", RFC 2974, DOI 10.17487/RFC2974, | |||
October 2000, <https://www.rfc-editor.org/rfc/rfc2974>. | October 2000, <https://www.rfc-editor.org/info/rfc2974>. | |||
[RFC6184] Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP | [RFC6184] Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP | |||
Payload Format for H.264 Video", RFC 6184, | Payload Format for H.264 Video", RFC 6184, | |||
DOI 10.17487/RFC6184, May 2011, | DOI 10.17487/RFC6184, May 2011, | |||
<https://www.rfc-editor.org/rfc/rfc6184>. | <https://www.rfc-editor.org/info/rfc6184>. | |||
[RFC6190] Wenger, S., Wang, Y.-K., Schierl, T., and A. | [RFC6190] Wenger, S., Wang, Y.-K., Schierl, T., and A. | |||
Eleftheriadis, "RTP Payload Format for Scalable Video | Eleftheriadis, "RTP Payload Format for Scalable Video | |||
Coding", RFC 6190, DOI 10.17487/RFC6190, May 2011, | Coding", RFC 6190, DOI 10.17487/RFC6190, May 2011, | |||
<https://www.rfc-editor.org/rfc/rfc6190>. | <https://www.rfc-editor.org/info/rfc6190>. | |||
[RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP | [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP | |||
Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014, | Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014, | |||
<https://www.rfc-editor.org/rfc/rfc7201>. | <https://www.rfc-editor.org/info/rfc7201>. | |||
[RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP | [RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP | |||
Framework: Why RTP Does Not Mandate a Single Media | Framework: Why RTP Does Not Mandate a Single Media | |||
Security Solution", RFC 7202, DOI 10.17487/RFC7202, April | Security Solution", RFC 7202, DOI 10.17487/RFC7202, April | |||
2014, <https://www.rfc-editor.org/rfc/rfc7202>. | 2014, <https://www.rfc-editor.org/info/rfc7202>. | |||
[RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and | [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and | |||
B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms | B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms | |||
for Real-Time Transport Protocol (RTP) Sources", RFC 7656, | for Real-Time Transport Protocol (RTP) Sources", RFC 7656, | |||
DOI 10.17487/RFC7656, November 2015, | DOI 10.17487/RFC7656, November 2015, | |||
<https://www.rfc-editor.org/rfc/rfc7656>. | <https://www.rfc-editor.org/info/rfc7656>. | |||
[RFC7667] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 7667, | [RFC7667] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 7667, | |||
DOI 10.17487/RFC7667, November 2015, | DOI 10.17487/RFC7667, November 2015, | |||
<https://www.rfc-editor.org/rfc/rfc7667>. | <https://www.rfc-editor.org/info/rfc7667>. | |||
[RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M. | [RFC7798] Wang, Y.-K., Sanchez, Y., Schierl, T., Wenger, S., and M. | |||
M. Hannuksela, "RTP Payload Format for High Efficiency | M. Hannuksela, "RTP Payload Format for High Efficiency | |||
Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, | Video Coding (HEVC)", RFC 7798, DOI 10.17487/RFC7798, | |||
March 2016, <https://www.rfc-editor.org/rfc/rfc7798>. | March 2016, <https://www.rfc-editor.org/info/rfc7798>. | |||
[VVC] "Versatile Video Coding, ITU-T Recommendation H.266", | [VIDEO-CODING] | |||
2020, <http://www.itu.int/rec/T-REC-H.266>. | ITU-T, "Video coding for low bit rate communication", | |||
ITU-T Recommendation H.263, January 2005, | ||||
<https://www.itu.int/rec/T-REC-H.263>. | ||||
[VVC] ITU-T, "Versatile video coding", ITU-T | ||||
Recommendation H.266, August 2020, | ||||
<http://www.itu.int/rec/T-REC-H.266>. | ||||
Acknowledgements | ||||
Large parts of this specification share text with the RTP payload | ||||
format for VVC [RFC9328]. Roman Chernyak is thanked for his valuable | ||||
review comments. We thank the authors of that specification for | ||||
their excellent work. | ||||
Authors' Addresses | Authors' Addresses | |||
Shuai Zhao | Shuai Zhao | |||
Intel | Intel | |||
2200 Mission College Blvd | 2200 Mission College Blvd | |||
Santa Clara, 95054 | Santa Clara, California 95054 | |||
United States of America | United States of America | |||
Email: shuai.zhao@ieee.org | Email: shuai.zhao@ieee.org | |||
Stephan Wenger | Stephan Wenger | |||
Tencent | Tencent | |||
2747 Park Blvd | 2747 Park Blvd | |||
Palo Alto, 94588 | Palo Alto, California 94588 | |||
United States of America | United States of America | |||
Email: stewe@stewe.org | Email: stewe@stewe.org | |||
Youngkwon Lim | Youngkwon Lim | |||
Samsung Electronics | Samsung Electronics | |||
6625 Excellence Way | 6625 Excellence Way | |||
Plano, 75013 | Plano, Texas 75013 | |||
United States of America | United States of America | |||
Email: yklwhite@gmail.com | Email: yklwhite@gmail.com | |||
End of changes. 327 change blocks. | ||||
1011 lines changed or deleted | 1006 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |