rfc9071.original | rfc9071.txt | |||
---|---|---|---|---|
AVTCore G. Hellstrom | Internet Engineering Task Force (IETF) G. Hellström | |||
Internet-Draft Gunnar Hellstrom Accessible Communication | Request for Comments: 9071 GHAccess | |||
Updates: 4103 (if approved) 26 May 2021 | Updates: 4103 June 2021 | |||
Intended status: Standards Track | Category: Standards Track | |||
Expires: 27 November 2021 | ISSN: 2070-1721 | |||
RTP-mixer formatting of multiparty Real-time text | RTP-Mixer Formatting of Multiparty Real-Time Text | |||
draft-ietf-avtcore-multi-party-rtt-mix-20 | ||||
Abstract | Abstract | |||
This document provides enhancements for RFC 4103 real-time text | This document provides enhancements of real-time text (as specified | |||
mixing suitable for a centralized conference model that enables | in RFC 4103) suitable for mixing in a centralized conference model, | |||
source identification and rapidly interleaved transmission of text | enabling source identification and rapidly interleaved transmission | |||
from different sources. The intended use is for real-time text | of text from different sources. The intended use is for real-time | |||
mixers and participant endpoints capable of providing an efficient | text mixers and participant endpoints capable of providing an | |||
presentation or other treatment of a multiparty real-time text | efficient presentation or other treatment of a multiparty real-time | |||
session. The specified mechanism builds on the standard use of the | text session. The specified mechanism builds on the standard use of | |||
Contributing Source (CSRC) list in the Realtime Protocol (RTP) packet | the Contributing Source (CSRC) list in the Real-time Transport | |||
for source identification. The method makes use of the same "text/ | Protocol (RTP) packet for source identification. The method makes | |||
t140" and "text/red" formats as for two-party sessions. | use of the same "text/t140" and "text/red" formats as for two-party | |||
sessions. | ||||
Solutions using multiple RTP streams in the same RTP session are | Solutions using multiple RTP streams in the same RTP session are | |||
briefly mentioned, as they could have some benefits over the RTP- | briefly mentioned, as they could have some benefits over the RTP- | |||
mixer model. The possibility to implement the solution in a wide | mixer model. The RTP-mixer model was selected to be used for the | |||
range of existing RTP implementations made the RTP-mixer model be | fully specified solution in this document because it can be applied | |||
selected to be fully specified in this document. | to a wide range of existing RTP implementations. | |||
A capability exchange is specified so that it can be verified that a | A capability exchange is specified so that it can be verified that a | |||
mixer and a participant can handle the multiparty-coded real-time | mixer and a participant can handle the multiparty-coded real-time | |||
text stream using the RTP-mixer method. The capability is indicated | text stream using the RTP-mixer method. The capability is indicated | |||
by use of an RFC 8866 Session Description Protocol (SDP) media | by the use of a Session Description Protocol (SDP) (RFC 8866) media | |||
attribute "rtt-mixer". | attribute, "rtt-mixer". | |||
The document updates RFC 4103 "RTP Payload for Text Conversation". | This document updates RFC 4103 ("RTP Payload for Text Conversation"). | |||
A specification of how a mixer can format text for the case when the | A specification for how a mixer can format text for the case when the | |||
endpoint is not multiparty-aware is also provided. | endpoint is not multiparty aware is also provided. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 27 November 2021. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9071. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2021 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Simplified BSD License text | to this document. Code Components extracted from this document must | |||
as described in Section 4.e of the Trust Legal Provisions and are | include Simplified BSD License text as described in Section 4.e of | |||
provided without warranty as described in the Simplified BSD License. | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction | |||
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.1. Terminology | |||
1.2. Selected solution and considered alternatives . . . . . . 7 | 1.2. Main Method, Fallback Method, and Considered Alternatives | |||
1.3. Intended application . . . . . . . . . . . . . . . . . . 9 | 1.3. Intended Application | |||
2. Overview of the two specified solutions and selection of | 2. Overview of the Two Specified Solutions and Selection of Method | |||
method . . . . . . . . . . . . . . . . . . . . . . . . . 10 | 2.1. The RTP-Mixer-Based Solution for Multiparty-Aware Endpoints | |||
2.1. The RTP-mixer-based solution for multiparty-aware | 2.2. Mixing for Multiparty-Unaware Endpoints | |||
endpoints . . . . . . . . . . . . . . . . . . . . . . . . 10 | 2.3. Offer/Answer Considerations | |||
2.2. Mixing for multiparty-unaware endpoints . . . . . . . . . 11 | 2.4. Actions Depending on Capability Negotiation Result | |||
2.3. Offer/answer considerations . . . . . . . . . . . . . . . 11 | 3. Details for the RTP-Mixer-Based Mixing Method for | |||
2.4. Actions depending on capability negotiation result . . . 13 | Multiparty-Aware Endpoints | |||
3. Details for the RTP-mixer-based mixing method for | 3.1. Use of Fields in the RTP Packets | |||
multiparty-aware endpoints . . . . . . . . . . . . . . . 13 | 3.2. Initial Transmission of a BOM Character | |||
3.1. Use of fields in the RTP packets . . . . . . . . . . . . 13 | 3.3. Keep-Alive | |||
3.2. Initial transmission of a BOM character . . . . . . . . . 14 | 3.4. Transmission Interval | |||
3.3. Keep-alive . . . . . . . . . . . . . . . . . . . . . . . 14 | 3.5. Only One Source per Packet | |||
3.4. Transmission interval . . . . . . . . . . . . . . . . . . 14 | 3.6. Do Not Send Received Text to the Originating Source | |||
3.5. Only one source per packet . . . . . . . . . . . . . . . 15 | 3.7. Clean Incoming Text | |||
3.6. Do not send received text to the originating source . . . 15 | 3.8. Principles of Redundant Transmission | |||
3.7. Clean incoming text . . . . . . . . . . . . . . . . . . . 16 | 3.9. Text Placement in Packets | |||
3.8. Redundant transmission principles . . . . . . . . . . . . 16 | 3.10. Empty T140blocks | |||
3.9. Text placement in packets . . . . . . . . . . . . . . . . 16 | 3.11. Creation of the Redundancy | |||
3.10. Empty T140blocks . . . . . . . . . . . . . . . . . . . . 17 | 3.12. Timer Offset Fields | |||
3.11. Creation of the redundancy . . . . . . . . . . . . . . . 17 | 3.13. Other RTP Header Fields | |||
3.12. Timer offset fields . . . . . . . . . . . . . . . . . . . 18 | 3.14. Pause in Transmission | |||
3.13. Other RTP header fields . . . . . . . . . . . . . . . . . 18 | 3.15. RTCP Considerations | |||
3.14. Pause in transmission . . . . . . . . . . . . . . . . . . 18 | 3.16. Reception of Multiparty Contents | |||
3.15. RTCP considerations . . . . . . . . . . . . . . . . . . . 19 | 3.17. Performance Considerations | |||
3.16. Reception of multiparty contents . . . . . . . . . . . . 19 | 3.18. Security for Session Control and Media | |||
3.17. Performance considerations . . . . . . . . . . . . . . . 21 | 3.19. SDP Offer/Answer Examples | |||
3.18. Security for session control and media . . . . . . . . . 21 | 3.20. Packet Sequence Example from Interleaved Transmission | |||
3.19. SDP offer/answer examples . . . . . . . . . . . . . . . . 22 | 3.21. Maximum Character Rate "cps" Setting | |||
3.20. Packet sequence example from interleaved transmission . . 23 | 4. Presentation-Level Considerations | |||
3.21. Maximum character rate "cps" . . . . . . . . . . . . . . 26 | 4.1. Presentation by Multiparty-Aware Endpoints | |||
4. Presentation level considerations . . . . . . . . . . . . . . 26 | 4.2. Multiparty Mixing for Multiparty-Unaware Endpoints | |||
4.1. Presentation by multiparty-aware endpoints . . . . . . . 27 | 5. Relationship to Conference Control | |||
4.2. Multiparty mixing for multiparty-unaware endpoints . . . 29 | 5.1. Use with SIP Centralized Conferencing Framework | |||
5. Relation to Conference Control . . . . . . . . . . . . . . . 35 | 5.2. Conference Control | |||
5.1. Use with SIP centralized conferencing framework . . . . . 36 | 6. Gateway Considerations | |||
5.2. Conference control . . . . . . . . . . . . . . . . . . . 36 | 6.1. Gateway Considerations with Textphones | |||
6. Gateway Considerations . . . . . . . . . . . . . . . . . . . 36 | 6.2. Gateway Considerations with WebRTC | |||
6.1. Gateway considerations with Textphones . . . . . . . . . 36 | 7. Updates to RFC 4103 | |||
6.2. Gateway considerations with WebRTC . . . . . . . . . . . 36 | 8. Congestion Considerations | |||
7. Updates to RFC 4103 . . . . . . . . . . . . . . . . . . . . . 37 | 9. IANA Considerations | |||
8. Congestion considerations . . . . . . . . . . . . . . . . . . 38 | 9.1. Registration of the "rtt-mixer" SDP Media Attribute | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 | 10. Security Considerations | |||
9.1. Registration of the "rtt-mixer" SDP media attribute . . . 38 | 11. References | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 39 | 11.1. Normative References | |||
11. Change history . . . . . . . . . . . . . . . . . . . . . . . 40 | 11.2. Informative References | |||
11.1. Changes included in | Acknowledgements | |||
draft-ietf-avtcore-multi-party-rtt-mix-20 . . . . . . . 40 | Author's Address | |||
11.2. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-19 . . . . . . . 40 | ||||
11.3. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-18 . . . . . . . 40 | ||||
11.4. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-17 . . . . . . . 40 | ||||
11.5. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-16 . . . . . . . 40 | ||||
11.6. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-15 . . . . . . . 41 | ||||
11.7. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-14 . . . . . . . 41 | ||||
11.8. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-13 . . . . . . . 41 | ||||
11.9. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-12 . . . . . . . 42 | ||||
11.10. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-11 . . . . . . . 42 | ||||
11.11. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-10 . . . . . . . 42 | ||||
11.12. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-09 . . . . . . . 42 | ||||
11.13. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-08 . . . . . . . 43 | ||||
11.14. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-07 . . . . . . . 43 | ||||
11.15. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-06 . . . . . . . 43 | ||||
11.16. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-05 . . . . . . . 43 | ||||
11.17. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-04 . . . . . . . 43 | ||||
11.18. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-03 . . . . . . . 44 | ||||
11.19. Changes included in | ||||
draft-ietf-avtcore-multi-party-rtt-mix-02 . . . . . . . 45 | ||||
11.20. Changes to draft-ietf-avtcore-multi-party-rtt-mix-01 . . 45 | ||||
11.21. Changes from | ||||
draft-hellstrom-avtcore-multi-party-rtt-source-03 to | ||||
draft-ietf-avtcore-multi-party-rtt-mix-00 . . . . . . . 45 | ||||
11.22. Changes from | ||||
draft-hellstrom-avtcore-multi-party-rtt-source-02 to | ||||
-03 . . . . . . . . . . . . . . . . . . . . . . . . . . 45 | ||||
11.23. Changes from | ||||
draft-hellstrom-avtcore-multi-party-rtt-source-01 to | ||||
-02 . . . . . . . . . . . . . . . . . . . . . . . . . . 46 | ||||
11.24. Changes from | ||||
draft-hellstrom-avtcore-multi-party-rtt-source-00 to | ||||
-01 . . . . . . . . . . . . . . . . . . . . . . . . . . 47 | ||||
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 | ||||
12.1. Normative References . . . . . . . . . . . . . . . . . . 47 | ||||
12.2. Informative References . . . . . . . . . . . . . . . . . 48 | ||||
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 49 | ||||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 49 | ||||
1. Introduction | 1. Introduction | |||
"RTP Payload for Text Conversation" [RFC4103] specifies use of the | "RTP Payload for Text Conversation" [RFC4103] specifies the use of | |||
Real-Time Transport Protocol (RTP) [RFC3550] for transmission of | the Real-time Transport Protocol (RTP) [RFC3550] for transmission of | |||
real-time text (RTT) and the "text/t140" format. It also specifies a | real-time text (often called RTT) and the "text/t140" format. It | |||
redundancy format "text/red" for increased robustness. The "text/ | also specifies a redundancy format, "text/red", for increased | |||
red" format is registered in [RFC4102]. | robustness. The "text/red" format is registered in [RFC4102]. | |||
Real-time text is usually provided together with audio and sometimes | Real-time text is usually provided together with audio and sometimes | |||
with video in conversational sessions. | with video in conversational sessions. | |||
A requirement related to multiparty sessions from the presentation | A requirement related to multiparty sessions from the presentation- | |||
level standard T.140 [T140] for real-time text is: "The display of | level standard T.140 [T140] for real-time text is as follows: | |||
text from the members of the conversation should be arranged so that | ||||
the text from each participant is clearly readable, and its source | | The display of text from the members of the conversation should be | |||
and the relative timing of entered text is visualized in the | | arranged so that the text from each participant is clearly | |||
display." | | readable, and its source and the relative timing of entered text | |||
| is visualized in the display. | ||||
Another requirement is that the mixing procedure must not introduce | Another requirement is that the mixing procedure must not introduce | |||
delays in the text streams that are experienced to be disturbing the | delays in the text streams that could be perceived as disruptive to | |||
real-time experience of the receiving users. | the real-time experience of the receiving users. | |||
Use of RTT is increasing, and specifically, use in emergency calls is | The use of real-time text is increasing, and specifically, use in | |||
increasing. Emergency call use requires multiparty mixing because it | emergency calls is increasing. Emergency call use requires | |||
is common that one agent needs to transfer the call to another | multiparty mixing, because it is common that one agent needs to | |||
specialized agent but is obliged to stay on the call at least to | transfer the call to another specialized agent but is obliged to stay | |||
verify that the transfer was successful. Mixer implementations for | on the call to at least verify that the transfer was successful. | |||
RFC 4103 "RTP Payload for Text Conversation" can use traditional RFC | Mixer implementations for RFC 4103 ("RTP Payload for Text | |||
3550 RTP functions for mixing and source identification, but the | Conversation") can use traditional RTP functions (RFC 3550) for | |||
performance of the mixer when giving turns for the different sources | mixing and source identification, but the performance of the mixer | |||
to transmit is limited when using the default transmission | when giving turns for the different sources to transmit is limited | |||
characteristics with redundancy. | when using the default transmission characteristics with redundancy. | |||
The redundancy scheme of [RFC4103] enables efficient transmission of | The redundancy scheme described in [RFC4103] enables efficient | |||
earlier transmitted redundant text in packets together with new text. | transmission of earlier transmitted redundant text in packets | |||
However, the redundancy header format has no source indicators for | together with new text. However, the redundancy header format has no | |||
the redundant transmissions. The redundant parts in a packet must | source indicators for the redundant transmissions. The redundant | |||
therefore be from the same source as the new text. The recommended | parts in a packet must therefore be from the same source as the new | |||
transmission is one new and two redundant generations of text | text. The recommended transmission is one new and two redundant | |||
(T140blocks) in each packet and the recommended transmission interval | generations of text (T140blocks) in each packet, and the recommended | |||
for two-party use is 300 ms. | transmission interval for two-party use is 300 ms. | |||
Real-time text mixers for multiparty sessions need to include the | Real-time text mixers for multiparty sessions need to include the | |||
source with each transmitted group of text from a conference | source with each transmitted group of text from a conference | |||
participant so that the text can be transmitted interleaved with text | participant so that the text can be transmitted interleaved with text | |||
groups from different sources at the rate they are created. This | groups from different sources at the rate at which they are created. | |||
enables the text groups to be presented by endpoints in suitable | This enables the text groups to be presented by endpoints in a | |||
grouping with other text from the same source. | suitable grouping with other text from the same source. | |||
The presentation can then be arranged so that text from different | The presentation can then be arranged so that text from different | |||
sources can be presented in real-time and easily read. At the same | sources can be presented in real time and easily read. At the same | |||
time it is possible for a reading user to perceive approximately when | time, it is possible for a reading user to perceive approximately | |||
the text was created in real time by the different parties. The | when the text was created in real time by the different parties. The | |||
transmission and mixing is intended to be done in a general way, so | transmission and mixing are intended to be done in a general way, so | |||
that presentation can be arranged in a layout decided by the | that presentation can be arranged in a layout decided upon by the | |||
endpoint. | receiving endpoint. | |||
There are existing implementations of RFC 4103 in endpoints without | Existing implementations of RFC 4103 in endpoints that do not | |||
the updates from this document. These will not be able to receive | implement the updates specified in this document cannot be expected | |||
and present real-time text mixed for multiparty-aware endpoints. | to properly present real-time text mixed for multiparty-aware | |||
endpoints. | ||||
A negotiation mechanism is therefore needed for verification if the | A negotiation mechanism is therefore needed to verify if the parties | |||
parties are able to handle a common method for multiparty | (1) are able to handle a common method for multiparty transmissions | |||
transmission and agreeing on using that method. | and (2) can agree on using that method. | |||
A fallback mixing procedure is also needed for cases when the | A fallback mixing procedure is also needed for cases when the | |||
negotiation result indicates that a receiving endpoint is not capable | negotiation result indicates that a receiving endpoint is not capable | |||
of handling the mixed format. Multiparty-unaware endpoints would | of handling the mixed format. Multiparty-unaware endpoints would | |||
possibly otherwise present all received multiparty mixed text as if | possibly otherwise present all received multiparty mixed text as if | |||
it came from the same source regardless of any accompanying source | it came from the same source regardless of any accompanying source | |||
indication coded in fields in the packet. Or they may have other | indication coded in fields in the packet. Or, they may have other | |||
undesirable ways of acting on the multiparty content. The fallback | undesirable ways of acting on the multiparty content. The fallback | |||
method is called the mixing procedure for multiparty-unaware | method is called the mixing procedure for multiparty-unaware | |||
endpoints. The fallback method is naturally not expected to meet all | endpoints. The fallback method is naturally not expected to meet all | |||
performance requirements placed on the mixing procedure for | performance requirements placed on the mixing procedure for | |||
multiparty-aware endpoints. | multiparty-aware endpoints. | |||
The document updates [RFC4103] by introducing an attribute for | This document updates [RFC4103] by introducing an attribute for | |||
declaring support of the RTP-mixer-based multiparty mixing case and | declaring support of the RTP-mixer-based multiparty-mixing case and | |||
rules for source indications and interleaving of text from different | rules for source indications and interleaving of text from different | |||
sources. | sources. | |||
1.1. Terminology | 1.1. Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown above. | capitals, as shown here. | |||
The terms Source Description (SDES), Canonical name (CNAME), Name | The terms "Source Description" (SDES), "Canonical Name" (CNAME), | |||
(NAME), Synchronization Source (SSRC), Contributing Source (CSRC), | "Name" (NAME), "Synchronization Source" (SSRC), "Contributing Source" | |||
CSRC list, CSRC count [CC], Real-Time control protocol (RTCP), RTP- | (CSRC), "CSRC list", "CSRC count" (CC), "RTP Control Protocol" | |||
mixer, RTP-translator are defined in [RFC3550]. | (RTCP), and "RTP mixer" are defined in [RFC3550]. | |||
"real-time text" (RTT) is text transmitted instantly as it is typed | ||||
or created. Recipients can immediately read the message while it is | ||||
being written, without waiting. | ||||
The term "T140block" is defined in [RFC4103] to contain one or more | The term "T140block" is defined in [RFC4103] to contain one or more | |||
T.140 code elements. | T.140 code elements. | |||
"TTY" stands for a textphone type used in North America. | "TTY" stands for a textphone type used in North America. | |||
Web based real-time communication (WebRTC) is specified by the World | Web Real-Time Communication (WebRTC) is specified by the World Wide | |||
Wide Web Consortium (W3C) and IETF. See [RFC8825]. | Web Consortium (W3C) and the IETF. See [RFC8825]. | |||
"DTLS-SRTP" is a Datagram Transport Layer Security (DTLS) extension | "DTLS-SRTP" is a Datagram Transport Layer Security (DTLS) extension | |||
for use with Secure Real-Time Transport Protocol/Secure Real-Time | for use with the Secure Real-time Transport Protocol / Secure Real- | |||
Control Protocol (SRTP/SRTCP) specified in [RFC5764]. | time Transport Control Protocol (SRTP/SRTCP) as specified in | |||
[RFC5764]. | ||||
"multiparty-aware" describes an endpoint receiving real-time text | The term "multiparty aware" describes an endpoint that (1) receives | |||
from multiple sources through a common conference mixer being able to | real-time text from multiple sources through a common conference | |||
present the text in real-time, separated by source, and presented so | mixer, (2) is able to present the text in real time, separated by | |||
that a user can get an impression of the approximate relative timing | source, and (3) presents the text so that a user can get an | |||
of text from different parties. | impression of the approximate relative timing of text from different | |||
parties. | ||||
"multiparty-unaware" describes an endpoint not itself being able to | The term "multiparty unaware" describes an endpoint that cannot | |||
separate text from different sources when received through a common | itself separate text from different sources when the text is received | |||
conference mixer. | through a common conference mixer. | |||
1.2. Selected solution and considered alternatives | 1.2. Main Method, Fallback Method, and Considered Alternatives | |||
A number of alternatives were considered when searching an efficient | A number of alternatives were considered when searching for an | |||
and easily implemented multiparty method for real-time text. This | efficient and easily implemented multiparty method for real-time | |||
section explains a few of them briefly. | text. This section briefly explains a few of them. | |||
Multiple RTP streams, one per participant | Multiple RTP streams, one per participant: | |||
One RTP stream per source would be sent in the same RTP session | One RTP stream per source would be sent in the same RTP session | |||
with the "text/red" format. From some points of view, use of | with the "text/red" format. From some points of view, the use of | |||
multiple RTP streams, one for each source, sent in the same RTP | multiple RTP streams, one for each source, sent in the same RTP | |||
session would be efficient, and would use exactly the same packet | session would be efficient and would use exactly the same packet | |||
format as [RFC4103] and the same payload type. A couple of | format as [RFC4103] and the same payload type. A couple of | |||
relevant scenarios using multiple RTP-streams are specified in | relevant scenarios using multiple RTP streams are specified in | |||
"RTP Topologies" [RFC7667]. One possibility of special interest | "RTP Topologies" [RFC7667]. One possibility of special interest | |||
is the Selective Forwarding Middlebox (SFM) topology specified in | is the Selective Forwarding Middlebox (SFM) topology specified in | |||
RFC 7667 section 3.7 that could enable end-to-end encryption. In | Section 3.7 of [RFC7667], which could enable end-to-end | |||
contrast to audio and video, real-time text is only transmitted | encryption. In contrast to audio and video, real-time text is | |||
when the users actually transmit information. Thus, an SFM | only transmitted when the users actually transmit information. | |||
solution would not need to exclude any party from transmission | Thus, an SFM solution would not need to exclude any party from | |||
under normal conditions. In order to allow the mixer to convey | transmission under normal conditions. In order to allow the mixer | |||
the packets with the payload preserved and encrypted, an SFM | to convey the packets with the payload preserved and encrypted, an | |||
solution would need to act on some specific characteristics of the | SFM solution would need to act on some specific characteristics of | |||
"text/red" format. The redundancy headers are part of the | the "text/red" format. The redundancy headers are part of the | |||
payload, so the receiver would need to just assume that the | payload, so the receiver would need to just assume that the | |||
payload type number in the redundancy header is for "text/t140". | payload type number in the redundancy header is for "text/t140". | |||
The characters per second parameter (cps) would need to act per | The characters per second ("cps") parameter would need to act per | |||
stream. The relation between the SSRC and the source would need | stream. The relationship between the SSRC and the source would | |||
to be conveyed in some specified way, e.g., in the CSRC. Recovery | need to be conveyed in some specified way, e.g., in the CSRC. | |||
and loss detection would preferably be based on sequence number | Recovery and loss detection would preferably be based on RTP | |||
gap detection. Thus, sequence number gaps in the incoming stream | sequence number gap detection. Thus, sequence number gaps in the | |||
to the mixer would need to be reflected in the stream to the | incoming stream to the mixer would need to be reflected in the | |||
participant, with no new gaps created by the mixer. However, the | stream to the participant, with no new gaps created by the mixer. | |||
RTP implementation in both mixers and endpoints need to support | However, the RTP implementation in both mixers and endpoints needs | |||
multiple streams in the same RTP session in order to use this | to support multiple streams in the same RTP session in order to | |||
mechanism. For best deployment opportunity, it should be possible | use this mechanism. To provide the best opportunities for | |||
to upgrade existing endpoint solutions to be multiparty-aware with | deployment, it should be possible to upgrade existing endpoint | |||
a reasonable effort. There is currently a lack of support for | solutions to be multiparty aware with a reasonable amount of | |||
multi-stream RTP in certain implementations. This fact led to | effort. There is currently a lack of support for multi-stream RTP | |||
this solution being only briefly mentioned in this document as an | in certain implementations. This fact led to only brief mention | |||
option for further study. | of this solution in this document as an option for further study. | |||
RTP-mixer-based method for multiparty-aware endpoints | RTP-mixer-based method for multiparty-aware endpoints: | |||
The "text/red" format in RFC 4103 is sent with a shorter | The "text/red" format as defined in RFC 4102 and applied in RFC | |||
transmission interval with the RTP-mixer method and indicating the | 4103 is sent with the RTP-mixer method indicating the source in | |||
source in the CSRC field. The "text/red" format with a "text/ | the CSRC field. The "text/red" format with a "text/t140" payload | |||
t140" payload in a single RTP stream can be sent when text is | in a single RTP stream can be sent when text is available from the | |||
available from the call participants instead of at the regular 300 | call participants instead of at the regular 300 ms intervals. | |||
ms. Transmission of packets with text from different sources can | Transmission of packets with text from different sources can then | |||
then be done smoothly while simultaneous transmission occurs as | be done smoothly while simultaneous transmission occurs as long as | |||
long as it is not limited by the maximum character rate "cps". | it is not limited by the maximum character rate "cps" value. With | |||
With ten participants sending text simultaneously, the switching | ten participants sending text simultaneously, the switching and | |||
and transmission performance is good. With more simultaneously | transmission performance is good. With more simultaneously | |||
sending participants, and with receivers having the default | sending participants and with receivers at default capacity, there | |||
capacity there will be a noticeable jerkiness and delay in text | will be a noticeable jerkiness and delay in text presentation. | |||
presentation. The jerkiness will be more expressed the more | The more participants who send text simultaneously, the more | |||
participants who send text simultaneously. Two seconds jerkiness | jerkiness will occur. Two seconds of jerkiness will be noticeable | |||
will be noticeable and slightly unpleasant, but it corresponds in | and slightly unpleasant, but it corresponds in time to what typing | |||
time to what typing humans often cause by hesitation or changing | humans often cause by hesitating or changing position while | |||
position while typing. A benefit of this method is that no new | typing. A benefit of this method is that no new packet format | |||
packet format needs to be introduced and implemented. Since | needs to be introduced and implemented. Since simultaneous typing | |||
simultaneous typing by more than two parties is expected to be | by more than two parties is expected to be very rare -- as | |||
very rare as described in Section 1.3, this method can be used | described in Section 1.3 -- this method can be used successfully | |||
successfully with good performance. Recovery of text in case of | with good performance. Recovery of text in the case of packet | |||
packet loss is based on analysis of timestamps of received | loss is based on analysis of timestamps of received redundancy | |||
redundancy versus earlier received text. Negotiation is based on | versus earlier received text. Negotiation is based on a new SDP | |||
a new SDP media attribute "rtt-mixer". This method is selected to | media attribute, "rtt-mixer". This method was selected to be the | |||
be the main one specified in this document. | main method specified in this document. | |||
Multiple sources per packet | Multiple sources per packet: | |||
A new "text" media subtype would be specified with up to 15 | A new "text" media subtype would be specified with up to 15 | |||
sources in each packet. The mechanism would make use of the RTP | sources in each packet. The mechanism would make use of the RTP- | |||
mixer model specified in RTP [RFC3550]. The sources are indicated | mixer model specified in RTP [RFC3550]. The sources would be | |||
in strict order in the CSRC list of the RTP packets. The CSRC | indicated in strict order in the CSRC list of the RTP packets. | |||
list can have up to 15 members. Therefore, text from up to 15 | The CSRC list can have up to 15 members. Therefore, text from up | |||
sources can be included in each packet. Packets are normally sent | to 15 sources can be included in each packet. Packets are | |||
with 300 ms intervals. The mean delay will be 150 ms. A new | normally sent at 300 ms intervals. The mean delay would be 150 | |||
redundancy packet format is specified. This method would result | ms. A new redundancy packet format would be specified. This | |||
in good performance, but would require standardization and | method would result in good performance but would require | |||
implementation of new releases in the target technologies that | standardization and implementation of new releases in the target | |||
would take more time than desirable to complete. It was therefore | technologies; these would take more time than desirable to | |||
not selected to be included in this document. | complete. It was therefore not selected to be included in this | |||
document. | ||||
Mixing for multiparty-unaware endpoints | Mixing for multiparty-unaware endpoints: | |||
Presentation of text from multiple parties is prepared by the | The presentation of text from multiple parties is prepared by the | |||
mixer in one single stream. It is desirable to have a method that | mixer in one single stream. It is desirable to have a method that | |||
does not require any modifications in existing user devices | does not require any modifications in existing user devices | |||
implementing RFC 4103 for RTT without explicit support of | implementing RFC 4103 for real-time text without explicit support | |||
multiparty sessions. This is possible by having the mixer insert | of multiparty sessions. This is made possible by having the mixer | |||
a new line and a text formatted source label before each switch of | insert a new line and a text-formatted source label before each | |||
text source in the stream. Switch of source can only be done in | switch of text source in the stream. Switching the source can | |||
places in the text where it does not disturb the perception of the | only be done in places in the text where it does not disturb the | |||
contents. Text from only one source can be presented in real time | perception of the contents. Text from only one source at a time | |||
at a time. The delay will therefore vary. The method also has | can be presented in real time. The delay will therefore vary. In | |||
other limitations, but is included in this document as a fallback | calls where parties take turns properly by ending their entries | |||
method. In calls where parties take turns properly by ending | with a new line, the limitations will have limited influence on | |||
their entries with a new line, the limitations will have limited | the user experience. When only two parties send text, these two | |||
influence on the user experience. when only two parties send text, | will see the text in real time with no delay. Although this | |||
these two will see the text in real time with no delay. This | method also has other limitations, it is included in this document | |||
method is specified as a fallback method in this document. | as a fallback method. | |||
RTT transport in WebRTC | Real-time text transport in WebRTC: | |||
Transport of real-time text in the WebRTC technology is specified | [RFC8865] specifies how the WebRTC data channel can be used to | |||
to use the WebRTC data channel in [RFC8865]. That specification | transport real-time text. That specification contains a section | |||
contains a section briefly describing its use in multiparty | briefly describing its use in multiparty sessions. The focus of | |||
sessions. The focus of this document is RTP transport. | this document is RTP transport. Therefore, even if the WebRTC | |||
Therefore, even if the WebRTC transport provides good multiparty | transport provides good multiparty performance, it is only | |||
performance, it is just mentioned in this document in relation to | mentioned in this document in relation to providing gateways with | |||
providing gateways with multiparty capabilities between RTP and | multiparty capabilities between RTP and WebRTC technologies. | |||
WebRTC technologies. | ||||
1.3. Intended application | 1.3. Intended Application | |||
The method for multiparty real-time text specified in this document | The method for multiparty real-time text specified in this document | |||
is primarily intended for use in transmission between mixers and | is primarily intended for use in transmissions between mixers and | |||
endpoints in centralized mixing configurations. It is also | endpoints in centralized mixing configurations. It is also | |||
applicable between mixers. An often mentioned application is for | applicable between mixers. An often-mentioned application is for | |||
emergency service calls with real-time text and voice, where a call | emergency service calls with real-time text and voice, where a call | |||
taker wants to make an attended handover of a call to another agent, | taker wants to make an attended handover of a call to another agent | |||
and stay to observe the session. Multimedia conference sessions with | and stay on the call to observe the session. Multimedia conference | |||
support for participants to contribute in text is another | sessions with support for participants to contribute with text is | |||
application. Conferences with central support for speech-to-text | another example. Conferences with central support for speech-to-text | |||
conversion is yet another mentioned application. | conversion represent yet another example. | |||
In all these applications, normally only one participant at a time | In all these applications, normally only one participant at a time | |||
will send long text utterances. In some cases, one other participant | will send long text comments. In some cases, one other participant | |||
will occasionally contribute with a longer comment simultaneously. | will occasionally contribute with a longer comment simultaneously. | |||
That may also happen in some rare cases when text is interpreted to | That may also happen in some rare cases when text is translated to | |||
text in another language in a conference. Apart from these cases, | text in another language in a conference. Apart from these cases, | |||
other participants are only expected to contribute with very brief | other participants are only expected to contribute with very brief | |||
utterings while others are sending text. | comments while others are sending text. | |||
Users expect that the text they send is presented in real-time in a | Users expect the text they send to be presented in real time in a | |||
readable way to the other participants even if they send | readable way to the other participants even if they send | |||
simultaneously with other users and even when they make brief edit | simultaneously with other users and even when they make brief edit | |||
operations of their text by backspacing and correcting their text. | operations of their text by backspacing and correcting their text. | |||
Text is supposed to be human generated, by some text input means, | Text is supposed to be human generated, by some means of text input, | |||
such as typing on a keyboard or using speech-to-text technology. | such as typing on a keyboard or using speech-to-text technology. | |||
Occasional small cut-and-paste operations may appear even if that is | Occasional small cut-and-paste operations may appear even if that is | |||
not the initial purpose of real-time text. | not the initial purpose of real-time text. | |||
The real-time characteristics of real-time text is essential for the | The real-time characteristics of real-time text are essential for the | |||
participants to be able to contribute to a conversation. If the text | participants to be able to contribute to a conversation. If the text | |||
is too much delayed from typing a letter to its presentation, then, | is delayed too much between the typing of a character and its | |||
in some conference situations, the opportunity to comment will be | presentation, then, in some conference situations, the opportunity to | |||
gone and someone else will grab the turn. A delay of more than one | comment will be gone and someone else will grab the turn. A delay of | |||
second in such situations is an obstacle for good conversation. | more than one second in such situations is an obstacle to good | |||
conversation. | ||||
2. Overview of the two specified solutions and selection of method | 2. Overview of the Two Specified Solutions and Selection of Method | |||
This section contains a brief introduction of the two methods | This section contains a brief introduction of the two methods | |||
specified in this document. | specified in this document. | |||
2.1. The RTP-mixer-based solution for multiparty-aware endpoints | 2.1. The RTP-Mixer-Based Solution for Multiparty-Aware Endpoints | |||
This method specifies negotiated use of the RFC 4103 format for | This method specifies the negotiated use of the formats described in | |||
multiparty transmission in a single RTP stream. The main purpose of | RFC 4103, for multiparty transmissions in a single RTP stream. The | |||
this document is to specify a method for true multiparty real-time | main purpose of this document is to specify a method for true | |||
text mixing for multiparty-aware endpoints that can be widely | multiparty real-time text mixing for multiparty-aware endpoints that | |||
deployed. The RTP-mixer-based method makes use of the current format | can be widely deployed. The RTP-mixer-based method makes use of the | |||
for real-time text in [RFC4103]. It is an update of RFC 4103 by a | current format for real-time text as provided in [RFC4103]. This | |||
clarification on one way to use it in the multiparty situation. That | method updates RFC 4103 by clarifying one way to use it in the | |||
is done by completing a negotiation for this kind of multiparty | multiparty situation. That is done by completing a negotiation for | |||
capability and by interleaving packets from different sources. The | this kind of multiparty capability and by interleaving packets from | |||
source is indicated in the CSRC element in the RTP packets. Specific | different sources. The source is indicated in the CSRC element in | |||
considerations are made to be able to recover text after packet loss. | the RTP packets. Specific considerations are made regarding the | |||
ability to recover text after packet loss. | ||||
The detailed procedures for the RTP-mixer-based multiparty-aware case | The detailed procedures for the RTP-mixer-based multiparty-aware case | |||
are specified in Section 3. | are specified in Section 3. | |||
Please use [RFC4103] as reference when reading the specification. | Please refer to [RFC4103] when reading this document. | |||
2.2. Mixing for multiparty-unaware endpoints | 2.2. Mixing for Multiparty-Unaware Endpoints | |||
A method is also specified in this document for cases when the | This document also specifies a method to be used in cases when the | |||
endpoint participating in a multiparty call does not itself implement | endpoint participating in a multiparty call does not itself implement | |||
any solution, or not the same, as the mixer. The method requires the | any solution or does not implement the same solution as the mixer. | |||
mixer to insert text dividers and readable labels and only send text | This method requires the mixer to insert text dividers and readable | |||
from one source at a time until a suitable point appears for source | labels and only send text from one source at a time until a suitable | |||
change. This solution is a fallback method with functional | point appears for changing the source. This solution is a fallback | |||
limitations. It acts on the presentation level. | method with functional limitations. It operates at the presentation | |||
level. | ||||
A mixer SHOULD by default format and transmit text to a call | A mixer SHOULD by default format and transmit text to a call | |||
participant to be suitable to present on a multiparty-unaware | participant so that the text is suitable for presentation on a | |||
endpoint which has not negotiated any method for true multiparty RTT | multiparty-unaware endpoint that has not negotiated any method for | |||
handling, but negotiated a "text/red" or "text/t140" format in a | true multiparty real-time text handling but has negotiated a "text/ | |||
session. This SHOULD be done if nothing else is specified for the | red" or "text/t140" format in a session. This SHOULD be done if | |||
application in order to maintain interoperability. Section 4.2 | nothing else is specified for the application, in order to maintain | |||
specifies how this mixing is done. | interoperability. Section 4.2 specifies how this mixing is done. | |||
2.3. Offer/answer considerations | 2.3. Offer/Answer Considerations | |||
RTP Payload for Text Conversation [RFC4103] specifies use of RTP | "RTP Payload for Text Conversation" [RFC4103] specifies the use of | |||
[RFC3550], and a redundancy format "text/red" for increased | RTP [RFC3550] and a redundancy format ("text/red", as defined in | |||
robustness of real-time text transmission. This document updates | [RFC4102]) for increased robustness of real-time text transmission. | |||
[RFC4103] by introducing a capability negotiation for handling | This document updates [RFC4103] by introducing a capability | |||
multiparty real-time text, a way to indicate the source of | negotiation for handling multiparty real-time text, a way to indicate | |||
transmitted text, and rules for efficient timing of the transmissions | the source of transmitted text, and rules for efficient timing of the | |||
interleaved from different sources. | transmissions interleaved from different sources. | |||
The capability negotiation for the "RTP-mixer-based multiparty | The capability negotiation for the RTP-mixer-based multiparty method | |||
method" is based on use of the SDP media attribute "rtt-mixer". | is based on the use of the SDP media attribute "rtt-mixer". | |||
The syntax is as follows: | The syntax is as follows: | |||
"a=rtt-mixer" | ||||
If any other method for RTP-based multiparty real-time text gets | a=rtt-mixer | |||
specified by additional work, it is assumed that it will be | ||||
If in the future any other method for RTP-based multiparty real-time | ||||
text is specified by additional work, it is assumed that it will be | ||||
recognized by some specific SDP feature exchange. | recognized by some specific SDP feature exchange. | |||
2.3.1. Initial offer | 2.3.1. Initial Offer | |||
A party intending to set up a session and being willing to use the | A party that intends to set up a session and is willing to use the | |||
RTP-mixer-based method of this specification for sending or receiving | RTP-mixer-based method provided in this specification for sending, | |||
or both sending and receiving real-time text SHALL include the "rtt- | receiving, or both sending and receiving real-time text SHALL include | |||
mixer" SDP attribute in the corresponding "text" media section in the | the "rtt-mixer" SDP attribute in the corresponding "text" media | |||
initial offer. | section in the initial offer. | |||
The party MAY indicate capability for both the RTP-mixer-based method | The party MAY indicate its capability regarding both the RTP-mixer- | |||
of this specification and other methods. | based method provided in this specification and other methods. | |||
When the offeror has sent the offer including the "rtt-mixer" | When the offerer has sent the offer, which includes the "rtt-mixer" | |||
attribute, it MUST be prepared to receive and handle real-time text | attribute, it MUST be prepared to receive and handle real-time text | |||
formatted according to both the method for multiparty-aware parties | formatted according to both the method for multiparty-aware parties | |||
specified in Section 3 in this specification and two-party formatted | specified in Section 3 and two-party formatted real-time text. | |||
real-time text. | ||||
2.3.2. Answering the offer | 2.3.2. Answering the Offer | |||
A party receiving an offer containing the "rtt-mixer" SDP attribute | A party that receives an offer containing the "rtt-mixer" SDP | |||
and being willing to use the RTP-mixer-based method of this | attribute and is willing to use the RTP-mixer-based method provided | |||
specification for sending or receiving or both sending and receiving | in this specification for sending, receiving, or both sending and | |||
SHALL include the "rtt-mixer" SDP attribute in the corresponding | receiving real-time text SHALL include the "rtt-mixer" SDP attribute | |||
"text" media section in the answer. | in the corresponding "text" media section in the answer. | |||
If the offer did not contain the "rtt-mixer" attribute, the answer | If the offer did not contain the "rtt-mixer" attribute, the answer | |||
MUST NOT contain the "rtt-mixer" attribute. | MUST NOT contain the "rtt-mixer" attribute. | |||
Even when the "rtt-mixer" attribute is successfully negotiated, the | Even when the "rtt-mixer" attribute is successfully negotiated, the | |||
parties MAY send and receive two-party coded real-time text. | parties MAY send and receive two-party coded real-time text. | |||
An answer MUST NOT include acceptance of more than one method for | An answer MUST NOT include acceptance of more than one method for | |||
multiparty real-time text in the same RTP session. | multiparty real-time text in the same RTP session. | |||
When the answer including acceptance is transmitted, the answerer | When the answer, which includes acceptance, is transmitted, the | |||
MUST be prepared to act on received text in the negotiated session | answerer MUST be prepared to act on received text in the negotiated | |||
according to the method for multiparty-aware parties specified in | session according to the method for multiparty-aware parties | |||
Section 3 of this specification. Reception of text for a two-party | specified in Section 3. Reception of text for a two-party session | |||
session SHALL also be supported. | SHALL also be supported. | |||
2.3.3. Offeror processing the answer | 2.3.3. Offerer Processing the Answer | |||
When the answer is processed by the offeror, it MUST act as specified | When the answer is processed by the offerer, the offerer MUST follow | |||
in Section 2.4 | the requirements listed in Section 2.4. | |||
2.3.4. Modifying a session | 2.3.4. Modifying a Session | |||
A session MAY be modified at any time by any party offering a | A session MAY be modified at any time by any party offering a | |||
modified SDP with or without the "rtt-mixer" SDP attribute expressing | modified SDP with or without the "rtt-mixer" SDP attribute expressing | |||
a desired change in the support of multiparty real-time text. | a desired change in the support of multiparty real-time text. | |||
If the modified offer adds indication of support for multiparty real- | If the modified offer adds the indication of support for multiparty | |||
time text by including the "rtt-mixer" SDP attribute, the procedures | real-time text by including the "rtt-mixer" SDP attribute, the | |||
specified in the previous subsections SHALL be applied. | procedures specified in the previous subsections SHALL be applied. | |||
If the modified offer deletes indication of support for multiparty | If the modified offer deletes the indication of support for | |||
real-time text by excluding the "rtt-mixer" SDP attribute, the answer | multiparty real-time text by excluding the "rtt-mixer" SDP attribute, | |||
MUST NOT contain the "rtt-mixer" attribute. After processing this | the answer MUST NOT contain the "rtt-mixer" attribute. After | |||
SDP exchange, the parties MUST NOT send real-time text formatted for | processing this SDP exchange, the parties MUST NOT send real-time | |||
multiparty-aware parties according to this specification. | text formatted for multiparty-aware parties according to this | |||
specification. | ||||
2.4. Actions depending on capability negotiation result | 2.4. Actions Depending on Capability Negotiation Result | |||
A transmitting party SHALL send text according to the RTP-mixer-based | A transmitting party SHALL send text according to the RTP-mixer-based | |||
multiparty method only when the negotiation for that method was | multiparty method only when the negotiation for that method was | |||
successful and when it conveys text for another source. In all other | successful and when it conveys text for another source. In all other | |||
cases, the packets SHALL be populated and interpreted as for a two- | cases, the packets SHALL be populated and interpreted as for a two- | |||
party session. | party session. | |||
A party which has negotiated the "rtt-mixer" SDP media attribute MUST | A party that has negotiated the "rtt-mixer" SDP media attribute and | |||
populate the CSRC-list, and format the packets according to Section 3 | acts as an RTP mixer sending multiparty text MUST (1) populate the | |||
if it acts as an rtp-mixer and sends multiparty text. | CSRC list and (2) format the packets according to Section 3. | |||
A party which has negotiated the "rtt-mixer" SDP media attribute MUST | A party that has negotiated the "rtt-mixer" SDP media attribute MUST | |||
interpret the contents of the "CC" field, the CSRC-list and the | interpret the contents of the CC field, the CSRC list, and the | |||
packets according to Section 3 in received RTP packets in the | packets according to Section 3 in received RTP packets in the | |||
corresponding RTP stream. | corresponding RTP stream. | |||
A party which has not successfully completed the negotiation of the | A party that has not successfully completed the negotiation of the | |||
"rtt-mixer" SDP media attribute MUST NOT transmit packets interleaved | "rtt-mixer" SDP media attribute MUST NOT transmit packets interleaved | |||
from different sources in the same RTP stream as specified in | from different sources in the same RTP stream, as specified in | |||
Section 3. If the party is a mixer and did declare the "rtt-mixer" | Section 3. If the party is a mixer and did declare the "rtt-mixer" | |||
SDP media attribute, it SHOULD perform the procedure for multiparty- | SDP media attribute, it SHOULD perform the procedure for multiparty- | |||
unaware endpoints. If the party is not a mixer, it SHOULD transmit | unaware endpoints. If the party is not a mixer, it SHOULD transmit | |||
as in a two-party session according to [RFC4103]. | as in a two-party session according to [RFC4103]. | |||
3. Details for the RTP-mixer-based mixing method for multiparty-aware | 3. Details for the RTP-Mixer-Based Mixing Method for Multiparty-Aware | |||
endpoints | Endpoints | |||
3.1. Use of fields in the RTP packets | 3.1. Use of Fields in the RTP Packets | |||
The CC field SHALL show the number of members in the CSRC list, which | The CC field SHALL show the number of members in the CSRC list, which | |||
SHALL be one (1) in transmissions from a mixer when conveying text | SHALL be one (1) in transmissions from a mixer when conveying text | |||
from other sources in a multiparty session, and otherwise 0. | from other sources in a multiparty session, and otherwise 0. | |||
When text is conveyed by a mixer during a multiparty session, a CSRC | When text is conveyed by a mixer during a multiparty session, a CSRC | |||
list SHALL be included in the packet. The single member in the CSRC- | list SHALL be included in the packet. The single member in the CSRC | |||
list SHALL contain the SSRC of the source of the T140blocks in the | list SHALL contain the SSRC of the source of the T140blocks in the | |||
packet. | packet. | |||
When redundancy is used, the RECOMMENDED level of redundancy is to | When redundancy is used, the RECOMMENDED level of redundancy is to | |||
use one primary and two redundant generations of T140blocks. In some | use one primary and two redundant generations of T140blocks. In some | |||
cases, a primary or redundant T140block is empty, but is still | cases, a primary or redundant T140block is empty but is still | |||
represented by a member in the redundancy header. | represented by a member in the redundancy header. | |||
In other regards, the contents of the RTP packets are equal to what | In other respects, the contents of the RTP packets will be as | |||
is specified in [RFC4103]. | specified in [RFC4103]. | |||
3.2. Initial transmission of a BOM character | 3.2. Initial Transmission of a BOM Character | |||
As soon as a participant is known to participate in a session with | As soon as a participant is known to participate in a session with | |||
another entity and is available for text reception, a Unicode Byte- | another entity and is available for text reception, a Unicode byte | |||
Order Mark (BOM) character SHALL be sent to it by the other entity | order mark (BOM) character SHALL be sent to it by the other entity | |||
according to the procedures in this section. This is useful in many | according to the procedures in this section. This is useful in many | |||
configurations to open ports and firewalls and setting up the | configurations for opening ports and firewalls and for setting up the | |||
connection between the application and the network. If the | connection between the application and the network. If the | |||
transmitter is a mixer, then the source of this character SHALL be | transmitter is a mixer, then the source of this character SHALL be | |||
indicated to be the mixer itself. | indicated to be the mixer itself. | |||
Note that the BOM character SHALL be transmitted with the same | Note that the BOM character SHALL be transmitted with the same | |||
redundancy procedures as any other text. | redundancy procedures as any other text. | |||
3.3. Keep-alive | 3.3. Keep-Alive | |||
After that, the transmitter SHALL send keep-alive traffic to the | After that, the transmitter SHALL send keep-alive traffic to the | |||
receiver(s) at regular intervals when no other traffic has occurred | receiver(s) at regular intervals when no other traffic has occurred | |||
during that interval, if that is decided for the actual connection. | during that interval, if that is decided upon for the actual | |||
It is RECOMMENDED to use the keep-alive solution from [RFC6263]. The | connection. It is RECOMMENDED to use the keep-alive solution | |||
consent check of [RFC7675] is a possible alternative if it is used | provided in [RFC6263]. The consent check [RFC7675] is a possible | |||
anyway for other reasons. | alternative if it is used anyway for other reasons. | |||
3.4. Transmission interval | 3.4. Transmission Interval | |||
A "text/red" or "text/t140" transmitter in a mixer SHALL send packets | A "text/red" or "text/t140" transmitter in a mixer SHALL send packets | |||
distributed in time as long as there is something (new or redundant | distributed over time as long as there is something (new or redundant | |||
T140blocks) to transmit. The maximum transmission interval between | T140blocks) to transmit. The maximum transmission interval between | |||
text transmissions from the same source SHALL then be 330 ms, when no | text transmissions from the same source SHALL then be 330 ms, when no | |||
other limitations cause a longer interval to be temporarily used. It | other limitations cause a longer interval to be temporarily used. It | |||
is RECOMMENDED to send the next packet to a receiver as soon as new | is RECOMMENDED to send the next packet to a receiver as soon as new | |||
text to that receiver is available, as long as the mean character | text to that receiver is available, as long as the mean character | |||
rate of new text to the receiver calculated over the last 10 one- | rate of new text to the receiver calculated over the last 10 one- | |||
second intervals does not exceed the "cps" value of the receiver. | second intervals does not exceed the "cps" value of the receiver. | |||
The intention is to keep the latency low and network load limited | The intention is to keep the latency low and network load limited | |||
while keeping good protection against text loss in bursty packet loss | while keeping good protection against text loss in bursty packet loss | |||
conditions. The main purpose of the 330 ms interval is for timing of | conditions. The main purpose of the 330 ms interval is for the | |||
redundant transmission, when no new text from the same source is | timing of redundant transmissions, when no new text from the same | |||
available. | source is available. | |||
The reason for the value 330 ms is that many sources of text will | The value of 330 ms is used, because many sources of text will | |||
transmit new text with 300 ms intervals during periods of continuous | transmit new text at 300 ms intervals during periods of continuous | |||
user typing, and then reception in the mixer of such new text will | user typing, and then reception in the mixer of such new text will | |||
cause a combined transmission of the new text and the unsent | cause a combined transmission of the new text and the unsent | |||
redundancy from the previous transmission. Only when the user stops | redundancy from the previous transmission. Only when the user stops | |||
typing, the 330 ms interval will be applied to send the redundancy. | typing will the 330 ms interval be applied to send the redundancy. | |||
If the Characters Per Second (cps) value is reached, a longer | If the characters per second ("cps") value is reached, a longer | |||
transmission interval SHALL be applied for text from all sources as | transmission interval SHALL be applied for text from all sources as | |||
specified in [RFC4103] and only as much of the text queued for | specified in [RFC4103] and only as much of the text queued for | |||
transmission SHALL be sent at the end of each transmission interval | transmission SHALL be sent at the end of each transmission interval | |||
as can be allowed without exceeding the "cps" value. Division of | as can be allowed without exceeding the "cps" value. Division of | |||
text for partial transmission MUST then be made at T140block borders. | text for partial transmission MUST then be made at T140block borders. | |||
When the transmission rate falls under the "cps" value again, the | When the transmission rate falls below the "cps" value again, the | |||
transmission intervals SHALL be returned to 330 ms and transmission | transmission intervals SHALL be reset to 330 ms and transmission of | |||
of new text SHALL return to be made as soon as new text is available. | new text SHALL again be made as soon as new text is available. | |||
NOTE: that extending the transmission intervals during high load | | NOTE: Extending the transmission intervals during periods of | |||
periods does not change the number of characters to be conveyed. It | | high load does not change the number of characters to be | |||
just evens out the load in time and reduces the number of packets per | | conveyed. It just evens out the load over time and reduces the | |||
second. With human created conversational text, the sending user | | number of packets per second. With human-created | |||
will eventually take a pause letting transmission catch up. | | conversational text, the sending user will eventually take a | |||
| pause, letting transmission catch up. | ||||
See also Section 8. | See also Section 8. | |||
For a transmitter not acting as a mixer, the transmission interval | For a transmitter not acting as a mixer, the transmission interval | |||
principles from [RFC4103] apply, and the normal transmission interval | principles provided in [RFC4103] apply, and the normal transmission | |||
SHALL be 300 ms. | interval SHALL be 300 ms. | |||
3.5. Only one source per packet | 3.5. Only One Source per Packet | |||
New text and redundant copies of earlier text from one source SHALL | New text and redundant copies of earlier text from one source SHALL | |||
be transmitted in the same packet if available for transmission at | be transmitted in the same packet if available for transmission at | |||
the same time. Text from different sources MUST NOT be transmitted | the same time. Text from different sources MUST NOT be transmitted | |||
in the same packet. | in the same packet. | |||
3.6. Do not send received text to the originating source | 3.6. Do Not Send Received Text to the Originating Source | |||
Text received by a mixer from a participant SHOULD NOT be included in | Text received by a mixer from a participant SHOULD NOT be included in | |||
transmission from the mixer to that participant, because the normal | transmissions from the mixer to that participant, because for text | |||
behavior of the endpoint is to present locally-produced text locally. | that is produced locally, the normal behavior of the endpoint is to | |||
present such text directly when it is produced. | ||||
3.7. Clean incoming text | 3.7. Clean Incoming Text | |||
A mixer SHALL handle reception, recovery from packet loss, deletion | A mixer SHALL handle reception, recovery from packet loss, deletion | |||
of superfluous redundancy, marking of possible text loss and deletion | of superfluous redundancy, marking of possible text loss, and | |||
of 'BOM' characters from each participant before queueing received | deletion of BOM characters from each participant before queueing | |||
text for transmission to receiving participants as specified in | received text for transmission to receiving participants as specified | |||
[RFC4103] for single-party sources and Section 3.16 for multiparty | in [RFC4103] for single-party sources and Section 3.16 for multiparty | |||
sources (chained mixers). | sources (chained mixers). | |||
3.8. Redundant transmission principles | 3.8. Principles of Redundant Transmission | |||
A transmitting party using redundancy SHALL send redundant | A transmitting party using redundancy SHALL send redundant | |||
repetitions of T140blocks already transmitted in earlier packets. | repetitions of T140blocks already transmitted in earlier packets. | |||
The number of redundant generations of T140blocks to include in | The number of redundant generations of T140blocks to include in | |||
transmitted packets SHALL be deduced from the SDP negotiation. It | transmitted packets SHALL be deduced from the SDP negotiation. It | |||
SHALL be set to the minimum of the number declared by the two parties | SHALL be set to the minimum of the number declared by the two parties | |||
negotiating a connection. It is RECOMMENDED to declare and transmit | negotiating a connection. It is RECOMMENDED to declare and transmit | |||
one original and two redundant generations of the T140blocks because | one original and two redundant generations of the T140blocks, because | |||
that provides good protection against text loss in case of packet | this provides good protection against text loss in the case of packet | |||
loss, and low overhead. | loss and also provides low overhead. | |||
3.9. Text placement in packets | 3.9. Text Placement in Packets | |||
The mixer SHALL compose and transmit an RTP packet to a receiver when | The mixer SHALL compose and transmit an RTP packet to a receiver when | |||
one or more of the following conditions have occurred: | one or more of the following conditions have occurred: | |||
* The transmission interval is the normal 330 ms and there is newly | * The transmission interval is the normal 330 ms (no matter whether | |||
the transmission interval has passed or not), and there is newly | ||||
received unsent text available for transmission to that receiver. | received unsent text available for transmission to that receiver. | |||
* The current transmission interval has passed and is longer than | * The current transmission interval has passed and is longer than | |||
the normal 330 ms and there is newly received unsent text | the normal 330 ms, and there is newly received unsent text | |||
available for transmission to that receiver. | available for transmission to that receiver. | |||
* The current transmission interval ( normally 330 ms) has passed | * The current transmission interval (normally 330 ms) has passed | |||
since already transmitted text was queued for transmission as | since already-transmitted text was queued for transmission as | |||
redundant text. | redundant text. | |||
The principles from [RFC4103] apply for populating the header, the | The principles provided in [RFC4103] apply for populating the header, | |||
redundancy header and the data in the packet with specifics specified | the redundancy header, and the data in the packet with specific | |||
here and in the following sections. | information, as detailed here and in the following sections. | |||
At the time of transmission, the mixer SHALL populate the RTP packet | At the time of transmission, the mixer SHALL populate the RTP packet | |||
with all T140blocks queued for transmission originating from the | with all T140blocks queued for transmission originating from the | |||
source in turn for transmission as long as this is not in conflict | source selected for transmission as long as this is not in conflict | |||
with the allowed number of characters per second ("cps") or the | with the allowed number of characters per second ("cps") or the | |||
maximum packet size. In this way, the latency of the latest received | maximum packet size. In this way, the latency of the latest received | |||
text is kept low even in moments of simultaneous transmission from | text is kept low even in moments of simultaneous transmission from | |||
many sources. | many sources. | |||
Redundant text SHALL also be included, and the assessment of how much | Redundant text SHALL also be included, and the assessment of how much | |||
new text can be included within the maximum packet size MUST take | new text can be included within the maximum packet size MUST take | |||
into account that the redundancy has priority to be transmitted in | into account that the redundancy has priority to be transmitted in | |||
its entirety. See Section 3.4 | its entirety. See Section 3.4. | |||
The SSRC of the source SHALL be placed as the only member in the | The SSRC of the source SHALL be placed as the only member in the CSRC | |||
CSRC-list. | list. | |||
Note: The CSRC-list in an RTP packet only includes the participant | | Note: The CSRC list in an RTP packet only includes the | |||
whose text is included in text blocks. It is not the same as the | | participant whose text is included in text blocks. It is not | |||
total list of participants in a conference. With audio and video | | the same as the total list of participants in a conference. | |||
media, the CSRC-list would often contain all participants who are not | | With audio and video media, the CSRC list would often contain | |||
muted whereas text participants that don't type are completely silent | | all participants who are not muted, whereas text participants | |||
and thus are not represented in RTP packet CSRC-lists. | | that don't type are completely silent and thus are not | |||
| represented in RTP packet CSRC lists. | ||||
3.10. Empty T140blocks | 3.10. Empty T140blocks | |||
If no unsent T140blocks were available for a source at the time of | If no unsent T140blocks were available for a source at the time of | |||
populating a packet, but T140blocks are available which have not yet | populating a packet but already-transmitted T140blocks are available | |||
been sent the full intended number of redundant transmissions, then | that have not yet been sent the full intended number of redundant | |||
the primary T140block for that source is composed of an empty | transmissions, then the primary area in the packet is composed of an | |||
T140block, and populated (without taking up any length) in a packet | empty T140block and included (without taking up any length) in the | |||
for transmission. The corresponding SSRC SHALL be placed as usual in | packet for transmission. The corresponding SSRC SHALL be placed as | |||
its place in the CSRC-list. | usual in its place in the CSRC list. | |||
The first packet in the session, the first after a source switch, and | The first packet in the session, the first after a source switch, and | |||
the first after a pause SHALL be populated with the available | the first after a pause SHALL be populated with the available | |||
T140blocks for the source in turn to be sent as primary, and empty | T140blocks for the source selected to be sent as the primary, and | |||
T140blocks for the agreed number of redundancy generations. | empty T140blocks for the agreed-upon number of redundancy | |||
generations. | ||||
3.11. Creation of the redundancy | 3.11. Creation of the Redundancy | |||
The primary T140block from a source in the latest transmitted packet | The primary T140block from a source in the latest transmitted packet | |||
is saved for populating the first redundant T140block for that source | is saved for populating the first redundant T140block for that source | |||
in the next transmission of text from that source. The first | in the next transmission of text from that source. The first | |||
redundant T140block for that source from the latest transmission is | redundant T140block for that source from the latest transmission is | |||
saved for populating the second redundant T140block in the next | saved for populating the second redundant T140block in the next | |||
transmission of text from that source. | transmission of text from that source. | |||
Usually this is the level of redundancy used. If a higher level of | Usually, this is the level of redundancy used. If a higher level of | |||
redundancy is negotiated, then the procedure SHALL be maintained | redundancy is negotiated, then the procedure SHALL be continued until | |||
until all available redundant levels of T140blocks are placed in the | all available redundant levels of T140blocks are placed in the | |||
packet. If a receiver has negotiated a lower number of "text/red" | packet. If a receiver has negotiated a lower number of "text/red" | |||
generations, then that level SHALL be the maximum used by the | generations, then that level SHALL be the maximum used by the | |||
transmitter. | transmitter. | |||
The T140blocks saved for transmission as redundant data are assigned | The T140blocks saved for transmission as redundant data are assigned | |||
a planned transmission time 330 ms after the current time, but SHOULD | a planned transmission time of 330 ms after the current time but | |||
be transmitted earlier if new text for the same source gets in turn | SHOULD be transmitted earlier if new text for the same source gets | |||
for transmission before that time. | selected for transmission before that time. | |||
3.12. Timer offset fields | 3.12. Timer Offset Fields | |||
The timestamp offset values SHALL be inserted in the redundancy | The timestamp offset values SHALL be inserted in the redundancy | |||
header, with the time offset from the RTP timestamp in the packet | header, with the time offset from the RTP timestamp in the packet | |||
when the corresponding T140block was sent as primary. | when the corresponding T140block was sent as the primary. | |||
The timestamp offsets are expressed in the same clock tick units as | The timestamp offsets are expressed in the same clock tick units as | |||
the RTP timestamp. | the RTP timestamp. | |||
The timestamp offset values for empty T140blocks have no relevance | The timestamp offset values for empty T140blocks have no relevance | |||
but SHOULD be assigned realistic values. | but SHOULD be assigned realistic values. | |||
3.13. Other RTP header fields | 3.13. Other RTP Header Fields | |||
The number of members in the CSRC list (0 or 1) SHALL be placed in | The number of members in the CSRC list (0 or 1) SHALL be placed in | |||
the "CC" header field. Only mixers place value 1 in the "CC" field. | the CC header field. Only mixers place value 1 in the CC field. A | |||
A value of "0" indicates that the source is the transmitting device | value of 0 indicates that the source is the transmitting device | |||
itself and that the source is indicated by the SSRC field. This | itself and that the source is indicated by the SSRC field. This | |||
value is used by endpoints, and by mixers sending self-sourced data. | value is used by endpoints and also by mixers sending self-sourced | |||
data. | ||||
The current time SHALL be inserted in the timestamp. | The current time SHALL be inserted in the timestamp. | |||
The SSRC header field SHALL contain the SSRC of the RTP session where | The SSRC header field SHALL contain the SSRC of the RTP session where | |||
the packet will be transmitted. | the packet will be transmitted. | |||
The M-bit SHALL be handled as specified in [RFC4103]. | The M-bit SHALL be handled as specified in [RFC4103]. | |||
3.14. Pause in transmission | 3.14. Pause in Transmission | |||
When there is no new T140block to transmit, and no redundant | When there is no new T140block to transmit and no redundant T140block | |||
T140block that has not been retransmitted the intended number of | that has not been retransmitted the intended number of times from any | |||
times from any source, the transmission process SHALL be stopped | source, the transmission process SHALL be stopped until either new | |||
until either new T140blocks arrive, or a keep-alive method calls for | T140blocks arrive or a keep-alive method calls for transmission of | |||
transmission of keep-alive packets. | keep-alive packets. | |||
3.15. RTCP considerations | 3.15. RTCP Considerations | |||
A mixer SHALL send RTCP reports with SDES, CNAME, and NAME | A mixer SHALL send RTCP reports with SDES, CNAME, and NAME | |||
information about the sources in the multiparty call. This makes it | information about the sources in the multiparty call. This makes it | |||
possible for participants to compose a suitable label for text from | possible for participants to compose a suitable label for text from | |||
each source. | each source. | |||
Privacy considerations SHALL be taken when composing these fields. | Privacy considerations SHALL be taken when composing these fields. | |||
They contain name and address information that may be sensitive to | They contain name and address information that may be considered | |||
transmit in its entirety, e.g., to unauthenticated participants. | sensitive if the information is transmitted in its entirety, e.g., to | |||
unauthenticated participants. | ||||
3.16. Reception of multiparty contents | 3.16. Reception of Multiparty Contents | |||
The "text/red" receiver included in an endpoint with presentation | The "text/red" receiver included in an endpoint with presentation | |||
functions will receive RTP packets in the single stream from the | functions will receive RTP packets in the single stream from the | |||
mixer, and SHALL distribute the T140blocks for presentation in | mixer and SHALL distribute the T140blocks for presentation in | |||
presentation areas for each source. Other receiver roles, such as | presentation areas for each source. Other receiver roles, such as | |||
gateways or chained mixers, are also feasible. They require | gateways or chained mixers, are also feasible. Whether the stream | |||
considerations if the stream shall just be forwarded, or distributed | will only be forwarded or will be distributed based on the different | |||
based on the different sources. | sources must be taken into consideration. | |||
3.16.1. Acting on the source of the packet contents | 3.16.1. Acting on the Source of the Packet Contents | |||
If the "CC" field value of a received packet is 1, it indicates that | If the CC field value of a received packet is 1, it indicates that | |||
the text is conveyed from a source indicated in the single member in | the text is conveyed from a source indicated in the single member in | |||
the CSRC-list, and the receiver MUST act on the source according to | the CSRC list, and the receiver MUST act on the source according to | |||
its role. If the CC value is 0, the source is indicated in the SSRC | its role. If the CC value is 0, the source is indicated in the SSRC | |||
field. | field. | |||
3.16.2. Detection and indication of possible text loss | 3.16.2. Detection and Indication of Possible Text Loss | |||
The receiver SHALL monitor the RTP sequence numbers of the received | The receiver SHALL monitor the RTP sequence numbers of the received | |||
packets for gaps and packets out of order. If a sequence number gap | packets for gaps and for packets received out of order. If a | |||
appears and still exists after some defined short time for jitter and | sequence number gap appears and still exists after some defined short | |||
reordering resolution, the packets in the gap SHALL be regarded as | time for jitter and reordering resolution, the packets in the gap | |||
lost. | SHALL be regarded as lost. | |||
If it is known that only one source is active in the RTP session, | If it is known that only one source is active in the RTP session, | |||
then it is likely that a gap equal to or larger than the agreed | then it is likely that a gap equal to or larger than the agreed-upon | |||
number of redundancy generations (including the primary) causes text | number of redundancy generations (including the primary) causes text | |||
loss. In that case, the receiver SHALL create a t140block with a | loss. In that case, the receiver SHALL create a T140block with a | |||
marker for possible text loss [T140ad1] and associate it with the | marker for possible text loss [T140ad1], associate it with the | |||
source and insert it in the reception buffer for that source. | source, and insert it in the reception buffer for that source. | |||
If it is known that more than one source is active in the RTP | If it is known that more than one source is active in the RTP | |||
session, then it is not possible in general to evaluate if text was | session, then it is not possible in general to evaluate if text was | |||
lost when packets were lost. With two active sources and the | lost when packets were lost. With two active sources and the | |||
recommended number of redundancy generations (3), it can take a gap | recommended number of redundancy generations (one original and two | |||
of five consecutive lost packets until any text may be lost, but text | redundant), it can take a gap of five consecutive lost packets before | |||
loss can also appear if three non-consecutive packets are lost when | any text may be lost, but text loss can also appear if three non- | |||
they contained consecutive data from the same source. A simple | consecutive packets are lost when they contained consecutive data | |||
method to decide when there is risk for resulting text loss is to | from the same source. A simple method for deciding when there is a | |||
evaluate if three or more packets were lost within one second. If | risk of resulting text loss is to evaluate if three or more packets | |||
this simple method is used, then a t140block SHOULD be created with a | were lost within one second. If this simple method is used, then a | |||
marker for possible text loss [T140ad1] and associated with the SSRC | T140block SHOULD be created with a marker for possible text loss | |||
of the RTP session as a general input from the mixer. | [T140ad1] and associated with the SSRC of the RTP session as a | |||
general input from the mixer. | ||||
Implementations MAY apply more refined methods for more reliable | Implementations MAY apply more refined methods for more reliable | |||
detection of whether text was lost or not. Any refined method SHOULD | detection of whether text was lost or not. Any refined method SHOULD | |||
prefer marking possible loss rather than not marking when it is | prefer marking possible loss rather than not marking when it is | |||
uncertain if there was loss. | uncertain if there was loss. | |||
3.16.3. Extracting text and handling recovery | 3.16.3. Extracting Text and Handling Recovery | |||
When applying the following procedures, the effects MUST be | When applying the following procedures, the effects of possible | |||
considered of possible timestamp wrap around and the RTP session | timestamp wraparound and the RTP session possibly changing the SSRC | |||
possibly changing SSRC. | MUST be considered. | |||
When a packet is received in an RTP session using the packetization | When a packet is received in an RTP session using the packetization | |||
for multiparty-aware endpoints, its T140blocks SHALL be extracted in | for multiparty-aware endpoints, its T140blocks SHALL be extracted as | |||
the following way. | described below. | |||
The source SHALL be extracted from the CSRC-list if available, | The source SHALL be extracted from the CSRC list if available, and | |||
otherwise from the SSRC. | otherwise from the SSRC. | |||
If the received packet is the first packet received from the source, | If the received packet is the first packet received from the source, | |||
then all T140blocks in the packet SHALL be retrieved and assigned to | then all T140blocks in the packet SHALL be retrieved and assigned to | |||
a receive buffer for the source beginning with the oldest available | a receive buffer for that source, beginning with the oldest available | |||
redundant generation, continuing with the younger redundant | redundant generation, continuing with the younger redundant | |||
generations in age order and finally the primary. | generations in age order, and finally ending with the primary. | |||
Note: The normal case is that in the first packet, only the primary | | Note: The normal case is that in the first packet, only the | |||
data has contents. The redundant data has contents in the first | | primary data has contents. The redundant data has contents in | |||
received packet from a source only after initial packet loss. | | the first received packet from a source only after initial | |||
| packet loss. | ||||
If the packet is not the first packet from a source, then if | If the packet is not the first packet from a source, then if | |||
redundant data is available, the process SHALL start with the oldest | redundant data is available, the process SHALL start with the oldest | |||
generation. The timestamp of that redundant data SHALL be created by | generation. The timestamp of that redundant data SHALL be created by | |||
subtracting its timestamp offset from the RTP timestamp. If the | subtracting its timestamp offset from the RTP timestamp. If the | |||
resulting timestamp is later than the latest retrieved data from the | resulting timestamp is later than the latest retrieved data from the | |||
same source, then the redundant data SHALL be retrieved and appended | same source, then the redundant data SHALL be retrieved and appended | |||
to the receive buffer. The process SHALL be continued in the same | to the receive buffer. The process SHALL be continued in the same | |||
way for all younger generations of redundant data. After that, the | way for all younger generations of redundant data. After that, the | |||
timestamp of the packet SHALL be compared with the timestamp of the | timestamp of the packet SHALL be compared with the timestamp of the | |||
latest retrieved data from the same source and if it is later, then | latest retrieved data from the same source and if it is later, then | |||
the primary data SHALL be retrieved from the packet and appended to | the primary data SHALL be retrieved from the packet and appended to | |||
the receive buffer for the source. | the receive buffer for the source. | |||
3.16.4. Delete 'BOM' | 3.16.4. Delete BOM | |||
Unicode character 'BOM' is used as a start indication and sometimes | The Unicode BOM character is used as a start indication and is | |||
used as a filler or keep alive by transmission implementations. | sometimes used as a filler or keep-alive by transmission | |||
These SHALL be deleted after extraction from received packets. | implementations. Any BOM characters SHALL be deleted after | |||
extraction from received packets. | ||||
3.17. Performance considerations | 3.17. Performance Considerations | |||
This solution has good performance with low text delays, as long as | This solution has good performance with low text delays, as long as | |||
the mean number of characters per second sent during any 10-second | the mean number of characters per second sent during any 10-second | |||
interval from a number of simultaneously sending participants to a | interval from a number of simultaneously sending participants to a | |||
receiving participant, does not reach the "cps" value. At higher | receiving participant does not reach the "cps" value. At higher | |||
numbers of sent characters per second, a jerkiness is visible in the | numbers of sent characters per second, a jerkiness is visible in the | |||
presentation of text. The solution is therefore suitable for | presentation of text. The solution is therefore suitable for | |||
emergency service use, relay service use, and small or well-managed | emergency service use, relay service use, and small or well-managed | |||
larger multimedia conferences. Only in large unmanaged conferences | larger multimedia conferences. In large unmanaged conferences with a | |||
with a high number of participants there may on very rare occasions | high number of participants only, on very rare occasions, situations | |||
appear situations when many participants happen to send text | might arise where many participants happen to send text | |||
simultaneously. In such circumstances, the result may be | simultaneously. In such circumstances, the result may be | |||
unpleasantly jerky presentation of text from each sending | unpleasantly jerky presentation of text from each sending | |||
participant. It should be noted that it is only the number of users | participant. It should be noted that it is only the number of users | |||
sending text within the same moment that causes jerkiness, not the | sending text within the same moment that causes jerkiness, not the | |||
total number of users with RTT capability. | total number of users with real-time text capability. | |||
3.18. Security for session control and media | 3.18. Security for Session Control and Media | |||
Security mechanisms to provide confidentiality and integrity | Security mechanisms to provide confidentiality, integrity protection, | |||
protection and peer authentication SHOULD be applied when possible | and peer authentication SHOULD be applied when possible regarding the | |||
regarding the capabilities of the participating devices by use of SIP | capabilities of the participating devices by using the Session | |||
over TLS by default according to [RFC5630] section 3.1.3 on the | Initiation Protocol (SIP) over TLS by default according to | |||
session control level and by default using DTLS-SRTP [RFC5764] on the | Section 3.1.3 of [RFC5630] on the session control level and by | |||
media level. In applications where legacy endpoints without security | default using DTLS-SRTP [RFC5764] at the media level. In | |||
are allowed, a negotiation SHOULD be performed to decide if | applications where legacy endpoints without security are allowed, a | |||
encryption on the media level will be applied. If no other security | negotiation SHOULD be performed to decide if encryption at the media | |||
solution is mandated for the application, then OSRTP [RFC8643] is a | level will be applied. If no other security solution is mandated for | |||
suitable method to be applied to negotiate SRTP media security with | the application, then the Opportunistic Secure Real-time Transport | |||
DTLS. Most SDP examples below are for simplicity expressed without | Protocol (OSRTP) [RFC8643] is a suitable method to be applied to | |||
the security additions. The principles (but not all details) for | negotiate SRTP media security with DTLS. For simplicity, most SDP | |||
applying DTLS-SRTP [RFC5764] security are shown in a couple of the | examples below are expressed without the security additions. The | |||
following examples. | principles (but not all details) for applying DTLS-SRTP security | |||
[RFC5764] are shown in a couple of the following examples. | ||||
Further general security considerations are covered in Section 10. | Further general security considerations are covered in Section 10. | |||
End-to-end encryption would require further work and could be based | End-to-end encryption would require further work and could be based | |||
on WebRTC as specified in Section 1.2 or on double encryption as | on WebRTC as specified in Section 1.2 or on double encryption as | |||
specified in [RFC8723]. | specified in [RFC8723]. | |||
3.19. SDP offer/answer examples | 3.19. SDP Offer/Answer Examples | |||
This section shows some examples of SDP for session negotiation of | This section shows some examples of SDP for session negotiation of | |||
the real-time text media in SIP sessions. Audio is usually provided | the real-time text media in SIP sessions. Audio is usually provided | |||
in the same session, and sometimes also video. The examples only | in the same session, and sometimes also video. The examples only | |||
show the part of importance for the real-time text media. The | show the part of importance for the real-time text media. The | |||
examples relate to the single RTP stream mixing for multiparty-aware | examples relate to the single RTP stream mixing for multiparty-aware | |||
endpoints and for multiparty-unaware endpoints. | endpoints and for multiparty-unaware endpoints. | |||
Note: Multiparty RTT MAY also be provided through other methods, | | Note: Multiparty real-time text MAY also be provided through | |||
e.g., by a Selective Forwarding Middlebox (SFM). In that case, the | | other methods, e.g., by a Selective Forwarding Middlebox (SFM). | |||
SDP of the offer will include something specific for that method, | | In that case, the SDP of the offer will include something | |||
e.g., an SDP attribute or another media format. An answer selecting | | specific for that method, e.g., an SDP attribute or another | |||
the use of that method would accept it by a corresponding | | media format. An answer selecting the use of that method would | |||
acknowledgement included in the SDP. The offer may contain also the | | accept it via a corresponding acknowledgement included in the | |||
"rtt-mixer" SDP media attribute for the main RTT media when the | | SDP. The offer may also contain the "rtt-mixer" SDP media | |||
offeror has capability for both multiparty methods, while an answer, | | attribute for the main real-time text media when the offerer | |||
selecting to use SFM will not include the "rtt-mixer" SDP media | | has this capability for both multiparty methods, while an | |||
attribute. | | answer, choosing to use SFM, will not include the "rtt-mixer" | |||
| SDP media attribute. | ||||
Offer example for "text/red" format and multiparty support: | Offer example for the "text/red" format, multiparty support, and | |||
capability for 90 characters per second: | ||||
m=text 11000 RTP/AVP 100 98 | m=text 11000 RTP/AVP 100 98 | |||
a=rtpmap:98 t140/1000 | a=rtpmap:98 t140/1000 | |||
a=rtpmap:100 red/1000 | a=fmtp:98 cps=90 | |||
a=fmtp:100 98/98/98 | a=rtpmap:100 red/1000 | |||
a=rtt-mixer | a=fmtp:100 98/98/98 | |||
a=rtt-mixer | ||||
Answer example from a multiparty-aware device | Answer example from a multiparty-aware device: | |||
m=text 14000 RTP/AVP 100 98 | ||||
a=rtpmap:98 t140/1000 | ||||
a=rtpmap:100 red/1000 | ||||
a=fmtp:100 98/98/98 | ||||
a=rtt-mixer | ||||
Offer example for "text/red" format including multiparty | m=text 14000 RTP/AVP 100 98 | |||
and security: | a=rtpmap:98 t140/1000 | |||
a=fingerprint: (fingerprint1) | a=fmtp:98 cps=90 | |||
m=text 11000 RTP/AVP 100 98 | a=rtpmap:100 red/1000 | |||
a=rtpmap:98 t140/1000 | a=fmtp:100 98/98/98 | |||
a=rtpmap:100 red/1000 | a=rtt-mixer | |||
a=fmtp:100 98/98/98 | ||||
a=rtt-mixer | Offer example for the "text/red" format, including multiparty and | |||
security: | ||||
a=fingerprint: (fingerprint1) | ||||
m=text 11000 RTP/AVP 100 98 | ||||
a=rtpmap:98 t140/1000 | ||||
a=rtpmap:100 red/1000 | ||||
a=fmtp:100 98/98/98 | ||||
a=rtt-mixer | ||||
The "fingerprint" is sufficient to offer DTLS-SRTP, with the media | The "fingerprint" is sufficient to offer DTLS-SRTP, with the media | |||
line still indicating RTP/AVP. | line still indicating RTP/AVP. | |||
Note: For brevity, the entire value of the SDP fingerprint attribute | | Note: For brevity, the entire value of the SDP "fingerprint" | |||
is not shown in this and the following example. | | attribute is not shown in this and the following example. | |||
Answer example from a multiparty-aware device with security | Answer example from a multiparty-aware device with security: | |||
a=fingerprint: (fingerprint2) | ||||
m=text 16000 RTP/AVP 100 98 | ||||
a=rtpmap:98 t140/1000 | ||||
a=rtpmap:100 red/1000 | ||||
a=fmtp:100 98/98/98 | ||||
a=rtt-mixer | ||||
With the "fingerprint" the device acknowledges use of SRTP/DTLS. | a=fingerprint: (fingerprint2) | |||
m=text 16000 RTP/AVP 100 98 | ||||
a=rtpmap:98 t140/1000 | ||||
a=rtpmap:100 red/1000 | ||||
a=fmtp:100 98/98/98 | ||||
a=rtt-mixer | ||||
Answer example from a multiparty-unaware device that also | With the "fingerprint", the device acknowledges the use of DTLS-SRTP. | |||
does not support security: | ||||
m=text 12000 RTP/AVP 100 98 | Answer example from a multiparty-unaware device that also does not | |||
a=rtpmap:98 t140/1000 | support security: | |||
a=rtpmap:100 red/1000 | ||||
a=fmtp:100 98/98/98 | ||||
3.20. Packet sequence example from interleaved transmission | m=text 12000 RTP/AVP 100 98 | |||
a=rtpmap:98 t140/1000 | ||||
a=rtpmap:100 red/1000 | ||||
a=fmtp:100 98/98/98 | ||||
This example shows a symbolic flow of packets from a mixer including | 3.20. Packet Sequence Example from Interleaved Transmission | |||
This example shows a symbolic flow of packets from a mixer, including | ||||
loss and recovery. The sequence includes interleaved transmission of | loss and recovery. The sequence includes interleaved transmission of | |||
text from two RTT sources A and B. P indicates primary data. R1 is | text from two real-time text sources: A and B. P indicates primary | |||
first redundant generation data and R2 is the second redundant | data. R1 is the first redundant generation of data, and R2 is the | |||
generation data. A1, B1, A2 etc. are text chunks (T140blocks) | second redundant generation of data. A1, B1, A2, etc. are text | |||
received from the respective sources and sent on to the receiver by | chunks (T140blocks) received from the respective sources and sent on | |||
the mixer. X indicates a dropped packet between the mixer and a | to the receiver by the mixer. X indicates a dropped packet between | |||
receiver. The session is assumed to use original and two redundant | the mixer and a receiver. The session is assumed to use the original | |||
generations of RTT. | and two redundant generations of real-time text. | |||
|-----------------------| | |-----------------------| | |||
|Seq no 101, Time=20400 | | |Seq no 101, Time=20400 | | |||
|CC=1 | | |CC=1 | | |||
|CSRC list A | | |CSRC list A | | |||
|R2: A1, Offset=600 | | |R2: A1, Offset=600 | | |||
|R1: A2, Offset=300 | | |R1: A2, Offset=300 | | |||
|P: A3 | | |P: A3 | | |||
|-----------------------| | |-----------------------| | |||
Assuming that earlier packets (with text A1 and A2) were received in | Assuming that earlier packets (with text A1 and A2) were received in | |||
sequence, text A3 is received from packet 101 and assigned to | sequence, text A3 is received from packet 101 and assigned to | |||
reception buffer A. The mixer is now assumed to have received | reception buffer A. The mixer is now assumed to have received | |||
initial text from source B 100 ms after packet 101 and will send that | initial text from source B 100 ms after packet 101 and will send that | |||
text. Transmission of A2 and A3 as redundancy is planned for 330 ms | text. Transmission of A2 and A3 as redundancy is planned for 330 ms | |||
after packet 101 if no new text from A is ready to be sent before | after packet 101 if no new text from A is ready to be sent before | |||
that. | that. | |||
|-----------------------| | |-----------------------| | |||
|Seq no 102, Time=20500 | | |Seq no 102, Time=20500 | | |||
|CC=1 | | |CC=1 | | |||
|CSRC list B | | |CSRC list B | | |||
|R2 Empty, Offset=600 | | |R2 Empty, Offset=600 | | |||
|R1: Empty, Offset=300 | | |R1: Empty, Offset=300 | | |||
|P: B1 | | |P: B1 | | |||
|-----------------------| | |-----------------------| | |||
Packet 102 is received. | ||||
B1 is retrieved from this packet. Redundant transmission of | ||||
B1 is planned 330 ms after packet 102. | ||||
X------------------------| | Packet 102 is received. | |||
X Seq no 103, Timer=20730| | ||||
X CC=1 | | ||||
X CSRC list A | | ||||
X R2: A2, Offset=630 | | ||||
X R1: A3, Offset=330 | | ||||
X P: Empty | | ||||
X------------------------| | ||||
Packet 103 is assumed to be lost due to network problems. | ||||
It contains redundancy for A. Sending A3 as second level | ||||
redundancy is planned for 330 ms after packet 103. | ||||
X------------------------| | B1 is retrieved from this packet. Redundant transmission of B1 is | |||
X Seq no 104, Timer=20800| | planned 330 ms after packet 102. | |||
X CC=1 | | ||||
X CSRC list B | | ||||
X R2: Empty, Offset=600 | | ||||
X R1: B1, Offset=300 | | ||||
X P: B2 | | ||||
X------------------------| | ||||
Packet 104 contains text from B, including new B2 and | ||||
redundant B1. It is assumed dropped due to network | ||||
problems. | ||||
The mixer has A3 redundancy to send, but no new text | ||||
appears from A and therefore the redundancy is sent | ||||
330 ms after the previous packet with text from A. | ||||
|------------------------| | X------------------------| | |||
| Seq no 105, Timer=21060| | X Seq no 103, Timer=20730| | |||
| CC=1 | | X CC=1 | | |||
| CSRC list A | | X CSRC list A | | |||
| R2: A3, Offset=660 | | X R2: A2, Offset=630 | | |||
| R1: Empty, Offset=330 | | X R1: A3, Offset=330 | | |||
| P: Empty | | X P: Empty | | |||
|------------------------| | X------------------------| | |||
Packet 105 is received. | ||||
A gap for lost packets 103 and 104 is detected. | ||||
Assume that no other loss was detected during the last second. | ||||
Then it can be concluded that nothing was totally lost. | ||||
R2 is checked. Its original time was 21060-660=20400. | Packet 103 is assumed to be lost due to network problems. | |||
A packet with text from A was received with that | ||||
timestamp, so nothing needs to be recovered. | ||||
B1 and B2 still need to be transmitted as redundancy. | It contains redundancy for A. Sending A3 as second-level | |||
This is planned 330 ms after packet 104. That | redundancy is planned for 330 ms after packet 103. | |||
would be at 21130. | ||||
|-----------------------| | X------------------------| | |||
|Seq no 106, Timer=21130| | X Seq no 104, Timer=20800| | |||
|CC=1 | | X CC=1 | | |||
|CSRC list B | | X CSRC list B | | |||
| R2: B1, Offset=630 | | X R2: Empty, Offset=600 | | |||
| R1: B2, Offset=330 | | X R1: B1, Offset=300 | | |||
| P: Empty | | X P: B2 | | |||
|-----------------------| | X------------------------| | |||
Packet 106 is received. | Packet 104 contains text from B, including new B2 and redundant | |||
B1. It is assumed dropped due to network problems. | ||||
The second level redundancy in packet 106 is B1 and has timestamp | The mixer has A3 redundancy to send, but no new text appears from | |||
offset 630 ms. The timestamp of packet 106 minus 630 is 20500 which | A, and therefore the redundancy is sent 330 ms after the previous | |||
is the timestamp of packet 102 that was received. So B1 does not | packet with text from A. | |||
need to be retrieved. The first level redundancy in packet 106 has | ||||
offset 330. The timestamp of packet 106 minus 330 is 20800. That is | ||||
later than the latest received packet with source B. Therefore B2 is | ||||
retrieved and assigned to the input buffer for source B. No primary | ||||
is available in packet 106. | ||||
After this sequence, A3 and B1 and B2 have been received. In this | |------------------------| | |||
case no text was lost. | | Seq no 105, Timer=21060| | |||
| CC=1 | | ||||
| CSRC list A | | ||||
| R2: A3, Offset=660 | | ||||
| R1: Empty, Offset=330 | | ||||
| P: Empty | | ||||
|------------------------| | ||||
3.21. Maximum character rate "cps" | Packet 105 is received. | |||
The default maximum rate of reception of "text/t140" real-time text | A gap for lost packets 103 and 104 is detected. Assume that no | |||
is in [RFC4103] specified to be 30 characters per second. The actual | other loss was detected during the last second. It can then be | |||
concluded that nothing was totally lost. | ||||
R2 is checked. Its original time was 21060-660=20400. A packet | ||||
with text from A was received with that timestamp, so nothing | ||||
needs to be recovered. | ||||
B1 and B2 still need to be transmitted as redundancy. This is | ||||
planned 330 ms after packet 104. That would be at 21130. | ||||
|-----------------------| | ||||
|Seq no 106, Timer=21130| | ||||
|CC=1 | | ||||
|CSRC list B | | ||||
| R2: B1, Offset=630 | | ||||
| R1: B2, Offset=330 | | ||||
| P: Empty | | ||||
|-----------------------| | ||||
Packet 106 is received. | ||||
The second-level redundancy in packet 106 is B1 and has a | ||||
timestamp offset of 630 ms. The timestamp of packet 106 minus 630 | ||||
is 20500, which is the timestamp of packet 102 that was received. | ||||
So, B1 does not need to be retrieved. The first-level redundancy | ||||
in packet 106 has an offset of 330. The timestamp of packet 106 | ||||
minus 330 is 20800. That is later than the latest received packet | ||||
with source B. Therefore, B2 is retrieved and assigned to the | ||||
input buffer for source B. No primary is available in packet 106. | ||||
After this sequence, A3, B1, and B2 have been received. In this | ||||
case, no text was lost. | ||||
3.21. Maximum Character Rate "cps" Setting | ||||
The default maximum rate of reception of "text/t140" real-time text, | ||||
as specified in [RFC4103], is 30 characters per second. The actual | ||||
rate is calculated without regard to any redundant text transmission | rate is calculated without regard to any redundant text transmission | |||
and is in the multiparty case evaluated for all sources contributing | and is, in the multiparty case, evaluated for all sources | |||
to transmission to a receiver. The value MAY be modified in the | contributing to transmission to a receiver. The value MAY be | |||
"cps" parameter of the FMTP attribute in the media section for the | modified in the "cps" parameter of the "fmtp" attribute for the | |||
"text/t140" media. A mixer combining real-time text from a number of | "text/t140" format of the "text" media section. | |||
sources may occasionally have a higher combined flow of text coming | ||||
from the sources. Endpoints SHOULD therefore specify a suitable | ||||
higher value for the "cps" parameter, corresponding to its real | ||||
reception capability. A value for "cps" of 90 SHALL be the default | ||||
for the "text/t140" stream in the "text/red" format when multiparty | ||||
real-time text is negotiated. See [RFC4103] for the format and use | ||||
of the "cps" parameter. The same rules apply for the multiparty case | ||||
except for the default value. | ||||
4. Presentation level considerations | A mixer combining real-time text from a number of sources may | |||
occasionally have a higher combined flow of text coming from the | ||||
sources. Endpoints SHOULD therefore include a suitable higher value | ||||
for the "cps" parameter, corresponding to its real reception | ||||
capability. The default "cps" value 30 can be assumed to be | ||||
sufficient for small meetings and well-managed larger conferences | ||||
with users only making manual text entry. A "cps" value of 90 can be | ||||
assumed to be sufficient even for large unmanaged conferences and for | ||||
cases when speech-to-text technologies are used for text entry. This | ||||
is also a reachable performance for receivers in modern technologies, | ||||
and 90 is therefore the RECOMMENDED "cps" value. See [RFC4103] for | ||||
the format and use of the "cps" parameter. The same rules apply for | ||||
the multiparty case. | ||||
4. Presentation-Level Considerations | ||||
"Protocol for multimedia application text conversation" [T140] | "Protocol for multimedia application text conversation" [T140] | |||
provides the presentation level requirements for the [RFC4103] | provides the presentation-level requirements for RTP transport as | |||
transport. Functions for erasure and other formatting functions are | described in [RFC4103]. Functions for erasure and other formatting | |||
specified in [T140] which has the following general statement for the | functions are specified in [T140], which has the following general | |||
presentation: | statement for the presentation: | |||
"The display of text from the members of the conversation should be | | The display of text from the members of the conversation should be | |||
arranged so that the text from each participant is clearly readable, | | arranged so that the text from each participant is clearly | |||
and its source and the relative timing of entered text is visualized | | readable, and its source and the relative timing of entered text | |||
in the display. Mechanisms for looking back in the contents from the | | is visualized in the display. Mechanisms for looking back in the | |||
current session should be provided. The text should be displayed as | | contents from the current session should be provided. The text | |||
soon as it is received." | | should be displayed as soon as it is received. | |||
Strict application of [T140] is of essence for the interoperability | ||||
of real-time text implementations and to fulfill the intention that | ||||
the session participants have the same information conveyed in the | ||||
text contents of the conversation without necessarily having the | ||||
exact same layout of the conversation. | ||||
[T140] specifies a set of presentation control codes to include in | Strict application of [T140] is essential for the interoperability of | |||
the stream. Some of them are optional. Implementations MUST ignore | real-time text implementations and to fulfill the intention that the | |||
optional control codes that they do not support. | session participants have the same information conveyed in the text | |||
contents of the conversation without necessarily having the exact | ||||
same layout of the conversation. | ||||
[T140] specifies a set of presentation control codes (Section 4.2.4) | ||||
to include in the stream. Some of them are optional. | ||||
Implementations MUST ignore optional control codes that they do not | ||||
support. | ||||
There is no strict "message" concept in real-time text. The Unicode | There is no strict "message" concept in real-time text. The Unicode | |||
Line Separator character SHALL be used as a separator allowing a part | Line Separator character SHALL be used as a separator allowing a part | |||
of received text to be grouped in presentation. The characters | of received text to be grouped in a presentation. The character | |||
"CRLF" may be used by other implementations as a replacement for Line | combination "CRLF" may be used by other implementations as a | |||
Separator. The "CRLF" combination SHALL be erased by just one | replacement for the Line Separator. The "CRLF" combination SHALL be | |||
erasing action, the same as the Line Separator. Presentation | erased by just one erasing action, the same as the Line Separator. | |||
functions are allowed to group text for presentation in smaller | Presentation functions are allowed to group text for presentation in | |||
groups than the line separators imply and present such groups with | smaller groups than the Line Separators imply and present such groups | |||
source indication together with text groups from other sources (see | with a source indication together with text groups from other sources | |||
the following presentation examples). Erasure has no specific limit | (see the following presentation examples). Erasure has no specific | |||
by any delimiter in the text stream. | limit by any delimiter in the text stream. | |||
4.1. Presentation by multiparty-aware endpoints | 4.1. Presentation by Multiparty-Aware Endpoints | |||
A multiparty-aware receiving party, presenting real-time text MUST | A multiparty-aware receiving party presenting real-time text MUST | |||
separate text from different sources and present them in separate | separate text from different sources and present them in separate | |||
presentation fields. The receiving party MAY separate presentation | presentation fields. The receiving party MAY separate the | |||
of parts of text from a source in readable groups based on other | presentation of parts of text from a source in readable groups based | |||
criteria than line separator and merge these groups in the | on criteria other than a Line Separator and merge these groups in the | |||
presentation area when it benefits the user to most easily find and | presentation area when it benefits the user to most easily find and | |||
read text from the different participants. The criteria MAY e.g., be | read text from the different participants. The criteria MAY, for | |||
a received comma, full stop, or other phrase delimiters, or a long | example, be a received comma, a full stop, some other type of phrase | |||
pause. | delimiter, or a long pause. | |||
When text is received from multiple original sources, the | When text is received from multiple original sources, the | |||
presentation SHALL provide a view where text is added in multiple | presentation SHALL provide a view where text is added in multiple | |||
presentation fields. | presentation fields. | |||
If the presentation presents text from different sources in one | If the presentation presents text from different sources in one | |||
common area, the presenting endpoint SHOULD insert text from the | common area, the presenting endpoint SHOULD insert text from the | |||
local user ended at suitable points merged with received text to | local user, where the text ends at suitable points and is merged | |||
indicate the relative timing for when the text groups were completed. | properly with received text to indicate the relative timing for when | |||
In this presentation mode, the receiving endpoint SHALL present the | the text groups were completed. In this presentation mode, the | |||
source of the different groups of text. This presentation style is | receiving endpoint SHALL present the source of the different groups | |||
called the "chat" style here and provides a possibility to follow | of text. This presentation style is called the "chat" style here and | |||
text arriving from multiple parties and the approximate relative time | provides the possibility of following text arriving from multiple | |||
that text is received related to text from the local user. | parties and the approximate relative time that text is received as | |||
related to text from the local user. | ||||
A view of a three-party RTT call in chat style is shown in this | A view of a three-party real-time text call in chat style is shown in | |||
example . | this example. | |||
_________________________________________________ | _________________________________________________ | |||
| |^| | | |^| | |||
|[Alice] Hi, Alice here. |-| | |[Alice] Hi, Alice here. |-| | |||
| | | | | | | | |||
|[Bob] Bob as well. | | | |[Bob] Bob as well. | | | |||
| | | | | | | | |||
|[Eve] Hi, this is Eve, calling from Paris. | | | |[Eve] Hi, this is Eve, calling from Paris. | | | |||
| I thought you should be here. | | | | I thought you should be here. | | | |||
| | | | | | | | |||
|[Alice] I am coming on Thursday, my | | | |[Alice] I am coming on Thursday, my | | | |||
| performance is not until Friday morning.| | | | performance is not until Friday morning.| | | |||
| | | | | | | | |||
|[Bob] And I on Wednesday evening. | | | |[Bob] And I on Wednesday evening. | | | |||
| | | | | | | | |||
|[Alice] Can we meet on Thursday evening? | | | |[Alice] Can we meet on Thursday evening? | | | |||
| | | | | | | | |||
|[Eve] Yes, definitely. How about 7pm. | | | |[Eve] Yes, definitely. How about 7pm. | | | |||
| at the entrance of the restaurant | | | | at the entrance of the restaurant | | | |||
| Le Lion Blanc? | | | | Le Lion Blanc? | | | |||
|[Eve] we can have dinner and then take a walk |-| | |[Eve] we can have dinner and then take a walk |-| | |||
|______________________________________________|v| | |______________________________________________|v| | |||
| <Eve-typing> But I need to be back to |^| | | <Eve-typing> But I need to be back to |^| | |||
| the hotel by 11 because I need |-| | | the hotel by 11 because I need |-| | |||
| | | | | | | | |||
| <Bob-typing> I wou |-| | | <Bob-typing> I wou |-| | |||
|______________________________________________|v| | |______________________________________________|v| | |||
| of course, I underst | | | of course, I underst | | |||
|________________________________________________| | |________________________________________________| | |||
Figure 3: Example of a three-party RTT call presented in chat style | Figure 1: Example of a Three-Party Real-Time Text Call Presented | |||
seen at participant 'Alice's endpoint. | in Chat Style Seen at Participant Alice's Endpoint | |||
Other presentation styles than the chat style MAY be arranged. | Presentation styles other than the chat style MAY be arranged. | |||
This figure shows how a coordinated column view MAY be presented. | Figure 2 shows how a coordinated column view MAY be presented. | |||
_____________________________________________________________________ | _____________________________________________________________________ | |||
| Bob | Eve | Alice | | | Bob | Eve | Alice | | |||
|____________________|______________________|_______________________| | |____________________|______________________|_______________________| | |||
| | |I will arrive by TGV. | | | | |I will arrive by TGV. | | |||
|My flight is to Orly| |Convenient to the main | | |My flight is to Orly| |Convenient to the main | | |||
| |Hi all, can we plan |station. | | | |Hi all, can we plan |station. | | |||
| |for the seminar? | | | | |for the seminar? | | | |||
|Eve, will you do | | | | |Eve, will you do | | | | |||
|your presentation on| | | | |your presentation on| | | | |||
|Friday? |Yes, Friday at 10. | | | |Friday? |Yes, Friday at 10. | | | |||
|Fine, wo | |We need to meet befo | | |Fine, wo | |We need to meet befo | | |||
|___________________________________________________________________| | |___________________________________________________________________| | |||
Figure 4: An example of a coordinated column-view of a three-party | Figure 2: An Example of a Coordinated Column View of a | |||
session with entries ordered vertically in approximate time-order. | Three-Party Session with Entries Ordered Vertically in | |||
Approximate Time Order | ||||
4.2. Multiparty mixing for multiparty-unaware endpoints | 4.2. Multiparty Mixing for Multiparty-Unaware Endpoints | |||
When the mixer has indicated RTT multiparty capability in an SDP | When the mixer has indicated multiparty real-time text capability in | |||
negotiation, but the multiparty capability negotiation fails with an | an SDP negotiation but the multiparty capability negotiation fails | |||
endpoint, then the agreed "text/red" or "text/t140" format SHALL be | with an endpoint, the agreed-upon "text/red" or "text/t140" format | |||
used and the mixer SHOULD compose a best-effort presentation of | SHALL be used and the mixer SHOULD compose a best-effort presentation | |||
multiparty real-time text in one stream intended to be presented by | of multiparty real-time text in one stream intended to be presented | |||
an endpoint with no multiparty awareness, when that is desired in the | by an endpoint with no multiparty awareness, when that is desired in | |||
actual implementation. The following specifies a procedure which MAY | the actual implementation. The following specifies a procedure that | |||
be applied in that situation. | MAY be applied in that situation. | |||
This presentation format has functional limitations and SHOULD be | This presentation format has functional limitations and SHOULD be | |||
used only to enable participation in multiparty calls by legacy | used only to enable participation in multiparty calls by legacy | |||
deployed endpoints implementing only RFC 4103 without any multiparty | deployed endpoints implementing only RFC 4103 without any multiparty | |||
extensions specified in this document. | extensions specified in this document. | |||
The principles and procedures below do not specify any new protocol | The principles and procedures below do not specify any new protocol | |||
elements. They are instead composed of information from [T140] and | elements. They are instead composed of information provided in | |||
an ambition to provide a best-effort presentation on an endpoint | [T140] and an ambition to provide a best-effort presentation on an | |||
which has functions originally intended only for two-party calls. | endpoint that has functions originally intended only for two-party | |||
calls. | ||||
The mixer mixing for multiparty-unaware endpoints SHALL compose a | The mixer performing the mixing for multiparty-unaware endpoints | |||
simulated, limited multiparty RTT view suitable for presentation in | SHALL compose a simulated, limited multiparty real-time text view | |||
one presentation area. The mixer SHALL group text in suitable groups | suitable for presentation in one presentation area. The mixer SHALL | |||
and prepare for presentation of them by inserting a line separator | group text in suitable groups and prepare them for presentation by | |||
between them if the transmitted text did not already end with a new | inserting a Line Separator between them if the transmitted text did | |||
line (line separator or CRLF). A presentable label SHALL be composed | not already end with a new line (Line Separator or CRLF). A | |||
and sent for the source initially in the session and after each | presentable label SHALL be composed and sent for the source initially | |||
source switch. With this procedure the time for switching from | in the session and after each source switch. With this procedure, | |||
transmission of text from one source to transmission of text from | the time for switching from transmission of text from one source to | |||
another source depends on the actions of the users. In order to | transmission of text from another source depends on the actions of | |||
expedite source switching, a user can, for example, end its turn with | the users. In order to expedite source switching, a user can, for | |||
a new line. | example, end its turn with a new line. | |||
4.2.1. Actions by the mixer at reception from the call participants | 4.2.1. Actions by the Mixer at Reception from the Call Participants | |||
When text is received by the mixer from the different participants, | When text is received by the mixer from the different participants, | |||
the mixer SHALL recover text from redundancy if any packets are lost. | the mixer SHALL recover text from redundancy if any packets are lost. | |||
The mark for lost text [T140ad1] SHALL be inserted in the stream if | The marker for lost text [T140ad1] SHALL be inserted in the stream if | |||
unrecoverable loss appears. Any Unicode "BOM" characters, possibly | unrecoverable loss appears. Any Unicode BOM characters, possibly | |||
used for keep-alive, SHALL be deleted. The time of creation of text | used for keep-alives, SHALL be deleted. The time of creation of text | |||
(retrieved from the RTP timestamp) SHALL be stored together with the | (retrieved from the RTP timestamp) SHALL be stored together with the | |||
received text from each source in queues for transmission to the | received text from each source in queues for transmission to the | |||
recipients in order to be able to evaluate text loss. | recipients in order to be able to evaluate text loss. | |||
4.2.2. Actions by the mixer for transmission to the recipients | 4.2.2. Actions by the Mixer for Transmission to the Recipients | |||
The following procedure SHALL be applied for each multiparty-unaware | The following procedure SHALL be applied for each multiparty-unaware | |||
recipient of multiparty text from the mixer. | recipient of multiparty text from the mixer. | |||
The text for transmission SHALL be formatted by the mixer for each | The text for transmission SHALL be formatted by the mixer for each | |||
receiving user for presentation in one single presentation area. | receiving user for presentation in one single presentation area. | |||
Text received from a participant SHOULD NOT be included in | Text received from a participant SHOULD NOT be included in | |||
transmission to that participant because it is usually presented | transmissions to that participant, because it is usually presented | |||
locally at transmission time. When there is text available for | locally at transmission time. When there is text available for | |||
transmission from the mixer to a receiving party from more than one | transmission from the mixer to a receiving party from more than one | |||
participant, the mixer SHALL switch between transmission of text from | participant, the mixer SHALL switch between transmission of text from | |||
the different sources at suitable points in the transmitted stream. | the different sources at suitable points in the transmitted stream. | |||
When switching source, the mixer SHALL insert a line separator if the | When switching the source, the mixer SHALL insert a Line Separator if | |||
already transmitted text did not end with a new line (line separator | the already-transmitted text did not end with a new line (Line | |||
or CRLF). A label SHALL be composed of information in the CNAME and | Separator or CRLF). A label SHALL be composed of information in the | |||
NAME fields in RTCP reports from the participant to have its text | CNAME and NAME fields in RTCP reports from the participant to have | |||
transmitted, or from other session information for that user. The | its text transmitted, or from other session information for that | |||
label SHALL be delimited by suitable characters (e.g., '[ ]') and | user. The label SHALL be delimited by suitable characters (e.g., | |||
transmitted. The CSRC SHALL indicate the selected source. Then text | "[ ]") and transmitted. The CSRC SHALL indicate the selected source. | |||
from that selected participant SHALL be transmitted until a new | Then, text from that selected participant SHALL be transmitted until | |||
suitable point for switching source is reached. | a new suitable point for switching the source is reached. | |||
Information available to the mixer for composing the label may | Information available to the mixer for composing the label may | |||
contain sensitive personal information that SHOULD NOT be revealed in | contain sensitive personal information that SHOULD NOT be revealed in | |||
sessions not securely authenticated and confidentiality protected. | sessions not securely authenticated and confidentiality protected. | |||
Privacy considerations regarding how much personal information is | Privacy considerations regarding how much personal information is | |||
included in the label SHOULD therefore be taken when composing the | included in the label SHOULD therefore be taken when composing the | |||
label. | label. | |||
Seeking a suitable point for switching source SHALL be done when | Seeking a suitable point for switching the source SHALL be done when | |||
there is older text waiting for transmission from any party than the | there is older text waiting for transmission from any party than the | |||
age of the last transmitted text. Suitable points for switching are: | age of the last transmitted text. Suitable points for switching are: | |||
* A completed phrase ended by comma | * A completed phrase ending with a comma. | |||
* A completed sentence | * A completed sentence. | |||
* A new line (line separator or CRLF) | * A new line (Line Separator or CRLF). | |||
* A long pause (e.g., > 10 seconds) in received text from the | * A long pause (e.g., > 10 seconds) in received text from the | |||
currently transmitted source | currently transmitted source. | |||
* If text from one participant has been transmitted with text from | * If text from one participant has been transmitted with text from | |||
other sources waiting for transmission for a long time (e.g., > 1 | other sources waiting for transmission for a long time (e.g., > 1 | |||
minute) and none of the other suitable points for switching has | minute) and none of the other suitable points for switching has | |||
occurred, a source switch MAY be forced by the mixer at the next | occurred, a source switch MAY be forced by the mixer at the next | |||
word delimiter, and also even if a word delimiter does not occur | word delimiter, and also even if a word delimiter does not occur | |||
within a time (e.g., 15 seconds) after the scan for a word | within some period of time (e.g., 15 seconds) after the scan for a | |||
delimiter started. | word delimiter started. | |||
When switching source, the source which has the oldest text in queue | When switching the source, the source that has the oldest text in | |||
SHALL be selected to be transmitted. A character display count SHALL | queue SHALL be selected to be transmitted. A character display count | |||
be maintained for the currently transmitted source, starting at zero | SHALL be maintained for the currently transmitted source, starting at | |||
after the label is transmitted for the currently transmitted source. | zero after the label is transmitted for the currently transmitted | |||
source. | ||||
The status SHALL be maintained for the latest control code for Select | The status SHALL be maintained for the latest control code for Select | |||
Graphic Rendition (SGR) from each source. If there is an SGR code | Graphic Rendition (SGR) from each source. If there is an SGR code | |||
stored as the status for the current source before the source switch | stored as the status for the current source before the source switch | |||
is done, a reset of SGR SHALL be sent by the sequence SGR 0 [009B | is done, a reset of SGR SHALL be sent by the sequence SGR 0 [U+009B | |||
0000 006D] after the new line and before the new label during a | U+0000 U+006D] after the new line and before the new label during a | |||
source switch. See SGR below for an explanation. This transmission | source switch. See Section 4.2.4 for an explanation. This | |||
does not influence the display count. | transmission does not influence the display count. | |||
If there is an SGR code stored for the new source after the source | If there is an SGR code stored for the new source after the source | |||
switch, that SGR code SHALL be transmitted to the recipient before | switch, that SGR code SHALL be transmitted to the recipient before | |||
the label. This transmission does not influence the display count. | the label. This transmission does not influence the display count. | |||
4.2.3. Actions on transmission of text | 4.2.3. Actions on Transmission of Text | |||
Text from a source sent to the recipient SHALL increase the display | Text from a source sent to the recipient SHALL increase the display | |||
count by one per transmitted character. | count by one per transmitted character. | |||
4.2.4. Actions on transmission of control codes | 4.2.4. Actions on Transmission of Control Codes | |||
The following control codes specified by T.140 require specific | The following control codes, as specified by T.140 [T140], require | |||
actions. They SHALL cause specific considerations in the mixer. | specific actions. They SHALL cause specific considerations in the | |||
Note that the codes presented here are expressed in UCS-16, while | mixer. Note that the codes presented here are expressed in UTF-16, | |||
transmission is made in the UTF-8 encoding of these codes. | while transmission is made in the UTF-8 encoding of these codes. | |||
BEL 0007 Bell Alert in session. Provides for alerting during an | BEL (U+0007): Bell. Alert in session. Provides for alerting during | |||
active session. The display count SHALL NOT be altered. | an active session. The display count SHALL NOT be altered. | |||
NEW LINE 2028 Line separator. Check and perform a source switch if | NEW LINE (U+2028): Line Separator. Check and perform a source | |||
appropriate. Increase the display count by 1. | switch if appropriate. Increase the display count by 1. | |||
CR LF 000D 000A A supported but not preferred way of requesting a | CR LF (U+000D U+000A): A supported, but not preferred, way of | |||
new line. Check and perform a source switch if appropriate. | requesting a new line. Check and perform a source switch if | |||
Increase the display count by 1. | appropriate. Increase the display count by 1. | |||
INT ESC 0061 Interrupt (used to initiate the mode negotiation | INT (ESC U+0061): Interrupt (used to initiate the mode negotiation | |||
procedure). The display count SHALL NOT be altered. | procedure). The display count SHALL NOT be altered. | |||
SGR 009B Ps 006D Select graphic rendition. Ps is the rendition | SGR (U+009B Ps U+006D): Select Graphic Rendition. Ps represents the | |||
parameters specified in ISO 6429. The display count SHALL NOT be | rendition parameters specified in [ISO6429]. (For freely | |||
altered. The SGR code SHOULD be stored for the current source. | available equivalent information, please see [ECMA-48].) The | |||
display count SHALL NOT be altered. The SGR code SHOULD be stored | ||||
for the current source. | ||||
SOS 0098 Start of string, used as a general protocol element | SOS (U+0098): Start of String. Used as a general protocol element | |||
introducer, followed by a maximum 256-byte string and the ST. The | introducer, followed by a maximum 256-byte string and the ST. The | |||
display count SHALL NOT be altered. | display count SHALL NOT be altered. | |||
ST 009C String terminator, end of SOS string. The display count | ST (U+009C): String Terminator. End of SOS string. The display | |||
SHALL NOT be altered. | count SHALL NOT be altered. | |||
ESC 001B Escape - used in control strings. The display count SHALL | ESC (U+001B): Escape. Used in control strings. The display count | |||
NOT be altered for the complete escape code. | SHALL NOT be altered for the complete escape code. | |||
Byte order mark "BOM" (U+FEFF) "Zero width, no break space", used | Byte order mark (BOM) (U+FEFF): "Zero width no-break space". Used | |||
for synchronization and keep-alive. It SHALL be deleted from | for synchronization and keep-alive. It SHALL be deleted from | |||
incoming streams. It SHALL also be sent first after session | incoming streams. It SHALL also be sent first after session | |||
establishment to the recipient. The display count SHALL NOT be | establishment to the recipient. The display count SHALL NOT be | |||
altered. | altered. | |||
Missing text mark (U+FFFD) "Replacement character", represented as a | Missing text mark (U+FFFD): "Replacement character". Represented as | |||
question mark in a rhombus, or if that is not feasible, replaced | a question mark in a rhombus, or, if that is not feasible, | |||
by an apostrophe '. It marks the place in the stream of possible | replaced by an apostrophe ('). It marks the place in the stream | |||
text loss. This mark SHALL be inserted by the reception procedure | of possible text loss. This mark SHALL be inserted by the | |||
in case of unrecoverable loss of packets. The display count SHALL | reception procedure in the case of unrecoverable loss of packets. | |||
be increased by one when sent as for any other character. | The display count SHALL be increased by one when sent as for any | |||
other character. | ||||
SGR If a control code for selecting graphic rendition (SGR) other | SGR: If a control code for SGR other than a reset of the graphic | |||
than reset of the graphic rendition (SGR 0) is sent to a | rendition (SGR 0) is sent to a recipient, that control code SHALL | |||
recipient, that control code SHALL also be stored as the status | also be stored as the status for the source in the storage for SGR | |||
for the source in the storage for SGR status. If a reset graphic | status. If a reset graphic rendition (SGR 0) originating from a | |||
rendition (SGR 0) originating from a source is sent, then the SGR | source is sent, then the SGR status storage for that source SHALL | |||
status storage for that source SHALL be cleared. The display | be cleared. The display count SHALL NOT be increased. | |||
count SHALL NOT be increased. | ||||
BS (U+0008) Back Space, intended to erase the last entered character | BS (U+0008): "Back Space". Intended to erase the last entered | |||
by a source. Erasure by backspace cannot always be performed as | character by a source. Erasure by backspace cannot always be | |||
the erasing party intended. If an erasing action erases all text | performed as the erasing party intended. If an erasing action | |||
up to the end of the leading label after a source switch, then the | erases all text up to the end of the leading label after a source | |||
mixer MUST NOT transmit more backspaces. Instead, it is | switch, then the mixer MUST NOT transmit more backspaces. | |||
RECOMMENDED that a letter "X" is inserted in the text stream for | Instead, it is RECOMMENDED that a letter "X" be inserted in the | |||
each backspace as an indication of the intent to erase more. A | text stream for each backspace as an indication of the intent to | |||
new line is usually coded by a Line Separator, but the character | erase more. A new line is usually coded by a Line Separator, but | |||
combination "CRLF" MAY be used instead. Erasure of a new line is | the character combination "CRLF" MAY be used instead. Erasure of | |||
in both cases done by just one erasing action (Backspace). If the | a new line is, in both cases, done by just one erasing action | |||
display count has a positive value it SHALL be decreased by one | (backspace). If the display count has a positive value, it SHALL | |||
when the BS is sent. If the display count is at zero, it SHALL | be decreased by one when the BS is sent. If the display count is | |||
NOT be altered. | at zero, it SHALL NOT be altered. | |||
4.2.5. Packet transmission | 4.2.5. Packet Transmission | |||
A mixer transmitting to a multiparty-unaware terminal SHALL send | A mixer transmitting to a multiparty-unaware endpoint SHALL send | |||
primary data only from one source per packet. The SSRC SHALL be the | primary data only from one source per packet. The SSRC SHALL be the | |||
SSRC of the mixer. The CSRC list SHALL contain one member and be the | SSRC of the mixer. The CSRC list MAY contain one member and be the | |||
SSRC of the source of the primary data. | SSRC of the source of the primary data. | |||
4.2.6. Functional limitations | 4.2.6. Functional Limitations | |||
When a multiparty-unaware endpoint presents a conversation in one | When a multiparty-unaware endpoint presents a conversation in one | |||
display area in a chat style, it inserts source indications for | display area in a chat style, it inserts source indications for | |||
remote text and local user text as they are merged in completed text | remote text and local user text as they are merged in completed text | |||
groups. When an endpoint using this layout receives and presents | groups. When an endpoint using this layout receives and presents | |||
text mixed for multiparty-unaware endpoints, there will be two levels | text mixed for multiparty-unaware endpoints, there will be two levels | |||
of source indicators for the received text; one generated by the | of source indicators for the received text: one generated by the | |||
mixer and inserted in a label after each source switch, and another | mixer and inserted in a label after each source switch, and another | |||
generated by the receiving endpoint and inserted after each switch | generated by the receiving endpoint and inserted after each switch | |||
between local and remote source in the presentation area. This will | between the local source and the remote source in the presentation | |||
waste display space and look inconsistent to the reader. | area. This will waste display space and look inconsistent to the | |||
reader. | ||||
New text can be presented only from one source at a time. Switch of | New text can be presented from only one source at a time. Switching | |||
source to be presented takes place at suitable places in the text, | the source to be presented takes place at suitable places in the | |||
such as end of phrase, end of sentence, line separator and | text, such as the end of a phrase, the end of a sentence, or a Line | |||
inactivity. Therefore, the time to switch to present waiting text | Separator, or upon detecting inactivity. Therefore, the time to | |||
from other sources may become long and will vary and depend on the | switch to present waiting text from other sources may grow long, and | |||
actions of the currently presented source. | it will vary and depend on the actions of the currently presented | |||
source. | ||||
Erasure can only be done up to the latest source switch. If a user | Erasure can only be done up to the latest source switch. If a user | |||
tries to erase more text, the erasing actions will be presented as | tries to erase more text, the erasing actions will be presented as a | |||
letter X after the label. | letter "X" after the label. | |||
Text loss because of network errors may hit the label between entries | Text loss because of network errors may hit the label between entries | |||
from different parties, causing risk for misunderstanding from which | from different parties, causing the risk of a misunderstanding | |||
source a piece of text is. | regarding which source provided a piece of text. | |||
These facts make it strongly RECOMMENDED implementing multiparty | Because of these facts, it is strongly RECOMMENDED that multiparty | |||
awareness in RTT endpoints. The use of the mixing method for | awareness be implemented in real-time text endpoints. The use of the | |||
multiparty-unaware endpoints should be left for use with endpoints | mixing method for multiparty-unaware endpoints should be left for use | |||
which are impossible to upgrade to become multiparty-aware. | with endpoints that are impossible to upgrade to become multiparty | |||
aware. | ||||
4.2.7. Example views of presentation on multiparty-unaware endpoints | 4.2.7. Example Views of Presentation on Multiparty-Unaware Endpoints | |||
The following pictures are examples of the view on a participant's | The following pictures are examples of the view on a participant's | |||
display for the multiparty-unaware case. | display for the multiparty-unaware case. | |||
_________________________________________________ | Figure 3 shows how a coordinated column view MAY be presented on | |||
| Conference | Alice | | Alice's device in a view with two columns. The mixer inserts labels | |||
|________________________|_________________________| | ||||
| |I will arrive by TGV. | | ||||
|[Bob]:My flight is to |Convenient to the main | | ||||
|Orly. |station. | | ||||
|[Eve]:Hi all, can we | | | ||||
|plan for the seminar. | | | ||||
| | | | ||||
|[Bob]:Eve, will you do | | | ||||
|your presentation on | | | ||||
|Friday? | | | ||||
|[Eve]:Yes, Friday at 10.| | | ||||
|[Bob]: Fine, wo |We need to meet befo | | ||||
|________________________|_________________________| | ||||
Figure 5: Alice who has a conference-unaware client is receiving the | ||||
multiparty real-time text in a single-stream. | ||||
This figure shows how a coordinated column view MAY be presented on | ||||
Alice's device in a view with two-columns. The mixer inserts labels | ||||
to show how the sources alternate in the column with received text. | to show how the sources alternate in the column with received text. | |||
The mixer alternates between the sources at suitable points in the | The mixer alternates between the sources at suitable points in the | |||
text exchange so that text entries from each party can be | text exchange so that text entries from each party can be | |||
conveniently read. | conveniently read. | |||
_________________________________________________ | ___________________________________________________ | |||
| |^| | | Conference | Alice | | |||
|(Alice) Hi, Alice here. |-| | |_________________________|_________________________| | |||
| | | | | |I will arrive by TGV. | | |||
|(mix)[Bob)] Bob as well. | | | |[Bob]: My flight is to |Convenient to the main | | |||
| | | | |Orly. |station. | | |||
|[Eve] Hi, this is Eve, calling from Paris | | | |[Eve]: Hi all, can we | | | |||
| I thought you should be here. | | | |plan for the seminar. | | | |||
| | | | | | | | |||
|(Alice) I am coming on Thursday, my | | | |[Bob]: Eve, will you do | | | |||
| performance is not until Friday morning.| | | |your presentation on | | | |||
| | | | |Friday? | | | |||
|(mix)[Bob] And I on Wednesday evening. | | | |[Eve]: Yes, Friday at 10.| | | |||
| | | | |[Bob]: Fine, wo |We need to meet befo | | |||
|[Eve] we can have dinner and then walk | | | |_________________________|_________________________| | |||
| | | | ||||
|[Eve] But I need to be back to | | | ||||
| the hotel by 11 because I need | | | ||||
| |-| | ||||
|______________________________________________|v| | ||||
| of course, I underst | | ||||
|________________________________________________| | ||||
Figure 6: An example of a view of the multiparty-unaware presentation | Figure 3: Alice, Who Has a Conference-Unaware Client, Is | |||
in chat style. Alice is the local user. | Receiving the Multiparty Real-Time Text in a Single Stream | |||
In this view, there is a tradition in receiving applications to | In Figure 4, there is a tradition in receiving applications to | |||
include a label showing the source of the text, here shown with | include a label showing the source of the text, here shown with | |||
parenthesis "()". The mixer also inserts source labels for the | parentheses "()". The mixer also inserts source labels for the | |||
multiparty call participants, here shown with brackets "[]". | multiparty call participants, here shown with brackets "[]". | |||
5. Relation to Conference Control | _________________________________________________ | |||
5.1. Use with SIP centralized conferencing framework | | |^| | |||
|(Alice) Hi, Alice here. |-| | ||||
| | | | ||||
|(mix)[Bob] Bob as well. | | | ||||
| | | | ||||
|[Eve] Hi, this is Eve, calling from Paris | | | ||||
| I thought you should be here. | | | ||||
| | | | ||||
|(Alice) I am coming on Thursday, my | | | ||||
| performance is not until Friday morning.| | | ||||
| | | | ||||
|(mix)[Bob] And I on Wednesday evening. | | | ||||
| | | | ||||
|[Eve] we can have dinner and then walk | | | ||||
| | | | ||||
|[Eve] But I need to be back to | | | ||||
| the hotel by 11 because I need | | | ||||
| |-| | ||||
|______________________________________________|v| | ||||
| of course, I underst | | ||||
|________________________________________________| | ||||
Figure 4: An Example of a View of the Multiparty-Unaware | ||||
Presentation in Chat Style, Where Alice Is the Local User | ||||
5. Relationship to Conference Control | ||||
5.1. Use with SIP Centralized Conferencing Framework | ||||
The Session Initiation Protocol (SIP) conferencing framework, mainly | The Session Initiation Protocol (SIP) conferencing framework, mainly | |||
specified in [RFC4353], [RFC4579] and [RFC4575] is suitable for | specified in [RFC4353], [RFC4579], and [RFC4575], is suitable for | |||
coordinating sessions including multiparty RTT. The RTT stream | coordinating sessions, including multiparty real-time text. The | |||
between the mixer and a participant is one and the same during the | real-time text stream between the mixer and a participant is one and | |||
conference. Participants get announced by notifications when | the same during the conference. Participants get announced by | |||
participants are joining or leaving, and further user information may | notifications when participants are joining or leaving, and further | |||
be provided. The SSRC of the text to expect from joined users MAY be | user information may be provided. The SSRC of the text to expect | |||
included in a notification. The notifications MAY be used both for | from joined users MAY be included in a notification. The | |||
security purposes and for translation to a label for presentation to | notifications MAY be used for both security purposes and translation | |||
other users. | to a label for presentation to other users. | |||
5.2. Conference control | 5.2. Conference Control | |||
In managed conferences, control of the real-time text media SHOULD be | In managed conferences, control of the real-time text media SHOULD be | |||
provided in the same way as other for media, e.g., for muting and | provided in the same way as for other media, e.g., for muting and | |||
unmuting by the direction attributes in SDP [RFC8866]. | unmuting by the direction attributes in SDP [RFC8866]. | |||
Note that floor control functions may be of value for RTT users as | Note that floor control functions may be of value for real-time text | |||
well as for users of other media in a conference. | users as well as for users of other media in a conference. | |||
6. Gateway Considerations | 6. Gateway Considerations | |||
6.1. Gateway considerations with Textphones | Multiparty real-time text sessions may involve gateways of different | |||
kinds. Gateways involved in setting up sessions SHALL correctly | ||||
reflect the multiparty capability or unawareness of the combination | ||||
of the gateway and the remote endpoint beyond the gateway. | ||||
multiparty RTT sessions may involve gateways of different kinds. | 6.1. Gateway Considerations with Textphones | |||
Gateways involved in setting up sessions SHALL correctly reflect the | ||||
multiparty capability or unawareness of the combination of the | ||||
gateway and the remote endpoint beyond the gateway. | ||||
One case that may occur is a gateway to Public Switched Telephone | One case that may occur is a gateway to the Public Switched Telephone | |||
Network (PSTN) for communication with textphones (e.g., TTYs). | Network (PSTN) for communication with textphones (e.g., TTYs). | |||
Textphones are limited devices with no multiparty awareness, and it | Textphones are limited devices with no multiparty awareness, and it | |||
SHOULD therefore be suitable for the gateway to not indicate | SHOULD therefore be appropriate for the gateway to not indicate | |||
multiparty awareness for that case. Another solution is that the | multiparty awareness for that case. Another solution is that the | |||
gateway indicates multiparty capability towards the mixer, and | gateway indicates multiparty capability towards the mixer and | |||
includes the multiparty mixer function for multiparty-unaware | includes the multiparty mixer function for multiparty-unaware | |||
endpoints itself. This solution makes it possible to adapt to the | endpoints itself. This solution makes it possible to adapt to the | |||
functional limitations of the textphone. | functional limitations of the textphone. | |||
More information on gateways to textphones is found in [RFC5194] | More information on gateways to textphones is found in [RFC5194]. | |||
6.2. Gateway considerations with WebRTC | 6.2. Gateway Considerations with WebRTC | |||
Gateway operation to real-time text in WebRTC may also be required. | Gateway operation between RTP-mixer-based multiparty real-time text | |||
In WebRTC, RTT is specified in [RFC8865]. | and WebRTC-based real-time text may also be required. Real-time text | |||
transport in WebRTC is specified in [RFC8865]. | ||||
A multiparty bridge may have functionality for communicating by RTT | A multiparty bridge may have functionality for communicating via | |||
both in RTP streams with RTT and WebRTC T.140 data channels. Other | real-time text in both (1) RTP streams with real-time text and (2) | |||
configurations may consist of a multiparty bridge with either | WebRTC T.140 data channels. Other configurations may consist of a | |||
technology for RTT transport and a separate gateway for conversion of | multiparty bridge with either technology for real-time text transport | |||
the text communication streams between RTP and T.140 data channel. | and a separate gateway for conversion of the text communication | |||
streams between RTP and T.140 data channels. | ||||
In WebRTC, it is assumed that for a multiparty session, one T.140 | In WebRTC, it is assumed that for a multiparty session, one T.140 | |||
data channel is established for each source from a gateway or bridge | data channel is established for each source from a gateway or bridge | |||
to each participant. Each participant also has a data channel with a | to each participant. Each participant also has a data channel with a | |||
two-way connection with the gateway or bridge. | two-way connection with the gateway or bridge. | |||
The T.140 data channel used both ways is for text from the WebRTC | A T.140 data channel used for two-way communication is for text from | |||
user and from the bridge or gateway itself to the WebRTC user. The | the WebRTC user and from the bridge or gateway itself to the WebRTC | |||
label parameter of this T.140 data channel is used as the NAME field | user. The label parameter of this T.140 data channel is used as the | |||
in RTCP to participants on the RTP side. The other T.140 data | NAME field in RTCP to participants on the RTP side. The other T.140 | |||
channels are only for text from other participants to the WebRTC | data channels are only for text from other participants to the WebRTC | |||
user. | user. | |||
When a new participant has entered the session with RTP transport of | When a new participant has entered the session with RTP transport of | |||
RTT, a new T.140 channel SHOULD be established to WebRTC users with | real-time text, a new T.140 data channel SHOULD be established to | |||
the label parameter composed of information from the NAME field in | WebRTC users with the label parameter composed of information from | |||
RTCP on the RTP side. | the NAME field in RTCP on the RTP side. | |||
When a new participant has entered the multiparty session with RTT | When a new participant has entered the multiparty session with real- | |||
transport in a WebRTC T.140 data channel, the new participant SHOULD | time text transport in a WebRTC T.140 data channel, the new | |||
be announced by a notification to RTP users. The label parameter | participant SHOULD be announced by a notification to RTP users. The | |||
from the WebRTC side SHOULD be used as the NAME RTCP field on the RTP | label parameter from the WebRTC side or other suitable information | |||
side, or other available session information. | from the session or stream establishment procedure SHOULD be used to | |||
compose the NAME RTCP field on the RTP side. | ||||
When a participant on the RTP side is disconnected from the | When a participant on the RTP side is disconnected from the | |||
multiparty session, the corresponding T.140 data channel(s) SHOULD be | multiparty session, the corresponding T.140 data channel(s) SHOULD be | |||
closed. | closed. | |||
When a WebRTC user of T.140 data channels disconnects from the mixer, | When a WebRTC user of T.140 data channels disconnects from the mixer, | |||
the corresponding RTP streams or sources in an RTP-mixed stream | the corresponding RTP streams or sources in an RTP-mixed stream | |||
SHOULD be closed. | SHOULD be closed. | |||
T.140 data channels MAY be opened and closed by negotiation or | T.140 data channels MAY be opened and closed by negotiation or | |||
renegotiation of the session or by any other valid means as specified | renegotiation of the session, or by any other valid means, as | |||
in section 1 of [RFC8865]. | specified in Section 1 of [RFC8865]. | |||
7. Updates to RFC 4103 | 7. Updates to RFC 4103 | |||
This document updates [RFC4103] by introducing an SDP media attribute | This document updates [RFC4103] by introducing an SDP media | |||
"rtt-mixer" for negotiation of multiparty-mixing capability with the | attribute, "rtt-mixer", for negotiation of multiparty-mixing | |||
[RFC4103] format, and by specifying the rules for packets when | capability with the format described in [RFC4103] and by specifying | |||
multiparty capability is negotiated and in use. | the rules for packets when multiparty capability is negotiated and in | |||
use. | ||||
8. Congestion considerations | 8. Congestion Considerations | |||
The congestion considerations and recommended actions from [RFC4103] | The congestion considerations and recommended actions provided in | |||
are also valid in multiparty situations. | [RFC4103] are also valid in multiparty situations. | |||
The time values SHALL then be applied per source of text sent to a | The time values SHALL then be applied per source of text sent to a | |||
receiver. | receiver. | |||
If the very unlikely situation appears that many participants in a | In the very unlikely event that many participants in a conference | |||
conference send text simultaneously for a long period, a delay may | send text simultaneously for a long period of time, a delay may build | |||
build up for presentation of text at the receivers if the limitation | up for the presentation of text at the receivers if the limitation in | |||
in characters per second ("cps") to be transmitted to the | characters per second ("cps") to be transmitted to the participants | |||
participants is exceeded. More delay than 7 seconds can cause | is exceeded. A delay of more than 15 seconds can cause confusion in | |||
confusion in the session. It is therefore RECOMMENDED that an RTP- | the session. It is therefore RECOMMENDED that an RTP mixer discard | |||
mixer-based mixer discards such text causing excessive delays and | such text causing excessive delays and insert a general indication of | |||
inserts a general indication of possible text loss [T140ad1] in the | possible text loss [T140ad1] in the session. If the main text | |||
session. If the main text contributor is indicated in any way, the | contributor is indicated in any way, the mixer MAY avoid deleting | |||
mixer MAY avoid deleting text from that participant. It should | text from that participant. It should, however, be noted that human | |||
however be noted that human creation of text normally contains | creation of text normally contains pauses, when the transmission can | |||
pauses, when the transmission can catch up, so that the transmission | catch up, so that transmission-overload situations are expected to be | |||
overload situations are expected to be very rare. | very rare. | |||
9. IANA Considerations | 9. IANA Considerations | |||
9.1. Registration of the "rtt-mixer" SDP media attribute | 9.1. Registration of the "rtt-mixer" SDP Media Attribute | |||
[RFC EDITOR NOTE: Please replace all instances of RFCXXXX with the | ||||
RFC number of this document.] | ||||
IANA is asked to register the new SDP attribute "rtt-mixer". | IANA has registered the new SDP attribute "rtt-mixer". | |||
Contact name: IESG | Contact name: IESG | |||
Contact email: iesg@ietf.org | Contact email: iesg@ietf.org | |||
Attribute name: rtt-mixer | Attribute name: rtt-mixer | |||
Attribute semantics: See RFCXXXX Section 2.3 | Attribute semantics: See RFC 9071, Section 2.3 | |||
Attribute value: none | Attribute value: none | |||
Usage level: media | Usage level: media | |||
Purpose: Indicate support by mixer and endpoint of multiparty mixing | Purpose: To indicate mixer and endpoint support of multiparty mixing | |||
for real-time text transmission, using a common RTP-stream for | for real-time text transmission, using a common RTP stream for | |||
transmission of text from a number of sources mixed with one | transmission of text from a number of sources mixed with one | |||
source at a time and the source indicated in a single CSRC-list | source at a time and where the source is indicated in a single | |||
member. | CSRC-list member. | |||
Charset Dependent: no | Charset Dependent: no | |||
O/A procedure: See RFCXXXX Section 2.3 | O/A procedures: See RFC 9071, Section 2.3 | |||
Mux Category: normal | Mux Category: normal | |||
Reference: RFCXXXX | Reference: RFC 9071 | |||
10. Security Considerations | 10. Security Considerations | |||
The RTP-mixer model requires the mixer to be allowed to decrypt, | The RTP-mixer model requires the mixer to be allowed to decrypt, | |||
pack, and encrypt secured text from the conference participants. | pack, and encrypt secured text from conference participants. | |||
Therefore, the mixer needs to be trusted to maintain confidentiality | Therefore, the mixer needs to be trusted to maintain confidentiality | |||
and integrity of the RTT data. This situation is similar to the | and integrity of the real-time text data. This situation is similar | |||
situation for handling audio and video media in centralized mixers. | to the situation for handling audio and video media in centralized | |||
mixers. | ||||
The requirement to transfer information about the user in RTCP | The requirement to transfer information about the user in RTCP | |||
reports in SDES, CNAME, and NAME fields, and in conference | reports in SDES, CNAME, and NAME fields, and in conference | |||
notifications, may have privacy concerns as already stated in RFC | notifications, may have privacy concerns, as already stated in RFC | |||
3550 [RFC3550], and may be restricted for privacy reasons. When used | 3550 [RFC3550], and may be restricted for privacy reasons. When used | |||
for creation of readable labels in the presentation, the receiving | for the creation of readable labels in the presentation, the | |||
user will then get a more symbolic label for the source. | receiving user will then get a more symbolic label for the source. | |||
The services available through the RTT mixer may have special | The services available through the real-time text mixer may be of | |||
interest for deaf and hard-of-hearing persons. Some users may want | special interest to deaf and hard-of-hearing individuals. Some users | |||
to refrain from revealing such characteristics broadly in | may want to refrain from revealing such characteristics broadly in | |||
conferences. The design of the conference systems where the mixer is | conferences. Conference systems where the mixer is included MAY need | |||
included MAY need to be made with confidentiality of such | to be designed with the confidentiality of such characteristics in | |||
characteristics in mind. | mind. | |||
Participants with malicious intentions may appear and e.g., disturb | Participants with malicious intentions may appear and, for example, | |||
the multiparty session by emitting a continuous flow of text. They | disrupt the multiparty session by emitting a continuous flow of text. | |||
may also send text that appears to originate from other participants. | They may also send text that appears to originate from other | |||
Counteractions should be to require secure signaling, media and | participants. Countermeasures should include requiring secure | |||
authentication, and to provide higher-layer conference functions | signaling, media, and authentication, and providing higher-layer | |||
e.g., for blocking, muting, and expelling participants. | conference functions, e.g., for blocking, muting, and expelling | |||
participants. | ||||
Participants with malicious intentions may also try to disturb the | Participants with malicious intentions may also try to disrupt the | |||
presentation by sending incomplete or malformed control codes. | presentation by sending incomplete or malformed control codes. | |||
Handling of text from the different sources by the receivers MUST | Handling of text from the different sources by the receivers MUST | |||
therefore be well separated so that the effects of such actions only | therefore be well separated so that the effects of such actions only | |||
affect text from the source causing the action. | affect text from the source causing the action. | |||
Care should be taken that if use of the mixer is allowed for users | Care should be taken to avoid the possibility of attacks by | |||
both with and without security procedures, opens for possible attacks | unauthenticated call participants, and even eavesdropping and | |||
by both unauthenticated call participants and even eavesdropping and | manipulation of content by non-participants, if the use of the mixer | |||
manipulating of content non-participants. | is permitted for users both with and without security procedures. | |||
As already stated in Section 3.18, security in media SHOULD be | As already stated in Section 3.18, security in media SHOULD be | |||
applied by using DTLS-SRTP [RFC5764] on the media level. | applied by using DTLS-SRTP [RFC5764] at the media level. | |||
Further security considerations specific for this application are | Further security considerations specific to this application are | |||
specified in Section 3.18. | specified in Section 3.18. | |||
11. Change history | 11. References | |||
[RFC Editor: Please remove this section prior to publication.] | ||||
11.1. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-20 | ||||
Inclusion of edits as respone to a comment by Benjamin Kaduk in | ||||
section 3.16.3 to make the recovery procedure generic. | ||||
Added persons to the acknowledgements and moved acknowledgements to | ||||
last in the document. | ||||
11.2. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-19 | ||||
Edits because of comments in a review by Francesca Palombini. | ||||
Edits because of comments from Benjamin Kaduk. | ||||
Proposed to not change anything because of Robert Wilton's comments. | ||||
Two added sentences in the security section to meet comments by Roman | ||||
Danyliw. | ||||
11.3. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-18 | ||||
Edits of nits as proposed in a review by Lars Eggert. | ||||
Edits as response to review by Martin Duke. | ||||
11.4. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-17 | ||||
Actions on Gen-ART review comments. | ||||
Actions on SecDir review comments. | ||||
11.5. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-16 | ||||
Improvements in the offer/answer considerations section by adding | ||||
subsections for each phase in the negotiation as requested by IANA | ||||
expert review. | ||||
11.6. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-15 | ||||
Actions on review comments from Jurgen Schonwalder: | ||||
A bit more about congestion situations and that they are expected to | ||||
be very rare. | ||||
Explanation of differences in security between the conference-aware | ||||
and the conference-unaware case added in security section. | ||||
Presentation examples with suource labels made less confusing, and | ||||
explained. | ||||
Reference to T.140 inserted at first mentioning of T.140. | ||||
Reference to RFC 8825 inserted to explain WebRTC | ||||
Nit in wording in terminology section adjusted. | ||||
11.7. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-14 | ||||
Changes from comments by Murray Kucherawy during AD review. | ||||
Many SHOULD in section 4.2 on multiparty-unaware mixing changed to | ||||
SHALL, and the whole section instead specified to be optional | ||||
depending on the application. | ||||
Some SHOULD in section 3 either explained or changed to SHALL. | ||||
In order to have explainable conditions behind SHOULDs, the | ||||
transmission interval in 3.4 is changed to as soon as text is | ||||
available as a main principle. The call participants send with 300 | ||||
ms interval so that will create realistic load conditions anyway. | ||||
11.8. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-13 | ||||
Changed year to 2021. | ||||
Changed reference to draft on RTT in WebRTC to recently published RFC | ||||
8865. | ||||
Changed label brackets in example from "[]" to "()" to avoid nits | ||||
comment. | ||||
Changed reference "RFC 4566" to recently published "RFC 8866" | ||||
11.9. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-12 | ||||
Changes according to responses on comments from Brian Rosen in | ||||
Avtcore list on 2020-12-05 and -06. | ||||
Changes according to responses to comments by Bernard Aboba in | ||||
avtcore list 2020-12-06. | ||||
Introduction of an optiona RTP multi-stream mixing method for further | ||||
study as proposed by Bernard Aboba. | ||||
Changes clarifying how to open and close T.140 data channels included | ||||
in 6.2 after comments by Lorenzo Miniero. | ||||
Changes to satisfy nits check. Some "not" changed to "NOT" in | ||||
normative wording combinations. Some lower case normative words | ||||
changed to upper case. A normative reference deleted from the | ||||
abstract. Two informative documents moved from normative references | ||||
to informative references. | ||||
11.10. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-11 | ||||
Timestamps and timestamp offsets added to the packet examples in | ||||
section 3.23, and the description corrected. | ||||
A number of minor corrections added in sections 3.10 - 3.23. | ||||
11.11. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-10 | ||||
The packet composition was modified for interleaving packets from | ||||
different sources. | ||||
The packet reception was modified for the new interleaving method. | ||||
The packet sequence examples was adjusted for the new interleaving | ||||
method. | ||||
Modifications according to responses to Brian Rosen of 2020-11-03 | ||||
11.12. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-09 | ||||
Changed name on the SDP media attribute to "rtt-mixer" | ||||
Restructure of section 2 for balance between aware and unaware cases. | ||||
Moved conference control to own section. | ||||
Improved clarification of recovery and loss in the packet sequence | ||||
example. | ||||
A number of editorial corrections and improvements. | ||||
11.13. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-08 | ||||
Deleted the method requiring a new packet format "text/rex" because | ||||
of the longer standardization and implementation period it needs. | ||||
Focus on use of RFC 4103 text/red format with shorter transmission | ||||
interval, and source indicated in CSRC. | ||||
11.14. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-07 | ||||
Added a method based on the "text/red" format and single source per | ||||
packet, negotiated by the "rtt-mixer" SDP attribute. | ||||
Added reasoning and recommendation about indication of loss. | ||||
The highest number of sources in one packet is 15, not 16. Changed. | ||||
Added in information on update to RFC 4103 that RFC 4103 explicitly | ||||
allows addition of FEC method. The redundancy is a kind of forward | ||||
error correction. | ||||
11.15. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-06 | ||||
Improved definitions list format. | ||||
The format of the media subtype parameters is made to match the | ||||
requirements. | ||||
The mapping of media subtype parameters to SDP is included. | ||||
The "cps" parameter belongs to the t140 subtype and does not need to | ||||
be registered here. | ||||
11.16. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-05 | ||||
nomenclature and editorial improvements | ||||
"this document" used consistently to refer to this document. | ||||
11.17. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-04 | ||||
'Redundancy header' renamed to 'data header'. | ||||
More clarifications added. | ||||
Language and figure number corrections. | ||||
11.18. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-03 | ||||
Mention possible need to mute and raise hands as for other media. | ||||
---done ---- | ||||
Make sure that use in two-party calls is also possible and explained. | ||||
- may need more wording - | ||||
Clarify the RTT is often used together with other media. --done-- | ||||
Tell that text mixing is N-1. A users own text is not received in | ||||
the mix. -done- | ||||
In 3. correct the interval to: A "text/rex" transmitter SHOULD send | ||||
packets distributed in time as long as there is something (new or | ||||
redundant T140blocks) to transmit. The maximum transmission interval | ||||
SHOULD then be 300 ms. It is RECOMMENDED to send a packet to a | ||||
receiver as soon as new text to that receiver is available, as long | ||||
as the time after the latest sent packet to the same receiver is more | ||||
than 150 ms, and also the maximum character rate to the receiver is | ||||
not exceeded. The intention is to keep the latency low while keeping | ||||
a good protection against text loss in bursty packet loss conditions. | ||||
-done- | ||||
In 1.3 say that the format is used both ways. -done- | ||||
In 13.1 change presentation area to presentation field so that reader | ||||
does not think it shall be totally separated. -done- | ||||
In Performance and intro, tell the performance in number of | ||||
simultaneous sending users and introduced delay 16, 150 vs | ||||
requirements 5 vs 500. -done -- | ||||
Clarify redundancy level per connection. -done- | ||||
Timestamp also for the last data header. To make it possible for all | ||||
text to have time offset as for transmission from the source. Make | ||||
that header equal to the others. -done- | ||||
Mixer always use the CSRC list, even for its own BOM. -done- | ||||
Combine all talk about transmission interval (300 ms vs when text has | ||||
arrived) in section 3 in one paragraph or close to each other. -done- | ||||
Documents the goal of good performance with low delay for 5 | ||||
simultaneous typers in the introduction. -done- | ||||
Describe better that only primary text shall be sent on to receivers. | ||||
Redundancy and loss must be resolved by the mixer. -done- | ||||
11.19. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-02 | ||||
SDP and better description and visibility of security by OSRTP RFC | ||||
8634 needed. | ||||
The description of gatewaying to WebRTC extended. | ||||
The description of the data header in the packet is improved. | ||||
11.20. Changes to draft-ietf-avtcore-multi-party-rtt-mix-01 | ||||
2,5,6 More efficient format "text/rex" introduced and attribute | ||||
a=rtt-mix deleted. | ||||
3. Brief about use of OSRTP for security included- More needed. | ||||
4. Brief motivation for the solution and why not rtp-translator is | ||||
used added to intro. | ||||
7. More limitations for the multiparty-unaware mixing method | ||||
inserted. | ||||
8. Updates to RFC 4102 and 4103 more clearly expressed. | ||||
9. Gateway to WebRTC started. More needed. | ||||
11.21. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-03 | ||||
to draft-ietf-avtcore-multi-party-rtt-mix-00 | ||||
Changed file name to draft-ietf-avtcore-multi-party-rtt-mix-00 | ||||
Replaced CDATA in IANA registration table with better coding. | ||||
Converted to xml2rfc version 3. | ||||
11.22. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-02 | ||||
to -03 | ||||
Changed company and e-mail of the author. | ||||
Changed title to "RTP-mixer formatting of multi-party Real-time text" | ||||
to better match contents. | ||||
Check and modification where needed of use of RFC 2119 words SHALL | ||||
etc. | ||||
More about the CC value in sections on transmitters and receivers so | ||||
that 1-to-1 sessions do not use the mixer format. | ||||
Enhanced section on presentation for multiparty-unaware endpoints | ||||
A paragraph recommending cps=150 inserted in the performance section. | ||||
11.23. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-01 | ||||
to -02 | ||||
In Abstract and 1. Introduction: Introduced wording about regulatory | ||||
requirements. | ||||
In section 5: The transmission interval is decreased to 100 ms when | ||||
there is text from more than one source to transmit. | ||||
In section 11 about SDP negotiation, a SHOULD-requirement is | ||||
introduced that the mixer should make a mix for multiparty-unaware | ||||
endpoints if the negotiation is not successful. And a reference to a | ||||
later chapter about it. | ||||
The presentation considerations chapter 14 is extended with more | ||||
information about presentation on multiparty-aware endpoints, and a | ||||
new section on the multiparty-unaware mixing with low functionality | ||||
but SHOULD be implemented in mixers. Presentation examples are | ||||
added. | ||||
A short chapter 15 on gateway considerations is introduced. | ||||
Clarification about the text/t140 format included in chapter 10. | ||||
This sentence added to the chapter 10 about use without redundancy. | ||||
"The text/red format SHOULD be used unless some other protection | ||||
against packet loss is utilized, for example a reliable network or | ||||
transport." | ||||
Note about deviation from RFC 2198 added in chapter 4. | ||||
In chapter 9. "Use with SIP centralized conferencing framework" the | ||||
following note is inserted: Note: The CSRC-list in an RTP packet only | ||||
includes participants whose text is included in one or more text | ||||
blocks. It is not the same as the list of participants in a | ||||
conference. With audio and video media, the CSRC-list would often | ||||
contain all participants who are not muted whereas text participants | ||||
that don't type are completely silent and so don't show up in RTP | ||||
packet CSRC-lists. | ||||
11.24. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-00 | ||||
to -01 | ||||
Editorial cleanup. | ||||
Changed capability indication from fmtp-parameter to SDP attribute | ||||
"rtt-mix". | ||||
Swapped order of redundancy elements in the example to match reality. | ||||
Increased the SDP negotiation section | 11.1. Normative References | |||
12. References | [ECMA-48] Ecma International, "ECMA-48: Control functions for coded | |||
character sets", 5th edition, June 1991, | ||||
<https://www.ecma-international.org/publications-and- | ||||
standards/standards/ecma-48/>. | ||||
12.1. Normative References | [ISO6429] ISO/IEC, "Information technology - Control functions for | |||
coded character sets", ISO/IEC ISO/IEC 6429:1992, December | ||||
1992, <https://www.iso.org/obp/ui/#iso:std:iso- | ||||
iec:6429:ed-3:v1:en>. | ||||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. | [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. | |||
Jacobson, "RTP: A Transport Protocol for Real-Time | Jacobson, "RTP: A Transport Protocol for Real-Time | |||
Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, | Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, | |||
July 2003, <https://www.rfc-editor.org/info/rfc3550>. | July 2003, <https://www.rfc-editor.org/info/rfc3550>. | |||
skipping to change at page 48, line 36 ¶ | skipping to change at line 1852 ¶ | |||
[RFC8865] Holmberg, C. and G. Hellström, "T.140 Real-Time Text | [RFC8865] Holmberg, C. and G. Hellström, "T.140 Real-Time Text | |||
Conversation over WebRTC Data Channels", RFC 8865, | Conversation over WebRTC Data Channels", RFC 8865, | |||
DOI 10.17487/RFC8865, January 2021, | DOI 10.17487/RFC8865, January 2021, | |||
<https://www.rfc-editor.org/info/rfc8865>. | <https://www.rfc-editor.org/info/rfc8865>. | |||
[RFC8866] Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP: | [RFC8866] Begen, A., Kyzivat, P., Perkins, C., and M. Handley, "SDP: | |||
Session Description Protocol", RFC 8866, | Session Description Protocol", RFC 8866, | |||
DOI 10.17487/RFC8866, January 2021, | DOI 10.17487/RFC8866, January 2021, | |||
<https://www.rfc-editor.org/info/rfc8866>. | <https://www.rfc-editor.org/info/rfc8866>. | |||
[T140] ITU-T, "Recommendation ITU-T T.140 (02/1998), Protocol for | [T140] ITU-T, "Protocol for multimedia application text | |||
multimedia application text conversation", February 1998, | conversation", ITU-T Recommendation T.140, February 1998, | |||
<https://www.itu.int/rec/T-REC-T.140-199802-I/en>. | <https://www.itu.int/rec/T-REC-T.140-199802-I/en>. | |||
[T140ad1] ITU-T, "Recommendation ITU-T.140 Addendum 1 - (02/2000), | [T140ad1] ITU-T, "Recommendation T.140 Addendum", February 2000, | |||
Protocol for multimedia application text conversation", | ||||
February 2000, | ||||
<https://www.itu.int/rec/T-REC-T.140-200002-I!Add1/en>. | <https://www.itu.int/rec/T-REC-T.140-200002-I!Add1/en>. | |||
12.2. Informative References | 11.2. Informative References | |||
[RFC4353] Rosenberg, J., "A Framework for Conferencing with the | [RFC4353] Rosenberg, J., "A Framework for Conferencing with the | |||
Session Initiation Protocol (SIP)", RFC 4353, | Session Initiation Protocol (SIP)", RFC 4353, | |||
DOI 10.17487/RFC4353, February 2006, | DOI 10.17487/RFC4353, February 2006, | |||
<https://www.rfc-editor.org/info/rfc4353>. | <https://www.rfc-editor.org/info/rfc4353>. | |||
[RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A | [RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A | |||
Session Initiation Protocol (SIP) Event Package for | Session Initiation Protocol (SIP) Event Package for | |||
Conference State", RFC 4575, DOI 10.17487/RFC4575, August | Conference State", RFC 4575, DOI 10.17487/RFC4575, August | |||
2006, <https://www.rfc-editor.org/info/rfc4575>. | 2006, <https://www.rfc-editor.org/info/rfc4575>. | |||
skipping to change at page 49, line 43 ¶ | skipping to change at line 1904 ¶ | |||
DOI 10.17487/RFC8723, April 2020, | DOI 10.17487/RFC8723, April 2020, | |||
<https://www.rfc-editor.org/info/rfc8723>. | <https://www.rfc-editor.org/info/rfc8723>. | |||
[RFC8825] Alvestrand, H., "Overview: Real-Time Protocols for | [RFC8825] Alvestrand, H., "Overview: Real-Time Protocols for | |||
Browser-Based Applications", RFC 8825, | Browser-Based Applications", RFC 8825, | |||
DOI 10.17487/RFC8825, January 2021, | DOI 10.17487/RFC8825, January 2021, | |||
<https://www.rfc-editor.org/info/rfc8825>. | <https://www.rfc-editor.org/info/rfc8825>. | |||
Acknowledgements | Acknowledgements | |||
The author want to thank the following persons for support, reviews | The author wants to thank the following persons for support, reviews, | |||
and valuable comments: Bernard Aboba, Amanda Baber, Roman Danyliw, | and valuable comments: Bernard Aboba, Amanda Baber, Roman Danyliw, | |||
Spencer Dawkins, Martin Duke, Lars Eggert, James Hamlin, Benjamin | Spencer Dawkins, Martin Duke, Lars Eggert, James Hamlin, Benjamin | |||
Kaduk, Murray Kucherawy, Paul Kyziwat, Jonathan Lennox, Lorenzo | Kaduk, Murray Kucherawy, Paul Kyzivat, Jonathan Lennox, Lorenzo | |||
Miniero, Dan Mongrain, Francesca Palombini, Colin Perkins, Brian | Miniero, Dan Mongrain, Francesca Palombini, Colin Perkins, Brian | |||
Rosen, Juergen Schoenwaelder, Rich Salz, Robert Wilton, Dale Worley, | Rosen, Rich Salz, Jürgen Schönwälder, Robert Wilton, Dale Worley, | |||
Peter Yee and Yong Xin. | Yong Xin, and Peter Yee. | |||
Author's Address | Author's Address | |||
Gunnar Hellstrom | ||||
Gunnar Hellstrom Accessible Communication | Gunnar Hellström | |||
SE-13670 Vendelso | Gunnar Hellström Accessible Communication | |||
SE-13670 Vendelsö | ||||
Sweden | Sweden | |||
Email: gunnar.hellstrom@ghaccess.se | Email: gunnar.hellstrom@ghaccess.se | |||
End of changes. 302 change blocks. | ||||
1423 lines changed or deleted | 1091 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |