Network Working Group Zhenjie Deng Internet Draft UCAS Intended status: Standards Track December 2, 2013 Expires: May 3, 2014 Non-Renegable Selective Acknowledgements (NR-SACKs) for MPTCP draft-deng-mptcp-nrsack-00.txt Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, and it may not be published except as an Internet-Draft. This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. This document may not be modified, and derivative works of it may not be created, except to publish it as an RFC and to translate it into languages other than English. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." Deng, et al Expires May 3, 2014 [Page 1] Internet-Draft Non-renegable SACK for MPTCP December 2013 The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on May 3, 2014. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract Multipath Transmission Control Protocol (MPTCP) [RFC6824] adopts Selective Acknowledgements (SACKs) at the subflow level to allow an MPTCP receiver to acknowledge the receipt of out-of-order data. In MPTCP, SACK information is expected (but not mandated)--though SACKs notify a data sender the reception of specific out-of-order data, the out-of-order data cannot be delivered to application layer until it has been cumulatively acknowledged at the connection-level. The MPTCP data receiver is permitted to later abandon the out-of-order data cached in the receive buffer. The out-of-order data is called renegable. Since the delivery of a SACKed out-of-order data is renegable, the sender has to maintain copies of SACKed data in the send buffer until it is cumulatively acked. As a result, the send buffer is inevitably wasted and the transmission rate is restricted even though the network is not congested. Deng, et al Expires May 3, 2014 [Page 2] Internet-Draft Non-renegable SACK for MPTCP December 2013 In current MPTCP, only the packets have been cumulatively acked and delivered to the application are considered as non-renegable. The transport sender has to maintain the renegable packets in the retransmission queue in case of retransmission. SACKs in MPTCP inevitably result in send buffer wastage. Interestingly, MPTCP implementation can be configured such that the MPTCP receiver is not allowed to and therefore discard the out-of-order data. This document specifies an extension to MPTCP's acknowledgement mechanism called Non-Renegable Selective Acknowledgements (NR-SACKs). This mechanism allows a data receiver to explicitly inform the data sender of non-renegable out-of-order data. That is, the data receiver will not discard the out-of-order data such that retransmission is not required for them. Therefore, the data sender can remove the NR- SACKed out-of-order data from the retransmission queue and the application can write the new data to the send buffer. Table of Contents 1. Introduction ................................................ 3 2. Conventions used in this document ............................ 4 3. Negotiation ................................................. 4 4. Non-renegable SACK (NR-SACK) ................................. 5 5. INNA Consideration .......................................... 7 6. Security Considerations ...................................... 7 7. References .................................................. 8 7.1. Normative References .................................... 8 7.2. Informative References .................................. 8 8. Acknowledgments ............................................. 8 1. Introduction To provide full end-to-end reliable data transfer, MPTCP specifies a connection-level acknowledgement, to act as a cumulative ACK for the connection as a whole. That is the "Data ACK" field of the Data Sequence Signal (DSS) option. Meanwhile, In MPTCP, each subflow acts as a standard TCP connection with its own subflow-level sequence numbers space (i.e., the regular sequence numbers in the TCP header). The data sequence mapping is used to define the mapping from the subflow sequence number to the data sequence number. Through the use of this mapping, the data stream can be reassembled and delivered to the application layer. SACKs information is advisory at the subflow level to improve efficiency. In this document, we refer to the "Data ACK" at the connection-level as "cum-ack", the selective acknowledgement at the subflow-level as "Gap-ack". Deng, et al Expires May 3, 2014 [Page 3] Internet-Draft Non-renegable SACK for MPTCP December 2013 In current MPTCP implementation, the connection-level Data ACK and the subflow level acknowledgements are separated. Data has been received and delivered to the application layer only when it is cum- acked. The segments are delivered to the receive buffer after acknowledgement at the subflow level and a receiver has the freedom to drop them. For example, there is not enough memory space to cache the out-of-order segments. Discarding a previously gap-acked segment is known as "reneging". Due to the characteristic of the renegable out-of-order data, any gap-acked segment MUST be maintained in the data sender's retransmission queue until it is later cum-acked at the connection-level. There are some situations that the out-of-order data in the receive buffer will never be discarded. That is, reneging will never take place. For example, the out-of-order data is delivered to and handled by the application layer instead of dropping from the receive buffer due to the memory constraint. In these situations, a data sender can improve transmission efficiency by removing the out-of-order data in the retransmission queue and the application can write the new data into the send buffer. This document describes an extension to MPTCP to allow for Non-Renegable Selective Acknowledgements (NR-SACKs). Some MPTCP operations SHOULD be modified to allow this extension to be implemented. Such modifications, such as the structure of the NR- SACK, will be described in section 4. If NR-SACKs are used, the data receiver MAY include the Data Sequence Number (DSN) of the delivered out-of-order segments in a NR-SACK message to inform a data sender the deliveries have occurred and the out-of-order segments are non-renegable, allowing the data sender to remove the copies of the delivered segments from the retransmission queue even before the segments are cum-acked. 2. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC2119]. 3. Negotiation Before sending/receiving NR-SACKs, both peer endpoints MUST support and agree on using NR-SACKs. Note that all MPTCP operations are signaled with a TCP option, this agreement MUST be negotiated during the initiation of an MPTCP connection and the "MP_CAPABLE" option of the MPTCP [RFC6824] MUST be modified to contain the "NR_SACK_CAPABLE"field to declare its capability of NR-SACK. The structure of "NR_SACK_CAPABLE" option will be described in the next section. Deng, et al Expires May 3, 2014 [Page 4] Internet-Draft Non-renegable SACK for MPTCP December 2013 Note that the connection initiation of MPTCP begins with the three- way handshake containing a SYN, SYN/ACK, ACK exchange on an initiating path. An endpoint supporting and expecting the NR-SACK extension of MPTCP MUST explicitly contain the "MP_CAPABLE" and the "NR_SACK_CAPABLE" option in each packet exchanging during the initiation phase. The receive endpoint MUST assume that the send endpoint is capable of NR-SACK if it receive the "NR_SACK_CAPABLE" message from the peer endpoint. Both endpoints MUST support NR-SACKs for either endpoint to expect to use NR-SACK. If one of the SYN, SYN/ACK, and ACK signals does not contain "NR_SACK_CAPABLE" option, the connection MUST not use NR-SACK and fallback to standard MPTCP described as [RFC6824]. After the initiation of the MPTCP connection, an endpoint MUST not negotiate the use of NR-SACK. 4. Non-renegable SACK (NR-SACK) In order to support the Non-renegable SACK, a new TCP option of MPTCP signaled in the Data Sequence Signal (DSS) option MUST be defined to transfer NR-SACK information. Figure 1 describes the meaning of each field of the NR-SACK option of MPTCP. Note that MPTCP uses the "Data ACK" in the DSS option for cumulative acknowledgement at connection- level; similarly, the NR-SACK is also signaled in the DSS option. We refer to the Non-renegable SACKs as "nr-gap-ack". As the NR-SACKs are also signaled in the DSS option, many NR-SACKs fields are similar to the "Data ACK" option. Some fields in the NR- SACKs option have the same semantics with the corresponding fields in the "Data ACK" option. Similar to the "Data ACK" option, the NR-SACK is sent the peer endpoint to (1) acknowledge the segments received in-order through cumulative acknowledgement, (2) explicitly inform the peer endpoint of the non-renegable out-of-order segments. Deng, et al Expires May 3, 2014 [Page 5] Internet-Draft Non-renegable SACK for MPTCP December 2013 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +---------------+---------------+-------+----------------------+ | Kind | Length | Subtype |(reserved) |n|N|F|m|M|a|A| +---------------+---------------+-------+----------------------+ | Data ACK (4 or 8 octets, depending on flags) | +--------------------------------------------------------------+ | Number of NR Gap Ack Blocks = N | +--------------------------------------------------------------+ | NR Gap Ack Block #1 Start (4 or 8 octets, depending on flags)| +-------------------------------+------------------------------+ | NR Gap Ack Block #1 End (4 or 8 octets, depending on flags) | +-------------------------------+------------------------------+ | ... | +--------------------------------------------------------------+ | NR Gap Ack Block #N Start (4 or 8 octets, depending on flags)| +-------------------------------+------------------------------+ | NR Gap Ack Block #N End (4 or 8 octets, depending on flags) | +-------------------------------+------------------------------+ Figure 1: NR-SACK option The flags, n and N, when set, define the contents of this option, as follow: O N = Non-renegable SACK present O n = Non-renegable SACK is 8 octets (if not set, NR-SACK is 4 octets), note that 'n' only has meaning when the corresponding flag 'N' is set. Deng, et al Expires May 3, 2014 [Page 6] Internet-Draft Non-renegable SACK for MPTCP December 2013 Subtype This field contains the INNA defined MPTCP operation for "NR-SACK" option. The suggested value of this field for INNA is 0X2. Data ACK: used for cumulative acknowledgement Number of NR Gap Ack Blocks: Indicates the number of Non-Renegable Gap Ack Blocks included in this NR-SACK. NR Gap Ack Block Start: The length of this field depends on the flags 'n' and'N'. When 'n' is set, the length is 8 octets (if not set, NR-SACK is 4 octets). This field indicates the start offset of Data sequence number of the NR Gap Ack Block. This number is set relative to the cumulative acknowledgement number defined in the "Data ACK" field. The actual value of this field is calculated by subtracting the value in the "Data ACK" field from the first data sequence number in the NR Gap Ack Block. NR Gap Ack Block End: The length of this field depends on the flags 'n' and'N'. When 'n' is set, the length is 8 octets (if not set, NR-SACK is 4 octets). This field indicates the end offset of data sequence number of the NR Gap Ack Block. This number is set relative to the cumulative acknowledgement number defined in the "Data ACK" field. The actual value of this field is calculated by subtracting the value in the "Data ACK" field from the first data sequence number in the NR Gap Ack Block. 5. INNA Consideration None 6. Security Considerations Security considerations discussed in [RFC6887] are to be taken into account. Security considerations discussed in [RFC6824] are to be taken in to account when creating new TCP subflows. Deng, et al Expires May 3, 2014 [Page 7] Internet-Draft Non-renegable SACK for MPTCP December 2013 7. References 7.1. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [2] Ford, A., Raiciu, C., Handley, M., Barre, S., and J. Iyengar, "Architectural Guidelines for Multipath TCP Development", RFC 6182, March 2011. [3] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, "TCP Extensions for Multipath Operation with Multiple Addresses",RFC 6824, January 2013. [4] Natarajan P, Ekiz N, Yilmaz E, et al. Non-renegable selective acknowledgments (NR-SACKs) for SCTP[C]//Network Protocols, 2008. ICNP 2008. IEEE International Conference on. IEEE, 2008: 187- 196. 7.2. Informative References [5] Dreibholz T, Becke M, Adhari H, et al. Evaluation of a new multipath congestion control scheme using the NetPerfMeter tool-chain[C]//Software, Telecommunications and Computer Networks (SoftCOM), 2011 19th International Conference on. IEEE, 2011: 1-6. [6] Ford A, Raiciu C, Handley M, et al. TCP Extensions for Multipath Operation with Multiple Addresses: draft-ietf-mptcp- multiaddressed-03[R]. Roke Manor, 2011. [7] Barre S, Bonaventure O, Raiciu C, et al. Experimenting with multipath TCP[J]. ACM SIGCOMM Computer Communication Review, 2010, 40(4): 443-444. 8. Acknowledgments This Internet Draft is the result of a great deal of constructive discussion with several people, notably Yinlong Liu, Shoushou Ren, Yahui Hu and Song Ci. This document was prepared using 2-Word-v2.0.template.dot. Deng, et al Expires May 3, 2014 [Page 8] Internet-Draft Non-renegable SACK for MPTCP December 2013 Authors' Addresses Zhenjie Deng UCAS Institute of Acoustics, North Fourth Ring Road, Haidian District, No. 21,Beijing 100190 P.R. China Email: dengzhj@hpnl.ac.cn Yinlong Liu UCAS Institute of Acoustics, North Fourth Ring Road, Haidian District, No. 21,Beijing 100190 P.R. China Email: Liuyl@hpnl.ac.cn Deng, et al Expires May 3, 2014 [Page 9]