rfc9682.original   rfc9682.txt 
CBOR C. Bormann Internet Engineering Task Force (IETF) C. Bormann
Internet-Draft Universität Bremen TZI Request for Comments: 9682 Universität Bremen TZI
Updates: 8610 (if approved) 24 June 2024 Updates: 8610 November 2024
Intended status: Standards Track Category: Standards Track
Expires: 26 December 2024 ISSN: 2070-1721
Updates to the CDDL grammar of RFC 8610 Updates to the Concise Data Definition Language (CDDL) Grammar
draft-ietf-cbor-update-8610-grammar-06
Abstract Abstract
The Concise Data Definition Language (CDDL), as defined in RFC 8610 The Concise Data Definition Language (CDDL), as defined in RFCs 8610
and RFC 9165, provides an easy and unambiguous way to express and 9165, provides an easy and unambiguous way to express structures
structures for protocol messages and data formats that are for protocol messages and data formats that are represented in
represented in CBOR or JSON. Concise Binary Object Representation (CBOR) or JSON.
The present document updates RFC 8610 by addressing errata and making
other small fixes for the ABNF grammar defined for CDDL there.
About This Document
This note is to be removed before publishing as an RFC.
The latest revision of this draft can be found at https://cbor-
wg.github.io/update-8610-grammar/. Status information for this
document may be found at https://datatracker.ietf.org/doc/draft-ietf-
cbor-update-8610-grammar/.
Discussion of this document takes place on the CBOR Working Group
mailing list (mailto:cbor@ietf.org), which is archived at
https://mailarchive.ietf.org/arch/browse/cbor/. Subscribe at
https://www.ietf.org/mailman/listinfo/cbor/.
Source for this draft and an issue tracker can be found at This document updates RFC 8610 by addressing related errata reports
https://github.com/cbor-wg/update-8610-grammar. and making other small fixes for the ABNF grammar defined for CDDL.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
This Internet-Draft will expire on 26 December 2024. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9682.
Copyright Notice Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents
license-info) in effect on the date of publication of this document. (https://trustee.ietf.org/license-info) in effect on the date of
Please review these documents carefully, as they describe your rights publication of this document. Please review these documents
and restrictions with respect to this document. Code Components carefully, as they describe your rights and restrictions with respect
extracted from this document must include Revised BSD License text as to this document. Code Components extracted from this document must
described in Section 4.e of the Trust Legal Provisions and are include Revised BSD License text as described in Section 4.e of the
provided without warranty as described in the Revised BSD License. Trust Legal Provisions and are provided without warranty as described
in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction
1.1. Conventions and Definitions . . . . . . . . . . . . . . . 3 1.1. Conventions and Definitions
2. Clarifications and Changes based on Errata Reports . . . . . 3 2. Clarifications and Changes Based on Errata Reports
2.1. Updates to String Literal Grammar . . . . . . . . . . . . 3 2.1. Updates to String Literal Grammar
Err6527 (Text String Literals) . . . . . . . . . . . . . . . 3 2.1.1. Erratum ID 6527 (Text String Literals)
Err6278 (Consistent String Literals) . . . . . . . . . . . . 5 2.1.2. Erratum ID 6278 (Consistent String Literals)
Addressing Err6526, Err6543 . . . . . . . . . . . . . . . . . 5 2.1.3. Addressing Erratum ID 6526 and Erratum ID 6543
2.2. Examples Demonstrating the Updated String Syntaxes . . . 5 2.2. Examples Demonstrating the Updated String Syntaxes
3. Small Enabling Grammar Changes . . . . . . . . . . . . . . . 6 3. Small Enabling Grammar Changes
3.1. Empty data models . . . . . . . . . . . . . . . . . . . . 7 3.1. Empty Data Models
3.2. Non-literal Tag Numbers, Simple Values . . . . . . . . . 7 3.2. Non-Literal Tag Numbers and Simple Values
4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 4. Security Considerations
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 5. IANA Considerations
6. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 6. References
6.1. Normative References . . . . . . . . . . . . . . . . . . 9 6.1. Normative References
6.2. Informative References . . . . . . . . . . . . . . . . . 9 6.2. Informative References
Appendix A. Updated Collected ABNF for CDDL . . . . . . . . . . 11 Appendix A. Updated Collected ABNF for CDDL
Appendix B. Details about Covering Errata Report 6543 . . . . . 13 Appendix B. Details about Covering Erratum ID 6543
Change Proposed By Errata Report 6543 . . . . . . . . . . . . . 13 B.1. Change Proposed by Erratum ID 6543
No Further Change Needed After Updating String Literal Grammar B.2. No Further Change Needed after Updating String Literal
(Section 2.1) . . . . . . . . . . . . . . . . . . . . . . . 14 Grammar
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 15 Acknowledgments
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 15 Author's Address
1. Introduction 1. Introduction
The Concise Data Definition Language (CDDL), as defined in [RFC8610] The Concise Data Definition Language (CDDL), as defined in [RFC8610]
and [RFC9165], provides an easy and unambiguous way to express and [RFC9165], provides an easy and unambiguous way to express
structures for protocol messages and data formats that are structures for protocol messages and data formats that are
represented in CBOR or JSON. represented in CBOR or JSON.
The present document updates [RFC8610] by addressing errata and This document updates [RFC8610] by addressing errata reports and
making other small fixes for the ABNF grammar defined for CDDL there. making other small fixes for the ABNF grammar defined for CDDL. The
The body of this document motivates and explains the updates; the body of this document explains and shows motivation for the updates;
updated collected ABNF syntax in Figure 11 in Appendix A replaces the the updated collected ABNF syntax in Figure 11 in Appendix A replaces
collected ABNF syntax in Appendix B of [RFC8610]. the collected ABNF syntax in Appendix B of [RFC8610].
1.1. Conventions and Definitions 1.1. Conventions and Definitions
The Terminology from [RFC8610] applies. The grammar in [RFC8610] is The terminology from [RFC8610] applies. The grammar in [RFC8610] is
based on ABNF, which is defined in [STD68] and [RFC7405]. based on ABNF, which is defined in [STD68] and [RFC7405].
2. Clarifications and Changes based on Errata Reports 2. Clarifications and Changes Based on Errata Reports
A number of errata reports have been made around some details of text A number of errata reports have been made regarding some details of
string and byte string literal syntax: [Err6527] and [Err6543]. text string and byte string literal syntax: for example, [Err6527]
These are being addressed in this section, updating details of the and [Err6543]. These are being addressed in this section, updating
ABNF for these literal syntaxes. Also, [Err6526] needs to be applied details of the ABNF for these literal syntaxes. Also, the changes
(backslashes have been lost during RFC processing in some text described in [Err6526] need to be applied (backslashes have been lost
explaining backslash escaping). during the RFC publication process of Appendix G.2 of [RFC8610],
garbling the text explaining backslash escaping).
These changes are intended to mirror the way existing implementations These changes are intended to mirror the way existing implementations
have dealt with the errata. They also use the opportunity presented have dealt with the errata reports. This document also uses the
by the necessary cleanup of the grammar of string literals for a opportunity presented by the necessary cleanup of the grammar of
backward compatible addition to the syntax for hexadecimal escapes. string literals for a backward-compatible addition to the syntax for
The latter change is not automatically forward compatible (i.e., CDDL hexadecimal escapes. The latter change is not automatically forward
specifications that make use of this syntax do not necessarily work compatible (i.e., CDDL specifications that make use of this syntax do
with existing implementations until these are updated, which this not necessarily work with existing implementations until these are
specification recommends). updated, which is recommended by this specification).
2.1. Updates to String Literal Grammar 2.1. Updates to String Literal Grammar
Err6527 (Text String Literals) 2.1.1. Erratum ID 6527 (Text String Literals)
The ABNF used in [RFC8610] for the content of text string literals is The ABNF used in [RFC8610] for the content of text string literals is
rather permissive: rather permissive:
; RFC 8610 ABNF: ; ABNF from RFC 8610:
text = %x22 *SCHAR %x22 text = %x22 *SCHAR %x22
SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
SESC = "\" (%x20-7E / %x80-10FFFD) SESC = "\" (%x20-7E / %x80-10FFFD)
Figure 1: Original RFC 8610 ABNF for strings with permissive ABNF
for SESC, but not allowing hex escapes Figure 1: Original ABNF from RFC 8610 for Strings with Permissive
ABNF for SESC (Which Did Not Allow Hex Escapes)
This allows almost any non-C0 character to be escaped by a backslash, This allows almost any non-C0 character to be escaped by a backslash,
but critically misses out on the \uXXXX and \uHHHH\uLLLL forms that but critically misses out on the \uXXXX and \uHHHH\uLLLL forms that
JSON allows to specify characters in hex (which should be applying JSON allows to specify characters in hex (which should apply here
here according to Bullet 6 of Section 3.1 of [RFC8610]). (Note that according to item 6 of Section 3.1 of [RFC8610]). (Note that CDDL
CDDL imports from JSON the unwieldy \uHHHH\uLLLL syntax, which imports from JSON the unwieldy \uHHHH\uLLLL syntax, which represents
represents Unicode code points beyond U+FFFF by making them look like Unicode code points beyond U+FFFF by making them look like UTF-16
UTF-16 surrogate pairs; CDDL text strings are not using UTF-16 or surrogate pairs; CDDL text strings do not use UTF-16 or surrogates.)
surrogates.)
Both can be solved by updating the SESC rule. This document uses the Both can be solved by updating the SESC rule. This document uses the
opportunity to add a popular form of directly specifying characters opportunity to add a popular form of directly specifying characters
in strings using hexadecimal escape sequences of the form \u{hex}, in strings using hexadecimal escape sequences of the form \u{hex},
where hex is the hexadecimal representation of the Unicode scalar where hex is the hexadecimal representation of the Unicode scalar
value. The result is the new set of rules defining SESC in Figure 2: value. The result is the new set of rules defining SESC in Figure 2.
; new rules collectively defining SESC: ; new rules collectively defining SESC:
SESC = "\" ( %x22 / "/" / "\" / ; \" \/ \\ SESC = "\" ( %x22 / "/" / "\" / ; \" \/ \\
%x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
(%x75 hexchar) ) ; \uXXXX (%x75 hexchar) ) ; \uXXXX
hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" / hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" /
non-surrogate / (high-surrogate "\" %x75 low-surrogate) non-surrogate / (high-surrogate "\" %x75 low-surrogate)
non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) / non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
("D" %x30-37 2HEXDIG ) ("D" %x30-37 2HEXDIG )
high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG
hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG
/ non-surrogate / 1*3HEXDIG / non-surrogate / 1*3HEXDIG
HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F" HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F"
Figure 2: Update to string ABNF in Appendix B of RFC 8610: allow Figure 2: Update to String ABNF in Appendix B of [RFC8610]: Allow
hex escapes Hex Escapes
(Notes: In ABNF, strings such as "A", "B" etc. are case-insensitive, | Notes: In ABNF, strings such as "A", "B", etc., are case
as is intended here. The rules above could, instead of %x62, also | insensitive, as is intended here. The rules above could have
have used %s"b" etc., but didn't, in order to maximize ABNF tool | also used %s"b", etc., instead of %x62, but didn't, in order to
compatibility.) | maximize compatibility with ABNF tools.
Now that SESC is more restrictively formulated, this also requires an Now that SESC is more restrictively formulated, an update to the
update to the BCHAR rule used in the ABNF syntax for byte string BCHAR rule used in the ABNF syntax for byte string literals is also
literals: required:
; RFC 8610 ABNF: ; ABNF from RFC 8610:
bytes = [bsqual] %x27 *BCHAR %x27 bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
bsqual = "h" / "b64" bsqual = "h" / "b64"
Figure 3: Original RFC 8610 ABNF for BCHAR
With the SESC updated as above, \' is no longer allowed in BCHAR; Figure 3: ABNF from RFC 8610 for BCHAR
this now needs to be explicitly included; see below.
Err6278 (Consistent String Literals) With the SESC updated as above, \' is no longer allowed in BCHAR and
now needs to be explicitly included there; see Figure 4.
2.1.2. Erratum ID 6278 (Consistent String Literals)
Updating BCHAR also provides an opportunity to address [Err6278], Updating BCHAR also provides an opportunity to address [Err6278],
which points to an inconsistency in treating U+007F (DEL) between which points to an inconsistency in treating U+007F (DEL) between
SCHAR and BCHAR. As U+007F is not printable, including it in a byte SCHAR and BCHAR. As U+007F is not printable, including it in a byte
string literal is as confusing as for a text string literal, and it string literal is as confusing as for a text string literal;
should therefore be excluded from BCHAR as it is from SCHAR. The therefore, it should be excluded from BCHAR as it is from SCHAR. The
same reasoning also applies to the C1 control characters, so the same reasoning also applies to the C1 control characters, so the
updated ABNF actually excludes the entire range from U+007F to updated ABNF actually excludes the entire range from U+007F to
U+009F. The same reasoning then also applies to text in comments U+009F. The same reasoning also applies to text in comments (PCHAR).
(PCHAR). For completeness, all these should also explicitly exclude For completeness, all these rules should also explicitly exclude the
the code points that have been set aside for UTF-16's surrogates. code points that have been set aside for UTF-16 surrogates.
; new rules for SCHAR, BCHAR, and PCHAR: ; new rules for SCHAR, BCHAR, and PCHAR:
SCHAR = %x20-21 / %x23-5B / %x5D-7E / NONASCII / SESC SCHAR = %x20-21 / %x23-5B / %x5D-7E / NONASCII / SESC
BCHAR = %x20-26 / %x28-5B / %x5D-7E / NONASCII / SESC / "\'" / CRLF BCHAR = %x20-26 / %x28-5B / %x5D-7E / NONASCII / SESC / "\'" / CRLF
PCHAR = %x20-7E / NONASCII PCHAR = %x20-7E / NONASCII
NONASCII = %xA0-D7FF / %xE000-10FFFD NONASCII = %xA0-D7FF / %xE000-10FFFD
Figure 4: Update to ABNF in Appendix B of RFC 8610: BCHAR, SCHAR, Figure 4: Update to ABNF in Appendix B of [RFC8610]: BCHAR,
and PCHAR SCHAR, and PCHAR
(Note that, apart from addressing the inconsistencies, there is no (Note that, apart from addressing the inconsistencies, there is no
attempt to further exclude non-printable characters from the ABNF; attempt to further exclude non-printable characters from the ABNF;
doing this properly would draw in complexity from the ongoing doing this properly would draw in complexity from the ongoing
evolution of the Unicode standard that is not needed here.) evolution of the Unicode standard [UNICODE] that is not needed here.)
Addressing Err6526, Err6543 2.1.3. Addressing Erratum ID 6526 and Erratum ID 6543
The above changes also cover [Err6543] (a proposal to split off The above changes also cover [Err6543] (a proposal to split off
qualified byte string literals from UTF-8 byte string literals) and qualified byte string literals from UTF-8 byte string literals) and
[Err6526] (lost backslashes); see Appendix B for details. [Err6526] (lost backslashes); see Appendix B for details.
2.2. Examples Demonstrating the Updated String Syntaxes 2.2. Examples Demonstrating the Updated String Syntaxes
The CDDL example in Figure 5 demonstrates various escaping techniques The CDDL example in Figure 5 demonstrates various escaping techniques
now available for (byte and text) strings in CDDL. Obviously in the now available for (byte and text) strings in CDDL. Obviously, in the
literals for a and x, there is no need to escape the second literals for a and x, there is no need to escape the second
character, an o, as \u{6f}; this is just for demonstration. character, an o, as \u{6f}; this is just for demonstration.
Similarly, as shown in c and z there also is no need to escape the 🁳 Similarly, as shown in c and z, there also is no need to escape the
or ⌘, but escaping them may be convenient in order to limit the "🁳" (DOMINO TILE VERTICAL-02-02, U+1F073) or "⌘" (PLACE OF INTEREST
character repertoire of a CDDL file itself to ASCII [STD80]. SIGN, U+2318); however, escaping them may be convenient in order to
limit the character repertoire of a CDDL file itself to ASCII
[STD80].
start = [a, b, c, x, y, z] start = [a, b, c, x, y, z]
; "🁳", DOMINO TILE VERTICAL-02-02, and ; "🁳", DOMINO TILE VERTICAL-02-02, and
; "⌘", PLACE OF INTEREST SIGN, in a text string: ; "⌘", PLACE OF INTEREST SIGN, in a text string:
a = "D\u{6f}mino's \u{1F073} + \u{2318}" ; \u{}-escape 3 chars a = "D\u{6f}mino's \u{1F073} + \u{2318}" ; \u{}-escape 3 chars
b = "Domino's \uD83C\uDC73 + \u2318" ; escape JSON-like b = "Domino's \uD83C\uDC73 + \u2318" ; escape JSON-like
c = "Domino's 🁳 + ⌘" ; unescaped c = "Domino's 🁳 + ⌘" ; unescaped
; in a byte string given as text, the ' needs to be escaped: ; in a byte string given as text, the ' needs to be escaped:
x = 'D\u{6f}mino\u{27}s \u{1F073} + \u{2318}' ; \u{}-escape 4 chars x = 'D\u{6f}mino\u{27}s \u{1F073} + \u{2318}' ; \u{}-escape 4 chars
y = 'Domino\'s \uD83C\uDC73 + \u2318' ; escape JSON-like y = 'Domino\'s \uD83C\uDC73 + \u2318' ; escape JSON-like
z = 'Domino\'s 🁳 + ⌘' ; escape ' only z = 'Domino\'s 🁳 + ⌘' ; escape ' only
Figure 5: Example text and byte string literals with various escaping Figure 5: Example Text and Byte String Literals with Various Escaping
techniques Techniques
In this example, the rules a to c and x to z all produce strings with In this example, the rules a to c and x to z all produce strings with
byte-wise identical content, where a to c are text strings, and x to byte-wise identical content: a to c are text strings and x to z are
z are byte strings. Figure 6 illustrates this by showing the output byte strings. Figure 6 illustrates this by showing the output
generated from the start rule in Figure 5, using pretty-printed generated from the start rule in Figure 5, using pretty-printed
hexadecimal. hexadecimal.
86 # array(6) 86 # array(6)
73 # text(19) 73 # text(19)
446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘" 446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
73 # text(19) 73 # text(19)
446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘" 446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
73 # text(19) 73 # text(19)
446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘" 446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
53 # bytes(19) 53 # bytes(19)
446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘" 446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
53 # bytes(19) 53 # bytes(19)
446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘" 446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
53 # bytes(19) 53 # bytes(19)
446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘" 446f6d696e6f277320f09f81b3202b20e28c98 # "Domino's 🁳 + ⌘"
Figure 6: Generated CBOR from CDDL example (Pretty-Printed Figure 6: Generated CBOR from CDDL Example (Pretty-Printed
Hexadecimal) Hexadecimal)
3. Small Enabling Grammar Changes 3. Small Enabling Grammar Changes
The two subsections in this section specify two small changes to the Each subsection that follows specifies a small change to the grammar
grammar that are intended to enable certain kinds of specifications. that is intended to enable certain kinds of specifications. These
These changes are backward compatible, i.e., CDDL files that comply changes are backward compatible (i.e., CDDL files that comply with
to [RFC8610] continue to match the updated grammar, but not [RFC8610] continue to match the updated grammar) but not necessarily
necessarily forward compatible, i.e., CDDL specifications that make forward compatible (i.e., CDDL specifications that make use of these
use of these changes cannot necessarily be processed by existing changes cannot necessarily be processed by existing implementations
[RFC8610] implementations. of [RFC8610]).
3.1. Empty data models 3.1. Empty Data Models
[RFC8610] requires a CDDL file to have at least one rule. [RFC8610] requires a CDDL file to have at least one rule.
; RFC 8610 ABNF: ; ABNF from RFC 8610:
cddl = S 1*(rule S) cddl = S 1*(rule S)
Figure 7: Original RFC 8610 ABNF for top-level rule cddl Figure 7: ABNF from RFC 8610 for Top-Level Rule cddl
This makes sense when the file has to stand alone, as a CDDL data This makes sense when the file has to stand alone, as a CDDL data
model needs to have at least one rule to provide an entry point model needs to have at least one rule to provide an entry point
(start rule). (i.e., a start rule).
With CDDL modules [I-D.ietf-cbor-cddl-modules], CDDL files can also With CDDL modules [CDDL-MODULES], CDDL files can also include
include directives, and these might be the source of all the rules directives, and these might be the source of all the rules that
that ultimately make up the module created by the file. Any other ultimately make up the module created by the file. Any other rule
rule content in the file has to be available for directive content in the file has to be available for directive processing,
processing, making the requirement for at least one rule cumbersome. making the requirement for at least one rule cumbersome.
Therefore, the present update extends the grammar as in Figure 8 and Therefore, the present update extends the grammar as in Figure 8 and
turns the existence of at least one rule into a semantic constraint, turns the existence of at least one rule into a semantic constraint,
to be fulfilled after processing of all directives. to be fulfilled after processing of all directives.
; new top-level rule: ; new top-level rule:
cddl = S *(rule S) cddl = S *(rule S)
Figure 8: Update to top-level ABNF in Appendices B and C of RFC 8610 Figure 8: Update to Top-Level ABNF in Appendices B and C of RFC 8610
3.2. Non-literal Tag Numbers, Simple Values 3.2. Non-Literal Tag Numbers and Simple Values
The existing ABNF syntax for expressing tags in CDDL is: The existing ABNF syntax for expressing tags in CDDL is as follows:
; extracted from RFC 8610 ABNF: ; extracted from the ABNF in RFC 8610:
type2 =/ "#" "6" ["." uint] "(" S type S ")" type2 =/ "#" "6" ["." uint] "(" S type S ")"
Figure 9: Original RFC 8610 ABNF for tag syntax Figure 9: Original ABNF from RFC 8610 for Tag Syntax
This means tag numbers can only be given as literal numbers (uints). This means tag numbers can only be given as literal numbers (uints).
Some specifications operate on ranges of tag numbers, e.g., [RFC9277] Some specifications operate on ranges of tag numbers; for example,
has a range of tag numbers 1668546817 (0x63740101) to 1668612095 [RFC9277] has a range of tag numbers 1668546817 (0x63740101) to
(0x6374FFFF) to tag specific content formats. This can currently not 1668612095 (0x6374FFFF) to tag specific content formats. This cannot
be expressed in CDDL. Similar considerations apply to simple values currently be expressed in CDDL. Similar considerations apply to
(#7.xx). simple values (#7.xx).
This update extends the syntax to: This update extends the syntax to the following:
; new rules collectively defining the tagged case: ; new rules collectively defining the tagged case:
type2 =/ "#" "6" ["." head-number] "(" S type S ")" type2 =/ "#" "6" ["." head-number] "(" S type S ")"
/ "#" "7" ["." head-number] / "#" "7" ["." head-number]
head-number = uint / ("<" type ">") head-number = uint / ("<" type ">")
Figure 10: Update to tag and simple value ABNF in Appendices B Figure 10: Update to Tag and Simple Value ABNF in Appendices B
and C of RFC 8610 and C of RFC 8610
For #6, the head-number stands for the tag number. For #7, the head- For #6, the head-number stands for the tag number. For #7, the head-
number stands for the simple value if it is in the ranges 0..23 or number stands for the simple value if it is in the ranges 0..23 or
32..255 (as per Section 3.3 of RFC 8949 [STD94] the simple values 32..255 (as per Section 3.3 of RFC 8949 [STD94], the simple values
24..31 are not used). For 24..31, the head-number stands for the 24..31 are not used). For 24..31, the head-number stands for the
"additional information", e.g., #7.25 or #7.<25> is a float16, etc. "additional information", e.g., #7.25 or #7.<25> is a float16, etc.
(All ranges mentioned here are inclusive.) (All ranges mentioned here are inclusive.)
So the above range can be expressed in a CDDL fragment such as: So the above range can be expressed in a CDDL fragment such as:
ct-tag<content> = #6.<ct-tag-number>(content) ct-tag<content> = #6.<ct-tag-number>(content)
ct-tag-number = 1668546817..1668612095 ct-tag-number = 1668546817..1668612095
; or use 0x63740101..0x6374FFFF ; or use 0x63740101..0x6374FFFF
Notes: | Notes:
|
1. This syntax reuses the angle bracket syntax for generics; this | 1. This syntax reuses the angle bracket syntax for
reuse is innocuous as a generic parameter/argument only ever | generics; this reuse is innocuous because a generic
occurs after a rule name (id), while it occurs after . here. | parameter or argument only ever occurs after a rule name
(Whether there is potential for human confusion can be debated; | (id), while it occurs after the "." (dot) character
the above example deliberately uses generics as well.) | here. (Whether there is potential for human confusion
| can be debated; the above example deliberately uses
2. The updated ABNF grammar makes it a bit more explicit that the | generics as well.)
number given after the optional dot is special, not giving the |
CBOR "additional information" for tags and simple values as it is | 2. The updated ABNF grammar makes it a bit more explicit
with other uses of # in CDDL. (Adding this observation to | that the number given after the optional dot is the
Section 2.2.3 of [RFC8610] is the subject of [Err6575]; it is | value of the argument: for tags and simple values, it is
correctly noted in Section 3.6 of [RFC8610].) In hindsight, | not giving the CBOR "additional information”, as it is
maybe a different character than the dot should have been chosen | with other uses of # in CDDL. (Adding this observation
for this special case, however changing the grammar now would | to Section 2.2.3 of [RFC8610] is the subject of
have been too disruptive. | [Err6575]; it is correctly noted in Section 3.6 of
| [RFC8610].) In hindsight, maybe a different character
| than the dot should have been chosen for this special
| case; however, changing the grammar in the current
| document would have been too disruptive.
4. Security Considerations 4. Security Considerations
The grammar fixes and updates in this document are not believed to The grammar fixes and updates in this document are not believed to
create additional security considerations. The security create additional security considerations. The security
considerations in Section 5 of [RFC8610] do apply, and specifically considerations in Section 5 of [RFC8610] apply. Specifically, the
the potential for confusion is increased in an environment that uses potential for confusion is increased in an environment that uses a
a combination of CDDL tools some of which have been updated and some combination of CDDL tools, some of which have been updated and some
of which have not been, in particular based on Section 2. of which have not, in particular based on Section 2.
Attackers may want to exploit such potential confusion by crafting Attackers may want to exploit such potential confusion by crafting
CDDL models that are interpreted differently by different parts of a CDDL models that are interpreted differently by different parts of a
system. There will be a period of transition from the details that system. There will be a period of transition from the details that
the [RFC8610] grammar handled in a less well-defined way, to the the grammar in [RFC8610] handled in a less well-defined way, to the
updated grammar defined in the present document. This transition updated grammar defined in the present document. This transition
might offer one, but not the only kind of opportunity for the kind of might offer one (but not the only) type of opportunity for the kind
attack that relies on differences between implementations. of attack that relies on differences between implementations.
Implementations that make use of CDDL models operationally already Implementations that make use of CDDL models operationally already
need to ascertain the provenance (and thus authenticity and need to ascertain the provenance (and thus authenticity and
integrity) and applicability of models they employ. At the time of integrity) and applicability of models they employ. At the time of
writing, it is expected that the models will generally be processed writing, it is expected that the models will generally be processed
by a software developer, within a software development environment. by a software developer, within a software development environment.
Developers are therefore advised to treat CDDL models with the same Therefore, developers are advised to treat CDDL models with the same
care as any other source code. care as any other source code.
5. IANA Considerations 5. IANA Considerations
This document has no IANA actions. This document has no IANA actions.
6. References 6. References
6.1. Normative References 6.1. Normative References
[RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
Definition Language (CDDL): A Notational Convention to Definition Language (CDDL): A Notational Convention to
Express Concise Binary Object Representation (CBOR) and Express Concise Binary Object Representation (CBOR) and
JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
June 2019, <https://www.rfc-editor.org/rfc/rfc8610>. June 2019, <https://www.rfc-editor.org/info/rfc8610>.
[STD68] Internet Standard 68, [STD68] Internet Standard 68,
<https://www.rfc-editor.org/info/std68>. <https://www.rfc-editor.org/info/std68>.
At the time of writing, this STD comprises the following: At the time of writing, this STD comprises the following:
Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234, Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008, DOI 10.17487/RFC5234, January 2008,
<https://www.rfc-editor.org/info/rfc5234>. <https://www.rfc-editor.org/info/rfc5234>.
skipping to change at page 10, line 5 skipping to change at line 418
<https://www.rfc-editor.org/info/std94>. <https://www.rfc-editor.org/info/std94>.
At the time of writing, this STD comprises the following: At the time of writing, this STD comprises the following:
Bormann, C. and P. Hoffman, "Concise Binary Object Bormann, C. and P. Hoffman, "Concise Binary Object
Representation (CBOR)", STD 94, RFC 8949, Representation (CBOR)", STD 94, RFC 8949,
DOI 10.17487/RFC8949, December 2020, DOI 10.17487/RFC8949, December 2020,
<https://www.rfc-editor.org/info/rfc8949>. <https://www.rfc-editor.org/info/rfc8949>.
6.2. Informative References 6.2. Informative References
[Err6278] "Errata Report 6278", RFC 8610, [CDDL-MODULES]
Bormann, C. and B. Moran, "CDDL Module Structure", Work in
Progress, Internet-Draft, draft-ietf-cbor-cddl-modules-03,
1 September 2024, <https://datatracker.ietf.org/doc/html/
draft-ietf-cbor-cddl-modules-03>.
[EDN-LITERALS]
Bormann, C., "CBOR Extended Diagnostic Notation (EDN)",
Work in Progress, Internet-Draft, draft-ietf-cbor-edn-
literals-13, 3 November 2024,
<https://datatracker.ietf.org/doc/html/draft-ietf-cbor-
edn-literals-13>.
[Err6278] RFC Errata, Erratum ID 6278, RFC 8610,
<https://www.rfc-editor.org/errata/eid6278>. <https://www.rfc-editor.org/errata/eid6278>.
[Err6526] "Errata Report 6526", RFC 8610, [Err6526] RFC Errata, Erratum ID 6526, RFC 8610,
<https://www.rfc-editor.org/errata/eid6526>. <https://www.rfc-editor.org/errata/eid6526>.
[Err6527] "Errata Report 6527", RFC 8610, [Err6527] RFC Errata, Erratum ID 6527, RFC 8610,
<https://www.rfc-editor.org/errata/eid6527>. <https://www.rfc-editor.org/errata/eid6527>.
[Err6543] "Errata Report 6543", RFC 8610, [Err6543] RFC Errata, Erratum ID 6543, RFC 8610,
<https://www.rfc-editor.org/errata/eid6543>. <https://www.rfc-editor.org/errata/eid6543>.
[Err6575] "Errata Report 6575", RFC 8610, [Err6575] RFC Errata, Erratum ID 6575, RFC 8610,
<https://www.rfc-editor.org/errata/eid6575>. <https://www.rfc-editor.org/errata/eid6575>.
[I-D.ietf-cbor-cddl-modules]
Bormann, C. and B. Moran, "CDDL Module Structure", Work in
Progress, Internet-Draft, draft-ietf-cbor-cddl-modules-02,
4 March 2024, <https://datatracker.ietf.org/doc/html/
draft-ietf-cbor-cddl-modules-02>.
[I-D.ietf-cbor-edn-literals]
Bormann, C., "CBOR Extended Diagnostic Notation (EDN):
Application-Oriented Literals, ABNF, and Media Type", Work
in Progress, Internet-Draft, draft-ietf-cbor-edn-literals-
09, 18 May 2024, <https://datatracker.ietf.org/doc/html/
draft-ietf-cbor-edn-literals-09>.
[RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF",
RFC 7405, DOI 10.17487/RFC7405, December 2014, RFC 7405, DOI 10.17487/RFC7405, December 2014,
<https://www.rfc-editor.org/rfc/rfc7405>. <https://www.rfc-editor.org/info/rfc7405>.
[RFC9165] Bormann, C., "Additional Control Operators for the Concise [RFC9165] Bormann, C., "Additional Control Operators for the Concise
Data Definition Language (CDDL)", RFC 9165, Data Definition Language (CDDL)", RFC 9165,
DOI 10.17487/RFC9165, December 2021, DOI 10.17487/RFC9165, December 2021,
<https://www.rfc-editor.org/rfc/rfc9165>. <https://www.rfc-editor.org/info/rfc9165>.
[RFC9277] Richardson, M. and C. Bormann, "On Stable Storage for [RFC9277] Richardson, M. and C. Bormann, "On Stable Storage for
Items in Concise Binary Object Representation (CBOR)", Items in Concise Binary Object Representation (CBOR)",
RFC 9277, DOI 10.17487/RFC9277, August 2022, RFC 9277, DOI 10.17487/RFC9277, August 2022,
<https://www.rfc-editor.org/rfc/rfc9277>. <https://www.rfc-editor.org/info/rfc9277>.
[STD80] Internet Standard 80, [STD80] Internet Standard 80,
<https://www.rfc-editor.org/info/std80>. <https://www.rfc-editor.org/info/std80>.
At the time of writing, this STD comprises the following: At the time of writing, this STD comprises the following:
Cerf, V., "ASCII format for network interchange", STD 80, Cerf, V., "ASCII format for network interchange", STD 80,
RFC 20, DOI 10.17487/RFC0020, October 1969, RFC 20, DOI 10.17487/RFC0020, October 1969,
<https://www.rfc-editor.org/info/rfc20>. <https://www.rfc-editor.org/info/rfc20>.
[UNICODE] The Unicode Consortium, "The Unicode Standard",
<https://www.unicode.org/versions/latest/>.
Appendix A. Updated Collected ABNF for CDDL Appendix A. Updated Collected ABNF for CDDL
This appendix is normative. This appendix is normative.
It provides the full ABNF from [RFC8610] with the updates applied in It provides the full ABNF from [RFC8610] as updated by the present
the present document. document.
cddl = S *(rule S) cddl = S *(rule S)
rule = typename [genericparm] S assignt S type rule = typename [genericparm] S assignt S type
/ groupname [genericparm] S assigng S grpent / groupname [genericparm] S assigng S grpent
typename = id typename = id
groupname = id groupname = id
assignt = "=" / "/=" assignt = "=" / "/="
assigng = "=" / "//=" assigng = "=" / "//="
skipping to change at page 13, line 29 skipping to change at line 589
S = *WS S = *WS
WS = SP / NL WS = SP / NL
SP = %x20 SP = %x20
NL = COMMENT / CRLF NL = COMMENT / CRLF
COMMENT = ";" *PCHAR CRLF COMMENT = ";" *PCHAR CRLF
PCHAR = %x20-7E / NONASCII PCHAR = %x20-7E / NONASCII
NONASCII = %xA0-D7FF / %xE000-10FFFD NONASCII = %xA0-D7FF / %xE000-10FFFD
CRLF = %x0A / %x0D.0A CRLF = %x0A / %x0D.0A
Figure 11: ABNF for CDDL as updated Figure 11: ABNF for CDDL as Updated
Appendix B. Details about Covering Errata Report 6543 Appendix B. Details about Covering Erratum ID 6543
This appendix is informative. This appendix is informative.
[Err6543] observes that the ABNF used in [RFC8610] for the content of [Err6543] notes that the ABNF used in [RFC8610] for the content of
byte string literals lumps together byte strings notated as text with byte string literals lumps together byte strings notated as text with
byte strings notated in base16 (hex) or base64 (but see also updated byte strings notated in base16 (hex) or base64 (but see also updated
BCHAR rule in Figure 4): BCHAR rule in Figure 4):
; RFC 8610 ABNF: ; ABNF from RFC 8610:
bytes = [bsqual] %x27 *BCHAR %x27 bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
Figure 12: Original RFC 8610 ABNF for BCHAR Figure 12: Original ABNF from RFC 8610 for BCHAR
Change Proposed By Errata Report 6543 B.1. Change Proposed by Erratum ID 6543
Errata report 6543 proposes to handle the two cases in separate ABNF Erratum ID 6543 proposes handling the two cases in separate ABNF
rules (where, with an updated SESC, BCHAR obviously needs to be rules (where, with an updated SESC, BCHAR obviously needs to be
updated as above): updated as above):
; Err6543 proposal: ; Proposal from Erratum ID 6543:
bytes = %x27 *BCHAR %x27 bytes = %x27 *BCHAR %x27
/ bsqual %x27 *QCHAR %x27 / bsqual %x27 *QCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
QCHAR = DIGIT / ALPHA / "+" / "/" / "-" / "_" / "=" / WS QCHAR = DIGIT / ALPHA / "+" / "/" / "-" / "_" / "=" / WS
Figure 13: Errata Report 8653 Proposal to Split the Byte String Rules Figure 13: Proposal from Erratum ID 6543 to Split the Byte String
Rules
This potentially causes a subtle change, which is hidden in the WS This potentially causes a subtle change, which is hidden in the WS
rule: rule:
; RFC 8610 ABNF: ; ABNF from RFC 8610:
WS = SP / NL WS = SP / NL
SP = %x20 SP = %x20
NL = COMMENT / CRLF NL = COMMENT / CRLF
COMMENT = ";" *PCHAR CRLF COMMENT = ";" *PCHAR CRLF
PCHAR = %x20-7E / %x80-10FFFD PCHAR = %x20-7E / %x80-10FFFD
CRLF = %x0A / %x0D.0A CRLF = %x0A / %x0D.0A
Figure 14: ABNF definition of WS from RFC 8610 Figure 14: ABNF Definition of WS from RFC 8610
This allows any non-C0 character in a comment, so this fragment This allows any non-C0 character in a comment, so this fragment
becomes possible: becomes possible:
foo = h' foo = h'
43424F52 ; 'CBOR' 43424F52 ; 'CBOR'
0A ; LF, but don't use CR! 0A ; LF, but don't use CR!
' '
The current text is not unambiguously saying whether the three The current text is not unambiguously saying whether the three
apostrophes need to be escaped with a \ or not, as in: apostrophes need to be escaped with a \ or not, as in:
foo = h' foo = h'
43424F52 ; \'CBOR\' 43424F52 ; \'CBOR\'
0A ; LF, but don\'t use CR! 0A ; LF, but don\'t use CR!
' '
... which would be supported by the existing ABNF in [RFC8610]. ... which would be supported by the existing ABNF in [RFC8610].
No Further Change Needed After Updating String Literal Grammar B.2. No Further Change Needed after Updating String Literal Grammar
(Section 2.1)
This document takes the simpler approach of leaving the processing of This document takes the simpler approach of leaving the processing of
the content of the byte string literal to a semantic step after the content of the byte string literal to a semantic step after
processing the syntax of the bytes/BCHAR rules, as updated by processing the syntax of the bytes and BCHAR rules, as updated by
Figure 2 and Figure 4 in Section 2.1 (updates prompted by the Figures 2 and 4 in Section 2.1 (updates prompted by the combination
combination of [Err6527] and [Err6278]). of [Err6527] and [Err6278]).
The rules in Figure 14 (as updated by Figure 4) are therefore applied Therefore, the rules in Figure 14 (as updated by Figure 4) are
to the result of this processing where bsqual is given as h or b64. applied to the result of this processing where bsqual is given as h
or b64.
Note that this approach also works well with the use of byte strings Note that this approach also works well with the use of byte strings
in Section 3 of [RFC9165]. It does require some care when copy- in Section 3 of [RFC9165]. It does require some care when copying-
pasting into CDDL models from ABNF that contains single quotes (which and-pasting into CDDL models from ABNF that contains single quotes
may also hide as apostrophes in comments); these need to be escaped (which may also hide as apostrophes in comments); these need to be
or possibly replaced by %x27. escaped or possibly replaced by %x27.
Finally, the approach taken lends support to extending bsqual in CDDL Finally, the approach taken lends support to extending bsqual in CDDL
similar to the way this is done for CBOR diagnostic notation in similar to the way this is done for CBOR diagnostic notation in
[I-D.ietf-cbor-edn-literals]. (Note that the processing of string [EDN-LITERALS]. (Note that, at the time of writing, the processing
literals now is quite similar between CDDL and EDN, except that CDDL of string literals is quite similar for both CDDL and Extended
has ";"-based end-of-line comments, while EDN has two comment Diagnostic Notation (EDN), except that CDDL has end-of-line comments
syntaxes, in-line "/"-based and end-of-line "#"-based.) that are ";" based and EDN has two comment syntaxes: one in-line "/"
based and one end-of-line "#" based.)
Acknowledgments Acknowledgments
Many thanks go to the submitters of the errata reports addressed in Many thanks go to the submitters of the errata reports addressed in
this document. In one of the ensuing discussions, Doug Ewell this document. In one of the ensuing discussions, Doug Ewell
proposed to define an ABNF rule NONASCII, of which we have included proposed defining an ABNF rule "NONASCII", of which we have included
the essence. Special thanks to the reviewers Marco Tiloca, Christian the essence. Special thanks to the reviewers Marco Tiloca, Christian
Amsüss (shepherd review and further guidance), Orie Steele (AD review Amsüss (Shepherd Review and further guidance), Orie Steele (AD Review
and further guidance), and Éric Vyncke (detailed IESG review). and further guidance), and Éric Vyncke (detailed IESG review).
Author's Address Author's Address
Carsten Bormann Carsten Bormann
Universität Bremen TZI Universität Bremen TZI
Postfach 330440 Postfach 330440
D-28359 Bremen D-28359 Bremen
Germany Germany
Phone: +49-421-218-63921 Phone: +49-421-218-63921
 End of changes. 86 change blocks. 
241 lines changed or deleted 234 lines changed or added

This html diff was produced by rfcdiff 1.48.