rfc8949v1.txt | rfc8949.txt | |||
---|---|---|---|---|
Internet Engineering Task Force (IETF) C. Bormann | Internet Engineering Task Force (IETF) C. Bormann | |||
Request for Comments: 8949 Universität Bremen TZI | Request for Comments: 8949 Universität Bremen TZI | |||
STD: 94 P. Hoffman | STD: 94 P. Hoffman | |||
Obsoletes: 7049 ICANN | Obsoletes: 7049 ICANN | |||
Category: Standards Track November 2020 | Category: Standards Track December 2020 | |||
ISSN: 2070-1721 | ISSN: 2070-1721 | |||
Concise Binary Object Representation (CBOR) | Concise Binary Object Representation (CBOR) | |||
Abstract | Abstract | |||
The Concise Binary Object Representation (CBOR) is a data format | The Concise Binary Object Representation (CBOR) is a data format | |||
whose design goals include the possibility of extremely small code | whose design goals include the possibility of extremely small code | |||
size, fairly small message size, and extensibility without the need | size, fairly small message size, and extensibility without the need | |||
for version negotiation. These design goals make it different from | for version negotiation. These design goals make it different from | |||
skipping to change at line 105 ¶ | skipping to change at line 105 ¶ | |||
5.7. Undefined Values | 5.7. Undefined Values | |||
6. Converting Data between CBOR and JSON | 6. Converting Data between CBOR and JSON | |||
6.1. Converting from CBOR to JSON | 6.1. Converting from CBOR to JSON | |||
6.2. Converting from JSON to CBOR | 6.2. Converting from JSON to CBOR | |||
7. Future Evolution of CBOR | 7. Future Evolution of CBOR | |||
7.1. Extension Points | 7.1. Extension Points | |||
7.2. Curating the Additional Information Space | 7.2. Curating the Additional Information Space | |||
8. Diagnostic Notation | 8. Diagnostic Notation | |||
8.1. Encoding Indicators | 8.1. Encoding Indicators | |||
9. IANA Considerations | 9. IANA Considerations | |||
9.1. Simple Values Registry | 9.1. CBOR Simple Values Registry | |||
9.2. Tags Registry | 9.2. CBOR Tags Registry | |||
9.3. Media Type | 9.3. Media Types Registry | |||
9.4. CoAP Content-Format | 9.4. CoAP Content-Format Registry | |||
9.5. The +cbor Structured Syntax Suffix Registration | 9.5. Structured Syntax Suffix Registry | |||
10. Security Considerations | 10. Security Considerations | |||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
11.2. Informative References | 11.2. Informative References | |||
Appendix A. Examples of Encoded CBOR Data Items | Appendix A. Examples of Encoded CBOR Data Items | |||
Appendix B. Jump Table for Initial Byte | Appendix B. Jump Table for Initial Byte | |||
Appendix C. Pseudocode | Appendix C. Pseudocode | |||
Appendix D. Half-Precision | Appendix D. Half-Precision | |||
Appendix E. Comparison of Other Binary Formats to CBOR's Design | Appendix E. Comparison of Other Binary Formats to CBOR's Design | |||
Objectives | Objectives | |||
skipping to change at line 296 ¶ | skipping to change at line 296 ¶ | |||
Stream decoder: A process that decodes a data stream and makes each | Stream decoder: A process that decodes a data stream and makes each | |||
of the data items in the sequence available to an application as | of the data items in the sequence available to an application as | |||
they are received. | they are received. | |||
Terms and concepts for floating-point values such as Infinity, NaN | Terms and concepts for floating-point values such as Infinity, NaN | |||
(not a number), negative zero, and subnormal are defined in | (not a number), negative zero, and subnormal are defined in | |||
[IEEE754]. | [IEEE754]. | |||
Where bit arithmetic or data types are explained, this document uses | Where bit arithmetic or data types are explained, this document uses | |||
the notation familiar from the programming language C [C], except | the notation familiar from the programming language C [C], except | |||
that "**" denotes exponentiation and ".." denotes a range that | that ".." denotes a range that includes both ends given, and | |||
includes both ends given. Examples and pseudocode assume that signed | superscript notation denotes exponentiation. For example, 2 to the | |||
integers use two's complement representation and that right shifts of | power of 64 is notated: 2^(64). In the plain-text version of this | |||
signed integers perform sign extension; these assumptions are also | specification, superscript notation is not available and therefore is | |||
specified in Sections 6.8.2 and 7.6.7 of the 2020 version of C++, | rendered by a surrogate notation. That notation is not optimized for | |||
successor of [Cplusplus17]. | this RFC; it is unfortunately ambiguous with C's exclusive-or (which | |||
is only used in the appendices, which in turn do not use | ||||
exponentiation) and requires circumspection from the reader of the | ||||
plain-text version. | ||||
Examples and pseudocode assume that signed integers use two's | ||||
complement representation and that right shifts of signed integers | ||||
perform sign extension; these assumptions are also specified in | ||||
Sections 6.8.1 (basic.fundamental) and 7.6.7 (expr.shift) of the 2020 | ||||
version of C++ (currently available as a final draft, [Cplusplus20]). | ||||
Similar to the "0x" notation for hexadecimal numbers, numbers in | Similar to the "0x" notation for hexadecimal numbers, numbers in | |||
binary notation are prefixed with "0b". Underscores can be added to | binary notation are prefixed with "0b". Underscores can be added to | |||
a number solely for readability, so 0b00100001 (0x21) might be | a number solely for readability, so 0b00100001 (0x21) might be | |||
written 0b001_00001 to emphasize the desired interpretation of the | written 0b001_00001 to emphasize the desired interpretation of the | |||
bits in the byte; in this case, it is split into three bits and five | bits in the byte; in this case, it is split into three bits and five | |||
bits. Encoded CBOR data items are sometimes given in the "0x" or | bits. Encoded CBOR data items are sometimes given in the "0x" or | |||
"0b" notation; these values are first interpreted as numbers as in C | "0b" notation; these values are first interpreted as numbers as in C | |||
and are then interpreted as byte strings in network byte order, | and are then interpreted as byte strings in network byte order, | |||
including any leading zero bytes expressed in the notation. | including any leading zero bytes expressed in the notation. | |||
skipping to change at line 340 ¶ | skipping to change at line 349 ¶ | |||
(which usually involves defining additional implementation data types | (which usually involves defining additional implementation data types | |||
for those data items that do not already have a natural | for those data items that do not already have a natural | |||
representation in the environment). The ability to provide generic | representation in the environment). The ability to provide generic | |||
encoders and decoders is an explicit design goal of CBOR; however, | encoders and decoders is an explicit design goal of CBOR; however, | |||
many applications will provide their own application-specific | many applications will provide their own application-specific | |||
encoders and/or decoders. | encoders and/or decoders. | |||
In the basic (unextended) generic data model defined in Section 3, a | In the basic (unextended) generic data model defined in Section 3, a | |||
data item is one of the following: | data item is one of the following: | |||
* an integer in the range -2**64..2**64-1 inclusive | * an integer in the range -2^(64)..2^(64)-1 inclusive | |||
* a simple value, identified by a number between 0 and 255, but | * a simple value, identified by a number between 0 and 255, but | |||
distinct from that number itself | distinct from that number itself | |||
* a floating-point value, distinct from an integer, out of the set | * a floating-point value, distinct from an integer, out of the set | |||
representable by IEEE 754 binary64 (including non-finites) | representable by IEEE 754 binary64 (including non-finites) | |||
[IEEE754] | [IEEE754] | |||
* a sequence of zero or more bytes ("byte string") | * a sequence of zero or more bytes ("byte string") | |||
* a sequence of zero or more Unicode code points ("text string") | * a sequence of zero or more Unicode code points ("text string") | |||
* a sequence of zero or more data items ("array") | * a sequence of zero or more data items ("array") | |||
* a mapping (mathematical function) from zero or more data items | * a mapping (mathematical function) from zero or more data items | |||
("keys") each to a data item ("values"), ("map") | ("keys") each to a data item ("values"), ("map") | |||
* a tagged data item ("tag"), comprising a tag number (an integer in | * a tagged data item ("tag"), comprising a tag number (an integer in | |||
the range 0..2**64-1) and the tag content (a data item) | the range 0..2^(64)-1) and the tag content (a data item) | |||
Note that integer and floating-point values are distinct in this | Note that integer and floating-point values are distinct in this | |||
model, even if they have the same numeric value. | model, even if they have the same numeric value. | |||
Also note that serialization variants are not visible at the generic | Also note that serialization variants are not visible at the generic | |||
data model level, including the number of bytes of the encoded | data model level. This deliberate absence of visibility includes the | |||
floating-point value or the choice of one of the ways in which an | number of bytes of the encoded floating-point value. It also | |||
integer, the length of a text or byte string, the number of elements | includes the choice of encoding for an "argument" (see Section 3) | |||
in an array or pairs in a map, or a tag number, (collectively "the | such as the encoding for an integer, the encoding for the length of a | |||
argument", see Section 3) can be encoded. | text or byte string, the encoding for the number of elements in an | |||
array or pairs in a map, or the encoding for a tag number. | ||||
2.1. Extended Generic Data Models | 2.1. Extended Generic Data Models | |||
This basic generic data model has been extended in this document by | This basic generic data model has been extended in this document by | |||
the registration of a number of simple values and tag numbers, such | the registration of a number of simple values and tag numbers, such | |||
as: | as: | |||
* "false", "true", "null", and "undefined" (simple values identified | * "false", "true", "null", and "undefined" (simple values identified | |||
by 20..23, Section 3.3) | by 20..23, Section 3.3) | |||
skipping to change at line 464 ¶ | skipping to change at line 474 ¶ | |||
CBOR format. In the present version of CBOR, the encoded item is | CBOR format. In the present version of CBOR, the encoded item is | |||
not well-formed. | not well-formed. | |||
31: No argument value is derived. If the major type is 0, 1, or 6, | 31: No argument value is derived. If the major type is 0, 1, or 6, | |||
the encoded item is not well-formed. For major types 2 to 5, the | the encoded item is not well-formed. For major types 2 to 5, the | |||
item's length is indefinite, and for major type 7, the byte does | item's length is indefinite, and for major type 7, the byte does | |||
not constitute a data item at all but terminates an indefinite- | not constitute a data item at all but terminates an indefinite- | |||
length item; all are described in Section 3.2. | length item; all are described in Section 3.2. | |||
The initial byte and any additional bytes consumed to construct the | The initial byte and any additional bytes consumed to construct the | |||
argument are collectively referred to as the "head" of the data item. | argument are collectively referred to as the _head_ of the data item. | |||
The meaning of this argument depends on the major type. For example, | The meaning of this argument depends on the major type. For example, | |||
in major type 0, the argument is the value of the data item itself | in major type 0, the argument is the value of the data item itself | |||
(and in major type 1, the value of the data item is computed from the | (and in major type 1, the value of the data item is computed from the | |||
argument); in major type 2 and 3, it gives the length of the string | argument); in major type 2 and 3, it gives the length of the string | |||
data in bytes that follow; and in major types 4 and 5, it is used to | data in bytes that follow; and in major types 4 and 5, it is used to | |||
determine the number of data items enclosed. | determine the number of data items enclosed. | |||
If the encoded sequence of bytes ends before the end of a data item, | If the encoded sequence of bytes ends before the end of a data item, | |||
that item is not well-formed. If the encoded sequence of bytes still | that item is not well-formed. If the encoded sequence of bytes still | |||
skipping to change at line 492 ¶ | skipping to change at line 502 ¶ | |||
256 defined values for the initial byte (Table 7). A decoder in a | 256 defined values for the initial byte (Table 7). A decoder in a | |||
constrained implementation can instead use the structure of the | constrained implementation can instead use the structure of the | |||
initial byte and following bytes for more compact code (see | initial byte and following bytes for more compact code (see | |||
Appendix C for a rough impression of how this could look). | Appendix C for a rough impression of how this could look). | |||
3.1. Major Types | 3.1. Major Types | |||
The following lists the major types and the additional information | The following lists the major types and the additional information | |||
and other bytes associated with the type. | and other bytes associated with the type. | |||
Major type 0: an unsigned integer in the range 0..2**64-1 inclusive. | Major type 0: | |||
The value of the encoded item is the argument itself. For | An unsigned integer in the range 0..2^(64)-1 inclusive. The value | |||
example, the integer 10 is denoted as the one byte 0b000_01010 | of the encoded item is the argument itself. For example, the | |||
(major type 0, additional information 10). The integer 500 would | integer 10 is denoted as the one byte 0b000_01010 (major type 0, | |||
be 0b000_11001 (major type 0, additional information 25) followed | additional information 10). The integer 500 would be 0b000_11001 | |||
by the two bytes 0x01f4, which is 500 in decimal. | (major type 0, additional information 25) followed by the two | |||
bytes 0x01f4, which is 500 in decimal. | ||||
Major type 1: a negative integer in the range -2**64..-1 inclusive. | Major type 1: | |||
The value of the item is -1 minus the argument. For example, the | A negative integer in the range -2^(64)..-1 inclusive. The value | |||
integer -500 would be 0b001_11001 (major type 1, additional | of the item is -1 minus the argument. For example, the integer | |||
information 25) followed by the two bytes 0x01f3, which is 499 in | -500 would be 0b001_11001 (major type 1, additional information | |||
decimal. | 25) followed by the two bytes 0x01f3, which is 499 in decimal. | |||
Major type 2: a byte string. The number of bytes in the string is | Major type 2: | |||
equal to the argument. For example, a byte string whose length is | A byte string. The number of bytes in the string is equal to the | |||
5 would have an initial byte of 0b010_00101 (major type 2, | argument. For example, a byte string whose length is 5 would have | |||
additional information 5 for the length), followed by 5 bytes of | an initial byte of 0b010_00101 (major type 2, additional | |||
binary content. A byte string whose length is 500 would have 3 | information 5 for the length), followed by 5 bytes of binary | |||
initial bytes of 0b010_11001 (major type 2, additional information | content. A byte string whose length is 500 would have 3 initial | |||
25 to indicate a two-byte length) followed by the two bytes 0x01f4 | bytes of 0b010_11001 (major type 2, additional information 25 to | |||
for a length of 500, followed by 500 bytes of binary content. | indicate a two-byte length) followed by the two bytes 0x01f4 for a | |||
length of 500, followed by 500 bytes of binary content. | ||||
Major type 3: a text string (Section 2) encoded as UTF-8 [RFC3629]. | Major type 3: | |||
The number of bytes in the string is equal to the argument. A | A text string (Section 2) encoded as UTF-8 [RFC3629]. The number | |||
string containing an invalid UTF-8 sequence is well-formed but | of bytes in the string is equal to the argument. A string | |||
invalid (Section 1.2). This type is provided for systems that | containing an invalid UTF-8 sequence is well-formed but invalid | |||
need to interpret or display human-readable text, and allows the | (Section 1.2). This type is provided for systems that need to | |||
interpret or display human-readable text, and allows the | ||||
differentiation between unstructured bytes and text that has a | differentiation between unstructured bytes and text that has a | |||
specified repertoire (that of Unicode) and encoding (UTF-8). In | specified repertoire (that of Unicode) and encoding (UTF-8). In | |||
contrast to formats such as JSON, the Unicode characters in this | contrast to formats such as JSON, the Unicode characters in this | |||
type are never escaped. Thus, a newline character (U+000A) is | type are never escaped. Thus, a newline character (U+000A) is | |||
always represented in a string as the byte 0x0a, and never as the | always represented in a string as the byte 0x0a, and never as the | |||
bytes 0x5c6e (the characters "\" and "n") nor as 0x5c7530303061 | bytes 0x5c6e (the characters "\" and "n") nor as 0x5c7530303061 | |||
(the characters "\", "u", "0", "0", "0", and "a"). | (the characters "\", "u", "0", "0", "0", and "a"). | |||
Major type 4: an array of data items. In other formats, arrays are | Major type 4: | |||
also called lists, sequences, or tuples (a "CBOR sequence" is | An array of data items. In other formats, arrays are also called | |||
something slightly different, though [RFC8742]). The argument is | lists, sequences, or tuples (a "CBOR sequence" is something | |||
the number of data items in the array. Items in an array do not | slightly different, though [RFC8742]). The argument is the number | |||
need to all be of the same type. For example, an array that | of data items in the array. Items in an array do not need to all | |||
contains 10 items of any type would have an initial byte of | be of the same type. For example, an array that contains 10 items | |||
0b100_01010 (major type 4, additional information 10 for the | of any type would have an initial byte of 0b100_01010 (major type | |||
length) followed by the 10 remaining items. | 4, additional information 10 for the length) followed by the 10 | |||
remaining items. | ||||
Major type 5: a map of pairs of data items. Maps are also called | Major type 5: | |||
tables, dictionaries, hashes, or objects (in JSON). A map is | A map of pairs of data items. Maps are also called tables, | |||
comprised of pairs of data items, each pair consisting of a key | dictionaries, hashes, or objects (in JSON). A map is comprised of | |||
that is immediately followed by a value. The argument is the | pairs of data items, each pair consisting of a key that is | |||
number of _pairs_ of data items in the map. For example, a map | immediately followed by a value. The argument is the number of | |||
that contains 9 pairs would have an initial byte of 0b101_01001 | _pairs_ of data items in the map. For example, a map that | |||
(major type 5, additional information 9 for the number of pairs) | contains 9 pairs would have an initial byte of 0b101_01001 (major | |||
followed by the 18 remaining items. The first item is the first | type 5, additional information 9 for the number of pairs) followed | |||
key, the second item is the first value, the third item is the | by the 18 remaining items. The first item is the first key, the | |||
second key, and so on. Because items in a map come in pairs, | second item is the first value, the third item is the second key, | |||
their total number is always even: a map that contains an odd | and so on. Because items in a map come in pairs, their total | |||
number of items (no value data present after the last key data | number is always even: a map that contains an odd number of items | |||
item) is not well-formed. A map that has duplicate keys may be | (no value data present after the last key data item) is not well- | |||
well-formed, but it is not valid, and thus it causes indeterminate | formed. A map that has duplicate keys may be well-formed, but it | |||
decoding; see also Section 5.6. | is not valid, and thus it causes indeterminate decoding; see also | |||
Section 5.6. | ||||
Major type 6: a tagged data item ("tag") whose tag number, an | Major type 6: | |||
integer in the range 0..2**64-1 inclusive, is the argument and | A tagged data item ("tag") whose tag number, an integer in the | |||
whose enclosed data item ("tag content") is the single encoded | range 0..2^(64)-1 inclusive, is the argument and whose enclosed | |||
data item that follows the head. See Section 3.4. | data item (_tag content_) is the single encoded data item that | |||
follows the head. See Section 3.4. | ||||
Major type 7: floating-point numbers and simple values, as well as | Major type 7: | |||
the "break" stop code. See Section 3.3. | Floating-point numbers and simple values, as well as the "break" | |||
stop code. See Section 3.3. | ||||
These eight major types lead to a simple table showing which of the | These eight major types lead to a simple table showing which of the | |||
256 possible values for the initial byte of a data item are used | 256 possible values for the initial byte of a data item are used | |||
(Table 7). | (Table 7). | |||
In major types 6 and 7, many of the possible values are reserved for | In major types 6 and 7, many of the possible values are reserved for | |||
future specification. See Section 9 for more information on these | future specification. See Section 9 for more information on these | |||
values. | values. | |||
Table 1 summarizes the major types defined by CBOR, ignoring | Table 1 summarizes the major types defined by CBOR, ignoring | |||
skipping to change at line 829 ¶ | skipping to change at line 845 ¶ | |||
Table 3: Values for Additional Information in Major Type 7 | Table 3: Values for Additional Information in Major Type 7 | |||
As with all other major types, the 5-bit value 24 signifies a single- | As with all other major types, the 5-bit value 24 signifies a single- | |||
byte extension: it is followed by an additional byte to represent the | byte extension: it is followed by an additional byte to represent the | |||
simple value. (To minimize confusion, only the values 32 to 255 are | simple value. (To minimize confusion, only the values 32 to 255 are | |||
used.) This maintains the structure of the initial bytes: as for the | used.) This maintains the structure of the initial bytes: as for the | |||
other major types, the length of these always depends on the | other major types, the length of these always depends on the | |||
additional information in the first byte. Table 4 lists the numeric | additional information in the first byte. Table 4 lists the numeric | |||
values assigned and available for simple values. | values assigned and available for simple values. | |||
+=========+============+ | +=========+==============+ | |||
| Value | Semantics | | | Value | Semantics | | |||
+=========+============+ | +=========+==============+ | |||
| 0..19 | Unassigned | | | 0..19 | (unassigned) | | |||
+---------+------------+ | +---------+--------------+ | |||
| 20 | False | | | 20 | false | | |||
+---------+------------+ | +---------+--------------+ | |||
| 21 | True | | | 21 | true | | |||
+---------+------------+ | +---------+--------------+ | |||
| 22 | Null | | | 22 | null | | |||
+---------+------------+ | +---------+--------------+ | |||
| 23 | Undefined | | | 23 | undefined | | |||
+---------+------------+ | +---------+--------------+ | |||
| 24..31 | Reserved | | | 24..31 | (reserved) | | |||
+---------+------------+ | +---------+--------------+ | |||
| 32..255 | Unassigned | | | 32..255 | (unassigned) | | |||
+---------+------------+ | +---------+--------------+ | |||
Table 4: Simple Values | Table 4: Simple Values | |||
An encoder MUST NOT issue two-byte sequences that start with 0xf8 | An encoder MUST NOT issue two-byte sequences that start with 0xf8 | |||
(major type 7, additional information 24) and continue with a byte | (major type 7, additional information 24) and continue with a byte | |||
less than 0x20 (32 decimal). Such sequences are not well-formed. | less than 0x20 (32 decimal). Such sequences are not well-formed. | |||
(This implies that an encoder cannot encode false, true, null, or | (This implies that an encoder cannot encode "false", "true", "null", | |||
undefined in two-byte sequences and that only the one-byte variants | or "undefined" in two-byte sequences and that only the one-byte | |||
of these are well-formed; more generally speaking, each simple value | variants of these are well-formed; more generally speaking, each | |||
only has a single representation variant). | simple value only has a single representation variant). | |||
The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit | |||
IEEE 754 binary floating-point values [IEEE754]. These floating- | IEEE 754 binary floating-point values [IEEE754]. These floating- | |||
point values are encoded in the additional bytes of the appropriate | point values are encoded in the additional bytes of the appropriate | |||
size. (See Appendix D for some information about 16-bit floating- | size. (See Appendix D for some information about 16-bit floating- | |||
point numbers.) | point numbers.) | |||
3.4. Tagging of Items | 3.4. Tagging of Items | |||
In CBOR, a data item can be enclosed by a tag to give it some | In CBOR, a data item can be enclosed by a tag to give it some | |||
additional semantics, as uniquely identified by a "tag number". The | additional semantics, as uniquely identified by a _tag number_. The | |||
tag is major type 6, its argument (Section 3) indicates the tag | tag is major type 6, its argument (Section 3) indicates the tag | |||
number, and it contains a single enclosed data item, the "tag | number, and it contains a single enclosed data item, the _tag | |||
content". (If a tag requires further structure to its content, this | content_. (If a tag requires further structure to its content, this | |||
structure is provided by the enclosed data item.) We use the term | structure is provided by the enclosed data item.) We use the term | |||
"tag" for the entire data item consisting of both a tag number and | _tag_ for the entire data item consisting of both a tag number and | |||
the tag content: the tag content is the data item that is being | the tag content: the tag content is the data item that is being | |||
tagged. | tagged. | |||
For example, assume that a byte string of length 12 is marked with a | For example, assume that a byte string of length 12 is marked with a | |||
tag of number 2 to indicate it is a positive "bignum" | tag of number 2 to indicate it is an unsigned _bignum_ | |||
(Section 3.4.3). The encoded data item would start with a byte | (Section 3.4.3). The encoded data item would start with a byte | |||
0b110_00010 (major type 6, additional information 2 for the tag | 0b110_00010 (major type 6, additional information 2 for the tag | |||
number) followed by the encoded tag content: 0b010_01100 (major type | number) followed by the encoded tag content: 0b010_01100 (major type | |||
2, additional information 12 for the length) followed by the 12 bytes | 2, additional information 12 for the length) followed by the 12 bytes | |||
of the bignum. | of the bignum. | |||
In the extended generic data model, a tag number's definition | In the extended generic data model, a tag number's definition | |||
describes the additional semantics conveyed with the tag number. | describes the additional semantics conveyed with the tag number. | |||
These semantics may include equivalence of some tagged data items | These semantics may include equivalence of some tagged data items | |||
with other data items, including some that can be represented in the | with other data items, including some that can be represented in the | |||
basic generic data model. For instance, 0xc24101, a bignum the tag | basic generic data model. For instance, 0xc24101, a bignum the tag | |||
content of which is the byte string with the single byte 0x01, is | content of which is the byte string with the single byte 0x01, is | |||
equivalent to an integer 1, which could also be encoded as 0x01, | equivalent to an integer 1, which could also be encoded as 0x01, | |||
0x1801, or 0x190001. The tag definition may specify a preferred | 0x1801, or 0x190001. The tag definition may specify a preferred | |||
serialization (Section 4.1) that is recommended for generic encoders; | serialization (Section 4.1) that is recommended for generic encoders; | |||
this may prefer basic generic data model representations over ones | this may prefer basic generic data model representations over ones | |||
that employ a tag. | that employ a tag. | |||
The tag definition usually restricts what kinds of nested data item | The tag definition usually defines which nested data items are valid | |||
or items are valid for such tags. Tag definitions may restrict their | for such tags. Tag definitions may restrict their content to a very | |||
content to a very specific syntactic structure, as the tags defined | specific syntactic structure, as the tags defined in this document | |||
in this document do, or they may aim at a more semantically defined | do, or they may define their content more semantically. An example | |||
definition of their content, as for instance tags 40 and 1040 do | for the latter is how tags 40 and 1040 accept multiple ways to | |||
[RFC8746]: these accept a number of different ways of representing | represent arrays [RFC8746]. | |||
arrays. | ||||
As a matter of convention, many tags do not accept null or undefined | As a matter of convention, many tags do not accept "null" or | |||
values as tag content; instead, a null or undefined value can be used | "undefined" values as tag content; instead, the expectation is that a | |||
in place of the entire tag. For example, Section 3.4.2 provides | "null" or "undefined" value can be used in place of the entire tag; | |||
guidance on the handling of this convention in application protocols | Section 3.4.2 provides some further considerations for one specific | |||
and the mapping to platform types for tag number 1. | tag about the handling of this convention in application protocols | |||
and in mapping to platform types. | ||||
Decoders do not need to understand tags of every tag number, and tags | Decoders do not need to understand tags of every tag number, and tags | |||
may be of little value in applications where the implementation | may be of little value in applications where the implementation | |||
creating a particular CBOR data item and the implementation decoding | creating a particular CBOR data item and the implementation decoding | |||
that stream know the semantic meaning of each item in the data flow. | that stream know the semantic meaning of each item in the data flow. | |||
The primary purpose of tags in this specification is to define common | The primary purpose of tags in this specification is to define common | |||
data types such as dates. A secondary purpose is to provide | data types such as dates. A secondary purpose is to provide | |||
conversion hints when it is foreseen that the CBOR data item needs to | conversion hints when it is foreseen that the CBOR data item needs to | |||
be translated into a different format, requiring hints about the | be translated into a different format, requiring hints about the | |||
content of items. Understanding the semantics of tags is optional | content of items. Understanding the semantics of tags is optional | |||
skipping to change at line 943 ¶ | skipping to change at line 959 ¶ | |||
+=======+=============+==================================+ | +=======+=============+==================================+ | |||
| Tag | Data Item | Semantics | | | Tag | Data Item | Semantics | | |||
+=======+=============+==================================+ | +=======+=============+==================================+ | |||
| 0 | text string | Standard date/time string; see | | | 0 | text string | Standard date/time string; see | | |||
| | | Section 3.4.1 | | | | | Section 3.4.1 | | |||
+-------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 1 | integer or | Epoch-based date/time; see | | | 1 | integer or | Epoch-based date/time; see | | |||
| | float | Section 3.4.2 | | | | float | Section 3.4.2 | | |||
+-------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 2 | byte string | Positive bignum; see | | | 2 | byte string | Unsigned bignum; see | | |||
| | | Section 3.4.3 | | | | | Section 3.4.3 | | |||
+-------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 3 | byte string | Negative bignum; see | | | 3 | byte string | Negative bignum; see | | |||
| | | Section 3.4.3 | | | | | Section 3.4.3 | | |||
+-------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 4 | array | Decimal fraction; see | | | 4 | array | Decimal fraction; see | | |||
| | | Section 3.4.4 | | | | | Section 3.4.4 | | |||
+-------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
| 5 | array | Bigfloat; see Section 3.4.4 | | | 5 | array | Bigfloat; see Section 3.4.4 | | |||
+-------+-------------+----------------------------------+ | +-------+-------------+----------------------------------+ | |||
skipping to change at line 997 ¶ | skipping to change at line 1013 ¶ | |||
into the CBOR data item. This means these tags cannot be implemented | into the CBOR data item. This means these tags cannot be implemented | |||
on top of an arbitrary generic CBOR encoder/decoder (which might not | on top of an arbitrary generic CBOR encoder/decoder (which might not | |||
reflect the serialization order for entries in a map at the data | reflect the serialization order for entries in a map at the data | |||
model level and vice versa); their implementation therefore typically | model level and vice versa); their implementation therefore typically | |||
needs to be integrated into the generic encoder/decoder. The | needs to be integrated into the generic encoder/decoder. The | |||
definition of new tags with this property is NOT RECOMMENDED. | definition of new tags with this property is NOT RECOMMENDED. | |||
IANA allocated tag numbers 65535, 4294967295, and | IANA allocated tag numbers 65535, 4294967295, and | |||
18446744073709551615 (binary all-ones in 16-bit, 32-bit, and 64-bit). | 18446744073709551615 (binary all-ones in 16-bit, 32-bit, and 64-bit). | |||
These can be used as a convenience for implementers who want a | These can be used as a convenience for implementers who want a | |||
single-integer data structure to indicate either the presence or | single-integer data structure to indicate either the presence of a | |||
absence of a specific tag. That allocation is described in | specific tag or absence of a tag. That allocation is described in | |||
Section 10 of [CBOR-TAGS]. These tags are not intended to occur in | Section 10 of [CBOR-TAGS]. These tags are not intended to occur in | |||
actual CBOR data items; implementations MAY flag such an occurrence | actual CBOR data items; implementations MAY flag such an occurrence | |||
as an error. | as an error. | |||
Protocols can extend the generic data model (Section 2) with data | Protocols can extend the generic data model (Section 2) with data | |||
items representing points in time by using tag numbers 0 and 1, with | items representing points in time by using tag numbers 0 and 1, with | |||
arbitrarily sized integers by using tag numbers 2 and 3, and with | arbitrarily sized integers by using tag numbers 2 and 3, and with | |||
floating-point values of arbitrary size and precision by using tag | floating-point values of arbitrary size and precision by using tag | |||
numbers 4 and 5. | numbers 4 and 5. | |||
skipping to change at line 1051 ¶ | skipping to change at line 1067 ¶ | |||
non-finite values. | non-finite values. | |||
To indicate fractional seconds, floating-point values can be used | To indicate fractional seconds, floating-point values can be used | |||
within tag number 1 instead of integer values. Note that this | within tag number 1 instead of integer values. Note that this | |||
generally requires binary64 support, as binary16 and binary32 provide | generally requires binary64 support, as binary16 and binary32 provide | |||
nonzero fractions of seconds only for a short period of time around | nonzero fractions of seconds only for a short period of time around | |||
early 1970. An application that requires tag number 1 support may | early 1970. An application that requires tag number 1 support may | |||
restrict the tag content to be an integer (or a floating-point value) | restrict the tag content to be an integer (or a floating-point value) | |||
only. | only. | |||
Note that platform types for date/time may include null or undefined | Note that platform types for date/time may include "null" or | |||
values, which may also be desirable at an application protocol level. | "undefined" values, which may also be desirable at an application | |||
While emitting tag number 1 values with non-finite tag content values | protocol level. While emitting tag number 1 values with non-finite | |||
(e.g., with NaN for undefined date/time values or with Infinite for | tag content values (e.g., with NaN for undefined date/time values or | |||
an expiry date that is not set) may seem an obvious way to handle | with Infinity for an expiry date that is not set) may seem an obvious | |||
this, using untagged null or undefined avoids the use of non-finites | way to handle this, using untagged "null" or "undefined" avoids the | |||
and results in a shorter encoding. Application protocol designers | use of non-finites and results in a shorter encoding. Application | |||
are encouraged to consider these cases and include clear guidelines | protocol designers are encouraged to consider these cases and include | |||
for handling them. | clear guidelines for handling them. | |||
3.4.3. Bignums | 3.4.3. Bignums | |||
Protocols using tag numbers 2 and 3 extend the generic data model | Protocols using tag numbers 2 and 3 extend the generic data model | |||
(Section 2) with "bignums" representing arbitrarily sized integers. | (Section 2) with "bignums" representing arbitrarily sized integers. | |||
In the basic generic data model, bignum values are not equal to | In the basic generic data model, bignum values are not equal to | |||
integers from the same model, but the extended generic data model | integers from the same model, but the extended generic data model | |||
created by this tag definition defines equivalence based on numeric | created by this tag definition defines equivalence based on numeric | |||
value, and preferred serialization (Section 4.1) never makes use of | value, and preferred serialization (Section 4.1) never makes use of | |||
bignums that also can be expressed as basic integers (see below). | bignums that also can be expressed as basic integers (see below). | |||
skipping to change at line 1089 ¶ | skipping to change at line 1105 ¶ | |||
leading zeroes. The preferred serialization of an integer that can | leading zeroes. The preferred serialization of an integer that can | |||
be represented using major type 0 or 1 is to encode it this way | be represented using major type 0 or 1 is to encode it this way | |||
instead of as a bignum (which means that the empty string never | instead of as a bignum (which means that the empty string never | |||
occurs in a bignum when using preferred serialization). Note that | occurs in a bignum when using preferred serialization). Note that | |||
this means the non-preferred choice of a bignum representation | this means the non-preferred choice of a bignum representation | |||
instead of a basic integer for encoding a number is not intended to | instead of a basic integer for encoding a number is not intended to | |||
have application semantics (just as the choice of a longer basic | have application semantics (just as the choice of a longer basic | |||
integer representation than needed, such as 0x1800 for 0x00, does | integer representation than needed, such as 0x1800 for 0x00, does | |||
not). | not). | |||
For example, the number 18446744073709551616 (2**64) is represented | For example, the number 18446744073709551616 (2^(64)) is represented | |||
as 0b110_00010 (major type 6, tag number 2), followed by 0b010_01001 | as 0b110_00010 (major type 6, tag number 2), followed by 0b010_01001 | |||
(major type 2, length 9), followed by 0x010000000000000000 (one byte | (major type 2, length 9), followed by 0x010000000000000000 (one byte | |||
0x01 and eight bytes 0x00). In hexadecimal: | 0x01 and eight bytes 0x00). In hexadecimal: | |||
C2 -- Tag 2 | C2 -- Tag 2 | |||
49 -- Byte string of length 9 | 49 -- Byte string of length 9 | |||
010000000000000000 -- Bytes content | 010000000000000000 -- Bytes content | |||
3.4.4. Decimal Fractions and Bigfloats | 3.4.4. Decimal Fractions and Bigfloats | |||
Protocols using tag number 4 extend the generic data model with data | Protocols using tag number 4 extend the generic data model with data | |||
items representing arbitrary-length decimal fractions of the form | items representing arbitrary-length decimal fractions of the form | |||
m*(10**e). Protocols using tag number 5 extend the generic data | m*(10^(e)). Protocols using tag number 5 extend the generic data | |||
model with data items representing arbitrary-length binary fractions | model with data items representing arbitrary-length binary fractions | |||
of the form m*(2**e). As with bignums, values of different types are | of the form m*(2^(e)). As with bignums, values of different types | |||
not equal in the generic data model. | are not equal in the generic data model. | |||
Decimal fractions combine an integer mantissa with a base-10 scaling | Decimal fractions combine an integer mantissa with a base-10 scaling | |||
factor. They are most useful if an application needs the exact | factor. They are most useful if an application needs the exact | |||
representation of a decimal fraction such as 1.1 because there is no | representation of a decimal fraction such as 1.1 because there is no | |||
exact representation for many decimal fractions in binary floating- | exact representation for many decimal fractions in binary floating- | |||
point representations. | point representations. | |||
"Bigfloats" combine an integer mantissa with a base-2 scaling factor. | "Bigfloats" combine an integer mantissa with a base-2 scaling factor. | |||
They are binary floating-point values that can exceed the range or | They are binary floating-point values that can exceed the range or | |||
the precision of the three IEEE 754 formats supported by CBOR | the precision of the three IEEE 754 formats supported by CBOR | |||
(Section 3.3). Bigfloats may also be used by constrained | (Section 3.3). Bigfloats may also be used by constrained | |||
applications that need some basic binary floating-point capability | applications that need some basic binary floating-point capability | |||
without the need for supporting IEEE 754. | without the need for supporting IEEE 754. | |||
A decimal fraction or a bigfloat is represented as a tagged array | A decimal fraction or a bigfloat is represented as a tagged array | |||
that contains exactly two integer numbers: an exponent e and a | that contains exactly two integer numbers: an exponent e and a | |||
mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | mantissa m. Decimal fractions (tag number 4) use base-10 exponents; | |||
the value of a decimal fraction data item is m*(10**e). Bigfloats | the value of a decimal fraction data item is m*(10^(e)). Bigfloats | |||
(tag number 5) use base-2 exponents; the value of a bigfloat data | (tag number 5) use base-2 exponents; the value of a bigfloat data | |||
item is m*(2**e). The exponent e MUST be represented in an integer | item is m*(2^(e)). The exponent e MUST be represented in an integer | |||
of major type 0 or 1, while the mantissa can also be a bignum | of major type 0 or 1, while the mantissa can also be a bignum | |||
(Section 3.4.3). Contained items with other structures are invalid. | (Section 3.4.3). Contained items with other structures are invalid. | |||
An example of a decimal fraction is the representation of the number | An example of a decimal fraction is the representation of the number | |||
273.15 as 0b110_00100 (major type 6 for tag, additional information 4 | 273.15 as 0b110_00100 (major type 6 for tag, additional information 4 | |||
for the tag number), followed by 0b100_00010 (major type 4 for the | for the tag number), followed by 0b100_00010 (major type 4 for the | |||
array, additional information 2 for the length of the array), | array, additional information 2 for the length of the array), | |||
followed by 0b001_00001 (major type 1 for the first integer, | followed by 0b001_00001 (major type 1 for the first integer, | |||
additional information 1 for the value of -2), followed by | additional information 1 for the value of -2), followed by | |||
0b000_11001 (major type 0 for the second integer, additional | 0b000_11001 (major type 0 for the second integer, additional | |||
skipping to change at line 1444 ¶ | skipping to change at line 1460 ¶ | |||
using Section 3.4.5.3, tag number 32 containing a text string. This | using Section 3.4.5.3, tag number 32 containing a text string. This | |||
protocol's deterministic encoding needs either to require that the | protocol's deterministic encoding needs either to require that the | |||
tag is present or to require that it is absent, not allow either one. | tag is present or to require that it is absent, not allow either one. | |||
In a protocol that does require tags in certain places to obtain | In a protocol that does require tags in certain places to obtain | |||
specific semantics, the tag needs to appear in the deterministic | specific semantics, the tag needs to appear in the deterministic | |||
format as well. Deterministic encoding considerations also apply to | format as well. Deterministic encoding considerations also apply to | |||
the content of tags. | the content of tags. | |||
If a protocol includes a field that can express integers with an | If a protocol includes a field that can express integers with an | |||
absolute value of 2**64 or larger using tag numbers 2 or 3 | absolute value of 2^(64) or larger using tag numbers 2 or 3 | |||
(Section 3.4.3), the protocol's deterministic encoding needs to | (Section 3.4.3), the protocol's deterministic encoding needs to | |||
specify whether smaller integers are also expressed using these tags | specify whether smaller integers are also expressed using these tags | |||
or using major types 0 and 1. Preferred serialization uses the | or using major types 0 and 1. Preferred serialization uses the | |||
latter choice, which is therefore recommended. | latter choice, which is therefore recommended. | |||
Protocols that include floating-point values, whether represented | Protocols that include floating-point values, whether represented | |||
using basic floating-point values (Section 3.3) or using tags (or | using basic floating-point values (Section 3.3) or using tags (or | |||
both), may need to define extra requirements on their deterministic | both), may need to define extra requirements on their deterministic | |||
encodings, such as: | encodings, such as: | |||
skipping to change at line 1651 ¶ | skipping to change at line 1667 ¶ | |||
5.3. Validity of Items | 5.3. Validity of Items | |||
A well-formed but invalid CBOR data item (Section 1.2) presents a | A well-formed but invalid CBOR data item (Section 1.2) presents a | |||
problem with interpreting the data encoded in it in the CBOR data | problem with interpreting the data encoded in it in the CBOR data | |||
model. A CBOR-based protocol could be specified in several layers, | model. A CBOR-based protocol could be specified in several layers, | |||
in which the lower layers don't process the semantics of some of the | in which the lower layers don't process the semantics of some of the | |||
CBOR data they forward. These layers can't notice any validity | CBOR data they forward. These layers can't notice any validity | |||
errors in data they don't process and MUST forward that data as-is. | errors in data they don't process and MUST forward that data as-is. | |||
The first layer that does process the semantics of an invalid CBOR | The first layer that does process the semantics of an invalid CBOR | |||
item MUST make one of two choices: | item MUST pick one of two choices: | |||
1. Replace the problematic item with an error marker and continue | 1. Replace the problematic item with an error marker and continue | |||
with the next item, or | with the next item, or | |||
2. Issue an error and stop processing altogether. | 2. Issue an error and stop processing altogether. | |||
A CBOR-based protocol MUST specify which of these options its | A CBOR-based protocol MUST specify which of these options its | |||
decoders take for each kind of invalid item they might encounter. | decoders take for each kind of invalid item they might encounter. | |||
Such problems might occur at the basic validity level of CBOR or in | Such problems might occur at the basic validity level of CBOR or in | |||
skipping to change at line 1690 ¶ | skipping to change at line 1706 ¶ | |||
that the sequence of bytes in a UTF-8 string (major type 3) is | that the sequence of bytes in a UTF-8 string (major type 3) is | |||
actually valid UTF-8 and react appropriately. | actually valid UTF-8 and react appropriately. | |||
5.3.2. Tag validity | 5.3.2. Tag validity | |||
Two additional kinds of validity errors are introduced by adding tags | Two additional kinds of validity errors are introduced by adding tags | |||
to the basic generic data model: | to the basic generic data model: | |||
Inadmissible type for tag content: Tag numbers (Section 3.4) specify | Inadmissible type for tag content: Tag numbers (Section 3.4) specify | |||
what type of data item is supposed to be used as their tag | what type of data item is supposed to be used as their tag | |||
content; for example, the tag numbers for positive or negative | content; for example, the tag numbers for unsigned or negative | |||
bignums are supposed to be put on byte strings. A decoder that | bignums are supposed to be put on byte strings. A decoder that | |||
decodes the tagged data item into a native representation (a | decodes the tagged data item into a native representation (a | |||
native big integer in this example) is expected to check the type | native big integer in this example) is expected to check the type | |||
of the data item being tagged. Even decoders that don't have such | of the data item being tagged. Even decoders that don't have such | |||
native representations available in their environment may perform | native representations available in their environment may perform | |||
the check on those tags known to them and react appropriately. | the check on those tags known to them and react appropriately. | |||
Inadmissible value for tag content: The type of data item may be | Inadmissible value for tag content: The type of data item may be | |||
admissible for a tag's content, but the specific value may not be; | admissible for a tag's content, but the specific value may not be; | |||
e.g., a value of "yesterday" is not acceptable for the content of | e.g., a value of "yesterday" is not acceptable for the content of | |||
skipping to change at line 1729 ¶ | skipping to change at line 1745 ¶ | |||
can do one of two things when it encounters such a case that it does | can do one of two things when it encounters such a case that it does | |||
not recognize: | not recognize: | |||
* It can report an error (and not return data). Note that treating | * It can report an error (and not return data). Note that treating | |||
this case as an error can cause ossification and is thus not | this case as an error can cause ossification and is thus not | |||
encouraged. This error is not a validity error, per se. This | encouraged. This error is not a validity error, per se. This | |||
kind of error is more likely to be raised by a decoder that would | kind of error is more likely to be raised by a decoder that would | |||
be performing validity checking if this were a known case. | be performing validity checking if this were a known case. | |||
* It can emit the unknown item (type, value, and, for tags, the | * It can emit the unknown item (type, value, and, for tags, the | |||
decoded tagged data item) to the application calling the decoder | decoded tagged data item) to the application calling the decoder, | |||
with an indication that the decoder did not recognize that tag | and then give the application an indication that the decoder did | |||
number or simple value. | not recognize that tag number or simple value. | |||
The latter approach, which is also appropriate for decoders that do | The latter approach, which is also appropriate for decoders that do | |||
not support validity checking, provides forward compatibility with | not support validity checking, provides forward compatibility with | |||
newly registered tags and simple values without the requirement to | newly registered tags and simple values without the requirement to | |||
update the encoder at the same time as the calling application. (For | update the encoder at the same time as the calling application. (For | |||
this, the decoder's API needs the ability to mark unknown items so | this, the decoder's API needs the ability to mark unknown items so | |||
that the calling application can handle them in a manner appropriate | that the calling application can handle them in a manner appropriate | |||
for the program.) | for the program.) | |||
Since some of the processing needed for validity checking may have an | Since some of the processing needed for validity checking may have an | |||
skipping to change at line 1762 ¶ | skipping to change at line 1778 ¶ | |||
5.5. Numbers | 5.5. Numbers | |||
CBOR-based protocols should take into account that different language | CBOR-based protocols should take into account that different language | |||
environments pose different restrictions on the range and precision | environments pose different restrictions on the range and precision | |||
of numbers that are representable. For example, the basic JavaScript | of numbers that are representable. For example, the basic JavaScript | |||
number system treats all numbers as floating-point values, which may | number system treats all numbers as floating-point values, which may | |||
result in the silent loss of precision in decoding integers with more | result in the silent loss of precision in decoding integers with more | |||
than 53 significant bits. Another example is that, since CBOR keeps | than 53 significant bits. Another example is that, since CBOR keeps | |||
the sign bit for its integer representation in the major type, it has | the sign bit for its integer representation in the major type, it has | |||
one bit more for signed numbers of a certain length (e.g., | one bit more for signed numbers of a certain length (e.g., | |||
-2**64..2**64-1 for 1+8-byte integers) than the typical platform | -2^(64)..2^(64)-1 for 1+8-byte integers) than the typical platform | |||
signed integer representation of the same length (-2**63..2**63-1 for | signed integer representation of the same length (-2^(63)..2^(63)-1 | |||
8-byte int64_t). A protocol that uses numbers should define its | for 8-byte int64_t). A protocol that uses numbers should define its | |||
expectations on the handling of nontrivial numbers in decoders and | expectations on the handling of nontrivial numbers in decoders and | |||
receiving applications. | receiving applications. | |||
A CBOR-based protocol that includes floating-point numbers can | A CBOR-based protocol that includes floating-point numbers can | |||
restrict which of the three formats (half-precision, single- | restrict which of the three formats (half-precision, single- | |||
precision, and double-precision) are to be supported. For an | precision, and double-precision) are to be supported. For an | |||
integer-only application, a protocol may want to completely exclude | integer-only application, a protocol may want to completely exclude | |||
the use of floating-point values. | the use of floating-point values. | |||
A CBOR-based protocol designed for compactness may want to exclude | A CBOR-based protocol designed for compactness may want to exclude | |||
skipping to change at line 1869 ¶ | skipping to change at line 1885 ¶ | |||
order in a map changes the semantics, except to specify that some | order in a map changes the semantics, except to specify that some | |||
orders are disallowed, for example, where they would not meet the | orders are disallowed, for example, where they would not meet the | |||
requirements of a deterministic encoding (Section 4.2). (Any | requirements of a deterministic encoding (Section 4.2). (Any | |||
secondary effects of map ordering such as on timing, cache usage, and | secondary effects of map ordering such as on timing, cache usage, and | |||
other potential side channels are not considered part of the | other potential side channels are not considered part of the | |||
semantics but may be enough reason on their own for a protocol to | semantics but may be enough reason on their own for a protocol to | |||
require a deterministic encoding format.) | require a deterministic encoding format.) | |||
Applications for constrained devices should consider using small | Applications for constrained devices should consider using small | |||
integers as keys if they have maps with a small number of frequently | integers as keys if they have maps with a small number of frequently | |||
used and identifiable keys; for instance, a set of 24 or fewer keys | used keys; for instance, a set of 24 or fewer keys can be encoded in | |||
can be encoded in a single byte as unsigned integers, up to 48 if | a single byte as unsigned integers, up to 48 if negative integers are | |||
negative integers are also used. Less frequently occurring keys can | also used. Less frequently occurring keys can then use integers with | |||
then use integers with longer encodings. | longer encodings. | |||
5.6.1. Equivalence of Keys | 5.6.1. Equivalence of Keys | |||
The specific data model that applies to a CBOR data item is used to | The specific data model that applies to a CBOR data item is used to | |||
determine whether keys occurring in maps are duplicates or distinct. | determine whether keys occurring in maps are duplicates or distinct. | |||
At the generic data model level, numerically equivalent integer and | At the generic data model level, numerically equivalent integer and | |||
floating-point values are distinct from each other, as they are from | floating-point values are distinct from each other, as they are from | |||
the various big numbers (Tags 2 to 5). Similarly, text strings are | the various big numbers (Tags 2 to 5). Similarly, text strings are | |||
distinct from byte strings, even if composed of the same bytes. A | distinct from byte strings, even if composed of the same bytes. A | |||
skipping to change at line 1920 ¶ | skipping to change at line 1936 ¶ | |||
decoder may deliver a decoded map to an application that needs to be | decoder may deliver a decoded map to an application that needs to be | |||
checked for duplicate map keys by that application (alternatively, | checked for duplicate map keys by that application (alternatively, | |||
the decoder may provide a programming interface to perform this | the decoder may provide a programming interface to perform this | |||
service for the application). Specific data models are not able to | service for the application). Specific data models are not able to | |||
distinguish values for map keys that are equal for this purpose at | distinguish values for map keys that are equal for this purpose at | |||
the generic data model level. | the generic data model level. | |||
5.7. Undefined Values | 5.7. Undefined Values | |||
In some CBOR-based protocols, the simple value (Section 3.3) of | In some CBOR-based protocols, the simple value (Section 3.3) of | |||
Undefined might be used by an encoder as a substitute for a data item | "undefined" might be used by an encoder as a substitute for a data | |||
with an encoding problem, in order to allow the rest of the enclosing | item with an encoding problem, in order to allow the rest of the | |||
data items to be encoded without harm. | enclosing data items to be encoded without harm. | |||
6. Converting Data between CBOR and JSON | 6. Converting Data between CBOR and JSON | |||
This section gives non-normative advice about converting between CBOR | This section gives non-normative advice about converting between CBOR | |||
and JSON. Implementations of converters MAY use whichever advice | and JSON. Implementations of converters MAY use whichever advice | |||
here they want. | here they want. | |||
It is worth noting that a JSON text is a sequence of characters, not | It is worth noting that a JSON text is a sequence of characters, not | |||
an encoded sequence of bytes, while a CBOR data item consists of | an encoded sequence of bytes, while a CBOR data item consists of | |||
bytes, not characters. | bytes, not characters. | |||
skipping to change at line 2018 ¶ | skipping to change at line 2034 ¶ | |||
All JSON values, once decoded, directly map into one or more CBOR | All JSON values, once decoded, directly map into one or more CBOR | |||
values. As with any kind of CBOR generation, decisions have to be | values. As with any kind of CBOR generation, decisions have to be | |||
made with respect to number representation. In a suggested | made with respect to number representation. In a suggested | |||
conversion: | conversion: | |||
* JSON numbers without fractional parts (integer numbers) are | * JSON numbers without fractional parts (integer numbers) are | |||
represented as integers (major types 0 and 1, possibly major type | represented as integers (major types 0 and 1, possibly major type | |||
6, tag number 2 and 3), choosing the shortest form; integers | 6, tag number 2 and 3), choosing the shortest form; integers | |||
longer than an implementation-defined threshold may instead be | longer than an implementation-defined threshold may instead be | |||
represented as floating-point values. The default range that is | represented as floating-point values. The default range that is | |||
represented as integer is -2**53+1..2**53-1 (fully exploiting the | represented as integer is -2^(53)+1..2^(53)-1 (fully exploiting | |||
range for exact integers in the binary64 representation often used | the range for exact integers in the binary64 representation often | |||
for decoding JSON [RFC7493]). A CBOR-based protocol, or a generic | used for decoding JSON [RFC7493]). A CBOR-based protocol, or a | |||
converter implementation, may choose -2**32..2**32-1 or | generic converter implementation, may choose -2^(32)..2^(32)-1 or | |||
-2**64..2**64-1 (fully using the integer ranges available in CBOR | -2^(64)..2^(64)-1 (fully using the integer ranges available in | |||
with uint32_t or uint64_t, respectively) or even -2**31..2**31-1 | CBOR with uint32_t or uint64_t, respectively) or even | |||
or -2**63..2**63-1 (using popular ranges for two's complement | -2^(31)..2^(31)-1 or -2^(63)..2^(63)-1 (using popular ranges for | |||
signed integers). (If the JSON was generated from a JavaScript | two's complement signed integers). (If the JSON was generated | |||
implementation, its precision is already limited to 53 bits | from a JavaScript implementation, its precision is already limited | |||
maximum.) | to 53 bits maximum.) | |||
* Numbers with fractional parts are represented as floating-point | * Numbers with fractional parts are represented as floating-point | |||
values, performing the decimal-to-binary conversion based on the | values, performing the decimal-to-binary conversion based on the | |||
precision provided by IEEE 754 binary64. The mathematical value | precision provided by IEEE 754 binary64. The mathematical value | |||
of the JSON number is converted to binary64 using the | of the JSON number is converted to binary64 using the | |||
roundTiesToEven procedure in Section 4.3.1 of [IEEE754]. Then, | roundTiesToEven procedure in Section 4.3.1 of [IEEE754]. Then, | |||
when encoding in CBOR, the preferred serialization uses the | when encoding in CBOR, the preferred serialization uses the | |||
shortest floating-point representation exactly representing this | shortest floating-point representation exactly representing this | |||
conversion result; for instance, 1.5 is represented in a 16-bit | conversion result; for instance, 1.5 is represented in a 16-bit | |||
floating-point value (not all implementations will be capable of | floating-point value (not all implementations will be capable of | |||
skipping to change at line 2134 ¶ | skipping to change at line 2150 ¶ | |||
The human mind is sometimes drawn to filling in little perceived gaps | The human mind is sometimes drawn to filling in little perceived gaps | |||
to make something neat. We expect the remaining gaps in the | to make something neat. We expect the remaining gaps in the | |||
codepoint space for the additional information values to be an | codepoint space for the additional information values to be an | |||
attractor for new ideas, just because they are there. | attractor for new ideas, just because they are there. | |||
The present specification does not manage the additional information | The present specification does not manage the additional information | |||
codepoint space by an IANA registry. Instead, allocations out of | codepoint space by an IANA registry. Instead, allocations out of | |||
this space can only be done by updating this specification. | this space can only be done by updating this specification. | |||
For an additional information value of n >= 24, the size of the | For an additional information value of n >= 24, the size of the | |||
additional data typically is 2**(n-24) bytes. Therefore, additional | additional data typically is 2^(n-24) bytes. Therefore, additional | |||
information values 28 and 29 should be viewed as candidates for | information values 28 and 29 should be viewed as candidates for | |||
128-bit and 256-bit quantities, in case a need arises to add them to | 128-bit and 256-bit quantities, in case a need arises to add them to | |||
the protocol. Additional information value 30 is then the only | the protocol. Additional information value 30 is then the only | |||
additional information value available for general allocation, and | additional information value available for general allocation, and | |||
there should be a very good reason for allocating it before assigning | there should be a very good reason for allocating it before assigning | |||
it through an update of the present specification. | it through an update of the present specification. | |||
8. Diagnostic Notation | 8. Diagnostic Notation | |||
CBOR is a binary interchange format. To facilitate documentation and | CBOR is a binary interchange format. To facilitate documentation and | |||
skipping to change at line 2237 ¶ | skipping to change at line 2253 ¶ | |||
or a text string (0x7fff) is meant and is therefore not used. The | or a text string (0x7fff) is meant and is therefore not used. The | |||
basic forms ''_ and ""_ can be used instead and are reserved for the | basic forms ''_ and ""_ can be used instead and are reserved for the | |||
case of no chunks only -- not as short forms for the (permitted, but | case of no chunks only -- not as short forms for the (permitted, but | |||
not really useful) encodings with only empty chunks, which need to be | not really useful) encodings with only empty chunks, which need to be | |||
notated as (_ ''), (_ ""), etc., to preserve the chunk structure. | notated as (_ ''), (_ ""), etc., to preserve the chunk structure. | |||
9. IANA Considerations | 9. IANA Considerations | |||
IANA has created two registries for new CBOR values. The registries | IANA has created two registries for new CBOR values. The registries | |||
are separate, that is, not under an umbrella registry, and follow the | are separate, that is, not under an umbrella registry, and follow the | |||
rules in [RFC8126]. IANA has also assigned a new media type and an | rules in [RFC8126]. IANA has also assigned a new media type, an | |||
associated CoAP Content-Format entry. | associated CoAP Content-Format entry, and a structured syntax suffix. | |||
9.1. Simple Values Registry | 9.1. CBOR Simple Values Registry | |||
IANA has created the "Concise Binary Object Representation (CBOR) | IANA has created the "Concise Binary Object Representation (CBOR) | |||
Simple Values" registry at [IANA.cbor-simple-values]. The initial | Simple Values" registry at [IANA.cbor-simple-values]. The initial | |||
values are shown in Table 4. | values are shown in Table 4. | |||
New entries in the range 0 to 19 are assigned by Standards Action | New entries in the range 0 to 19 are assigned by Standards Action | |||
[RFC8126]. It is suggested that IANA allocate values starting with | [RFC8126]. It is suggested that IANA allocate values starting with | |||
the number 16 in order to reserve the lower numbers for contiguous | the number 16 in order to reserve the lower numbers for contiguous | |||
blocks (if any). | blocks (if any). | |||
New entries in the range 32 to 255 are assigned by Specification | New entries in the range 32 to 255 are assigned by Specification | |||
Required. | Required. | |||
9.2. Tags Registry | 9.2. CBOR Tags Registry | |||
IANA has created the "Concise Binary Object Representation (CBOR) | IANA has created the "Concise Binary Object Representation (CBOR) | |||
Tags" registry at [IANA.cbor-tags]. The tags that were defined in | Tags" registry at [IANA.cbor-tags]. The tags that were defined in | |||
[RFC7049] are described in detail in Section 3.4, and other tags have | [RFC7049] are described in detail in Section 3.4, and other tags have | |||
already been defined since then. | already been defined since then. | |||
New entries in the range 0 to 23 ("1+0") are assigned by Standards | New entries in the range 0 to 23 ("1+0") are assigned by Standards | |||
Action. New entries in the ranges 24 to 255 ("1+1") and 256 to 32767 | Action. New entries in the ranges 24 to 255 ("1+1") and 256 to 32767 | |||
(lower half of "1+2") are assigned by Specification Required. New | (lower half of "1+2") are assigned by Specification Required. New | |||
entries in the range 32768 to 18446744073709551615 (upper half of | entries in the range 32768 to 18446744073709551615 (upper half of | |||
skipping to change at line 2286 ¶ | skipping to change at line 2302 ¶ | |||
* Description of semantics (URL) -- This description is optional; | * Description of semantics (URL) -- This description is optional; | |||
the URL can point to something like an Internet-Draft or a web | the URL can point to something like an Internet-Draft or a web | |||
page. | page. | |||
Applicants exercising the First Come First Served range and making a | Applicants exercising the First Come First Served range and making a | |||
suggestion for a tag number that is not representable in 32 bits | suggestion for a tag number that is not representable in 32 bits | |||
(i.e., larger than 4294967295) should be aware that this could reduce | (i.e., larger than 4294967295) should be aware that this could reduce | |||
interoperability with implementations that do not support 64-bit | interoperability with implementations that do not support 64-bit | |||
numbers. | numbers. | |||
9.3. Media Type | 9.3. Media Types Registry | |||
The Internet media type [RFC6838] for a single encoded CBOR data item | The Internet media type [RFC6838] ("MIME type") for a single encoded | |||
is "application/cbor" as defined in the "Media Types" registry | CBOR data item is "application/cbor" as defined in the "Media Types" | |||
[IANA.media-types]: | registry [IANA.media-types]: | |||
Type name: application | Type name: application | |||
Subtype name: cbor | Subtype name: cbor | |||
Required parameters: n/a | Required parameters: n/a | |||
Optional parameters: n/a | Optional parameters: n/a | |||
Encoding considerations: Binary | Encoding considerations: Binary | |||
skipping to change at line 2328 ¶ | skipping to change at line 2344 ¶ | |||
Area (art@ietf.org) | Area (art@ietf.org) | |||
Intended usage: COMMON | Intended usage: COMMON | |||
Restrictions on usage: none | Restrictions on usage: none | |||
Author: IETF CBOR Working Group (cbor@ietf.org) | Author: IETF CBOR Working Group (cbor@ietf.org) | |||
Change controller: The IESG (iesg@ietf.org) | Change controller: The IESG (iesg@ietf.org) | |||
9.4. CoAP Content-Format | 9.4. CoAP Content-Format Registry | |||
The CoAP Content-Format for CBOR has been registered in the "CoAP | The CoAP Content-Format for CBOR has been registered in the "CoAP | |||
Content-Formats" subregistry within the "Constrained RESTful | Content-Formats" subregistry within the "Constrained RESTful | |||
Environments (CoRE) Parameters" registry [IANA.core-parameters]: | Environments (CoRE) Parameters" registry [IANA.core-parameters]: | |||
Media Type: application/cbor | Media Type: application/cbor | |||
Encoding: - | Encoding: - | |||
ID: 60 | ID: 60 | |||
Reference: RFC 8949 | Reference: RFC 8949 | |||
9.5. The +cbor Structured Syntax Suffix Registration | 9.5. Structured Syntax Suffix Registry | |||
The structured syntax suffix [RFC6838] for media types based on a | The structured syntax suffix [RFC6838] for media types based on a | |||
single encoded CBOR data item is +cbor, which IANA has registered in | single encoded CBOR data item is +cbor, which IANA has registered in | |||
the "Structured Syntax Suffixes" registry [IANA.structured-suffix]: | the "Structured Syntax Suffixes" registry [IANA.structured-suffix]: | |||
Name: Concise Binary Object Representation (CBOR) | Name: Concise Binary Object Representation (CBOR) | |||
+suffix: +cbor | +suffix: +cbor | |||
References: RFC 8949 | References: RFC 8949 | |||
skipping to change at line 2524 ¶ | skipping to change at line 2540 ¶ | |||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
[C] International Organization for Standardization, | [C] International Organization for Standardization, | |||
"Information technology - Programming languages - C", | "Information technology - Programming languages - C", | |||
Fourth Edition, ISO/IEC 9899:2018, June 2018, | Fourth Edition, ISO/IEC 9899:2018, June 2018, | |||
<https://www.iso.org/standard/74528.html>. | <https://www.iso.org/standard/74528.html>. | |||
[Cplusplus17] | [Cplusplus20] | |||
International Organization for Standardization, | International Organization for Standardization, | |||
"Programming languages - C++", Fifth Edition, ISO/ | "Programming languages - C++", Sixth Edition, ISO/IEC DIS | |||
IEC 14882:2017, December 2017, | 14882, ISO/IEC ISO/IEC JTC1 SC22 WG21 N 4860, March 2020, | |||
<https://www.iso.org/standard/68564.html>. | <https://isocpp.org/files/papers/N4860.pdf>. | |||
[IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE | [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE | |||
Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, | Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, | |||
<https://ieeexplore.ieee.org/document/8766229>. | <https://ieeexplore.ieee.org/document/8766229>. | |||
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
Extensions (MIME) Part One: Format of Internet Message | Extensions (MIME) Part One: Format of Internet Message | |||
Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996, | |||
<https://www.rfc-editor.org/info/rfc2045>. | <https://www.rfc-editor.org/info/rfc2045>. | |||
skipping to change at line 2577 ¶ | skipping to change at line 2593 ¶ | |||
RFC 8126, DOI 10.17487/RFC8126, June 2017, | RFC 8126, DOI 10.17487/RFC8126, June 2017, | |||
<https://www.rfc-editor.org/info/rfc8126>. | <https://www.rfc-editor.org/info/rfc8126>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
[TIME_T] The Open Group, "The Open Group Base Specifications", | [TIME_T] The Open Group, "The Open Group Base Specifications", | |||
Section 4.16, 'Seconds Since the Epoch', Issue 7, 2018 | Section 4.16, 'Seconds Since the Epoch', Issue 7, 2018 | |||
Edition, IEEE Std 1003.1, 2018, | Edition, IEEE Std 1003.1, 2018, | |||
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/ | <https://pubs.opengroup.org/onlinepubs/9699919799/ | |||
V1_chap04.html#tag_04_16>. | basedefs/V1_chap04.html#tag_04_16>. | |||
11.2. Informative References | 11.2. Informative References | |||
[ASN.1] International Telecommunication Union, "Information | [ASN.1] International Telecommunication Union, "Information | |||
Technology - ASN.1 encoding rules: Specification of Basic | Technology - ASN.1 encoding rules: Specification of Basic | |||
Encoding Rules (BER), Canonical Encoding Rules (CER) and | Encoding Rules (BER), Canonical Encoding Rules (CER) and | |||
Distinguished Encoding Rules (DER)", ITU-T Recommendation | Distinguished Encoding Rules (DER)", ITU-T Recommendation | |||
X.690, 1994. | X.690, 2015, | |||
<https://www.itu.int/rec/T-REC-X.690-201508-I/en>. | ||||
[BSON] Various, "BSON - Binary JSON", 2013, | [BSON] Various, "BSON - Binary JSON", <http://bsonspec.org/>. | |||
<http://bsonspec.org/>. | ||||
[CBOR-TAGS] | [CBOR-TAGS] | |||
Bormann, C., "Notable CBOR Tags", Work in Progress, | Bormann, C., "Notable CBOR Tags", Work in Progress, | |||
Internet-Draft, draft-bormann-cbor-notable-tags-02, 25 | Internet-Draft, draft-bormann-cbor-notable-tags-02, 25 | |||
June 2020, <https://tools.ietf.org/html/draft-bormann- | June 2020, <https://tools.ietf.org/html/draft-bormann- | |||
cbor-notable-tags-02>. | cbor-notable-tags-02>. | |||
[ECMA262] Ecma International, "ECMAScript 2018 Language | [ECMA262] Ecma International, "ECMAScript 2020 Language | |||
Specification", Standard ECMA-262, 9th Edition, June 2018, | Specification", Standard ECMA-262, 11th Edition, June | |||
<https://www.ecma-international.org/publications/files/ | 2020, <https://www.ecma- | |||
ECMA-ST/Ecma-262.pdf>. | international.org/publications/standards/Ecma-262.htm>. | |||
[Err3764] RFC Errata, Erratum ID 3764, RFC 7049, | [Err3764] RFC Errata, Erratum ID 3764, RFC 7049, | |||
<https://www.rfc-editor.org/errata/eid3764>. | <https://www.rfc-editor.org/errata/eid3764>. | |||
[Err3770] RFC Errata, Erratum ID 3770, RFC 7049, | [Err3770] RFC Errata, Erratum ID 3770, RFC 7049, | |||
<https://www.rfc-editor.org/errata/eid3770>. | <https://www.rfc-editor.org/errata/eid3770>. | |||
[Err4294] RFC Errata, Erratum ID 4294, RFC 7049, | [Err4294] RFC Errata, Erratum ID 4294, RFC 7049, | |||
<https://www.rfc-editor.org/errata/eid4294>. | <https://www.rfc-editor.org/errata/eid4294>. | |||
skipping to change at line 2653 ¶ | skipping to change at line 2669 ¶ | |||
[IANA.media-types] | [IANA.media-types] | |||
IANA, "Media Types", | IANA, "Media Types", | |||
<https://www.iana.org/assignments/media-types>. | <https://www.iana.org/assignments/media-types>. | |||
[IANA.structured-suffix] | [IANA.structured-suffix] | |||
IANA, "Structured Syntax Suffixes", | IANA, "Structured Syntax Suffixes", | |||
<https://www.iana.org/assignments/media-type-structured- | <https://www.iana.org/assignments/media-type-structured- | |||
suffix>. | suffix>. | |||
[MessagePack] | [MessagePack] | |||
Furuhashi, S., "MessagePack", 2013, | Furuhashi, S., "MessagePack", <https://msgpack.org/>. | |||
<https://msgpack.org/>. | ||||
[PCRE] Hazel, P., "PCRE - Perl Compatible Regular Expressions", | [PCRE] Hazel, P., "PCRE - Perl Compatible Regular Expressions", | |||
2018, <https://www.pcre.org/>. | <https://www.pcre.org/>. | |||
[RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | [RFC0713] Haverty, J., "MSDTP-Message Services Data Transmission | |||
Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | Protocol", RFC 713, DOI 10.17487/RFC0713, April 1976, | |||
<https://www.rfc-editor.org/info/rfc713>. | <https://www.rfc-editor.org/info/rfc713>. | |||
[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type | [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type | |||
Specifications and Registration Procedures", BCP 13, | Specifications and Registration Procedures", BCP 13, | |||
RFC 6838, DOI 10.17487/RFC6838, January 2013, | RFC 6838, DOI 10.17487/RFC6838, January 2013, | |||
<https://www.rfc-editor.org/info/rfc6838>. | <https://www.rfc-editor.org/info/rfc6838>. | |||
skipping to change at line 2718 ¶ | skipping to change at line 2733 ¶ | |||
<https://www.rfc-editor.org/info/rfc8746>. | <https://www.rfc-editor.org/info/rfc8746>. | |||
[SIPHASH_LNCS] | [SIPHASH_LNCS] | |||
Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | Aumasson, J. and D. Bernstein, "SipHash: A Fast Short- | |||
Input PRF", Progress in Cryptology - INDOCRYPT 2012, pp. | Input PRF", Progress in Cryptology - INDOCRYPT 2012, pp. | |||
489-508, DOI 10.1007/978-3-642-34931-7_28, 2012, | 489-508, DOI 10.1007/978-3-642-34931-7_28, 2012, | |||
<https://doi.org/10.1007/978-3-642-34931-7_28>. | <https://doi.org/10.1007/978-3-642-34931-7_28>. | |||
[SIPHASH_OPEN] | [SIPHASH_OPEN] | |||
Aumasson, J. and D.J. Bernstein, "SipHash: a fast short- | Aumasson, J. and D.J. Bernstein, "SipHash: a fast short- | |||
input PRF", <https://131002.net/siphash/siphash.pdf>. | input PRF", <https://www.aumasson.jp/siphash/siphash.pdf>. | |||
[YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | [YAML] Ben-Kiki, O., Evans, C., and I.d. Net, "YAML Ain't Markup | |||
Language (YAML[TM]) Version 1.2", 3rd Edition, October | Language (YAML[TM]) Version 1.2", 3rd Edition, October | |||
2009, <https://www.yaml.org/spec/1.2/spec.html>. | 2009, <https://www.yaml.org/spec/1.2/spec.html>. | |||
Appendix A. Examples of Encoded CBOR Data Items | Appendix A. Examples of Encoded CBOR Data Items | |||
The following table provides some CBOR-encoded values in hexadecimal | The following table provides some CBOR-encoded values in hexadecimal | |||
(right column), together with diagnostic notation for these values | (right column), together with diagnostic notation for these values | |||
(left column). Note that the string "\u00fc" is one form of | (left column). Note that the string "\u00fc" is one form of | |||
diagnostic notation for a UTF-8 string containing the single Unicode | diagnostic notation for a UTF-8 string containing the single Unicode | |||
character U+00FC (LATIN SMALL LETTER U WITH DIAERESIS, "ü"). | character U+00FC (LATIN SMALL LETTER U WITH DIAERESIS, "ü"). | |||
Similarly, "\u6c34" is a UTF-8 string in diagnostic notation with a | Similarly, "\u6c34" is a UTF-8 string in diagnostic notation with a | |||
single character U+6C34 U+000A U+0020 U+0020 U+0020 U+0020 U+0020 | single character U+6C34 (CJK UNIFIED IDEOGRAPH-6C34, "水"), often | |||
U+0020 U+0020 U+0020 (CJK UNIFIED IDEOGRAPH-6C34, U+000a, SPACE, | ||||
SPACE, SPACE, SPACE, SPACE, SPACE, SPACE, SPACE, "水 "), often | ||||
representing "water", and "\ud800\udd51" is a UTF-8 string in | representing "water", and "\ud800\udd51" is a UTF-8 string in | |||
diagnostic notation with a single character U+10151 (GREEK ACROPHONIC | diagnostic notation with a single character U+10151 (GREEK ACROPHONIC | |||
ATTIC FIFTY STATERS, "𐅑"). In the diagnostic notation provided for | ATTIC FIFTY STATERS, "𐅑"). (Note that all these single-character | |||
bignums, their intended numeric value is shown as a decimal number | strings could also be represented in native UTF-8 in diagnostic | |||
(such as 18446744073709551616) instead of a tagged byte string (such | notation, just not if an ASCII-only specification is required.) In | |||
as 2(h'010000000000000000')). | the diagnostic notation provided for bignums, their intended numeric | |||
value is shown as a decimal number (such as 18446744073709551616) | ||||
instead of a tagged byte string (such as 2(h'010000000000000000')). | ||||
+==============================+====================================+ | +==============================+====================================+ | |||
|Diagnostic | Encoded | | |Diagnostic | Encoded | | |||
+==============================+====================================+ | +==============================+====================================+ | |||
|0 | 0x00 | | |0 | 0x00 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
|1 | 0x01 | | |1 | 0x01 | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
|10 | 0x0a | | |10 | 0x0a | | |||
+------------------------------+------------------------------------+ | +------------------------------+------------------------------------+ | |||
skipping to change at line 2929 ¶ | skipping to change at line 2944 ¶ | |||
Appendix B. Jump Table for Initial Byte | Appendix B. Jump Table for Initial Byte | |||
For brevity, this jump table does not show initial bytes that are | For brevity, this jump table does not show initial bytes that are | |||
reserved for future extension. It also only shows a selection of the | reserved for future extension. It also only shows a selection of the | |||
initial bytes that can be used for optional features. (All unsigned | initial bytes that can be used for optional features. (All unsigned | |||
integers are in network byte order.) | integers are in network byte order.) | |||
+============+================================================+ | +============+================================================+ | |||
| Byte | Structure/Semantics | | | Byte | Structure/Semantics | | |||
+============+================================================+ | +============+================================================+ | |||
| 0x00..0x17 | Unsigned integer 0x00..0x17 (0..23) | | | 0x00..0x17 | unsigned integer 0x00..0x17 (0..23) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x18 | Unsigned integer (one-byte uint8_t follows) | | | 0x18 | unsigned integer (one-byte uint8_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x19 | Unsigned integer (two-byte uint16_t follows) | | | 0x19 | unsigned integer (two-byte uint16_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x1a | Unsigned integer (four-byte uint32_t follows) | | | 0x1a | unsigned integer (four-byte uint32_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x1b | Unsigned integer (eight-byte uint64_t follows) | | | 0x1b | unsigned integer (eight-byte uint64_t follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x20..0x37 | Negative integer -1-0x00..-1-0x17 (-1..-24) | | | 0x20..0x37 | negative integer -1-0x00..-1-0x17 (-1..-24) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x38 | Negative integer -1-n (one-byte uint8_t for n | | | 0x38 | negative integer -1-n (one-byte uint8_t for n | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x39 | Negative integer -1-n (two-byte uint16_t for n | | | 0x39 | negative integer -1-n (two-byte uint16_t for n | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x3a | Negative integer -1-n (four-byte uint32_t for | | | 0x3a | negative integer -1-n (four-byte uint32_t for | | |||
| | n follows) | | | | n follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x3b | Negative integer -1-n (eight-byte uint64_t for | | | 0x3b | negative integer -1-n (eight-byte uint64_t for | | |||
| | n follows) | | | | n follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x40..0x57 | byte string (0x00..0x17 bytes follow) | | | 0x40..0x57 | byte string (0x00..0x17 bytes follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x58 | byte string (one-byte uint8_t for n, and then | | | 0x58 | byte string (one-byte uint8_t for n, and then | | |||
| | n bytes follow) | | | | n bytes follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0x59 | byte string (two-byte uint16_t for n, and then | | | 0x59 | byte string (two-byte uint16_t for n, and then | | |||
| | n bytes follow) | | | | n bytes follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
skipping to change at line 3021 ¶ | skipping to change at line 3036 ¶ | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xba | map (four-byte uint32_t for n, and then n | | | 0xba | map (four-byte uint32_t for n, and then n | | |||
| | pairs of data items follow) | | | | pairs of data items follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xbb | map (eight-byte uint64_t for n, and then n | | | 0xbb | map (eight-byte uint64_t for n, and then n | | |||
| | pairs of data items follow) | | | | pairs of data items follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xbf | map, pairs of data items follow, terminated by | | | 0xbf | map, pairs of data items follow, terminated by | | |||
| | "break" | | | | "break" | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc0 | Text-based date/time (data item follows; see | | | 0xc0 | text-based date/time (data item follows; see | | |||
| | Section 3.4.1) | | | | Section 3.4.1) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc1 | Epoch-based date/time (data item follows; see | | | 0xc1 | epoch-based date/time (data item follows; see | | |||
| | Section 3.4.2) | | | | Section 3.4.2) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc2 | Positive bignum (data item "byte string" | | | 0xc2 | unsigned bignum (data item "byte string" | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc3 | Negative bignum (data item "byte string" | | | 0xc3 | negative bignum (data item "byte string" | | |||
| | follows) | | | | follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc4 | Decimal Fraction (data item "array" follows; | | | 0xc4 | decimal Fraction (data item "array" follows; | | |||
| | see Section 3.4.4) | | | | see Section 3.4.4) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc5 | Bigfloat (data item "array" follows; see | | | 0xc5 | bigfloat (data item "array" follows; see | | |||
| | Section 3.4.4) | | | | Section 3.4.4) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xc6..0xd4 | (tag) | | | 0xc6..0xd4 | (tag) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xd5..0xd7 | Expected Conversion (data item follows; see | | | 0xd5..0xd7 | expected conversion (data item follows; see | | |||
| | Section 3.4.5.2) | | | | Section 3.4.5.2) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xd8..0xdb | (more tags; 1/2/4/8 bytes of tag number and | | | 0xd8..0xdb | (more tags; 1/2/4/8 bytes of tag number and | | |||
| | then a data item follow) | | | | then a data item follow) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xe0..0xf3 | (simple value) | | | 0xe0..0xf3 | (simple value) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf4 | False | | | 0xf4 | false | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf5 | True | | | 0xf5 | true | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf6 | Null | | | 0xf6 | null | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf7 | Undefined | | | 0xf7 | undefined | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf8 | (simple value, one byte follows) | | | 0xf8 | (simple value, one byte follows) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xf9 | Half-Precision Float (two-byte IEEE 754) | | | 0xf9 | half-precision float (two-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xfa | Single-Precision Float (four-byte IEEE 754) | | | 0xfa | single-precision float (four-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xfb | Double-Precision Float (eight-byte IEEE 754) | | | 0xfb | double-precision float (eight-byte IEEE 754) | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
| 0xff | "break" stop code | | | 0xff | "break" stop code | | |||
+------------+------------------------------------------------+ | +------------+------------------------------------------------+ | |||
Table 7: Jump Table for Initial Byte | Table 7: Jump Table for Initial Byte | |||
Appendix C. Pseudocode | Appendix C. Pseudocode | |||
The well-formedness of a CBOR item can be checked by the pseudocode | The well-formedness of a CBOR item can be checked by the pseudocode | |||
in Figure 1. The data is well-formed if and only if: | in Figure 1. The data is well-formed if and only if: | |||
skipping to change at line 3504 ¶ | skipping to change at line 3519 ¶ | |||
by adding "Second value" to a comment to the last example in | by adding "Second value" to a comment to the last example in | |||
Section 3.2.2). | Section 3.2.2). | |||
Other clerical changes include: | Other clerical changes include: | |||
* the use of new xml2rfc functionality [RFC7991]; | * the use of new xml2rfc functionality [RFC7991]; | |||
* more explanation of the notation used; | * more explanation of the notation used; | |||
* the update of references, e.g., from RFC 4627 to [RFC8259], from | * the update of references, e.g., from RFC 4627 to [RFC8259], from | |||
CNN-TERMS to [RFC7228], and from the 5.1 edition to the 9th | CNN-TERMS to [RFC7228], and from the 5.1 edition to the 11th | |||
edition of [ECMA262]; the addition of a reference to [IEEE754] and | edition of [ECMA262]; the addition of a reference to [IEEE754] and | |||
importation of required definitions; and the addition of a | importation of required definitions; the addition of references to | |||
reference to [RFC8618] that further illustrates the discussion in | [C] and [Cplusplus20]; and the addition of a reference to | |||
Appendix E; | [RFC8618] that further illustrates the discussion in Appendix E; | |||
* in the discussion of diagnostic notation (Section 8), the | * in the discussion of diagnostic notation (Section 8), the | |||
"Extended Diagnostic Notation" (EDN) defined in [RFC8610] is now | "Extended Diagnostic Notation" (EDN) defined in [RFC8610] is now | |||
mentioned, the gap in representing NaN payloads is now | mentioned, the gap in representing NaN payloads is now | |||
highlighted, and an explanation of representing indefinite-length | highlighted, and an explanation of representing indefinite-length | |||
strings with no chunks has been added (Section 8.1); | strings with no chunks has been added (Section 8.1); | |||
* the addition of this appendix. | * the addition of this appendix. | |||
G.2. Changes in IANA Considerations | G.2. Changes in IANA Considerations | |||
skipping to change at line 3532 ¶ | skipping to change at line 3547 ¶ | |||
specification). References to the respective IANA registries were | specification). References to the respective IANA registries were | |||
added to the informative references. | added to the informative references. | |||
In the "Concise Binary Object Representation (CBOR) Tags" registry | In the "Concise Binary Object Representation (CBOR) Tags" registry | |||
[IANA.cbor-tags], tags in the space from 256 to 32767 (lower half of | [IANA.cbor-tags], tags in the space from 256 to 32767 (lower half of | |||
"1+2") are no longer assigned by First Come First Served; this range | "1+2") are no longer assigned by First Come First Served; this range | |||
is now Specification Required. | is now Specification Required. | |||
G.3. Changes in Suggestions and Other Informational Components | G.3. Changes in Suggestions and Other Informational Components | |||
While revising the document beyond the addressing of the errata | While revising the document, beyond the addressing of the errata | |||
reports, the working group drew upon nearly seven years of experience | reports, the working group drew upon nearly seven years of experience | |||
with CBOR in a diverse set of applications. This led to a number of | with CBOR in a diverse set of applications. This led to a number of | |||
editorial changes, including adding tables for illustration, but also | editorial changes, including adding tables for illustration, but also | |||
emphasizing some aspects and de-emphasizing others. | emphasizing some aspects and de-emphasizing others. | |||
A significant addition is Section 2, which discusses the CBOR data | A significant addition is Section 2, which discusses the CBOR data | |||
model and its small variations involved in the processing of CBOR. | model and its small variations involved in the processing of CBOR. | |||
The introduction of terms for those variations (basic generic, | The introduction of terms for those variations (basic generic, | |||
extended generic, specific) enables more concise language in other | extended generic, specific) enables more concise language in other | |||
places of the document and also helps to clarify expectations of | places of the document and also helps to clarify expectations of | |||
End of changes. 85 change blocks. | ||||
219 lines changed or deleted | 235 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |