rfc9285xml2.original.xml | rfc9285.xml | |||
---|---|---|---|---|
<?xml version="1.0" encoding="US-ASCII"?> | <?xml version='1.0' encoding='utf-8'?> | |||
<!DOCTYPE RFC SYSTEM "rfc2629.dtd" [ | <!DOCTYPE rfc [ | |||
<!ENTITY RFC4648 SYSTEM | <!ENTITY nbsp " "> | |||
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4648.xml"> | <!ENTITY zwsp "​"> | |||
<!ENTITY RFC8174 SYSTEM | <!ENTITY nbhy "‑"> | |||
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"> | <!ENTITY wj "⁠"> | |||
<!ENTITY RFC2119 SYSTEM | ||||
"http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"> | ||||
]> | ]> | |||
<?xml-stylesheet type='text/xsl' href='RFC2629.xslt' ?> | ||||
<?rfc compact="yes"?> | ||||
<?rfc toc="yes"?> | ||||
<?rfc symrefs="yes"?> | ||||
<?rfc sortrefs="yes"?> | ||||
<!-- Expand crefs and put them inline --> | ||||
<?rfc comments='yes' ?> | ||||
<?rfc inline='yes' ?> | ||||
<rfc | <rfc number="9285" docName="draft-faltstrom-base45-12" xmlns:xi="http://www.w3.o | |||
docName="draft-faltstrom-base45-12" | rg/2001/XInclude" ipr="trust200902" submissionType="IETF" category="info" obsole | |||
ipr="trust200902" | tes="" updates="" consensus="true" xml:lang="en" tocInclude="true" symRefs="true | |||
submissionType="IETF" | " sortRefs="true" version="3"> | |||
category="info"> | <!-- xml2rfc v2v3 conversion 3.12.10 --> | |||
<front> | <front> | |||
<title abbrev="Base45"> | <title abbrev="Base45"> | |||
The Base45 Data Encoding | The Base45 Data Encoding | |||
</title> | </title> | |||
<author fullname="Patrik Faltstrom" initials="P." surname="Faltstrom"> | <seriesInfo name="RFC" value="9285" /> | |||
<organization abbrev="Netnod">Netnod</organization> | <author fullname="Patrik Fältström" initials="P." surname="Fältström"> | |||
<organization>Netnod</organization> | ||||
<address> | <address> | |||
<email>paf@netnod.se</email> | <email>paf@netnod.se</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<author fullname="Fredrik Ljunggren" initials="F." surname="Ljunggren"> | <author fullname="Fredrik Ljunggren" initials="F." surname="Ljunggren"> | |||
<organization abbrev="Kirei">Kirei</organization> | <organization>Kirei</organization> | |||
<address> | <address> | |||
<email>fredrik@kirei.se</email> | <email>fredrik@kirei.se</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<author fullname="Dirk-Willem van Gulik" initials="D." surname="van Gulik"> | <author fullname="Dirk-Willem van Gulik" initials="D.W." surname="van Gulik" | |||
<organization abbrev="Webweaving">Webweaving</organization> | > | |||
<organization>Webweaving</organization> | ||||
<address> | <address> | |||
<email>dirkx@webweaving.org</email> | <email>dirkx@webweaving.org</email> | |||
</address> | </address> | |||
</author> | </author> | |||
<date month="June" year="2022" day="16"/> | <date month="August" year="2022"/> | |||
<area>Operations</area> | ||||
<keyword>BASE45</keyword> | <keyword>BASE45</keyword> | |||
<abstract> | <abstract> | |||
<t> | <t> | |||
This document describes the Base45 encoding scheme which is | This document describes the Base45 encoding scheme, which is | |||
built upon the Base64, Base32 and Base16 encoding schemes. | built upon the Base64, Base32, and Base16 encoding schemes. | |||
</t> | </t> | |||
</abstract> | </abstract> | |||
</front> | </front> | |||
<middle> | <middle> | |||
<section anchor="intro" title="Introduction"> | <section anchor="intro" numbered="true" toc="default"> | |||
<name>Introduction</name> | ||||
<t> | <t> | |||
A QR-code is used to encode text as a graphical | A QR code is used to encode text as a graphical | |||
image. Depending on the characters used in the text various | image. Depending on the characters used in the text, various | |||
encoding options for a QR-code exist, e.g. Numeric, | encoding options for a QR code exist, e.g., Numeric, | |||
Alphanumeric and Byte mode. Even in Byte mode a typical | Alphanumeric, and Byte mode. Even in Byte mode, a typical | |||
QR-code reader tries to interpret a byte sequence as a UTF-8 | QR code reader tries to interpret a byte sequence as text encoded in UTF- | |||
or ISO/IEC 8859-1 encoded text. Thus, QR-codes cannot be used | 8 | |||
or ISO/IEC 8859-1. Thus, QR codes cannot be used | ||||
to encode arbitrary binary data directly. Such data has to be | to encode arbitrary binary data directly. Such data has to be | |||
converted into an appropriate text before that text could be | converted into an appropriate text before that text could be | |||
encoded as a QR-code. Compared to already established Base64, | encoded as a QR code. Compared to already established Base64, | |||
Base32 and Base16 encoding schemes, that are described in | Base32, and Base16 encoding schemes that are described in | |||
<xref target="RFC4648">RFC 4648</xref>, the Base45 scheme | <xref target="RFC4648" format="default"></xref>, the Base45 scheme | |||
described in this document offer a more compact QR-code | described in this document offers a more compact QR code | |||
encoding. | encoding. | |||
</t> | </t> | |||
<t> | <t> | |||
One important difference from those others and Base45 is the | One important difference from those others and Base45 is the | |||
key table and that the padding with '=' is not required. | key table and that the padding with '=' is not required. | |||
</t> | </t> | |||
</section> | </section> | |||
<section title="Conventions Used in This Document"> | <section numbered="true" toc="default"> | |||
<t> | <name>Conventions Used in This Document</name> | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL | <t> | |||
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT | The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQU | |||
RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be | IRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL | |||
interpreted as described in BCP 14 <xref target="RFC2119" /> | NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14> | |||
<xref target="RFC8174" /> when, and only when, they appear in | RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", | |||
all capitals, as shown here. | "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to | |||
</t> | be interpreted as | |||
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> | ||||
when, and only when, they appear in all capitals, as shown here. | ||||
</t> | ||||
</section> | </section> | |||
<section title="Interpretation of Encoded Data"> | <section numbered="true" toc="default"> | |||
<name>Interpretation of Encoded Data</name> | ||||
<t> | <t> | |||
Encoded data is to be interpreted as described in <xref | Encoded data is to be interpreted as described in <xref target="RFC4648" | |||
target="RFC4648">RFC 4648</xref> with the exception that a | format="default"></xref> with the exception that a | |||
different alphabet is selected. | different alphabet is selected. | |||
</t> | </t> | |||
</section> | </section> | |||
<section title="The Base45 Encoding"> | <section numbered="true" toc="default"> | |||
<name>The Base45 Encoding</name> | ||||
<t> | <t> | |||
QR codes have a limited ability to store binary data. In | QR codes have a limited ability to store binary data. In | |||
practice binary data have to be encoded in characters | practice, binary data have to be encoded in characters | |||
according to one of the modes already defined in the standard | according to one of the modes already defined in the standard | |||
for QR codes. The easiest mode to use in called Alphanumeric | for QR codes. The easiest mode to use in called Alphanumeric | |||
mode (see section 7.3.4 and Table 2 of <xref | mode (see Section 7.3.4 and Table 2 of <xref target="ISO18004" format="de | |||
target="ISO18004">ISO/IEC 18004:2015</xref>). Unfortunately | fault"></xref>. Unfortunately | |||
Alphanumeric mode uses 45 different characters which implies | Alphanumeric mode uses 45 different characters which implies | |||
neither Base32 nor Base64 are very effective encodings. | neither Base32 nor Base64 are very effective encodings. | |||
</t> | </t> | |||
<t> | <t> | |||
A 45-character subset of US-ASCII is used; the 45 characters | A 45-character subset of US-ASCII is used; the 45 characters | |||
usable in a QR code in Alphanumeric mode (see section 7.3.4 | usable in a QR code in Alphanumeric mode (see Section 7.3.4 | |||
and Table 2 of <xref target="ISO18004">ISO/IEC | and Table 2 of <xref target="ISO18004" format="default"></xref>). Base45 | |||
18004:2015</xref>). Base45 encodes 2 bytes in 3 characters, | encodes 2 bytes in 3 characters, | |||
compared to Base64, which encodes 3 bytes in 4 characters. | compared to Base64, which encodes 3 bytes in 4 characters. | |||
</t> | </t> | |||
<t> | <t> | |||
For encoding, two bytes [a, b] MUST be interpreted as a number | For encoding, two bytes [a, b] <bcp14>MUST</bcp14> be | |||
n in base 256, i.e. as an unsigned integer over 16 bits so | interpreted as a number n in base 256, i.e. as an unsigned | |||
that the number n = (a*256) + b. | integer over 16 bits so that the number n = (a * 256) + b. | |||
</t> | </t> | |||
<t> | <t> | |||
This number n is converted to base 45 [c, d, e] so that n = c | This number n is converted to base 45 [c, d, e] so that n = c | |||
+ (d*45) + (e*45*45). Note the order of c, d and e which are | + (d * 45) + (e * 45 * 45). Note the order of c, d and e which | |||
chosen so that the left-most [c] is the least significant. | are chosen so that the left-most [c] is the least significant. | |||
</t> | </t> | |||
<t> | <t> | |||
The values c, d and e are then looked up in Table 1 to produce | The values c, d and e are then looked up in <xref | |||
a three character string. The process is reversed when | target="table1"/> to produce a three character string. The | |||
decoding. | process is reversed when decoding. | |||
</t> | </t> | |||
<t> | <t> | |||
For encoding a single byte [a], it MUST be interpreted as a | For encoding a single byte [a], it <bcp14>MUST</bcp14> be | |||
base 256 number, i.e. as an unsigned integer over 8 bits. That | interpreted as a base 256 number, i.e. as an unsigned integer | |||
integer MUST be converted to base 45 [c d] so that a = c + | over 8 bits. That integer <bcp14>MUST</bcp14> be converted to | |||
(45*d). The values c and d are then looked up in Table 1 to | base 45 [c d] so that a = c + (45 * d). The values c and d are | |||
produce a two character string. | then looked up in <xref target="table1"/> to produce a | |||
two-character string. | ||||
</t> | </t> | |||
<t> | <t> | |||
A byte string [a b c d ... x y z] with arbitrary content and | A byte string [a b c d ... x y z] with arbitrary content and | |||
arbitrary length MUST be encoded as follows: From left to | arbitrary length <bcp14>MUST</bcp14> be encoded as follows: | |||
right pairs of bytes MUST be encoded as described above. If the | From left to right pairs of bytes <bcp14>MUST</bcp14> be | |||
number of bytes is even, then the encoded form is a string | encoded as described above. If the number of bytes is even, | |||
with a length which is evenly divisible by 3. If the number of | then the encoded form is a string with a length that is evenly | |||
bytes is odd, then the last (rightmost) byte MUST be encoded | divisible by 3. If the number of bytes is odd, then the last | |||
on two characters as described above. | (rightmost) byte <bcp14>MUST</bcp14> be encoded on two | |||
characters as described above. | ||||
</t> | </t> | |||
<t> | <t> | |||
For decoding a Base45 encoded string the inverse operations | For decoding a Base45 encoded string the inverse operations | |||
are performed. | are performed. | |||
</t> | </t> | |||
<section title="When to, and not to, use Base45"> | <section numbered="true" toc="default"> | |||
<t> | <name>When to Use and Not Use Base45</name> | |||
If binary data is to be stored in a QR-Code, the suggested | <t> | |||
If binary data is to be stored in a QR code, the suggested | ||||
mechanism is to use the Alphanumeric mode that uses 11 bits | mechanism is to use the Alphanumeric mode that uses 11 bits | |||
for 2 characters as defined in section 7.3.4 in <xref | for 2 characters as defined in Section 7.3.4 of <xref target="ISO18004" | |||
target="ISO18004">ISO/IEC 18004:2015</xref>. The ECI mode | format="default"></xref>. The Extended Channel Interpretation (ECI) mode | |||
indicator for this encoding is 0010. | indicator for this encoding is 0010. | |||
</t> | </t> | |||
<t> | <t> | |||
On the other hand if the data is to be sent via some other | On the other hand if the data is to be sent via some other | |||
transport, a transport encoding suitable for that transport | transport, a transport encoding suitable for that transport | |||
should be used instead of Base45. For example, it is not | should be used instead of Base45. For example, it is not | |||
recommended to first encode data in Base45 and then encode | recommended to first encode data in Base45 and then encode | |||
the resulting string in Base64 if the data is to be sent via | the resulting string in Base64 if the data is to be sent via | |||
email. Instead, the Base45 encoding should be removed, and | email. Instead, the Base45 encoding should be removed, and | |||
the data itself should be encoded in Base64. | the data itself should be encoded in Base64. | |||
</t> | </t> | |||
</section> | </section> | |||
<section title="The alphabet used in Base45"> | <section numbered="true" toc="default"> | |||
<t> | <name>The Alphabet Used in Base45</name> | |||
<t> | ||||
The Alphanumeric mode is defined to use 45 characters as specified | The Alphanumeric mode is defined to use 45 characters as specified | |||
in this alphabet. | in this alphabet. | |||
</t> | </t> | |||
<t> | ||||
<figure><artwork> | ||||
Table 1: The Base45 Alphabet | ||||
Value Encoding Value Encoding Value Encoding Value Encoding | <table anchor="table1"> | |||
00 0 12 C 24 O 36 Space | <name>The Base45 Alphabet</name> | |||
01 1 13 D 25 P 37 $ | <thead> | |||
02 2 14 E 26 Q 38 % | <tr> | |||
03 3 15 F 27 R 39 * | <th align="right">Value</th><th>Encoding</th> | |||
04 4 16 G 28 S 40 + | <th align="right">Value</th><th>Encoding</th> | |||
05 5 17 H 29 T 41 - | <th align="right">Value</th><th>Encoding</th> | |||
06 6 18 I 30 U 42 . | <th align="right">Value</th><th>Encoding</th> | |||
07 7 19 J 31 V 43 / | </tr> | |||
08 8 20 K 32 W 44 : | </thead> | |||
09 9 21 L 33 X | <tbody> | |||
10 A 22 M 34 Y | <tr> | |||
11 B 23 N 35 Z | <td align="right">00</td><td>0</td> | |||
</artwork></figure> | <td align="right">12</td><td>C</td> | |||
</t> | <td align="right">24</td><td>O</td> | |||
<td align="right">36</td><td>Space</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">01</td><td>1</td> | ||||
<td align="right">13</td><td>D</td> | ||||
<td align="right">25</td><td>P</td> | ||||
<td align="right">37</td><td>$</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">02</td><td>2</td> | ||||
<td align="right">14</td><td>E</td> | ||||
<td align="right">26</td><td>Q</td> | ||||
<td align="right">38</td><td>%</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">03</td><td>3</td> | ||||
<td align="right">15</td><td>F</td> | ||||
<td align="right">27</td><td>R</td> | ||||
<td align="right">39</td><td>*</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">04</td><td>4</td> | ||||
<td align="right">16</td><td>G</td> | ||||
<td align="right">28</td><td>S</td> | ||||
<td align="right">40</td><td>+</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">05</td><td>5</td> | ||||
<td align="right">17</td><td>H</td> | ||||
<td align="right">29</td><td>T</td> | ||||
<td align="right">41</td><td>-</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">06</td><td>6</td> | ||||
<td align="right">18</td><td>I</td> | ||||
<td align="right">30</td><td>U</td> | ||||
<td align="right">42</td><td>.</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">07</td><td>7</td> | ||||
<td align="right">19</td><td>J</td> | ||||
<td align="right">31</td><td>V</td> | ||||
<td align="right">43</td><td>/</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">08</td><td>8</td> | ||||
<td align="right">20</td><td>K</td> | ||||
<td align="right">32</td><td>W</td> | ||||
<td align="right">44</td><td>:</td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">09</td><td>9</td> | ||||
<td align="right">21</td><td>L</td> | ||||
<td align="right">33</td><td>X</td> | ||||
<td></td><td></td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">10</td><td>A</td> | ||||
<td align="right">22</td><td>M</td> | ||||
<td align="right">34</td><td>Y</td> | ||||
<td></td><td></td> | ||||
</tr> | ||||
<tr> | ||||
<td align="right">11</td><td>B</td> | ||||
<td align="right">23</td><td>N</td> | ||||
<td align="right">35</td><td>Z</td> | ||||
<td></td><td></td> | ||||
</tr> | ||||
</tbody> | ||||
</table> | ||||
</section> | </section> | |||
<section title="Encoding examples"> | <section numbered="true" toc="default"> | |||
<t> | <name>Encoding Examples</name> | |||
<t> | ||||
It should be noted that although the examples are all text, | It should be noted that although the examples are all text, | |||
Base45 is an encoding for binary data where each octet can | Base45 is an encoding for binary data where each octet can | |||
have any value 0-255. | have any value 0-255. | |||
</t> | </t> | |||
<t> | ||||
Encoding example 1: The string "AB" is the byte sequence [65 | <t>Encoding example 1:</t> | |||
66]. The 16 bit value is 65 * 256 + 66 = 16706. 16706 equals | <t indent="3"> | |||
11 + 45 * 11 + 45 * 45 * 8, so the sequence in base 45 is [11 | The string "AB" is the byte sequence [[65 66]]. | |||
11 8]. By looking up these values in the Table 1 we get the | ||||
encoded string "BB8". | If we look at all 16 bits, we get 65 * 256 + 66 = 16706. | |||
</t> | ||||
<t> | 16706 equals 11 + (11 * 45) + (8 * 45 * 45), so the sequence in base 45 is [11 | |||
Encoding example 2: The string "Hello!!" as ASCII is the | 11 8]. | |||
byte sequence [72 101 108 108 111 33 33]. If we look at each | ||||
16 bit value, it is [18533 27756 28449 33]. Note the 33 for | Referring to <xref target="table1"/>, we get the encoded string "BB8". | |||
the last byte. When looking at the values in base 45, we get | </t> | |||
[[38 6 9] [36 31 13] [9 2 14] [33 0]] where the last byte is | ||||
represented by two. The resulting string "%69 VD92EX0" is | <table> | |||
created by looking up these values in Table 1. It should be | <name>Example 1 in Detail</name> | |||
noted it includes a space. | <tbody> | |||
</t> | <tr> | |||
<t> | <td>AB</td> | |||
Encoding example 3: The string "base-45" as ASCII is the | <td>Initial string</td> | |||
byte sequence [98 97 115 101 45 52 53]. If we look at each | </tr> | |||
16 bit value, it is [25185 29541 11572 53]. Note the 53 for | <tr> | |||
the last byte. When looking at the values in base 45, we get | <td>[[65 66]]</td> | |||
[[30 19 12] [21 26 14] [7 32 5] [8 1]] where the last byte | <td>Decimal value</td> | |||
is represented by two. By looking up these values in the | </tr> | |||
Table 1 we get the encoded string "UJCLQE7W581". | <tr> | |||
</t> | <td>[16706]</td> | |||
<td>Value in base 16</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[11 11 8]</td> | ||||
<td>Value in base 45</td> | ||||
</tr> | ||||
<tr> | ||||
<td>BB8</td> | ||||
<td>Encoded string</td> | ||||
</tr> | ||||
</tbody> | ||||
</table> | ||||
<t>Encoding example 2:</t> | ||||
<t indent="3"> | ||||
The string "Hello!!" as ASCII is the byte sequence [[72 101] [108 108] [111 33 | ||||
] [33]]. | ||||
If we look at this 16 bits at a time, we get [18533 27756 28449 33]. Note the | ||||
33 for the last byte. | ||||
When looking at the values in base 45, we get [[38 6 9] [36 31 13] [9 2 14] | ||||
[33 0]], where the last byte is represented by two values. | ||||
The resulting string "%69 VD92EX0" is created by looking up these values in | ||||
<xref target="table1"/>. It should be noted it includes a space. | ||||
</t> | ||||
<table> | ||||
<name>Example 2 in Detail</name> | ||||
<tbody> | ||||
<tr> | ||||
<td>Hello!!</td> | ||||
<td>Initial string</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[[72 101] [108 108] [111 33] [33]]</td> | ||||
<td>Decimal value</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[18533 27756 28449 33]</td> | ||||
<td>Value in base 16</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[[38 6 9] [36 31 13] [9 2 14] [33 0]]</td> | ||||
<td>Value in base 45</td> | ||||
</tr> | ||||
<tr> | ||||
<td>%69 VD92EX0</td> | ||||
<td>Encoded string</td> | ||||
</tr> | ||||
</tbody> | ||||
</table> | ||||
<t>Encoding example 3:</t> | ||||
<t indent="3"> | ||||
The string "base-45" as ASCII is the byte sequence [[98 97] [115 101] | ||||
[45 52] [53]]. | ||||
If we look at this two bytes at a time, we get [25185 29541 11572 | ||||
53]. Note the 53 for the last byte. | ||||
When looking at the values in base 45, we get [[30 19 12] [21 26 14] | ||||
[7 32 5] [8 1]] where the last byte is represented by two values. | ||||
Referring to <xref target="table1"/>, we get the encoded string | ||||
"UJCLQE7W581". | ||||
</t> | ||||
<table> | ||||
<name>Example 3 in Detail</name> | ||||
<tbody> | ||||
<tr> | ||||
<td>base-45</td> | ||||
<td>Initial string</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[[98 97] [115 101] [45 52] [53]]</td> | ||||
<td>Decimal value</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[25185 29541 11572 53]</td> | ||||
<td>Value in base 16</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[[30 19 12] [21 26 14] [7 32 5] [8 1]]</td> | ||||
<td>Value in base 45</td> | ||||
</tr> | ||||
<tr> | ||||
<td>UJCLQE7W581</td> | ||||
<td>Encoded string</td> | ||||
</tr> | ||||
</tbody> | ||||
</table> | ||||
</section> | </section> | |||
<section title="Decoding examples"> | <section numbered="true" toc="default"> | |||
<t> | <name>Decoding Example</name> | |||
Decoding example 1: The string "QED8WEX0" represents, when | <t>Decoding example 1:</t> | |||
looked up in Table 1, the values [26 14 13 8 32 14 33 0]. We | <t indent="3"> | |||
arrange the numbers in chunks of three, except for the last | The string "QED8WEX0" represents, when looked up in Table 1, the | |||
one which can be two, and get [[26 14 13] [8 32 14] [33 0]]. | values [26 14 13 8 32 14 33 0]. | |||
In base 45 we get [26981 29798 33] where the bytes are [[105 | ||||
101] [116 102] [33]]. If we look at the ASCII values we get | We arrange the numbers in chunks of three, except for the last one | |||
the string "ietf!". | which can be two numbers, and get [[26 14 13] [8 32 14] [33 0]]. | |||
</t> | ||||
In base 45, we get [26981 29798 33] where the bytes are [[105 101] | ||||
[116 102] [33]]. | ||||
If we look at the ASCII values, we get the string "ietf!". | ||||
</t> | ||||
<table> | ||||
<name>Example 4 in Detail</name> | ||||
<tbody> | ||||
<tr> | ||||
<td>QED8WEX0</td> | ||||
<td>Initial string</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[26 14 13 8 32 14 33 0]</td> | ||||
<td>Looked up values</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[[26 14 13] [8 32 14] [33 0]]</td> | ||||
<td>Groups of three</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[26981 29798 33]</td> | ||||
<td>Interpreted as base 45</td> | ||||
</tr> | ||||
<tr> | ||||
<td>[[105 101] [116 102] [33]]</td> | ||||
<td>Values in base 8</td> | ||||
</tr> | ||||
<tr> | ||||
<td>ietf!</td> | ||||
<td>Decoded string</td> | ||||
</tr> | ||||
</tbody> | ||||
</table> | ||||
</section> | </section> | |||
</section> | </section> | |||
<section title="IANA Considerations"> | <section numbered="true" toc="default"> | |||
<name>IANA Considerations</name> | ||||
<t> | <t> | |||
There are no considerations for IANA in this document. | This document has no IANA actions. | |||
</t> | </t> | |||
</section> | </section> | |||
<section title="Security Considerations"> | <section numbered="true" toc="default"> | |||
<name>Security Considerations</name> | ||||
<t> | <t> | |||
When implementing encoding and decoding it is important to be | When implementing encoding and decoding it is important to be | |||
very careful so that buffer overflow or similar does not | very careful so that buffer overflow or similar issues do | |||
occur. This of course includes the calculations in base 45 | not occur. This of course includes the calculations in base | |||
and lookup in the table of characters (Table 1). A decoder | 45 and lookup in the table of characters (<xref | |||
must also be robust regarding input, including proper handling | target="table1"/>). A decoder must also be robust regarding | |||
of any octet value 0-255, including the NUL character (ASCII | input, including proper handling of any octet value 0-255, | |||
0). | including the NUL character (ASCII 0). | |||
</t> | </t> | |||
<t> | <t> | |||
It should be noted that Base64 and some other encodings pad | It should be noted that Base64 and some other encodings pad | |||
the string so that the encoding starts with an aligned number | the string so that the encoding starts with an aligned number | |||
of characters while Base45 specifically avoids padding. Because of | of characters while Base45 specifically avoids padding. Because of | |||
this, special care has to be taken when odd number of octets | this, special care has to be taken when an odd number of octets | |||
are to be encoded. Similarly, care must be taken if the number | is to be encoded. Similarly, care must be taken if the number | |||
of characters to decode are not evenly divisible by 3. | of characters to decode are not evenly divisible by 3. | |||
</t> | </t> | |||
<t> | <t> | |||
Base encodings use a specific, reduced alphabet to encode | Base encodings use a specific, reduced alphabet to encode | |||
binary data. Non-alphabet characters could exist within | binary data. Non-alphabet characters could exist within | |||
base-encoded data, caused by data corruption or by | base-encoded data, caused by data corruption or by | |||
design. Non-alphabet characters may be exploited as a "covert | design. Non-alphabet characters may be exploited as a "covert | |||
channel", where non-protocol data can be sent for nefarious | channel", where non-protocol data can be sent for nefarious | |||
purposes. Non-alphabet characters might also be sent in order | purposes. Non-alphabet characters might also be sent in order | |||
to exploit implementation errors leading to, e.g., buffer | to exploit implementation errors leading to, for example, buffer | |||
overflow attacks. | overflow attacks. | |||
</t> | </t> | |||
<t> | <t> | |||
Implementations MUST reject any input that is not a valid | Implementations <bcp14>MUST</bcp14> reject any input that is | |||
encoding. For example, it MUST reject the input (encoded data) | not a valid encoding. For example, it <bcp14>MUST</bcp14> | |||
if it contains characters outside the base alphabet (in Table | reject the input (encoded data) if it contains characters | |||
1) when interpreting base-encoded data. | outside the base alphabet (in <xref target="table1"/>) when | |||
interpreting base-encoded data. | ||||
</t> | </t> | |||
<t> | <t> | |||
Even though a Base45 encoded string contains only characters | Even though a Base45-encoded string contains only characters | |||
from the alphabet in Table 1, cases like the following has to | from the alphabet in <xref target="table1"/>, cases like the | |||
be considered: The string "FGW" represents 65535 (FFFF in base | following have to be considered: The string "FGW" represents | |||
16), which is a valid encoding of 16 bits. A slightly | 65535 (FFFF in base 16), which is a valid encoding of 16 bits. | |||
different encoded string of the same length, "GGW", would | A slightly different encoded string of the same length, "GGW", | |||
represent 65536 (10000 in base 16), which is represented by | would represent 65536 (10000 in base 16), which is represented | |||
more than 16 bits. Implementations MUST also reject the | by more than 16 bits. Implementations <bcp14>MUST</bcp14> | |||
encoded data if it contains a triplet of characters which, | also reject the encoded data if it contains a triplet of | |||
when decoded, results in an unsigned integer which is greater | characters that, when decoded, results in an unsigned integer | |||
than 65535 (ffff in base 16). | that is greater than 65535 (FFFF in base 16). | |||
</t> | </t> | |||
<t> | <t> | |||
It should be noted that the resulting string after encoding to | It should be noted that the resulting string after encoding to | |||
Base45 might include non-URL-safe characters so if the URL | Base45 might include non-URL-safe characters so if the URL | |||
including the Base45 encoded data has to be URL safe, one | including the Base45 encoded data has to be URL-safe, one | |||
has to use %-encoding. | has to use percent-encoding. | |||
</t> | ||||
</section> | ||||
<section title="Acknowledgements"> | ||||
<t> | ||||
The authors thank Mark Adler, Anders Ahl, Alan Barrett, Sam | ||||
Spens Clason, Alfred Fiedler, Tomas Harreveld, Erik Hellman, | ||||
Joakim Jardenberg, Michael Joost, Erik Kline, Christian | ||||
Landgren, Anders Lowinger, Mans Nilsson, Jakob Schlyter, Peter | ||||
Teufl and Gaby Whitehead for the feedback. Also, everyone that | ||||
have been working with Base64 over a long period of years and | ||||
have proven the implementations are stable. | ||||
</t> | </t> | |||
</section> | </section> | |||
</middle> | </middle> | |||
<back> | <back> | |||
<references title='Normative References'> | <references> | |||
&RFC4648; | <name>Normative References</name> | |||
&RFC2119; | <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC | |||
&RFC8174; | .4648.xml"/> | |||
<reference anchor="ISO18004"> | <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC | |||
<front> | .2119.xml"/> | |||
<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC | ||||
.8174.xml"/> | ||||
<reference anchor="ISO18004" target="https://www.iso.org/standard/62021.ht | ||||
ml"> | ||||
<front> | ||||
<title> | <title> | |||
ISO/IEC 18004:2015 Information technology - Automatic | Information technology - Automatic identification and data capture | |||
identification and data capture techniques - QR Code bar | techniques - QR Code bar code symbology specification | |||
code symbology specification | </title> | |||
</title> | ||||
<author> | <author> | |||
<organization>ISO/IEC JTC 1/SC 31</organization> | <organization>ISO/IEC</organization> | |||
</author> | </author> | |||
<date month="February" year="2015" /> | <date month="February" year="2015"/> | |||
</front> | </front> | |||
<seriesInfo name="ISO/IEC 18004:2015" | <seriesInfo name="ISO/IEC" value="18004:2015"/> | |||
value="https://www.iso.org/standard/62021.html" /> | ||||
</reference> | </reference> | |||
</references> | </references> | |||
<section numbered="false" toc="default"> | ||||
<name>Acknowledgements</name> | ||||
<t> | ||||
The authors thank <contact fullname="Mark Adler"/>, <contact fullname="An | ||||
ders Ahl"/>, <contact fullname="Alan Barrett"/>, <contact fullname="Sam Spens Cl | ||||
ason"/>, <contact fullname="Alfred Fiedler"/>, <contact fullname="Tomas Harrevel | ||||
d"/>, <contact fullname="Erik Hellman"/>, <contact fullname="Joakim Jardenberg"/ | ||||
>, <contact fullname="Michael Joost"/>, <contact fullname="Erik Kline"/>, <conta | ||||
ct fullname="Christian Landgren"/>, <contact fullname="Anders Lowinger"/>, <cont | ||||
act fullname="Mans Nilsson"/>, <contact fullname="Jakob Schlyter"/>, <contact fu | ||||
llname="Peter Teufl"/>, and <contact fullname="Gaby Whitehead"/> for the feedbac | ||||
k. Also, everyone who has been working with Base64 over a long period of years a | ||||
nd has proven the implementations are stable. | ||||
</t> | ||||
</section> | ||||
</back> | </back> | |||
</rfc> | </rfc> | |||
End of changes. 51 change blocks. | ||||
203 lines changed or deleted | 413 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |