rfc9233.original | rfc9233.txt | |||
---|---|---|---|---|
Network Working Group P. Faltstrom | Internet Engineering Task Force (IETF) P. Fältström | |||
Internet-Draft Netnod | Request for Comments: 9233 Netnod | |||
Intended status: Standards Track February 13, 2022 | Category: Standards Track March 2022 | |||
Expires: August 17, 2022 | ISSN: 2070-1721 | |||
IDNA2008 and Unicode 12.0.0 | Internationalized Domain Names for Applications 2008 (IDNA2008) and | |||
draft-faltstrom-unicode12-07 | Unicode 12.0.0 | |||
Abstract | Abstract | |||
This document describes the changes between Unicode 6.0.0 and Unicode | This document describes the changes between Unicode 6.0.0 and Unicode | |||
12.0.0 in the context of IDNA2008. Some additions and changes have | 12.0.0 in the context of the current version of Internationalized | |||
been made in the Unicode Standard that affect the values produced by | Domain Names for Applications 2008 (IDNA2008). Some additions and | |||
the algorithm IDNA2008 specifies. IDNA2008 allows adding exceptions | changes have been made in the Unicode Standard that affect the values | |||
to the algorithm for backward compatibility; however, this document | produced by the algorithm IDNA2008 specifies. IDNA2008 allows adding | |||
does not add any such exceptions. This document provides the | exceptions to the algorithm for backward compatibility; however, this | |||
necessary tables to IANA to make its database consistent with Unicode | document does not add any such exceptions. This document provides | |||
12.0.0. | the necessary tables to IANA to make its database consistent with | |||
Unicode 12.0.0. | ||||
To improve understanding, this document describes systems that are | To improve understanding, this document describes systems that are | |||
being used as alternatives to those that conform to IDNA2008. | being used as alternatives to those that conform to IDNA2008. | |||
TO BE REMOVED AT TIME OF PUBLICATION AS AN RFC: | ||||
This document is discussed on the i18n-discuss@ietf.org mailing list | ||||
of the IETF. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on August 17, 2022. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9233. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2022 IETF Trust and the persons identified as the | Copyright (c) 2022 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Revised BSD License text as described in Section 4.e of the | |||
the Trust Legal Provisions and are provided without warranty as | Trust Legal Provisions and are provided without warranty as described | |||
described in the Simplified BSD License. | in the Revised BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction | |||
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2. Background | |||
2.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 5 | 2.1. IDNA2008 Documents | |||
2.2. Additional important IDNA2008-related documents . . . . . 6 | 2.2. Additional Important IDNA2008-Related Documents | |||
2.3. Deployment . . . . . . . . . . . . . . . . . . . . . . . 6 | 2.3. Deployment | |||
3. Notable Changes Between Unicode 6.0.0 and 12.0.0 . . . . . . 7 | 3. Notable Changes between Unicode 6.0.0 and 12.0.0 | |||
3.1. Changes between Unicode 6.0.0 and 7.0.0 . . . . . . . . . 7 | 3.1. Changes between Unicode 6.0.0 and 7.0.0 | |||
3.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 8 | 3.2. Changes between Unicode 7.0.0 and 10.0.0 | |||
3.3. Changes between Unicode 10.0.0 and 11.0.0 . . . . . . . . 9 | 3.3. Changes between Unicode 10.0.0 and 11.0.0 | |||
3.4. Changes between Unicode 11.0.0 and 12.0.0 . . . . . . . . 10 | 3.4. Changes between Unicode 11.0.0 and 12.0.0 | |||
4. U+111C9 SHARADA SANDHI MARK . . . . . . . . . . . . . . . . . 11 | 4. U+111C9 SHARADA SANDHI MARK | |||
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 11 | 5. Conclusion | |||
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | 6. IANA Considerations | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . 12 | 7. Security Considerations | |||
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 | 8. References | |||
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 8.1. Normative References | |||
9.1. Normative References . . . . . . . . . . . . . . . . . . 12 | 8.2. Informative References | |||
9.2. Non-normative references . . . . . . . . . . . . . . . . 13 | Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0 | |||
Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0 . . . . 15 | Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 | |||
Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 21 | Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 | |||
Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 23 | Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 | |||
Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 24 | Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 | |||
Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 26 | Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0 | |||
Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0 . . . 27 | Acknowledgments | |||
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 29 | Author's Address | |||
1. Introduction | 1. Introduction | |||
The current version of Internationalized Domain Names for | The current version of Internationalized Domain Names for | |||
Applications (IDNA) was initiated in 2008, and despite not being | Applications (IDNA) was initiated in 2008, and despite not being | |||
completed until 2010, is widely known as "IDNA2008". It is specified | completed until 2010, is widely known as "IDNA2008". It is specified | |||
in the series of documents listed in Section 2.1. The IDNA2008 | in the series of documents listed in Section 2.1. The IDNA2008 | |||
standard includes an algorithm by which a derived property value is | standard includes an algorithm by which a derived property value is | |||
calculated based on the properties defined from the Unicode Standard. | calculated based on the properties defined in the Unicode Standard. | |||
The derived property values that can be calculated are defined in RFC | The derived property values that can be calculated are defined in RFC | |||
5892 [RFC5892]. Below is a summary to aid in the reading of this | 5892 [RFC5892]. Below is a summary to aid in the reading of this | |||
document. For definition of the terms, please see RFC 5892 | document. For definition of the terms, please see RFC 5892 | |||
[RFC5892]. | [RFC5892]. | |||
o PROTOCOL VALID: Those that are allowed to be used in IDNs. Code | PROTOCOL VALID: Those that are allowed to be used in IDNs. Code | |||
points with this property value are permitted for general use in | points with this property value are permitted for general use in | |||
IDNs. However, that a label consists only of code points that | IDNs. However, the fact that a label consists only of code points | |||
have this property value does not imply that the label can be used | with this property value does not imply that the label can be used | |||
in DNS. The abbreviated term PVALID is used to refer to this | in DNS. The abbreviated term PVALID is used to refer to this | |||
value. | value. | |||
o CONTEXTUAL RULE REQUIRED: Some characteristics of the character, | CONTEXTUAL RULE REQUIRED: Some characteristics of the character, | |||
such as it being invisible in certain contexts or problematic in | such as it being invisible in certain contexts or problematic in | |||
others, require that it not be used in labels unless specific | others, require that it not be used in labels unless specific | |||
other characters or properties are present. The abbreviated term | other characters or properties are present. The abbreviated term | |||
CONTEXT is used to refer to this value. As explained in RFC 5892 | CONTEXT is used to refer to this value. As explained in RFC 5892 | |||
[RFC5892] CONTEXT is in turn divided into CONTEXTJ and CONTEXTO. | [RFC5892], CONTEXT is in turn divided into CONTEXTJ and CONTEXTO. | |||
o DISALLOWED: Those that should clearly not be included in IDNs. | DISALLOWED: Those that should clearly not be included in IDNs. Code | |||
Code points with this property value are not permitted in IDNs. | points with this property value are not permitted in IDNs. | |||
o UNASSIGNED: Those code points that are not designated (i.e., are | UNASSIGNED: Those code points that are not designated (i.e., are | |||
unassigned) in the Unicode Standard. | unassigned) in the Unicode Standard. | |||
When the Unicode Standard is updated, new code points are assigned | When the Unicode Standard is updated, new code points are assigned | |||
and already-assigned code points can have their property values | and already assigned code points can have their property values | |||
changed. | changed. | |||
o Assigning code points can create problems if the newly-assigned | * Assigning code points can create problems if the newly assigned | |||
code points are compositions of existing code points and because | code points are compositions of existing code points and the | |||
of that the normalization relationships associated with those code | normalization relationships associated with those code points | |||
points should have been changed. | should have been changed because of that. | |||
o Changing properties for already-assigned code points can create | * Changing properties for already assigned code points can create | |||
problems if the property change results in changes to the derived | problems if the property change results in changes to the derived | |||
property value. This might make an earlier allowed code point | property value. A previously allowed code point whose derived | |||
whose derived property value is PVALID to then not be allowed | property value is PVALID may now be prohibited if its derived | |||
anymore if its derived property value changes to DISALLOWED. The | property value changes to DISALLOWED. The problem can also happen | |||
problem can also happen the other way around: a code point that | the other way around: a code point that was not allowed (and thus | |||
was not allowed (and thus is prohibited) can suddenly end up being | was prohibited) can suddenly be allowed. | |||
allowed. | ||||
o Problems can also be created if the properties assigned to those | * Problems can also be created if the properties assigned to those | |||
code points are inconsistent with IDNA2008 assumptions about how | code points are inconsistent with IDNA2008 assumptions about how | |||
properties are assigned and/or about how code points with those | properties are assigned and/or about how code points with those | |||
properties are used or behave. | properties are used or behave. | |||
There were three incompatible changes in the Unicode standard between | There were three incompatible changes in the Unicode Standard between | |||
Unicode 5.2.0 [Unicode-5.2.0] and Unicode 6.0.0 [Unicode-6.0.0]; they | Unicode 5.2.0 [Unicode-5.2.0] and Unicode 6.0.0 [Unicode-6.0.0]; they | |||
are described in RFC 6452 [RFC6452]. The code points U+0CF1 and | are described in RFC 6452 [RFC6452]. The code points U+0CF1 and | |||
U+0CF2 had a derived property value change from DISALLOWED to PVALID, | U+0CF2 had a derived property value change from DISALLOWED to PVALID, | |||
and the code point U+19DA had a change in derived property value from | and the code point U+19DA had a change in derived property value from | |||
PVALID to DISALLOWED. These changes where examined in great detail, | PVALID to DISALLOWED. These changes where examined in great detail, | |||
but the IETF concluded that these changes to the Unicode standard did | but the IETF concluded that these changes to the Unicode Standard did | |||
not warrant an update to RFC 5892 [RFC5892]. | not warrant an update to RFC 5892 [RFC5892]. | |||
As described in Section 3, more incompatible changes have been made | As described in Section 3, more incompatible changes have been made | |||
to code points between Unicode 6.0.0 and Unicode 12.0.0 | to code points between Unicode 6.0.0 and Unicode 12.0.0 | |||
[Unicode-12.0.0]; however, the changes in the derived property values | [Unicode-12.0.0]; however, the changes in the derived property values | |||
do not result in exceptions (as defined in section 2.6 of RFC 5892 | do not result in exceptions (as defined in Section 2.6 of RFC 5892 | |||
[RFC5892]) being added to RFC 5892 [RFC5892]. | [RFC5892]) that would require an update to the "IDNA Contextual | |||
Rules" registry (which would also be considered an update to RFC 5892 | ||||
[RFC5892]). | ||||
Further, in 2015, the Internet Architecture Board (IAB) issued a | Further, in 2015, the Internet Architecture Board (IAB) issued a | |||
statement [IAB2005-1] that advised the community to avoid using any | statement [IAB2005-1] that advised the community to avoid using any | |||
of the potentially problematic code points and asked the IETF to | of the potentially problematic code points and asked the IETF to | |||
resolve the issues related to the code point ARABIC LETTER BEH WITH | resolve the issues related to the code point ARABIC LETTER BEH WITH | |||
HAMZA ABOVE (U+08A1) that was introduced in Unicode 7.0.0 | HAMZA ABOVE (U+08A1) that was introduced in Unicode 7.0.0 | |||
[Unicode-7.0.0]. In February of that year, the statement was revised | [Unicode-7.0.0]. In February of that year, the statement was revised | |||
[IAB2005-2] to focus on the latter request. More details about the | [IAB2005-2] to focus on the latter request. More details about the | |||
problem of code point sequences not normalizing as one might expect | problem of code point sequences not normalizing as one might expect | |||
appear in a draft that was part of the discussion [IDNA7]. | appear in a draft that was part of the discussion [IDNA7]. | |||
skipping to change at page 5, line 4 ¶ | skipping to change at line 179 ¶ | |||
may have similar issues. While the affected code points remain | may have similar issues. While the affected code points remain | |||
PVALID in this document, identification of the problem resulted in a | PVALID in this document, identification of the problem resulted in a | |||
clarification of the review process for new Unicode versions. That | clarification of the review process for new Unicode versions. That | |||
clarification, which reinforces the original review plan to capture | clarification, which reinforces the original review plan to capture | |||
issues like these, was published as RFC 8753 [RFC8753]. Any review | issues like these, was published as RFC 8753 [RFC8753]. Any review | |||
of Unicode versions after 12.0.0 should be made according to RFC 8753 | of Unicode versions after 12.0.0 should be made according to RFC 8753 | |||
[RFC8753]; an objective of this document is to ensure that a proper | [RFC8753]; an objective of this document is to ensure that a proper | |||
review of such versions after version 12.0.0 can be made. | review of such versions after version 12.0.0 can be made. | |||
2. Background | 2. Background | |||
2.1. IDNA2008 Documents | 2.1. IDNA2008 Documents | |||
IDNA2008 consists of the following documents. The documents in the | IDNA2008 consists of the following documents. The documents in the | |||
set have informal names. | set have informal names. | |||
o Internationalized Domain Names for Applications (IDNA): | * "Internationalized Domain Names for Applications (IDNA): | |||
Definitions and Document Framework [RFC5890], informally called | Definitions and Document Framework" [RFC5890], informally called | |||
"Defs" or "Definitions", contains definitions and other material | "Defs" or "Definitions", contains definitions and other material | |||
that are needed for understanding other documents in the set. | that are needed for understanding other documents in the set. | |||
o Internationalized Domain Names in Applications (IDNA): Protocol | * "Internationalized Domain Names in Applications (IDNA): Protocol" | |||
[RFC5891], informally called "Protocol", describes the core | [RFC5891], informally called "Protocol", describes the core | |||
IDNA2008 protocol and its operations. It needs to be interpreted | IDNA2008 protocol and its operations. It needs to be interpreted | |||
in combination with the Bidi document (described below). | in combination with the Bidi document (described below). RFC 5891 | |||
[RFC5891] obsoletes RFC 3491 [RFC3491] and, in particular, the use | ||||
of the tables to which RFC 3491 [RFC3491] refers. | ||||
o The Unicode Code Points and Internationalized Domain Names for | * "The Unicode Code Points and Internationalized Domain Names for | |||
Applications (IDNA) [RFC5892], informally called "Tables", lists | Applications (IDNA)" [RFC5892], informally called "Tables", lists | |||
the categories and rules that identify the code points allowed in | the categories and rules that identify the code points allowed in | |||
a label written in native character form (called a "U-label"), and | a label written in native character form (called a "U-label"), and | |||
is based on Unicode 5.2.0 [Unicode-5.2.0] code point assignments | is based on Unicode 5.2.0 [Unicode-5.2.0] code point assignments | |||
and additional rules unique to IDNA2008. The Unicode-based rules | and additional rules unique to IDNA2008. The Unicode-based rules | |||
in RFC 5892 are expected to be stable across Unicode updates and | in RFC 5892 are expected to be stable across Unicode updates and | |||
hence independent of Unicode versions. RFC 5892 [RFC5892] | hence independent of Unicode versions. | |||
obsoletes RFC 3491 [RFC3491], and in particular the use of the | ||||
tables to which RFC 3491 [RFC3491] refers. | ||||
o Right-to-Left Scripts for Internationalized Domain Names for | * "Right-to-Left Scripts for Internationalized Domain Names for | |||
Applications (IDNA) [RFC5893], informally called "Bidi", specifies | Applications (IDNA)" [RFC5893], informally called "Bidi", | |||
special rules for labels that contain characters that are written | specifies special rules for labels that contain characters that | |||
from right to left. | are written from right to left. | |||
o Internationalized Domain Names for Applications (IDNA): | * "Internationalized Domain Names for Applications (IDNA): | |||
Background, Explanation, and Rationale [RFC5894], informally | Background, Explanation, and Rationale" [RFC5894], informally | |||
called "Rationale", provides an overview of the protocol and | called "Rationale", provides an overview of the protocol and | |||
associated tables, and gives explanatory material and some | associated tables, and gives explanatory material and some | |||
rationale for the decisions that led to IDNA2008. It also | rationale for the decisions that led to IDNA2008. It also | |||
contains advice for DNS registry operators and others who use | contains advice for DNS registry operators and others who use | |||
Internationalized Domain Names (IDNs). | Internationalized Domain Names (IDNs). | |||
o Mapping Characters for Internationalized Domain Names in | * "Mapping Characters for Internationalized Domain Names in | |||
Applications (IDNA) 2008 [RFC5895], informally called "Mapping", | Applications (IDNA) 2008" [RFC5895], informally called "Mapping", | |||
discusses the issue of mapping characters into other characters | discusses the issue of mapping characters into other characters | |||
and provides guidance for doing so when that is appropriate. RFC | and provides guidance for doing so when that is appropriate. RFC | |||
5895 provides advice only and is not a required part of IDNA. | 5895 provides advice only and is not a required part of IDNA. | |||
2.2. Additional important IDNA2008-related documents | 2.2. Additional Important IDNA2008-Related Documents | |||
There are other documents important for the understanding and | There are other documents important for the understanding and | |||
functioning of IDNA2008, for example this. | functioning of IDNA2008, for example this. | |||
o The Unicode Code Points and Internationalized Domain Names for | * "The Unicode Code Points and Internationalized Domain Names for | |||
Applications (IDNA) - Unicode 6.0 [RFC6452] describes some changes | Applications (IDNA) - Unicode 6.0" [RFC6452] describes some | |||
made to Unicode 6.0.0 [Unicode-6.0.0] that resulted in derived | changes made to Unicode 6.0.0 [Unicode-6.0.0] that resulted in | |||
property value change for the code points U+0CF1, U+0CF2 and | derived property value changes for the code points U+0CF1, U+0CF2, | |||
U+19DA. U+0CF1 and U+0CF2 changed from DISALLOWED to PVALID, | and U+19DA. U+0CF1 and U+0CF2 changed from DISALLOWED to PVALID, | |||
while U+19DA changed from PVALID to DISALLOWED. The IETF | while U+19DA changed from PVALID to DISALLOWED. The IETF | |||
concluded that no update to RFC 5892 [RFC5892] was needed based on | concluded that no update to RFC 5892 [RFC5892] was needed based on | |||
the changes made in Unicode 6.0.0 [Unicode-6.0.0]. As a result, | the changes made in Unicode 6.0.0 [Unicode-6.0.0]. As a result, | |||
the derived property value remained aligned with the Unicode | the derived property value remained aligned with the Unicode | |||
Standard. Specifically, no exception was added. | Standard. Specifically, no exception was added. | |||
2.3. Deployment | 2.3. Deployment | |||
There are many variations on the general IDNA model in use in the | There are many variations on the general IDNA model in use in the | |||
various parts of the community. The following lists some of the | various parts of the community. The following lists some of the | |||
strategies that implementations that claim to be IDNA compliant are | strategies that implementations that claim to be IDNA compliant are | |||
known to use, but it should be noted the list is not complete: | known to use, but it should be noted the list is not complete: | |||
o IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491 | * IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491 | |||
[RFC3491]. Those specifications are dependent on case folding and | [RFC3491]. Those specifications are dependent on case folding, | |||
NFKC normalization and on tables that specify for each code point | Normalization Form KC (NFKC), and on tables that specify for each | |||
whether it is allowed to be used or not, with a distinction made | code point whether it is allowed to be used or not, with a | |||
between use for "stored strings" and "query strings". The tables | distinction made between use for "stored strings" and "query | |||
themselves are dependent on Unicode 3.2 [Unicode-3.2.0]. | strings". The tables themselves are dependent on Unicode 3.2 | |||
[Unicode-3.2.0]. | ||||
o A number of variations on IDNA2003, sometimes presented as | * A number of variations on IDNA2003, sometimes presented as | |||
"updated IDNA2003" or the like, which follow the principles of | "updated IDNA2003" or the like, which follow the principles of | |||
IDNA2003 as understood by the implementers but that use tables | IDNA2003 as understood by the implementers but that use tables | |||
that represent how the implementers believe Stringprep [RFC3454] | that represent how the implementers believe Stringprep [RFC3454] | |||
and Nameprep [RFC3491] would have evolved had the IETF not moved | and Nameprep [RFC3491] would have evolved had the IETF not moved | |||
in the direction of IDNA2008 instead. | in the direction of IDNA2008 instead. | |||
o A mix between IDNA2003 and IDNA2008 where code points assigned to | * A mix between IDNA2003 and IDNA2008 where code points assigned to | |||
Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property | Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property | |||
value calculated according to the algorithm specified in IDNA2008. | value calculated according to the algorithm specified in IDNA2008. | |||
o A mix between IDNA2003 and IDNA2008 according to the Unicode | * A mix between IDNA2003 and IDNA2008 according to the Unicode | |||
Technical Standard #46 [UTS-46]. Because that document specifies | Technical Standard #46 [UTS-46]. Because that document specifies | |||
different profiles, there are several variations that leave users | different profiles, there are several variations that leave users | |||
with no guarantee that two applications claiming conformance to | with no guarantee that two applications claiming conformance to | |||
UTS#46 will interoperate well with each other much less with | UTS#46 will interoperate well with each other much less with | |||
conforming IDNA2008 implementations. UTS#46 is ultimately based | conforming IDNA2008 implementations. UTS#46 is ultimately based | |||
on a normative table very much like the one used by Stringprep | on a normative table very much like the one used by Stringprep | |||
[RFC3454] but updated for each new version of Unicode. | [RFC3454] but updated for each new version of Unicode. | |||
o The (normative) IDNA2008 algorithm applied to whatever version of | * The (normative) IDNA2008 algorithm applied to whatever version of | |||
Unicode Standard exists in the operating system and/or libraries | Unicode Standard exists in the operating system and/or libraries | |||
used, independent of whatever version of tables appears in the | used, independent of whatever version of tables appears in the | |||
(non-normative) IANA database. | (non-normative) IANA database. | |||
In practice, the Unicode Consortium creates a maximum set of code | In practice, the Unicode Consortium creates a maximum set of code | |||
points by assigning code points in the Unicode Standard. The | points by assigning code points in the Unicode Standard. The | |||
IDNA2008 rules use the Unicode Standard to create a further subset of | IDNA2008 rules use the Unicode Standard to create a further subset of | |||
code points and context that are permitted in DNS labels associated | code points and context that are permitted in DNS labels associated | |||
with its PVALID, and CONTEXT (CONTEXTJ or CONTEXTO) derived property | with its PVALID and CONTEXT (CONTEXTJ or CONTEXTO) derived property | |||
values. DNS registries and other organizations that deal with IDNs | values. DNS registries and other organizations that deal with IDNs | |||
are supposed to create their own subsets from IDNA2008 for use by | are supposed to create their own subsets from IDNA2008 for use by | |||
those registries and organizations. | those registries and organizations. | |||
This progressive subsetting and narrowing of the repertoire of code | This progressive subsetting and narrowing of the repertoire of code | |||
points that can be used in labels is an implementation of the | points that can be used in labels is an implementation of the | |||
principles of being conservative when deciding what code points to | principles of being conservative when deciding what code points to | |||
include in such a subset. SAC-084 [SAC-084] and RFC 6912 [RFC6912] | include in such a subset. SAC-084 [SAC-084] and RFC 6912 [RFC6912] | |||
recommend to DNS registries and other organizations to be | recommend to DNS registries and other organizations to be | |||
conservative when creating their subsets, and to use the principle of | conservative when creating their subsets and to use the principle of | |||
creating subsets by inclusion. | creating subsets by inclusion. | |||
See also the Security Considerations section in this document. | See also Security Considerations (Section 7) in this document. | |||
3. Notable Changes Between Unicode 6.0.0 and 12.0.0 | 3. Notable Changes between Unicode 6.0.0 and 12.0.0 | |||
Among the changes between the Unicode versions, most code points that | Among the changes between the Unicode versions, most code points that | |||
change derived property value change from UNASSIGNED to PVALID or | change derived property value change from UNASSIGNED to PVALID or | |||
from UNASSIGNED to DISALLOWED. The interesting changes in derived | from UNASSIGNED to DISALLOWED. The interesting changes in derived | |||
property values include other changes. All changes between the major | property values include other changes. All changes between the major | |||
versions of Unicode can be found in Appendix A (6.0.0-7.0.0), | versions of Unicode can be found in Appendix A (6.0.0-7.0.0), | |||
Appendix B (7.0.0-8.0.0), Appendix C (8.0.0-9.0.0), Appendix D | Appendix B (7.0.0-8.0.0), Appendix C (8.0.0-9.0.0), Appendix D | |||
(9.0.0-10.0.0), Appendix E (10.0.0-11.0.0) and Appendix F | (9.0.0-10.0.0), Appendix E (10.0.0-11.0.0), and Appendix F | |||
(11.0.0-12.0.0). | (11.0.0-12.0.0). | |||
3.1. Changes between Unicode 6.0.0 and 7.0.0 | 3.1. Changes between Unicode 6.0.0 and 7.0.0 | |||
Change in number of characters in each category: | Change in number of characters in each category: | |||
PVALID changed from 97418 to 99867 (+2449) | * PVALID changed from 97418 to 99867 (+2449) | |||
UNASSIGNED changed from 865081 to 861509 (-3572) | * UNASSIGNED changed from 865081 to 861509 (-3572) | |||
CONTEXTJ did not change, at 2 | * CONTEXTJ did not change, at 2 | |||
CONTEXTO did not change, at 25 | ||||
DISALLOWED changed from 151586 to 152709 (+1123) | * CONTEXTO did not change, at 25 | |||
TOTAL did not change, at 1114112 | * DISALLOWED changed from 151586 to 152709 (+1123) | |||
There are no changes made to Unicode between version 6.0.0 and | * TOTAL did not change, at 1114112 | |||
7.0.0 that impact IDNA2008 calculation of the derived property | ||||
values. | There are no changes made to Unicode between version 6.0.0 and 7.0.0 | |||
that impact IDNA2008 calculation of the derived property values. | ||||
The code points U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL | The code points U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL | |||
INHERENT AA both changed the general category from Cf (Format) to Mn | INHERENT AA both changed the General Category from Cf (Format) to Mn | |||
(Nonspacing_Mark), but that did not impact the calculation of the | (Nonspacing_Mark), but that did not impact the calculation of the | |||
derived property value which stayed at DISALLOWED. | derived property value which stayed at DISALLOWED. | |||
The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was | The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was | |||
introduced in Unicode 7.0.0. This was discussed extensively in the | introduced in Unicode 7.0.0. This was discussed extensively in the | |||
IETF, and by the IAB in their statement [IAB2005-1] requesting the | IETF and also by the IAB in their statement [IAB2005-1] requesting | |||
IETF to investigate the issue. Specifically, the IAB stated: | the IETF to investigate the issue. Specifically, the IAB stated: | |||
On the same precautionary principle, the IAB recommends that the | | On the same precautionary principle, the IAB recommends that the | |||
Internationalized Domain Names for Applications (IDNA) Parameters | | Internationalized Domain Names for Applications (IDNA) Parameters | |||
registry <https://www.iana.org/assignments/idna-tables/> not be | | registry <https://www.iana.org/assignments/idna-tables/> not be | |||
updated to Unicode 7.0.0 until the IETF has consensus on a | | updated to Unicode 7.0.0 until the IETF has consensus on a | |||
solution to this problem. | | solution to this problem. | |||
The discussion in the IETF concluded that although it is possible to | The discussion in the IETF concluded that although it is possible to | |||
create "the same" character in multiple ways, the issue with U+08A1 | create "the same" character in multiple ways, the issue with U+08A1 | |||
is not unique. The character U+08A1 (ARABIC LETTER BEH WITH HAMZA | is not unique. The character U+08A1 (ARABIC LETTER BEH WITH HAMZA | |||
ABOVE) can be represented with the sequence ARABIC LETTER BEH | ABOVE) can be represented with the sequence ARABIC LETTER BEH | |||
(U+0628) and ARABIC HAMZA ABOVE (U+0654). This identical to LATIN | (U+0628) and ARABIC HAMZA ABOVE (U+0654). This is identical to LATIN | |||
SMALL LETTER O WITH STROKE (U+00F8), which can be represented with | SMALL LETTER O WITH STROKE (U+00F8), which can be represented with | |||
the sequence LATIN SMALL LETTER O (U+006F) followed by COMBINING | the sequence LATIN SMALL LETTER O (U+006F) followed by COMBINING | |||
SHORT SOLIDUS OVERLAY (U+0337). | SHORT SOLIDUS OVERLAY (U+0337). | |||
Although the discussion about this specific code point resulted in | Although the discussion about this specific code point resulted in | |||
acceptance of the derived property value of PVALID, the underlying | acceptance of the derived property value of PVALID, the underlying | |||
problem with combining sequences is not understood fully. Therefore, | problem with combining sequences is not understood fully. Therefore, | |||
it cannot be claimed that this case can be extrapolated to other | it cannot be claimed that this case can be extrapolated to other | |||
situations and other code points. | situations and other code points. | |||
3.2. Changes between Unicode 7.0.0 and 10.0.0 | 3.2. Changes between Unicode 7.0.0 and 10.0.0 | |||
Change in number of characters in each category: | Change in number of characters in each category: | |||
Code points that changed derived property value: 0 | * Code points that changed derived property value: 0 | |||
PVALID changed from 99867 to 122411 (+22544) | * PVALID changed from 99867 to 122411 (+22544) | |||
UNASSIGNED changed from 861509 to 837775 (-23734) | ||||
CONTEXTJ did not change, at 2 | * UNASSIGNED changed from 861509 to 837775 (-23734) | |||
CONTEXTO did not change, at 25 | * CONTEXTJ did not change, at 2 | |||
DISALLOWED changed from 152709 to 153899 (+1190) | * CONTEXTO did not change, at 25 | |||
TOTAL did not change, at 1114112 | * DISALLOWED changed from 152709 to 153899 (+1190) | |||
There are no changes made to Unicode between version 7.0.0 and | * TOTAL did not change, at 1114112 | |||
10.0.0 that impact IDNA2008 calculation of the derived property | ||||
values. | There are no changes made to Unicode between version 7.0.0 and 10.0.0 | |||
that impact IDNA2008 calculation of the derived property values. | ||||
3.3. Changes between Unicode 10.0.0 and 11.0.0 | 3.3. Changes between Unicode 10.0.0 and 11.0.0 | |||
Change in number of characters in each category: | Change in number of characters in each category: | |||
Code points that changed derived property value: 1 | * Code points that changed derived property value: 1 | |||
PVALID changed from 122411 to 122734 (+323) | * PVALID changed from 122411 to 122734 (+323) | |||
UNASSIGNED changed from 837775 to 837091 (-684) | * UNASSIGNED changed from 837775 to 837091 (-684) | |||
CONTEXTJ did not change, at 2 | * CONTEXTJ did not change, at 2 | |||
CONTEXTO did not change, at 25 | * CONTEXTO did not change, at 25 | |||
DISALLOWED changed from 153899 to 154260 (+361) | * DISALLOWED changed from 153899 to 154260 (+361) | |||
TOTAL did not change, at 1114112 | * TOTAL did not change, at 1114112 | |||
Georgian letters in the ranges U+10D0..U+10FA and U+10FD..U+10FF | * Georgian letters in the ranges U+10D0..U+10FA and U+10FD..U+10FF | |||
had their General Properties changed from Lo to Ll, to reflect | had their General Category changed from Lo (Other_Letter) to Ll | |||
their status as the lowercase of new Georgian case pairs. Case | (Lowercase_Letter) to reflect their status as the lowercase of new | |||
mappings were also added. | Georgian case pairs. Case mappings were also added. | |||
SHARADA SANDHI MARK (U+111C9) was changed from Po to Mn, and from | * SHARADA SANDHI MARK (U+111C9) General Category was changed from Po | |||
bc=L to bc=NSM. | (Other_Punctuation) to Mn (Nonspacing_Mark), and the Bidi property | |||
was changed from L (Left to Right) to NSM (Nonspacing Mark). | ||||
The properties for ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and | * The properties for ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and | |||
ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were corrected from Mc to | ZANABAZAR SQUARE VOWEL SIGN AU (U+11A08) were corrected from Mc to | |||
Mn. | Mn. | |||
SPHERICAL ANGLE OPENING UP (U+29A1) was changed to Bidi_M=N. | * SPHERICAL ANGLE OPENING UP (U+29A1) was changed to Bidi Mirrored | |||
to No. | ||||
These changes to the Unicode Standard have the following implications | These changes to the Unicode Standard have the following implications | |||
for these code points: | for these code points: | |||
o The newly assigned 684 characters are assigned a derived property | * The newly assigned 684 characters are assigned a derived property | |||
value as of a result of applying the IDNA2008 algorithm. | value as of a result of applying the IDNA2008 algorithm. | |||
o The Georgian letters in the ranges U+10D0..U+10FA and | * The Georgian letters in the ranges U+10D0..U+10FA and | |||
U+10FD..U+10FF existed before IDNA2008 was created. Applying the | U+10FD..U+10FF existed before IDNA2008 was created. Applying the | |||
IDNA2008 algorithm to the code points assigned the derived | IDNA2008 algorithm to the code points assigned the derived | |||
property value PVALID, and that value is unchanged even if the | property value PVALID, and that value is unchanged even if the | |||
underlying Unicode properties have changed. The newly encoded | underlying Unicode properties have changed. The newly encoded | |||
Mtavruli letters have general category "Lu" and are therefore | Mtavruli letters have General Category Lu (Uppercase_Letter) and | |||
DISALLOWED. | are therefore DISALLOWED. | |||
o The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0 | * The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0 | |||
[Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code | [Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code | |||
point assigned the derived property value DISALLOWED. The changes | point assigned the derived property value DISALLOWED. The changes | |||
in the underlying properties in the Unicode Standard Version | in the underlying properties in Unicode 11.0.0 [Unicode-11.0.0] | |||
11.0.0 [Unicode-11.0.0] caused the derived property value to | caused the derived property value to change to PVALID. | |||
change to PVALID. | ||||
o The characters ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and | * The characters ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and | |||
ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were added to Unicode | ZANABAZAR SQUARE VOWEL SIGN AU (U+11A08) were added to Unicode | |||
10.0.0 [Unicode-10.0.0]. Applying the IDNA2008 algorithm to the | 10.0.0 [Unicode-10.0.0]. Applying the IDNA2008 algorithm to the | |||
code points assigned the derived property value PVALID, and that | code points assigned the derived property value PVALID, and that | |||
value is unchanged even if the underlying Unicode properties have | value is unchanged even if the underlying Unicode properties have | |||
changed. | changed. | |||
o SPHERICAL ANGLE OPENING UP (U+29A1) existed before IDNA2008 was | * SPHERICAL ANGLE OPENING UP (U+29A1) existed before IDNA2008 was | |||
created. Applying the IDNA2008 algorithm to the code point | created. Applying the IDNA2008 algorithm to the code point | |||
assigned the derived property value DISALLOWED, and that value is | assigned the derived property value DISALLOWED, and that value is | |||
unchanged even if the underlying Unicode properties have changed. | unchanged even if the underlying Unicode properties have changed. | |||
3.4. Changes between Unicode 11.0.0 and 12.0.0 | 3.4. Changes between Unicode 11.0.0 and 12.0.0 | |||
Change in number of characters in each category: | Change in number of characters in each category: | |||
Code points that changed derived property value: 0 | * Code points that changed derived property value: 0 | |||
PVALID changed from 122734 to 123006 (+272) | * PVALID changed from 122734 to 123006 (+272) | |||
UNASSIGNED changed from 837091 to 836537 (-554) | * UNASSIGNED changed from 837091 to 836537 (-554) | |||
CONTEXTJ did not change, at 2 | * CONTEXTJ did not change, at 2 | |||
CONTEXTO did not change, at 25 | * CONTEXTO did not change, at 25 | |||
DISALLOWED changed from 154260 to 154542 (+282) | * DISALLOWED changed from 154260 to 154542 (+282) | |||
TOTAL did not change, at 1114112 | * TOTAL did not change, at 1114112 | |||
4. U+111C9 SHARADA SANDHI MARK | 4. U+111C9 SHARADA SANDHI MARK | |||
As one can see in Section 3, an incompatible property change was made | As one can see in Section 3, an incompatible property change was made | |||
between Unicode 6.0.0 and 12.0.0, affecting the code point U+111C9. | between Unicode 6.0.0 and 12.0.0, affecting the code point U+111C9. | |||
Its derived property value thus changed from DISALLOWED to PVALID. | Its derived property value thus changed from DISALLOWED to PVALID. | |||
In situations like these, IDNA2008 allow for addition of rules to RFC | In situations like these, IDNA2008 allows for addition of rules to | |||
5892 [RFC5892] section 2.7. If the code point is accepted, it might | RFC 5892 [RFC5892], Section 2.7. If the code point is accepted, it | |||
still be rejected if validated by software based on older versions of | might still be rejected if validated by software based on versions of | |||
Unicode than 12.0.0. As the character is rarely used outside the | Unicode older than 12.0.0. As the character is rarely used outside | |||
group of Sharada specialists, and used in some records for indicating | the group of Sharada specialists but is used in some records for | |||
sandhi breaks, the conclusion is that it could either be added as an | indicating sandhi breaks, the conclusion was that it could either be | |||
exception or allowed to change its property value, as the use of the | added as an exception or allowed to change its property value. As | |||
code point is limited outside a special community. As including an | including an exception would require implementation changes to | |||
exception would require implementation changes in deployed | deployments of IDNA20008, the IETF has decided not to add a | |||
implementations of IDNA20008, the IETF has decided to not add a | BackwardCompatible rule to IDNA2008 (i.e., Section 2.7 of RFC 5892 | |||
BackwardCompatible rule to IDNA2008 (i.e. Section 2.7 of RFC 5892 | [RFC5892]) for this code point. This also ensures all sandhi marks | |||
[RFC5892] for this code point. This also ensures all sandhi marks | are treated equally. | |||
being treated in an equal way. | ||||
5. Conclusion | 5. Conclusion | |||
As described in Section 3 and Section 4, changes have been made to | As described in Sections 3 and 4, changes have been made to Unicode | |||
Unicode between version 6.0.0 and 12.0.0. Some changes to specific | between version 6.0.0 and 12.0.0. Some changes to specific | |||
characters changed their derived property value, whereas other | characters changed their derived property value, whereas other | |||
changes did not. Given the deployment considerations described in | changes did not. Given the deployment considerations described in | |||
Section 2.3 and changes in the Unicode Standard described in | Section 2.3 and changes in the Unicode Standard described in Sections | |||
Section 3 and Section 4, including implications to normalization, the | 3 and 4, including implications to normalization, the conclusion is | |||
conclusion is to not add any exception rules to IDNA2008. | not to add any exception rules to IDNA2008. | |||
This document addresses only changes to Unicode between version 6.0.0 | This document addresses only changes to Unicode between version 6.0.0 | |||
and version 12.0.0. Changes in future Unicode versions might result | and version 12.0.0. Changes in future Unicode versions might result | |||
in the conclusion that exception rules need to be added to IDNA2008 | in the conclusion that exception rules need to be added to IDNA2008 | |||
after the review process explained in RFC 8753 [RFC8753]. Separately | after the review process explained in RFC 8753 [RFC8753]. Separately | |||
from any changes in Unicode, the IETF might conclude that updates to | from any changes in Unicode, the IETF might conclude that updates to | |||
RFC 5892 [RFC5892] or other IDNA2008 documents might become | RFC 5892 [RFC5892] or other IDNA2008 documents might become | |||
necessary; such updates might include changes to the algorithm | necessary; such updates might include changes to the algorithm | |||
specified in IDNA2008 as well as additional rules, categories, or | specified in IDNA2008 as well as additional rules, categories, or | |||
other forms of tuning, like the clarifications in RFC 8753 [RFC8753]. | other forms of tuning, like the clarifications in RFC 8753 [RFC8753]. | |||
6. IANA Considerations | 6. IANA Considerations | |||
IANA is requested to update the IDNA Parameters registry [IANA-IDNA] | IANA updated the "IDNA Rules and Derived Property Values" [IANA-IDNA] | |||
of derived property values, after the expert reviewer validates that | registry after the expert reviewer validated that the derived | |||
the derived property values are calculated correctly. | property values were calculated correctly. | |||
7. Security Considerations | 7. Security Considerations | |||
This document makes recommendations regarding the use of the IDNA2008 | This document makes recommendations regarding the use of the IDNA2008 | |||
algorithm for calculation of derived property values, based on | algorithm for calculation of derived property values, based on | |||
Unicode version 12.0.0. This recommendation does not say anything | Unicode version 12.0.0. This recommendation does not say anything | |||
about what recommendations to make for future versions of the Unicode | about what recommendations to make for future versions of the Unicode | |||
Standard. | Standard. | |||
Not following these recommendations can lead to various security | Not following these recommendations can lead to various security | |||
issues. Specifically, allowing confusable characters may lead to | issues. Specifically, allowing confusable characters may lead to | |||
various phishing attacks, as described in the Security Consideration | various phishing attacks, as described in the Security Consideration | |||
Sections in the documents listed in Section 2.1. | Sections in the documents listed in Section 2.1. | |||
8. Acknowledgements | 8. References | |||
Thanks to Harald Alvestrand, Marc Blanchet, Martin Duerst, Asmus | ||||
Freytag, Ted Hardie, John Klensin, Erik Nordmark, Pete Resnick, Peter | ||||
Saint-Andre, Michel Suignard, Andrew Sullivan and Suzanne Woolf for | ||||
input to this document. | ||||
9. References | ||||
9.1. Normative References | 8.1. Normative References | |||
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep | [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep | |||
Profile for Internationalized Domain Names (IDN)", | Profile for Internationalized Domain Names (IDN)", | |||
RFC 3491, DOI 10.17487/RFC3491, March 2003, | RFC 3491, DOI 10.17487/RFC3491, March 2003, | |||
<https://www.rfc-editor.org/info/rfc3491>. | <https://www.rfc-editor.org/info/rfc3491>. | |||
[RFC5890] Klensin, J., "Internationalized Domain Names for | [RFC5890] Klensin, J., "Internationalized Domain Names for | |||
Applications (IDNA): Definitions and Document Framework", | Applications (IDNA): Definitions and Document Framework", | |||
RFC 5890, DOI 10.17487/RFC5890, August 2010, | RFC 5890, DOI 10.17487/RFC5890, August 2010, | |||
<https://www.rfc-editor.org/info/rfc5890>. | <https://www.rfc-editor.org/info/rfc5890>. | |||
skipping to change at page 13, line 10 ¶ | skipping to change at line 557 ¶ | |||
[RFC5893] Alvestrand, H., Ed. and C. Karp, "Right-to-Left Scripts | [RFC5893] Alvestrand, H., Ed. and C. Karp, "Right-to-Left Scripts | |||
for Internationalized Domain Names for Applications | for Internationalized Domain Names for Applications | |||
(IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010, | (IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010, | |||
<https://www.rfc-editor.org/info/rfc5893>. | <https://www.rfc-editor.org/info/rfc5893>. | |||
[RFC6452] Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code | [RFC6452] Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code | |||
Points and Internationalized Domain Names for Applications | Points and Internationalized Domain Names for Applications | |||
(IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452, | (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452, | |||
November 2011, <https://www.rfc-editor.org/info/rfc6452>. | November 2011, <https://www.rfc-editor.org/info/rfc6452>. | |||
9.2. Non-normative references | 8.2. Informative References | |||
[IAB2005-1] | [IAB2005-1] | |||
Internet Architecture Board, "IAB Statement on Identifiers | Internet Architecture Board, "IAB Statement on Identifiers | |||
and Unicode 7.0.0", IAB Statement on Identifiers and | and Unicode 7.0.0", 27 January 2015, | |||
Unicode 7.0.0 | ||||
<https://www.iab.org/documents/correspondence-reports- | <https://www.iab.org/documents/correspondence-reports- | |||
documents/2015-2/iab-statement-on-identifiers-and-unicode- | documents/2015-2/iab-statement-on-identifiers-and-unicode- | |||
7-0-0/archive/>, January 2015. | 7-0-0/archive/>. | |||
[IAB2005-2] | [IAB2005-2] | |||
Internet Architecture Board, "IAB Statement on Identifiers | Internet Architecture Board, "IAB Statement on Identifiers | |||
and Unicode 7.0.0", IAB Statement on Identifiers and | and Unicode 7.0.0", 11 February 2015, | |||
Unicode 7.0.0 | ||||
<https://www.iab.org/documents/correspondence-reports- | <https://www.iab.org/documents/correspondence-reports- | |||
documents/2015-2/iab-statement-on-identifiers-and-unicode- | documents/2015-2/iab-statement-on-identifiers-and-unicode- | |||
7-0-0/>, February 2015. | 7-0-0/>. | |||
[IANA-IDNA] | [IANA-IDNA] | |||
IANA, "IDNA Rules and Derived Property Values", IDNA Rules | IANA, "IDNA Rules and Derived Property Values", February | |||
and Derived Property Values | 2022, | |||
<https://www.iana.org/assignments/idna-tables-6.0.0/idna- | <https://www.iana.org/assignments/idna-tables-12.0.0/>. | |||
tables-6.0.0.xhtml>, April 2020. | ||||
[IDNA7] Klensin, J. and P. Faltstrom, "IDNA Update for Unicode 7.0 | [IDNA7] Klensin, J. C. and P. Faltstrom, "IDNA Update for Unicode | |||
and Later Versions", draft-klensin-idna-5892upd-unicode70 | 7.0 and Later Versions", Work in Progress, Internet-Draft, | |||
<https://datatracker.ietf.org/doc/draft-klensin-idna- | draft-klensin-idna-5892upd-unicode70-05, 8 October 2017, | |||
5892upd-unicode70/>, October 2017. | <https://datatracker.ietf.org/doc/html/draft-klensin-idna- | |||
5892upd-unicode70-05>. | ||||
[RFC3454] Hoffman, P. and M. Blanchet, "Preparation of | [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of | |||
Internationalized Strings ("stringprep")", RFC 3454, | Internationalized Strings ("stringprep")", RFC 3454, | |||
DOI 10.17487/RFC3454, December 2002, | DOI 10.17487/RFC3454, December 2002, | |||
<https://www.rfc-editor.org/info/rfc3454>. | <https://www.rfc-editor.org/info/rfc3454>. | |||
[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, | [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, | |||
"Internationalizing Domain Names in Applications (IDNA)", | "Internationalizing Domain Names in Applications (IDNA)", | |||
RFC 3490, DOI 10.17487/RFC3490, March 2003, | RFC 3490, DOI 10.17487/RFC3490, March 2003, | |||
<https://www.rfc-editor.org/info/rfc3490>. | <https://www.rfc-editor.org/info/rfc3490>. | |||
skipping to change at page 14, line 15 ¶ | skipping to change at line 609 ¶ | |||
[RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for | [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for | |||
Internationalized Domain Names in Applications (IDNA) | Internationalized Domain Names in Applications (IDNA) | |||
2008", RFC 5895, DOI 10.17487/RFC5895, September 2010, | 2008", RFC 5895, DOI 10.17487/RFC5895, September 2010, | |||
<https://www.rfc-editor.org/info/rfc5895>. | <https://www.rfc-editor.org/info/rfc5895>. | |||
[RFC6912] Sullivan, A., Thaler, D., Klensin, J., and O. Kolkman, | [RFC6912] Sullivan, A., Thaler, D., Klensin, J., and O. Kolkman, | |||
"Principles for Unicode Code Point Inclusion in Labels in | "Principles for Unicode Code Point Inclusion in Labels in | |||
the DNS", RFC 6912, DOI 10.17487/RFC6912, April 2013, | the DNS", RFC 6912, DOI 10.17487/RFC6912, April 2013, | |||
<https://www.rfc-editor.org/info/rfc6912>. | <https://www.rfc-editor.org/info/rfc6912>. | |||
[RFC8753] Klensin, J. and P. Faeltstroem, "Internationalized Domain | [RFC8753] Klensin, J. and P. Fältström, "Internationalized Domain | |||
Names for Applications (IDNA) Review for New Unicode | Names for Applications (IDNA) Review for New Unicode | |||
Versions", RFC 8753, DOI 10.17487/RFC8753, April 2020, | Versions", RFC 8753, DOI 10.17487/RFC8753, April 2020, | |||
<https://www.rfc-editor.org/info/rfc8753>. | <https://www.rfc-editor.org/info/rfc8753>. | |||
[SAC-084] The Security and Stability Advisory Committee, "SAC084", | [SAC-084] The Security and Stability Advisory Committee, "SAC084", | |||
SSAC Comments on Guidelines for the Extended Process | SSAC Comments on Guidelines for the Extended Process | |||
Similarity Review Panel for the IDN ccTLD Fast Track | Similarity Review Panel for the IDN ccTLD Fast Track | |||
Process <https://www.icann.org/en/system/files/files/sac- | Process, August 2016, | |||
084-en.pdf>, August 2016. | <https://www.icann.org/en/system/files/files/sac- | |||
084-en.pdf>. | ||||
[Unicode-3.2.0] | [Unicode-3.2.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
3.2.0", The Unicode Standard, Version 3.2.0 ISBN | 3.2.0", Mountain View: The Unicode Consortium, | |||
0-201-61633-5, March 2002. | ISBN 0-201-61633-5, March 2002, | |||
<https://www.unicode.org/versions/Unicode3.2.0/>. | ||||
[Unicode-5.2.0] | [Unicode-5.2.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
5.2.0", The Unicode Standard, Version 5.2.0 ISBN | 5.2.0", Mountain View: The Unicode Consortium, | |||
978-1-936213-00-9, October 2009. | ISBN 978-1-936213-00-9, October 2009, | |||
<https://www.unicode.org/versions/Unicode5.2.0/>. | ||||
[Unicode-6.0.0] | [Unicode-6.0.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
6.0.0", The Unicode Standard, Version 6.0.0 ISBN | 6.0.0", Mountain View: The Unicode Consortium, | |||
978-1-936213-01-6, October 2011. | ISBN 978-1-936213-01-6, October 2011, | |||
<https://www.unicode.org/versions/Unicode6.0.0/>. | ||||
[Unicode-7.0.0] | [Unicode-7.0.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
7.0.0", The Unicode Standard, Version 7.0.0 ISBN | 7.0.0", Mountain View: The Unicode Consortium, | |||
978-1-936213-09-2, June 2014. | ISBN 978-1-936213-09-2, June 2014, | |||
<https://www.unicode.org/versions/Unicode7.0.0/>. | ||||
[Unicode-8.0.0] | [Unicode-8.0.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
8.0.0", The Unicode Standard, Version 8.0.0 ISBN | 8.0.0", Mountain View: The Unicode Consortium, | |||
978-1-936213-10-8, June 2015. | ISBN 978-1-936213-10-8, June 2015, | |||
<https://www.unicode.org/versions/Unicode8.0.0/>. | ||||
[Unicode-10.0.0] | [Unicode-10.0.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
10.0.0", The Unicode Standard, Version 10.0.0 ISBN | 10.0.0", Mountain View: The Unicode Consortium, | |||
978-1-936213-16-0, June 2017. | ISBN 978-1-936213-16-0, June 2017, | |||
<https://www.unicode.org/versions/Unicode10.0.0/>. | ||||
[Unicode-11.0.0] | [Unicode-11.0.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
11.0.0", The Unicode Standard, Version 11.0.0 ISBN | 11.0.0", Mountain View: The Unicode Consortium, | |||
978-1-936213-19-1, June 2018. | ISBN 978-1-936213-19-1, June 2018, | |||
<https://www.unicode.org/versions/Unicode11.0.0/>. | ||||
[Unicode-12.0.0] | [Unicode-12.0.0] | |||
The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
12.0.0", The Unicode Standard, Version 12.0.0 ISBN | 12.0.0", Mountain View: The Unicode Consortium, | |||
978-1-936213-22-1, March 2019. | ISBN 978-1-936213-22-1, March 2019, | |||
<https://www.unicode.org/versions/Unicode12.0.0/>. | ||||
[UTS-46] The Unicode Consortium, "Unicode Technical Standard #46, | [UTS-46] The Unicode Consortium, "Unicode Technical Standard #46, | |||
Version 12.0.0", UNICODE IDNA COMPATIBILITY | Version 12.0.0", UNICODE IDNA COMPATIBILITY PROCESSING, | |||
PROCESSING <https://www.unicode.org/reports/tr46/>, March | March 2019, | |||
2019. | <https://www.unicode.org/reports/tr46/tr46-23.html>. | |||
Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0 | Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0 | |||
Changes from derived property value UNASSIGNED to either PVALID or | Changes from derived property value UNASSIGNED to either PVALID or | |||
DISALLOWED. | DISALLOWED. | |||
037F ; DISALLOWED # GREEK CAPITAL LETTER YOT | 037F ; DISALLOWED # GREEK CAPITAL LETTER YOT | |||
0528 ; DISALLOWED # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK | 0528 ; DISALLOWED # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK | |||
0529 ; PVALID # CYRILLIC SMALL LETTER EN WITH LEFT HOOK | 0529 ; PVALID # CYRILLIC SMALL LETTER EN WITH LEFT HOOK | |||
052A ; DISALLOWED # CYRILLIC CAPITAL LETTER DZZHE | 052A ; DISALLOWED # CYRILLIC CAPITAL LETTER DZZHE | |||
skipping to change at page 29, line 15 ¶ | skipping to change at line 1306 ¶ | |||
1F9AE..1F9AF; DISALLOWED # GUIDE DOG..PROBING CANE | 1F9AE..1F9AF; DISALLOWED # GUIDE DOG..PROBING CANE | |||
1F9BA..1F9BF; DISALLOWED # SAFETY VEST..MECHANICAL LEG | 1F9BA..1F9BF; DISALLOWED # SAFETY VEST..MECHANICAL LEG | |||
1F9C3..1F9CA; DISALLOWED # BEVERAGE BOX..ICE CUBE | 1F9C3..1F9CA; DISALLOWED # BEVERAGE BOX..ICE CUBE | |||
1F9CD..1F9CF; DISALLOWED # STANDING PERSON..DEAF PERSON | 1F9CD..1F9CF; DISALLOWED # STANDING PERSON..DEAF PERSON | |||
1FA00..1FA53; DISALLOWED # NEUTRAL CHESS KING..BLACK CHESS KNIGHT-BISHOP | 1FA00..1FA53; DISALLOWED # NEUTRAL CHESS KING..BLACK CHESS KNIGHT-BISHOP | |||
1FA70..1FA73; DISALLOWED # BALLET SHOES..SHORTS | 1FA70..1FA73; DISALLOWED # BALLET SHOES..SHORTS | |||
1FA78..1FA7A; DISALLOWED # DROP OF BLOOD..STETHOSCOPE | 1FA78..1FA7A; DISALLOWED # DROP OF BLOOD..STETHOSCOPE | |||
1FA80..1FA82; DISALLOWED # YO-YO..PARACHUTE | 1FA80..1FA82; DISALLOWED # YO-YO..PARACHUTE | |||
1FA90..1FA95; DISALLOWED # RINGED PLANET..BANJO | 1FA90..1FA95; DISALLOWED # RINGED PLANET..BANJO | |||
Acknowledgments | ||||
Thanks to Harald Alvestrand, Marc Blanchet, Martin Dürst, Asmus | ||||
Freytag, Ted Hardie, John Klensin, Erik Nordmark, Pete Resnick, Peter | ||||
Saint-Andre, Michel Suignard, Andrew Sullivan, and Suzanne Woolf for | ||||
input to this document. | ||||
Author's Address | Author's Address | |||
Patrik Faltstrom | Patrik Fältström | |||
Netnod | Netnod | |||
Email: paf@netnod.se | Email: paf@netnod.se | |||
End of changes. 118 change blocks. | ||||
257 lines changed or deleted | 259 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |