<?xmlversion="1.0" encoding="US-ASCII"?>version='1.0' encoding='utf-8'?> <!DOCTYPE rfc SYSTEM"rfc2629.dtd" [ <!-- Getting references from the online citation library. There has to be one entity for each item to be referenced. --> <!ENTITY rfc1766 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.1766.xml"> <!ENTITY rfc3282 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3282.xml"> <!ENTITY rfc3454 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3454.xml"> <!ENTITY rfc3490 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3490.xml"> <!ENTITY rfc3491 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3491.xml"> <!ENTITY rfc3629 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3629.xml"> <!ENTITY rfc4690 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4690.xml"> <!ENTITY rfc5646 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5646.xml"> <!ENTITY rfc5890 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5890.xml"> <!ENTITY rfc5892 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5892.xml"> <!ENTITY rfc5894 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.5894.xml"> <!ENTITY rfc6452 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6452.xml"> <!ENTITY rfc8126 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.8126.xml"> <!-- Outline of entity definition for citations to Internet Drafts <!ENTITY I-D.mrose-writing-rfcs SYSTEM "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.mrose-writing-rfcs"> corresponding to a draft filename draft-mrose-writing-rfcs-nn.txt. Naming convention for draft-ietf-xx-yy is that ietf-xx-yy is latest version draft-ietf-xx-yy-NN is that version. Similarly for draft-foo rather than draft-ietf: foo-xx-yy and draft-foo-xx-yy-NN --> <!-- Fudge for XMLmind which doesn't have this built in --> <!ENTITY nbsp " "> ]> <!-- Extra statement used by XSLT processors to control the output style. --> <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?> <!-- Processing Instructions- PIs (for a complete list and description, see file http://xml.resource.org/authoring/README.html. You may find that some sphisticated editors are not able to edit PIs when palced here. An alternative position is just inside the rfc elelment as noted below. --> <!-- Some of the more generally applicable PIs that most I-Ds might want to use --> <!-- Try to enforce the ID-nits conventions and DTD validity --> <?rfc strict="yes" ?> <!-- Items used when reviewing the document --> <!-- Controls display of <cref> elements --> <?rfc comments="yes" ?> <!-- When no, put comments at end in comments section, otherwise, put inline --> <?rfc inline="yes" ?> <!-- When yes, insert editing marks: editing marks consist of a string such as <29> printed in the blank line at the beginning of each paragraph of text. --> <?rfc editing="no" ?> <!-- Create Table of Contents (ToC) and set some options for it. Note the ToC may be omitted for very short documents,but idnits insists on a ToC if the document has more than 15 pages. --> <?rfc toc="yes"?> <?rfc tocompact="yes"?> <!-- If "yes" eliminates blank lines before main section entries. --> <?rfc tocdepth="3"?> <!-- Sets the number of levels of sections/subsections... in ToC. Can be overridden by 'toc="include"/"exclude"' on the section element--> <!-- Choose the options for the references. Some like symbolic tags in the references (and citations) and others prefer numbers. The RFC Editor always uses symbolic tags. The tags used are the anchor attributes of the references. --> <?rfc symrefs="yes"?> <?rfc sortrefs="yes" ?> <!-- If "yes", causes the references to be sorted in order of tags. This doesn't have any effect unless symrefs is "yes" also. --> <!-- These two save paper: Just setting compact to "yes" makes savings by not starting each main section on a new page but does not omit the blank lines between list items. If subcompact is also "yes" the blank lines between list items are also omitted. --> <?rfc compact="yes" ?> <?rfc subcompact="no" ?> <!-- end of list of popular I-D processing instructions --> <!-- Information about the document. category values: std, bcp, info, exp, and historic For Internet-Drafts, specify attribute "ipr". original ipr values are: full3978, noModification3978, noDerivatives3978), 2008 IETF Trust versions: trust200811, noModificationTrust200811, noDerivativeTrust200811 2009/Current: trust200902, noModificationTrust200902, noDerivativesTrust200902, pre5378Trust200902 Also for Internet-Drafts, you must specify a value for attributes "docName" which is typically the file name under which it is filed but need not be. If relevant, "iprExtract" may be specified to denote the anchor attribute value of a section that can be extracted for separate publication, it is only useful when the value of "ipr" does not give the Trust full rights. "updates" and "obsoletes" attributes can also be specified here, their arguments are comma-separated lists of RFC numbers (just the numbers) --> <!-- This XML file is version -05x of the XML. It is identical to -05b, the version posted 2019-11-03 and approved by the IESG, except for the addition of keywords and adjustment of tracking and editorial comments not relevant to RFC Editor processing. This comment replaces earlier tracking information. In the text below, tracking and editorial comments have been removed. -->"rfc2629-xhtml.ent"> <rfc xmlns:xi="http://www.w3.org/2001/XInclude" number="8753" docName="draft-klensin-idna-unicode-review-05" ipr="trust200902" category="std"updates="589"> <!-- obsoletes='2821, 821' updates='1123' category='std' (bcp, info, exp, historic) --> <!-- ***** FRONT MATTER ***** -->consensus="true" updates="5892" obsoletes="" submissionType="IETF" xml:lang="en" tocInclude="true" tocDepth="3" symRefs="true" sortRefs="false" version="3"> <front> <titleabbrev="IDNA-Unicode Reviews">IDNAabbrev="IDNA Unicode Reviews">Internationalized Domain Names for Applications (IDNA) Review for New Unicode Versions</title><!-- add 'role="editor"' below for the editors if appropriate --><seriesInfo name="RFC" value="8753"/> <author fullname="John C Klensin"initials="J.C."initials="J." surname="Klensin"> <organization/> <address> <postal> <street>1770 Massachusetts Ave, Ste 322</street> <city>Cambridge</city> <region>MA</region> <code>02140</code><country>USA</country><country>United States of America</country> </postal> <phone>+1 617 245 1457</phone> <email>john-ietf@jck.com</email> </address> </author> <author fullname="PatrikFaltstrom"Fältström" initials="P."surname="Faltstrom">surname="Fältström"> <organization>Netnod</organization> <address> <postal><street>Franzengatan 5</street> <city>Stockholm</city> <code>112 51</code><street>Greta Garbos Väg 13</street> <city>Solna</city> <code>169 40</code> <country>Sweden</country> </postal> <phone>+46 70 6059051</phone> <email>paf@netnod.se</email> </address> </author> <datemonth="November" day="03" year="2019" /> <!-- Meta-data Declarations -->month="April" year="2020"/> <area>ART</area> <keyword> IDNA2008 </keyword> <keyword> IDN </keyword> <keyword> Unicode Algorithmic Review</keyword> <keyword> Unicode Code Point Review</keyword> <keyword> IDNA Designated Expert </keyword> <abstract> <t>The standards for Internationalized Domain Names in Applications (IDNA) require a review of each new version of Unicode to determine whether incompatibilities with prior versions or other issues exist and, where appropriate, to allow the IETF to decide on the trade-offs between compatibility with prior IDNA versions and compatibility with Unicode going forward. That requirement, and its relationship to tables maintained by IANA, has caused significant confusion in the past. This document makes adjustments to the review procedure based on experience and updates IDNA, specifically RFC 5892, to reflect those changes and to clarify the various relationships involved. It also makes other minor adjustments to align that document with experience. </t> </abstract> </front> <middle> <sectiontitle="Introduction">numbered="true" toc="default"> <name>Introduction</name> <t>The standards for Internationalized Domain Names in Applications (IDNA) require a review of each new version of Unicode to determine whether incompatibilities with prior versions or other issues exist and, where appropriate, to allow the IETF to decide on the trade-offs between compatibility with prior IDNA versions and compatibility with Unicode <xreftarget="Unicode"/>target="Unicode" format="default"/> going forward. That requirement, and its relationship to tables maintained by IANA, has caused significant confusion in the past (see <xreftarget="ReviewModel"/>target="ReviewModel" format="default"/> and <xreftarget="IDNA-Assumptions"/>target="IDNA-Assumptions" format="default"/> for additional discussion of the question of appropriate decisions and the history of these reviews). This document makes adjustments to the review procedure based on nearly a decade of experience and updates IDNA, specifically the document that specifies the relationship between Unicode code points and IDNA derived properties <xreftarget="RFC5892"/>,target="RFC5892" format="default"/>, to reflect those changes and to clarify the various relationships involved. </t> <t> This specification does not change the requirement thatregistries,registries at all levels of the DNStree,tree take responsibility for the labels theyare insertinginsert in the DNS, a level of responsibility that requires allowing only a subset of the code points and strings allowed by the IDNA protocol itself. That requirement is discussed in more detail in a companion document <xreftarget="RegRestr"/>.</t>target="I-D.klensin-idna-rfc5891bis" format="default"/>.</t> <t> Terminology note: In this document, "IDNA" refers to the current version as described in <xreftarget="RFC5890">RFCtarget="RFC5890" format="default">RFC 5890</xref> and subsequent documents and sometimes known as "IDNA2008". Distinctions between it and the earlier version are explicit only wherethat isthey are necessarytofor understanding the relationships involved, e.g., in <xreftarget="History"/>.</t>target="History" format="default"/>.</t> </section> <sectiontitle="Briefanchor="History" numbered="true" toc="default"> <name>Brief History of IDNA Versions, the Review Requirement, and RFC5982" anchor="History">5982</name> <t>The original, now-obsolete, version of IDNA, commonly known as "IDNA2003" <xreftarget="RFC3490"/>target="RFC3490" format="default"/> <xreftarget="RFC3491"/>,target="RFC3491" format="default"/>, was defined in terms of a profile of a collection of IETF-specific tables <xreftarget="RFC3454"/>target="RFC3454" format="default"/> that specified the usability of each Unicode code point with IDNA. Because the tables themselves were normative, they were intrinsically tied to a particular version of Unicode. As Unicode evolved, the IDNA2003 standard wouldeitherhave required the creation of a newversionprofile for each new version ofUnicodeUnicode, oritthe tables wouldfallhave fallen further and furtherbehind. </t> <!-- Note to RFC Editor: "the standard" is not capitalized (or is) consistently in this document. Please figure out what you'd like and adjust accordingly. -->behind.</t> <t> Whenthat version of IDNAIDNA2003 was superseded by the currentone,version, known as IDNA2008 <xreftarget="RFC5890"/>,target="RFC5890" format="default"/>, a different strategy, one that was property-based rather than table-based, was adopted for a number ofreasonsreasons, of which the reliance on normative tables was not dominant <xreftarget="RFC4690"/>.target="RFC4690" format="default"/>. In the IDNA2008 model, the use of normative tables was replaced by a set of procedures and rules that operated on Unicode properties <xreftarget="Unicode-properties"/>target="Unicode-properties" format="default"/> and a few internal definitions to determine the category and status, and hence an IDNA-specific "derived property", for any given code point. Those rules are, in principle, independent of Unicode versions. They can be applied to any version of Unicode, at least from approximately version 5.0 forward, to yield an appropriate set of derived properties. However, the working group that defined IDNA2008 recognized that not all of the Unicode properties were completely stable and that, because the criteria for new code points and property assignment used by the Unicode Consortium might not precisely align with the needs of IDNA, there were possibilities of incompatible changes to the derived propertyvalue.values. More specifically, there could be changes that would makepreviously-disallowedpreviously disallowed labels valid,previously-validpreviously valid labels disallowed, or that would be disruptive to IDNA's defining rule structure. Consequently, IDNA2008 provided for an expert review of each new version of Unicode with the possibility of providing exceptions to the rules for particular new code points, code points whose properties had changed, andnewly-discoverednewly discovered issues with the IDNA2008 collection of rules. When problems were identified, the reviewer was expected to notify the IESG. The assumption was that the IETF would review the situation and modify IDNA2008 as needed, most likely by adding exceptions to preserve backward compatibility (see <xreftarget="AlgorReview"/> below).</t>target="AlgorReview" format="default"/>).</t> <t> For the convenience of the community, IDNA2008 also provided that IANA would maintain copies of calculated tables resulting from each review, showing the derived properties for each code point. Those tables were expected to be helpful, especially to those without the facilities to easily compute derived properties themselves. Experience with the community and those tables has shown that they have been confused with the normative tables of IDNA2003: the IDNA2008 tables published by IANA have never beennormativenormative, and statements about IDNA2008 being out of date with regard to some Unicode version because the IANA tables have not been updated are incorrect or meaningless.</t> </section> <sectiontitle="Theanchor="ReviewModel" numbered="true" toc="default"> <name>The ReviewModel" anchor="ReviewModel">Model</name> <t> While the text has sometimes been interpreted differently, IDNA2008 actually calls for two types of review when a new Unicode version is introduced. One is an algorithmic comparison of the set of derived properties calculated from the new version of Unicode to the derived properties calculated from the previous one to determine whether incompatible changes have occurred. The other is a review ofnewly-assignednewly assigned code points to determine whether any of them require special treatment (e.g., assignment of what IDNA2008 calls contextual rules) and whether any of them violate any of the assumptions underlying the IDNA2008 derived property calculations. Any of the cases of either review might require either per-code point exceptions or other adjustments to the rules for deriving properties that are part of RFC 5892. The subsections below provide a revised specification for the review procedure.</t> <t> Unless the IESG or theDesignated Expert concludedesignated expert team concludes that there are special problems or unusual circumstances, these reviews will be performed only for major Unicode versions (those numbered NN.0, e.g., 12.0) and not for minor updates (e.g., 12.1). </t> <t> As can be seen in the detailed descriptions in the followingsections,subsections, proper review will require a team of experts that has both broad and specific skills in reviewing Unicode characters and their properties in relation to both the written standards and operational needs. The IESG will need to appoint experts who can draw on the broader community to obtain the necessary skills for particular situations. See the IANA Considerations (<xreftarget="IANA"/>)target="IANA" format="default"/>) for details.</t> <sectiontitle="Reviewanchor="AlgorReview" numbered="true" toc="default"> <name>Review Model Part I: AlgorithmicComparison" anchor="AlgorReview"> <t> Section 5.1Comparison</name> <t>Section <xref target="RFC5892" section="5.1" sectionFormat="bare" format="default"/> of RFC 5892 is the description of the process for creating the initial IANA tables. It is noteworthy that, while it can be read as strongly implying new reviews and new tables for versions of Unicode after 5.2, it does not explicitly specify those reviews or, e.g., the timetable for completing them. It also indicates that incompatibilities are to be "flagged for the IESG" but does not specify exactly what the IESG is to do about them and when. For reasons related to the other type of review and discussed below, only one review was completed, documented <xreftarget="RFC6452"/>,target="RFC6452" format="default"/>, and a set of corresponding new tables installed. That review, which was for Unicode 6.0, found only three incompatibilities; the consensus was to ignore them (not create exceptions in IDNA2008) and to remain consistent with computations based on current (Unicode 6.0) properties rather than preserving backward compatibility within IDNA. The 2018 review (for Unicode 11.0 and versions in between it and 6.0) <xreftarget="IDNA-Unicode12"/>target="I-D.faltstrom-unicode12" format="default"/> also concluded that Unicode compatibility, rather than IDNA backward compatibility, should be maintained. That decision was partially driven by the long period between reviews and the concern that table calculations by others in the interim could result in unexpected incompatibilities if derived property definitionswherewere then changed. See <xreftarget="IDNA-Assumptions"/>target="IDNA-Assumptions" format="default"/> for further discussion of these preferences. </t> </section> <sectiontitle="Reviewanchor="CodePointReview" numbered="true" toc="default"> <name>Review Model Part II: New Code PointAnalysis" anchor="CodePointReview">Analysis</name> <t> The second type ofreviewreview, which is not clearly explained in RFC 5892,butis intended to identify cases in whichnewly-added code points,newly added orperhaps even newly-discoveredrecently discovered problematicolder ones,code points violate the design assumptions of IDNA, to identify defects in those assumptions, orare inconsistentto identify inconsistencies (from an IDNA perspective) with Unicode commitments about assignment, properties, and stability ofnewly-addednewly added code points.The discovery after Unicode 7.0One example of this type of review wasreleased thatthe discovery of new code pointswere being addedafter Unicode 7.0 that were potentially visually equivalent, in the same script, topreviously-availablepreviously available code point sequenceswas one example of the type of situation the review was expected to discover (and did so<xreftarget="IAB-Unicode7-2015"/>target="IAB-Unicode7-2015" format="default"/> <xreftarget="IDNA-Unicode7"/>).</t>target="I-D.klensin-idna-5892upd-unicode70" format="default"/>.</t> <t> Because multiple perspectives on Unicode and writing systems are required, this review will not be successful unless it is done by ateam --team. Finding one all-knowing expert is improbable, and asingle, all-knowing, Designated Expertsingle expert isnot feasible or likelyunlikely to produce an adequate analysis. Rather than any single expert being the sole source of analysis, the designated expert (DE) team needs to understand that there will always be gaps in their knowledge, to know what they don't know, and to work to find the expertise that each review requires. It is also important that the DE team maintains close contact with the Area Directors (ADs) and that the ADs remain aware of the team's changing needs,reviewingexamining and adjusting the team's membership over time, with periodicreviewsreexamination at least annually. It should also be recognized that, if this review identifies a problem, that problem is likely to be complex and/or involve multiple trade-offs. Actions to deal with it are likely to be disruptive (although perhaps not to large communities ofusers)users), or to leaveeithersecurity risks (opportunities for attacks and inadvertent confusion as expected matches do notoccur)occur), or to cause excessive reliance on registries understanding and taking responsibility for what they are registering <xreftarget="RFC5894"/>target="RFC5894" format="default"/> <xreftarget="RegRestr"/>.target="I-D.klensin-idna-rfc5891bis" format="default"/>. The latter, while a requirement of IDNA, has often not worked out well in the past.</t> <t>Because resolution of problems identified by this part of the review may take some time even if that resolution is to add additional contextual rules or to disallow one or more code points, there will be cases in which it will be appropriate to publish the results of the algorithmic review and to provide IANA with corresponding tables, with warnings about code points whose status is uncertain until thereareis IETF consensusconclusionsabout how to proceed. The affected code points should be considered unsafe and identified as "under review" in the IANA tables until final derived properties are assigned. </t> </section> </section> <sectiontitle="IDNAanchor="IDNA-Assumptions" numbered="true" toc="default"> <name>IDNA Assumptions and CurrentPractice" anchor="IDNA-Assumptions">Practice</name> <t> At the time the IDNA2008 documents were written, the assumption was that, if new versions of Unicode introduced incompatible changes, the Standard would be updated to preserve backward compatibility for users of IDNA. For most purposes, this would be done by adding to the table of exceptions associated with Rule G <xreftarget="RFC5892a"/>. <!-- (Section 2.7 of RFC 5892).-->target="RFC5892a" format="default"/>. </t> <t>This has not been the practice in the reviews completed subsequent to Unicode 5.2, as discussed in <xreftarget="ReviewModel"/>.target="ReviewModel" format="default"/>. Incompatibilities were identified in Unicode 6.0 <xreftarget="RFC6452"/>,target="RFC6452" format="default"/> and in the cumulative review leading to tables for Unicode 11.0 <xreftarget="ID.draft-faltstrom-unicode11"/>.target="I-D.faltstrom-unicode11" format="default"/>. In all of those cases, the decision was made to maintain compatibility with Unicode properties rather than with prior versions of IDNA.</t> <t>Subsequent to the publication of this document,If an algorithmic review detects changes in Unicodedetected by algorithmic reviewsafter version 12.0 that would break compatibility with derived properties associated with prior versions of Unicode or changes that would preservesuchcompatibility within IDNA at thepricecost of departing from current Unicodespecificationsspecifications, those changes must bedocumented (incaptured in documents expected to be published asstandards track RFCs), explained to, and reviewed byStandards Track RFCs so that theIETF.</t> <!-- RFC Editor: The above sentence is a horror, but I have not been able to do much better. Please haveIETF can review those changes and maintain ago at it if you have less-bad ideas -->historical record.</t> <t> The community has now made decisions and updated tables for Unicode 6.0 <xreftarget="RFC6452"/>,target="RFC6452" format="default"/>, done catch-up work between it and Unicode 11.0 <xreftarget="ID.draft-faltstrom-unicode11"/>,target="I-D.faltstrom-unicode11" format="default"/>, and completed the review and tables for Unicode 12.0 <xreftarget="IDNA-Unicode12"/>.target="I-D.faltstrom-unicode12" format="default"/>. The decisions made in those cases were driven by preserving consistency with Unicode and Unicode property changes for reasons most clearly explained by the IAB <xreftarget="IAB-Unicode-2018"/>. Doing things that way istarget="IAB-Unicode-2018" format="default"/>. These actions were not only at variance with the language in RFC 5892 butiswere also inconsistent with commitments to the registry and user communities to ensure that IDNlabels,labels that were once valid underIDNA2008,IDNA2008 would remainvalid and, exceptingvalid, and previously invalid labels would remain invalid, except for those labels that were invalid because they contained unassigned codepoints, those that were invalid remained invalid.</t>points.</t> <t> This document restores and clarifies that original language and intent: absent extremely strong evidence on a per-code point basis that preserving the validity status of possible existing (or prohibited) labels would cause significant harm, Unicode changes that would affect IDNA derived properties are to be reflected in IDNA exceptions that preserve the status of those labels. There is one partial exception to this principle. If the new code point analysis (see <xreftarget="CodePointReview"/>)target="CodePointReview" format="default"/>) concludes that some code points or collections of code points should be further analyzed, those code points, and labels including them, should be considered unsafe and used only with extreme caution because the conclusions of the analysis may change their derived property values and status.</t> </section> <sectiontitle="Derivedanchor="IANATables" numbered="true" toc="default"> <name>Derived Tables Published byIANA" anchor="IANATables">IANA</name> <t> As discussed above, RFC 5892 specified that derived property tables be provided via an IANA registry. Perhaps because most IANA registries are considered normative and authoritative, that registry has been the source of considerable confusion, including the incorrect assumption that the absence of published tables for versions of Unicode later than 6.0 meant that IDNA could not be used with later versions. That position was raised in multiple ways, not all of them consistent, especially in the ICANN context <xreftarget="ICANN-LGR-SLA"/>.target="ICANN-LGR-SLA" format="default"/>. </t> <t> If the changes specified in this document are not successful in significantly mitigating the confusion about the status of the tables published by IANA, serious consideration should be given to eliminating those tables entirely.</t> </section> <sectiontitle="Editorial clarificationanchor="ApplyErratum" numbered="true" toc="default"> <name>Editorial Clarification to RFC5892" anchor="ApplyErratum">5892</name> <t>In order to avoid this document going forward with remaining known errors or omissions in RFC 5892, thisThis section updatesthat documentRFC 5892 to provide fixestofor known applicableerrata.errata and omissions. In particular, verified RFC Editor Erratum 3312 <xreftarget="RFC5892Erratum"/>target="Err3312" format="default"/> provides a clarification to AppendixA<xref target="RFC5892" section="A" sectionFormat="bare" format="default"/> andSection A.1 of<xref target="RFC5892" section="A.1" sectionFormat="bare" format="default"/> in RFC 5892. That clarification isresolvedincorporated below.</t><t><list style="numbers"><ol spacing="normal" type="1"> <li> <t> In AppendixA,<xref target="RFC5892" section="A" sectionFormat="bare" format="default"/>, add a new paragraph after the paragraph that begins "The code point...". The new paragraph should read:<vspace blankLines="1"/> "For</t> <blockquote> For the rule to be evaluated to True for the label, itMUST<bcp14>MUST</bcp14> be evaluated separately for every occurrence of theCodecode point in the label; each of those evaluations must result inTrue."</t>True.</blockquote> </li> <li> <t> In AppendixA, Section A.1,<xref target="RFC5892" section="A.1" sectionFormat="bare" format="default"/>, replace the "Rule Set" by<figure><artwork></t> <sourcecode type="pseudocode"><![CDATA[ Rule Set: False; If Canonical_Combining_Class(Before(cp)) .eq. Virama Then True; If cp .eq. \u200C And RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*cp (Joining_Type:T)*(Joining_Type:{R,D})) Then True;</artwork></figure> </t></list></t> </section> <section anchor="Acknowledgements" title="Acknowledgements"> <t> This document was inspired by extensive discussions within the I18N Directorate of the IETF Applications and Real Time (ART) area in the first quarter of 2019 about sorting out the reviews for Unicode 11.0 and 12.0. Careful reviews by Joel Halpern and text suggestions from Barry Leiba resulted in some clarifications.</t> <t> Thanks to Christopher Wood for catching some editorial errors that persisted until rather late in the draft's life cycle and to Benjamin Kaduk for catching and raising a number of questions during Last Call. Some of the issues they raised have been reflected in the document; others did appear to be desirable modifications after further discussion but the questions were definitely worth raising and discussion.</t>]]></sourcecode> </li> </ol> </section> <section anchor="IANA"title="IANA Considerations">numbered="true" toc="default"> <name>IANA Considerations</name> <t> For the algorithmic review described in <xreftarget="AlgorReview"/>,target="AlgorReview" format="default"/>, the IESG is to appoint aDesignated Expertdesignated expert <xreftarget="RFC8126"/>target="RFC8126" format="default"/> with appropriate expertise to conduct the review and to supply derived property tables to IANA. As provided inSection 5.2 of the<xref target="RFC8126" section="5.2" sectionFormat="of">the Guidelines for Writing IANAConsiderations <xref target="RFC8126"/>,Considerations</xref>, theDesignated Expertdesignated expert is expected to consult additional sources of expertise as needed. For the code point review, the expertise will be supplied by an IESG-designated expert team as discussed in <xreftarget="CodePointReview"/>target="CodePointReview" format="default"/> and <xreftarget="ExpertRationale"/>.target="ExpertRationale" format="default"/>. In both cases, the experts should draw on the expertise of other members of the community as needed. In particular, and especially if there is no overlapinof the people holding the various roles, coordination with the IAB-appointed liaison to the Unicode Consortium will be essential to mitigate possible errors due to confusion.</t><!-- RFC Editor: please align tense in the next paragraph if IANA has completed this action prior to publication --><t>As discussed in <xreftarget="IANATables"/>, and if they have not already done so,target="IANATables" format="default"/>, IANAis requested to modifyhas modified the IDNA tables collection <xreftarget="IANA-IDNA-Tables"/> to identifytarget="IANA-IDNA-Tables" format="default"/> by identifying them clearly asnon-normative and in a waynon-normative, so thatdrops the idea ofa "current" or "correct" version of thosetables,tables is not implied, and by pointing to this document for an explanation.That includes publishing and retaining tables, asIANA has published tables supplied by theIETF's Designated Expert,IETF foreach new version ofall Unicodeafter this document is published, keepingversions through 11.0, retaining all older versions and making them available. Newer tables will be constructed as specified in this document and then made available by IANA. IANAis also requested to changehas changed thecurrenttitle of that registry from "IDNA Parameters", which is misleading, to "IDNA Rules and Derived Property Values". </t> <t> The "Note" in that registryshould also be revised to be consistent with the above, perhaps to say: <list style="empty"> <t> "IDNAsays: </t> <blockquote> <t>IDNA does not require that applications and libraries, either for registration/storage or lookup, support any particular version of Unicode. Instead, they are required to use derived property values based on calculations associated with whatever version of Unicode they are using elsewhere in the application or library. For the convenience of application and library developers and others, the IETF has supplied, and IANA maintains, derived property tables for several version of Unicode as listed below. It should be stressed that these are not normative in that, in principle, an application can do its own calculations and these tables can change as IETF understanding evolves. By contrast, the list of code points requiring contextual rules and the associated rules are normative and should be treated as updates to the list in RFC5892."</t> </list></t>5892.</t> </blockquote> <t> As long as the intent is preserved, thespecifictextisof that note may be changed in the future at IANA's discretion.</t><!-- IANA: The above would benefit from a conversation between IANA and the authors at an appropriate time --><t> IANA's attention is called to the introduction, in <xreftarget="CodePointReview"/>,target="CodePointReview" format="default"/>, of a temporary "under review" category to the PVALID, DISALLOWED, etc., entries in the tables.</t> </section> <section anchor="Security"title="Security Considerations"> <t>Application ofnumbered="true" toc="default"> <name>Security Considerations</name> <t>Applying the procedures described in this document and understanding of the clarifications it provides should reduce confusion about IDNA requirements. Because past confusion has provided opportunities for bad behavior, the effect of these changes should improve Internet security to at least some small extent. </t> <t> Because of the preference to keep the derived property value stable (as specified in RFC 5892 and discussed in <xreftarget="IDNA-Assumptions"/>),target="IDNA-Assumptions" format="default"/>), the algorithm used to calculate those derived properties does change as explained in <xreftarget="ReviewModel"/>.target="ReviewModel" format="default"/>. If these changes are not taken into account, the derived property value willchangechange, and the implications might have negative consequences, in some cases with security implications. For example, changes in the calculated derived property value for a code point from either DISALLOWED to PVALID or from PVALID to DISALLOWED can cause changes in label interpretation that would be visible and confusing to end users and might enable attacks. </t> </section> </middle><!-- *****BACK MATTER ***** --><back><references title="Normative References"> &rfc5892; &rfc8126; <reference anchor="Unicode" target="http://www.unicode.org/versions/latest/"> <front> <title> The Unicode Standard (Current Version)</title> <author> <organization> The Unicode Consortium</organization> <address/> </author> <date year="2019"/> </front> <annotation> The link given will always access the current version of the Unicode Standard, independent of its version number or date.</annotation> </reference> <reference anchor="Unicode-properties" target="https://www.unicode.org/versions/Unicode11.0.0/"> <front> <title> The Unicode Standard Version 11.0</title> <author> <organization> The Unicode Consortium</organization> <address/> </author> <date year="2018"/> </front> <annotation> Section 3.5.</annotation> </reference><displayreference target="I-D.klensin-idna-5892upd-unicode70" to="IDNA-Unicode7"/> <displayreference target="I-D.faltstrom-unicode12" to="IDNA-Unicode12"/> <displayreference target="I-D.faltstrom-unicode11" to="IDNA-Unicode11"/> <displayreference target="I-D.klensin-idna-rfc5891bis" to="RegRestr"/> <references> <name>References</name> <references> <name>Normative References</name> <reference anchor="IANA-IDNA-Tables"target="https://www.iana.org/assignments/idna-tables-11.0.0/idna-tables-11.0.0.xhtml">target="https://www.iana.org/assignments/idna-tables"> <front> <title>IDNAParameters</title>Rules and Derived Property Values</title> <author><organization>Internet Assigned Numbers Authority (IDNA)</organization><organization>IANA</organization> </author><date year="2019" day="31" month="March"/></front><annotation> This documents make changes to this registry and a way that could change the title, the URL, or both. This citation is to be version published on 2019-03-31. It may be appropriate to supply a citation to the finished version when this document is published. </annotation></reference> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5892.xml"/> <reference anchor="RFC5892a"target="http://www.rfc-editor.org/rfc/rfc5892.txt">target="https://www.rfc-editor.org/rfc/rfc5892.txt"> <front> <title>The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)</title> <author initials="P." surname="Faltstrom" role="editor"> <organization/> <address/> </author> <date year="2010" month="August" /> </front> <seriesInfo name="RFC" value="5892"/><annotation> Section 2.7 </annotation> </reference> <reference anchor="RFC5892Erratum" target="http://www.rfc-editor.org/errata_search.php?rfc=5892"> <front> <title>RFC5892, "The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)", August 2010, Errata ID: 3312</title> <author> <organization/> <address/> </author> <date year="2012" month="August" day="9" /> </front> <seriesInfo name="Errata ID" value="3312"/><refcontent>Section 2.7</refcontent> </reference></references> <references title="Informative References"> &rfc3454; &rfc3490; &rfc3491; &rfc4690; &rfc5890; &rfc6452; &rfc5894; &rfc1766; &rfc3282; &rfc3629; &rfc5646;<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8126.xml"/> <referenceanchor="IDNA-Unicode12" target="https://datatracker.ietf.org/doc/draft-faltstrom-unicode12/">anchor="Unicode" target="http://www.unicode.org/versions/latest/"> <front><title>IDNA2008 and<title>The Unicode12.0.0</title> <author initials="P." surname="Faltstrom"> <organization></organization> <address/>Standard (Current Version)</title> <author> <organization>The Unicode Consortium</organization> </author><date year="2019" month="March" day="11" /></front><annotation> This document is in<annotation>The link given will always access theRFC Editor queue atcurrent version of2019-06-09. Update to RFC reference if/when appropriate.</annotation>the Unicode Standard, independent of its version number or date.</annotation> </reference> <referenceanchor="ID.draft-faltstrom-unicode11" target="https://datatracker.ietf.org/doc/draft-faltstrom-unicode11/">anchor="Unicode-properties" target="https://www.unicode.org/versions/Unicode11.0.0/"> <front><title>IDNA2008 and<title>The Unicode11.0.0</title> <author initials="P." surname="Faltstrom"> <organization></organization> <address/>Standard Version 11.0</title> <author> <organization>The Unicode Consortium</organization> </author> <dateyear="2019" month="March" day="11" />year="2018"/> </front> <refcontent>Section 3.5</refcontent> </reference> </references> <references> <name>Informative References</name> <referenceanchor="IAB-Unicode7-2015" target="https://www.iab.org/documents/correspondence-reports-documents/2015-2/iab-statement-on-identifiers-and-unicode-7-0-0/">anchor="Err3312" quote-title="false" target="https://www.rfc-editor.org/errata/eid3312"> <front><title> IAB Statement on Identifiers and Unicode 7.0.0</title><title>Erratum ID 3312</title> <author><organization>Internet Architecture Board (IAB)</organization> <address/><organization>RFC Errata</organization> </author><date year="2015" month="February" day="11" /></front> <refcontent>RFC 5892</refcontent> </reference> <reference anchor="IAB-Unicode-2018" target="https://www.iab.org/documents/correspondence-reports-documents/2018-2/iab-statement-on-identifiers-and-unicode/"> <front><title> IAB<title>IAB Statement on Identifiers and Unicode</title> <author> <organization>Internet Architecture Board (IAB)</organization><address/></author> <date year="2018" month="March"day="15" />day="15"/> </front> </reference> <referenceanchor="IDNA-Unicode7" target="https://datatracker.ietf.org/doc/draft-klensin-idna-5892upd-unicode70/03/">anchor="IAB-Unicode7-2015" target="https://www.iab.org/documents/correspondence-reports-documents/2015-2/iab-statement-on-identifiers-and-unicode-7-0-0/"> <front><title>IDNA Update for<title>IAB Statement on Identifiers and Unicode 7.0.0</title><author surname="Klensin" initials="J."> <organization/> </author> <author surname="Falstrom" initials="P."> <organization/><author> <organization>Internet Architecture Board (IAB)</organization> </author> <date year="2015"month="January" day="6"/> </front> <annotation> Note that this is an historical reference to a superseded document. There is nothing "in progress" about it.</annotation> </reference> <reference anchor="RegRestr" target="https://datatracker.ietf.org/doc/draft-klensin-idna-rfc5891bis/"> <front> <title>Internationalized Domain Names in Applications (IDNA): Registry Restrictions and Recommendations </title> <author surname="Klensin" initials="J."> <organization/> </author> <author surname="Freytag" initials="A."> <organization/> </author> <date year="2019" month="July" day="6"/>month="February" day="11"/> </front> </reference> <reference anchor="ICANN-LGR-SLA" target="https://www.icann.org/public-comments/proposed-iana-sla-lgr-idn-tables-2019-06-10-en"> <front><title> Proposed<title>Proposed IANA SLAs for Publishing LGRs/IDN Tables </title> <author><organization> Internet<organization>Internet Corporation for Assigned Names and Numbers (ICANN)</organization> </author> <date year="2019" month="June" day="10"/> </front><annotation>Captured 2019-06-12. In public comment until 2019-07-26. </annotation></reference> <xi:include href="https://www.rfc-editor.org/refs/bibxml3/reference.I-D.klensin-idna-5892upd-unicode70.xml"/> <xi:include href="https://www.rfc-editor.org/refs/bibxml3/reference.I-D.faltstrom-unicode11.xml"/> <xi:include href="https://www.rfc-editor.org/refs/bibxml3/reference.I-D.faltstrom-unicode12.xml"/> <xi:include href="https://www.rfc-editor.org/refs/bibxml3/reference.I-D.klensin-idna-rfc5891bis.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.1766.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3282.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3454.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3490.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3491.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3629.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.4690.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5646.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5890.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.5894.xml"/> <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.6452.xml"/> </references> </references><!-- Sections below here become Appendices. --><sectiontitle="Summarynumbered="true" toc="default"> <name>Summary of Changes to RFC5892">5892</name> <t> Other than the editorial correction specified in <xreftarget="ApplyErratum"/>target="ApplyErratum" format="default"/>, all of the changes in this document are concerned with the reviews for new versions of Unicode and with the IANA Considerations inSection 5,<xref target="RFC5892" section="5" sectionFormat="of" format="default"/>, particularlySection 5.1, of RFC 5982.<xref target="RFC5892" section="5.1" sectionFormat="of" format="default"/>. Whether the changes are substantive or merely clarifications may be somewhat in the eye of thebeholderbeholder, so the list below should not be assumed to be comprehensive. At a very high level, this document clarifies that two types of review were intended and separates them forclarity andclarity. This document also restores the original (but so far unobserved) default for actions when code point derived properties change. For this reason, this document effectivelyprovides a replacement for Section 5.1 of RFC 5892replaces <xref target="RFC5892" section="5.1" sectionFormat="of" format="default"/> and adds or changes somematerial needed to havetext so that the replacementbe clear or makemakes better sense. </t> <t> Changes or clarifications that may be considered important include:<list style="symbols"> <t>Separated</t> <ul spacing="normal"> <li>Separated the new Unicode version review into two explicit parts and provided for different review methods and, potentially, asynchronous outcomes.</t> <t></li> <li> Specified areviewDE team, not a single designated expert, for the code pointreview.</t> <t>review.</li> <li> Eliminated the de facto requirement for the (formerly single)Designated Expertdesignated expert to be the same person as the IAB'sLiaisonliaison to the UnicodeConsortiumConsortium, but called out the importance ofcoordination.</t> <t>coordination.</li> <li> Createdan explicit provision for an "under review" entrythe "Status" field in the IANA tablesso that, if there is ever again a needtotellinform the communityto wait until the IETF sorts things out, that will beaboutselectedspecific potentially problematic code points. This change creates the ability to add information about such code pointsand not wholebefore IETF review is completed instead of having the review process hold up the use of the new Unicodeversions.</t> <t>version. </li> <li> In part because Unicode is now on a regular one-year cycle rather than producing major and minor versions as needed, to avoid overloading the IETF'si18ninternationalization resources, and to avoid generating and storing IANA tables for trivial changes (e.g., the single new code point in Unicode 12.1), the review procedure is applied only to major versions of Unicode unless exceptional circumstances arise and areidentified.</t> </list></t>identified.</li> </ul> </section> <sectiontitle="Backgroundanchor="ExpertRationale" numbered="true" toc="default"> <name>Background and Rationale for Expert Review Procedure for New Code PointAnalysis" anchor="ExpertRationale">Analysis</name> <t> TheExpert Review for New Code Point Analysis providedexpert review procedure forabovenew code point analysis described in <xref target="CodePointReview" format="default"/> is somewhat unusual compared to the examples presented in the Guidelines for Writing IANA Considerations <xreftarget="RFC8126"/>.target="RFC8126" format="default"/>. This appendixprovides an explanation ofexplains that choice and provides the background for it.</t> <t>Development of specifications to support use of languages and writing systems other than English (and LatinScript)script) -- so-called "internationalization" or "i18n" -- has always been problematic in the IETF, especially when requirements go beyond simple coding of characters (e.g., <xreftarget="RFC3629">RFCtarget="RFC3629" format="default">RFC 3629</xref>) or simple identification of languages (e.g., <xreftarget="RFC3282">RFCtarget="RFC3282" format="default">RFC 3282</xref> and the earlier <xreftarget="RFC1766">RFCtarget="RFC1766" format="default">RFC 1766</xref>). A good deal of specialized knowledge is required, knowledge that comes from multiple fields and that requires multiple perspectives. The work is not obviously more complex than routing, especially if one assumes that routing work requires a solid foundation in graph theory or network optimization, or than security and cryptography, but people working in those areas are drawn to the IETF and people from the fields that bear on internationalization typically are not.</t> <t>One result is thatAs a result, we haveseveral timesoften thought we understood a problem, generated a specification or set of specifications,andbut then have been surprisedwhenby unanticipated (by the IETF)issues arose and weissues. We then needed togo back and at leasttune and oftenrevise.revise our specification. The language tag work that started with RFC 1766 is a good example of this: broader considerations and requirements led to later work and a much more complex and finer-grained system <xreftarget="RFC5646"/></t>target="RFC5646" format="default"/>.</t> <t> Work on IDNs further increased the difficulties because many of the decisions that led to the current version of IDNA require understanding theDNS andDNS, itsconstraintsconstraints, and, to at least some extent, the commercial marketinof domainnamesnames, including various ICANN efforts.</t> <t> The net result of these factors is that it is extremely unlikely that the IESG will ever findan Expert Reviewera designated expert whose knowledge and understanding will include everything that is required. </t> <t> Consequently, <xreftarget="IANA"/>target="IANA" format="default"/> and other discussions in this document specify areviewDE teamwith the expectationthatthe members of the team will, together, haveis expected to haveathe broadenoughperspective,collection ofexpertise, and access to information and community in order toconsult, so as to be able to do areview new Unicode versions and to make consensus recommendations that will serve the Internet well. While we anticipate that the team will have one or more leaders,thisthe structure of the team differs from the suggestions given inSection 5.2 of the<xref target="RFC8126" section="5.2" sectionFormat="of">the Guidelines for Writing IANAConsiderations <xref target="RFC8126"/> by not leaving whether or not a team exists or how itConsiderations</xref> since neither the team's formation nor its consultation isconsultedleft to the discretion of the designatedexpertexpert, nor is the designated expert solely accountable to the community. A team that contains multiple perspectives is required, the team members are accountable as a group, and anynon-trivialnontrivial recommendations require team consensus. This also differs from the common practice in the IETF of "review teams" from which a single member is selected to perform a review: the principle for these reviews is teameffort. </t>effort.</t> </section> <sectiontitle="Change Log" anchor="ChangeLog"> <t>RFC Editor: Please remove this appendix before publication.</t> <section title="Changes from version -00 (2019-06-12) to -01"> <t><list style="symbols">anchor="Acknowledgments" numbered="false" toc="default"> <name>Acknowledgments</name> <t>Added a noteThis document was inspired by extensive discussions within the I18N Directorate of the IETF Applications and Real-Time (ART) area in the first quarter of 2019 about sorting out therelationship to draft-klensin-idna-rfc5891bis. </t> <t> Adjusted references per discussion with RFC Editor.</t> <t> Minor editorial correctionsreviews for Unicode 11.0 andimprovements.</t> </list></t> </section> <section title="Changes from version -01 (2019-07-06) to -02"> <t><list style="symbols"> <t> Removed an unnecessary reference12.0. Careful reviews by <contact fullname="Joel Halpern"/> anda duplicate one.</t> </list></t> </section> <section title="Changestext suggestions fromversion -02 (2019-07-22) to -03"> <t><list style="symbols"><contact fullname="Barry Leiba"/> resulted in some clarifications.</t> <t>Addition of text to Section 3Thanks toclarify IESG responsibilities.</t> <t> Very small<contact fullname="Christopher Wood"/> for catching some editorialchangeserrors that persisted until rather late inresponse to AD review. </t> </list></t> </section> <section title="Changes from version -03 (2019-08-29) to -04"> <t><list style="symbols"> <t> Added <xref target="ExpertRationale"/> to describe the reasoning and details ofthereview team for New Code Point Analysisdocument's life cycle andslightly updated the IANA Considerations section to pointtoit.</t> <t> Corrections<contact fullname="Benjamin Kaduk"/> foreditorial problems identified after IETFcatching and raising a number of questions during Last Call.</t> </list></t> </section> <!-- RFC Editor: since this Change Log section will be deleted entirely, I didn't bother producingSome of the issues they raised have been reflected in theSection covering -04document; others did not appear to-05 changes -->be desirable modifications after further discussion, but the questions were definitely worth raising and discussing.</t> </section> </back> </rfc>