<?xmlversion="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>version='1.0' encoding='utf-8'?> <!DOCTYPE rfc SYSTEM"rfc2629.dtd"> <?rfc toc="yes"?> <?rfc tocdepth="4"?> <?rfc sortrefs="yes"?> <?rfc symrefs="yes"?>"rfc2629-xhtml.ent"> <rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-google-self-published-geofeeds-09"ipr="trust200902">ipr="trust200902" obsoletes="" updates="" submissionType="independent" xml:lang="en" tocInclude="true" tocDepth="4" sortRefs="true" symRefs="true" version="3" number="8805"> <!-- xml2rfc v2v3 conversion 2.41.0 --> <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?><?rfc toc="yes" ?> <?rfc symrefs="yes" ?> <?rfc sortrefs="yes"?> <?rfc iprnotified="no" ?> <?rfc strict="yes" ?> <?rfc compact="yes" ?> <?rfc subcompact="no" ?><front> <titleabbrev="Self-publishedabbrev="Self-Published IP Geofeeds">A Format forSelf-publishedSelf-Published IP Geolocation Feeds</title> <seriesInfo name="RFC" value="8805"/> <author fullname="Erik Kline" initials="E." surname="Kline"> <organization>Loon LLC</organization> <address> <postal> <street>1600 Amphitheatre Parkway</street> <city>Mountain View</city><region>California</region><region>CA</region> <code>94043</code> <country>United States of America</country> </postal> <email>ek@loon.com</email> </address> </author> <author fullname="Krzysztof Duleba" initials="K." surname="Duleba"><organization>Google Switzerland GmbH</organization><organization>Google</organization> <address> <postal><street>Brandschenkestrasse 110</street> <code>8002</code> <city>Zürich</city> <country>Switzerland</country><street>1600 Amphitheatre Parkway</street> <city>Mountain View</city> <region>CA</region> <code>94043</code> <country>United States of America</country> </postal> <email>kduleba@google.com</email> </address> </author> <author fullname="Zoltan Szamonek" initials="Z." surname="Szamonek"> <organization>Google Switzerland GmbH</organization> <address> <postal> <street>Brandschenkestrasse 110</street> <code>8002</code> <city>Zürich</city> <country>Switzerland</country> </postal> <email>zszami@google.com</email> </address> </author> <author fullname="Stefan Moser" initials="S." surname="Moser"> <organization>Google Switzerland GmbH</organization> <address> <postal> <street>Brandschenkestrasse 110</street> <code>8002</code> <city>Zürich</city> <country>Switzerland</country> </postal> <email>smoser@google.com</email> </address> </author> <author fullname="Warren Kumari" initials="W." surname="Kumari"> <organization>Google</organization> <address> <postal> <street>1600 Amphitheatre Parkway</street> <city>MountainView, CA</city>View</city> <region>CA</region> <code>94043</code><country>US</country><country>United States of America</country> </postal> <email>warren@kumari.net</email> </address> </author><date/><date month="July" year="2020"/> <abstract> <t>This document records a format whereby a network operator can publish a mapping of IP address prefixes to simplified geolocation information, colloquially termed ageolocation "feed"."geolocation feed". Interested parties can poll and parse these feeds to update or merge with other geolocation data sources and procedures. This format intentionally only allows specifyingcoarse levelcoarse-level location.</t> <t>Some technical organizations operating networks that move from one conference location to the next have already experimentally published small geolocation feeds.</t> <t>This document describes a currently deployed format. At least one consumer (Google) has incorporated these feeds into a geolocation data pipeline, and a significant number of ISPs are using it to inform them where their prefixes should be geolocated.</t> </abstract> </front> <middle> <sectiontitle="Introduction">numbered="true" toc="default"> <name>Introduction</name> <sectiontitle="Motivation">numbered="true" toc="default"> <name>Motivation</name> <t>Providers of services over the Internet have grown to depend on best-effort geolocation information to improve the user experience. Locality information can aid in directing traffic to the nearest serving location, inferring likely native language, and providing additional context for services involving search queries.</t> <t>When an ISP, for example, changes the location where an IP prefix is deployed, serviceswhichthat make use of geolocation information may begin to suffer degraded performance. This can lead to customer complaints, possibly to the ISP directly. Dissemination of correct geolocation data is complicated by the lack of any centralized means to coordinate and communicate geolocation information to all interested consumers of the data.</t> <t>This document records a format whereby a network operator (an ISP, an enterprise, or any organizationwhichthat deems the geolocation of its IP prefixes to be of concern) can publish a mapping of IP address prefixes to simplified geolocation information, colloquially termed a "geolocation feed". Interested parties can poll and parse these feeds to update or merge with other geolocation data sources and procedures.</t> <t>This document describes a currently deployed format. At least one consumer (Google) has incorporated these feeds into a geolocation data pipeline, and a significant number of ISPs are using it to inform them where their prefixes should be geolocated.</t> </section> <sectiontitle="Requirements Notation"> <t>Thenumbered="true" toc="default"> <name>Requirements Notation</name> <t> The key words"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY","<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>", "<bcp14>MAY</bcp14>", and"OPTIONAL""<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as described inBCP 14BCP 14 <xref target="RFC2119"/>and<xref target="RFC8174"/> when, and only when, they appear in all capitals, as shownhere.</t>here. </t> <t>As this is an informational document about a data format and set of operational practices presently in use, requirements notation captures the design goals of the authors and implementors.</t> </section> <sectiontitle="Assumptions About Publication">numbered="true" toc="default"> <name>Assumptions about Publication</name> <t>This document describes both a format and a mechanism for publishing data, with the assumption that the network operator to whom operational responsibility has been delegated for any published data wishes it to be public. Any privacy risk is bounded by the format, and feed publishersMAY<bcp14>MAY</bcp14> omit prefixes or any location field associated with a given prefix to further protect privacy (see <xreftarget="spec"/>target="spec" format="default"/> for details about which fields exactly may be omitted). Feed publishers assume the responsibility of determining which data should be made public.</t> <t>This document does not incorporate a mechanism to communicate acceptable use policies for self-published data. Publication itself is inferred as a desire by the publisher for the data to be usefully consumed, similar to the publication of information like host names, cryptographic keys, andSPFSender Policy Framework (SPF) records <xreftarget="RFC4408"/>target="RFC7208" format="default"/> in the DNS.</t> </section> </section> <sectiontitle="Self-Publishednumbered="true" toc="default"> <name>Self-Published IP GeolocationFeeds">Feeds</name> <t>The format described here was developed to address the need of network operators to rapidly and usefully share geolocation information changes. Originally, there arose a specific case where regional operators found it desirable to publish location changes rather than wait for geolocation algorithms to "learn" about them. Later, technical conferenceswhichthat frequently use the same network prefixes advertised from different conference locations experimented by publishing geolocationfeeds,feeds updated in advance of network locationchanges,changes in order to better serve conference attendees.</t> <t>At its simplest, the mechanism consists of a network operator publishing a file (the "geolocationfeed"), whichfeed") that contains several text entries, one per line. Each entry is keyed by a unique (within the feed) IP prefix (or single IP address) followed by a sequence of network locality attributes to be ascribed to the given prefix.</t> <section anchor="spec"title="Specification">numbered="true" toc="default"> <name>Specification</name> <t>For operational simplicity, every feed should contain data about all IP addresses the provider wants to publish. Alternatives, like publishing only entries for IP addresses whose geolocation data has changed or differ from current observed geolocation behavior "at large", are likely to be too operationally complex.</t> <t>FeedsMUST<bcp14>MUST</bcp14> use UTF-8 <xreftarget="RFC3629"/>target="RFC3629" format="default"/> character encoding. Lines are delimited by a line break (CRLF) (asspecifedspecified in <xreftarget="RFC4180"/>),target="RFC4180" format="default"/>), and blank lines are ignored. Text from a '#' character to the end of the current line is treated as a comment only and issimilarilysimilarly ignored (note that this does notstriclystrictly follow <xreftarget="RFC4180"/>,target="RFC4180" format="default"/>, which has no support forcomments). </t>comments).</t> <t>Feed lines that are not commentsMUST<bcp14>MUST</bcp14> bein comma separated value (CSV) formatformatted as comma-separated values (CSV), as described in <xreftarget="RFC4180"/>.target="RFC4180" format="default"/>. Each feed entry is a text line of theform: <figure> <artwork><![CDATA[form:</t> <artwork name="" type="" align="left" alt=""><![CDATA[ ip_prefix,alpha2code,region,city,postal_code ]]></artwork></figure></t><t>The IP prefix field isREQUIRED,<bcp14>REQUIRED</bcp14>, all others areOPTIONAL<bcp14>OPTIONAL</bcp14> (can be empty), though the requisite minimum number of commasSHOULD<bcp14>SHOULD</bcp14> be present.</t> <sectiontitle="Geolocationnumbered="true" toc="default"> <name>Geolocation Feed Individual EntryFields">Fields</name> <sectiontitle="IP Prefix"> <t>REQUIRED.numbered="true" toc="default"> <name>IP Prefix</name> <t><bcp14>REQUIRED</bcp14>: Each IP prefix fieldMUST<bcp14>MUST</bcp14> be either a single IP address or an IP prefix inCIDRClassless Inter-Domain Routing (CIDR) notation in conformance with<eref target="http://tools.ietf.org/html/rfc4632#section-3.1">section 3.1</eref> of<xreftarget="RFC4632"/>target="RFC4632" sectionFormat="of" section="3.1"/> for IPv4 or<eref target="http://tools.ietf.org/html/rfc4291#section-2.3">section 2.3</eref> of<xreftarget="RFC4291"/>target="RFC4291" sectionFormat="of" section="2.3"/> for IPv6.</t> <t>Examples include "192.0.2.1" and "192.0.2.0/24" for IPv4 and "2001:db8::1" and "2001:db8::/32" for IPv6.</t> </section> <sectiontitle="Alpha2code (previously: 'country')"> <t>OPTIONAL.numbered="true" toc="default"> <name>Alpha2code (Previously: 'country')</name> <t><bcp14>OPTIONAL</bcp14>: The alpha2code field, if non-empty,MUST<bcp14>MUST</bcp14> be a2 letter2-letter ISO country code conforming to ISO 3166-1 alpha 2 <xreftarget="ISO.3166.1alpha2"/>.target="ISO.3166.1alpha2" format="default"/>. ParsersSHOULD<bcp14>SHOULD</bcp14> treat this field case-insensitively.</t> <t>Earlier versions of this document called this field "country", and it may still be referred to as such in existingtools / interfaces.</t>tools/interfaces.</t> <t>ParsersMAY<bcp14>MAY</bcp14> additionally support other2 letter2-letter codes outside the ISO 3166-1 alpha 2codes. For example, 2 lettercodes, such as the 2-letter codes from the<eref target="https://www.iso.org/glossary-for-iso-3166.html">"Exceptionally reservedcodes"</eref> set may appear in this field, e.g. <eref target="https://www.iso.org/obp/ui/#iso:code:3166:UK" >"UK"</eref> or <eref target="https://www.iso.org/obp/ui/#iso:code:3166:EU" >"EU"</eref>.</t>codes" <xref target="ISO-GLOSSARY" format="default"/> set.</t> <t>Examples include "US" for the United States, "JP" for Japan, and "PL" for Poland.</t> </section> <sectiontitle="Region"> <t>OPTIONAL.numbered="true" toc="default"> <name>Region</name> <t><bcp14>OPTIONAL</bcp14>: The region field, if non-empty,MUST<bcp14>MUST</bcp14> beaan ISO region code conforming to ISO 3166-2 <xreftarget="ISO.3166.2"/>.target="ISO.3166.2" format="default"/>. ParsersSHOULD<bcp14>SHOULD</bcp14> treat this field case-insensitively.</t> <t>Examples include "ID-RI" for the Riau province of Indonesia and "NG-RI" for the Rivers province in Nigeria.</t> </section> <sectiontitle="City"> <t>OPTIONAL.numbered="true" toc="default"> <name>City</name> <t><bcp14>OPTIONAL</bcp14>: The city field, if non-empty,SHOULD<bcp14>SHOULD</bcp14> be free UTF-8 text, excluding the comma (',') character.</t> <t>Examples include "Dublin", "New York", and"São"Sao Paulo" (specifically "S" followed by 0xc3, 0xa3, and "o Paulo").</t> </section> <section anchor="postal"title="Postal Code"> <t>OPTIONAL, DEPRECATED.numbered="true" toc="default"> <name>Postal Code</name> <t><bcp14>OPTIONAL</bcp14>, DEPRECATED: The postal code field, if non-empty,SHOULD<bcp14>SHOULD</bcp14> be free UTF-8 text, excluding the comma (',') character. The use of this field is deprecated; consumers of feeds should be able to parse feeds containing these fields, but new feedsSHOULD NOT<bcp14>SHOULD NOT</bcp14> include thisfield,field due to the granularity of this information. See <xreftarget="privacy"/>target="privacy" format="default"/> for additional discussion.</t> <t>Examples include "106-6126" (in Minato ward, Tokyo, Japan).</t> </section> </section> <sectiontitle="Prefixes Withnumbered="true" toc="default"> <name>Prefixes with No GeolocationInformation">Information</name> <t>Feed publishers may indicate that some IP prefixes should not have any associated geolocation information. It may be that some prefixes under their administrative control are reserved, not yet allocated or deployed, orarein the process of being redeployed elsewhere and existing geolocation information can, from the perspective of the publisher, safely be discarded.</t> <t>This special case can be indicated by explicitly leaving blank all fieldswhichthat specify any degree of geolocation information. For example:<figure> <artwork><![CDATA[</t> <artwork name="" type="" align="left" alt=""><![CDATA[ 192.0.2.0/24,,,, 2001:db8:1::/48,,,, 2001:db8:2::/48,,,, ]]></artwork></figure></t><t>Historically, the user-assigned alpha2code identifier of "ZZ" has been used for this same purpose. This is not necessarily preferred, and no specific interpretation of any of the other user-assigned alpha2code codes is currently defined.</t> </section> <sectiontitle="Additionalnumbered="true" toc="default"> <name>Additional ParsingRequirements">Requirements</name> <t>Feed entriesmissingthat do not have an IP address or prefix field orhavinghave an IP address or prefix fieldwhichthat fails to parse correctlyMUST<bcp14>MUST</bcp14> be discarded.</t> <t>While publishersSHOULD<bcp14>SHOULD</bcp14> follow <xreftarget="RFC5952"/> styletarget="RFC5952" format="default"/> for IPv6 prefix fields, consumersMUST<bcp14>MUST</bcp14> nevertheless accept all valid string representations.</t> <t>Duplicate IP address or prefix entriesMUST<bcp14>MUST</bcp14> be considered an error, and consumer implementationsSHOULD<bcp14>SHOULD</bcp14> log the repeated entries for further administrative review. PublishersSHOULD<bcp14>SHOULD</bcp14> take measures to ensure there is one and only one entry per IP address and prefix.</t> <t>Multiple entrieswhichthat constitute nested prefixes are permitted. ConsumersSHOULD<bcp14>SHOULD</bcp14> consider the entry with the longest matching prefix(i.e.(i.e., the "most specific") to be the best matching entry for a given IP address.</t> <t>Feed entries with non-empty optional fieldswhichthat fail to parse, either in part or in full,SHOULD<bcp14>SHOULD</bcp14> be discarded. It isRECOMMENDED<bcp14>RECOMMENDED</bcp14> that they also be logged for further administrative review.</t> <t>For compatibility with future additional fields, a parserMUST<bcp14>MUST</bcp14> ignore any fields beyond those it expects. The data from fieldswhichthat are expected andwhichthat parse successfullyMUST<bcp14>MUST</bcp14> still be considered valid. Per <xreftarget="future_work"/>target="future_work" format="default"/>, no extensions to this format are in use nor are any anticipated.</t> </section> </section> <sectiontitle="Examples">numbered="true" toc="default"> <name>Examples</name> <t>Example entries using different IP address formats and describing locations at alpha2code ("country code"), region, and city granularity level, respectively:<figure> <artwork><![CDATA[</t> <artwork name="" type="" align="left" alt=""><![CDATA[ 192.0.2.0/25,US,US-AL,, 192.0.2.5,US,US-AL,Alabaster, 192.0.2.128/25,PL,PL-MZ,, 2001:db8::/32,PL,,, 2001:db8:cafe::/48,PL,PL-MZ,, ]]></artwork></figure></t><t>The IETF network publishes geolocation information for the meeting prefixes, and generally just comment out the last meeting information and append the new meeting information. The <xreftarget="GEO_IETF"/>target="GEO_IETF" format="default"/>, at the time of thiswriting contains:</t> <t><figure> <artwork><![CDATA[writing, contains: </t> <artwork name="" type="" align="left" alt=""><![CDATA[ # IETF106 (Singapore) - November 2019 - Singapore, SG 130.129.0.0/16,SG,SG-01,Singapore, 2001:df8::/32,SG,SG-01,Singapore, 31.133.128.0/18,SG,SG-01,Singapore, 31.130.224.0/20,SG,SG-01,Singapore, 2001:67c:1230::/46,SG,SG-01,Singapore, 2001:67c:370::/48,SG,SG-01,Singapore, ]]></artwork></figure></t><t>Experimentally, RIPE has published geolocation information for their conference network prefixes, which change location in accordance with each new event. <xreftarget="GEO_RIPE_NCC"/>target="GEO_RIPE_NCC" format="default"/>, at the time ofwriting contains: <figure> <artwork><![CDATA[writing, contains:</t> <artwork name="" type="" align="left" alt=""><![CDATA[ 193.0.24.0/21,NL,NL-ZH,Rotterdam, 2001:67c:64::/48,NL,NL-ZH,Rotterdam, ]]></artwork></figure></t><t>Similarly, ICANN has published geolocation information for their portable conference network prefixes. <xreftarget="GEO_ICANN"/>target="GEO_ICANN" format="default"/>, at the time ofwritingwriting, contains:<figure> <artwork><![CDATA[</t> <artwork name="" type="" align="left" alt=""><![CDATA[ 199.91.192.0/21,MA,MA-07,Marrakech 2620:f:8000::/48,MA,MA-07,Marrakech ]]></artwork></figure></t><t>A longer example is the <xreftarget="GEO_Google"/>target="GEO_Google" format="default"/> Google Corp Geofeed, which lists thegeo-locationgeolocation information for Google corporate offices.</t> <t>At the time of writing, Google processes approximately 400 feeds comprising more than 750,000 IPv4 and IPv6 prefixes.</t> </section> </section> <section anchor="consumers"title="Consumingnumbered="true" toc="default"> <name>Consuming Self-Published IP GeolocationFeeds">Feeds</name> <t>ConsumersMAY<bcp14>MAY</bcp14> treat published feed data as a hint only andMAY<bcp14>MAY</bcp14> choose to prefer other sources of geolocation information for any given IP prefix. Regardless of a consumer's stance with respect to a given published feed, there are some points of note for sensibly and effectively consuming published feeds.</t> <section anchor="integrity"title="Feed Integrity">numbered="true" toc="default"> <name>Feed Integrity</name> <t>The integrity of published informationSHOULD<bcp14>SHOULD</bcp14> be protected by securing the means of publication, forexampleexample, by using HTTP over TLS <xreftarget="RFC2818"/>.target="RFC2818" format="default"/>. Whenever possible, consumersSHOULD<bcp14>SHOULD</bcp14> prefer retrieving geolocation feeds in a manner that guarantees integrity of the feed.</t> </section> <section anchor="authority"title="Verificationnumbered="true" toc="default"> <name>Verification ofAuthority">Authority</name> <t>Consumers of self-published IP geolocation feedsSHOULD<bcp14>SHOULD</bcp14> perform some form of verification that the publisher is in fact authoritative for the addresses in the feed. The actual means of verification is likely dependent upon the way in which the feed is discovered. Ad hoc shared URIs, for example, will likely require an ad hoc verification process. Future automated means of feed discoverySHOULD<bcp14>SHOULD</bcp14> have an accompanying automated means of verification.</t> <t>A consumer should only trust geolocation information for IP addresses or prefixes for which the publisher has been verified as administratively authoritative. All other geolocation feed entries should be ignored and logged for further administrative review.</t> </section> <section anchor="accuracy"title="Verificationnumbered="true" toc="default"> <name>Verification ofAccuracy">Accuracy</name> <t>Errors and inaccuracies may occur at many levels, and publication and consumption of geolocation data are no exceptions. To the extent practical, consumersSHOULD<bcp14>SHOULD</bcp14> take steps to verify the accuracy of published locality. Verification methodology, resolution of discrepancies, and preference for alternative sources of data are left to the discretion of the feed consumer.</t> <t>ConsumersSHOULD<bcp14>SHOULD</bcp14> decide on discrepancy thresholds andSHOULD flag<bcp14>SHOULD</bcp14> flag, for administrativereviewreview, feed entrieswhichthat exceed set thresholds.</t> </section> <sectiontitle="Refreshingnumbered="true" toc="default"> <name>Refreshing FeedInformation">Information</name> <t>As a publisher can change geolocation data at any time and without notification, consumersSHOULD<bcp14>SHOULD</bcp14> implement mechanisms to periodically refresh local copies of feed data. In the absence of any other refresh timing information, it is recommended that consumersSHOULD<bcp14>SHOULD</bcp14> refresh feeds no less often thanweekly,weekly and no more often than is likely to cause issues to the publisher.</t> <t>For feeds available via HTTPS (or HTTP), the publisherMAY<bcp14>MAY</bcp14> communicate refresh timing information by means of the standard HTTP expiration model(<eref target="http://tools.ietf.org/html/rfc2616#section-13.2">section 13.2</eref> of <xref target="RFC2616"/>).(<xref target="RFC7234"/>). Specifically, publishers can include either an<eref target="http://tools.ietf.org/html/rfc2616#section-14.21">Expires header</eref>Expires header (<xref target="RFC7234" sectionFormat="of" section="5.3"/>) or a<eref target="http://tools.ietf.org/html/rfc2616#section-14.9">Cache-Control header</eref>Cache-Control header (<xref target="RFC7234" sectionFormat="of" section="5.2"/>) specifying the max-age. Where practical, consumersSHOULD<bcp14>SHOULD</bcp14> refresh feed information before the expiry time is reached.</t> </section> </section> <section anchor="privacy"title="Privacy Considerations">numbered="true" toc="default"> <name>Privacy Considerations</name> <t>Publishers of geolocation feeds are advised to have fully considered any and all privacy implications of the disclosure of such information for the users of the described networks prior to publication. A thorough comprehension of the<eref target="http://tools.ietf.org/html/rfc6772#section-13">security considerations</eref>security considerations (<xref target="RFC6772" sectionFormat="of" section="13"/>) of a chosen geolocation policy is highly recommended, including an understanding of some of the<eref target="http://tools.ietf.org/html/rfc6772#section-13.5">limitationslimitations of informationobscurity</eref>obscurity (<xref target="RFC6772" sectionFormat="of" section="13.5"/>) (see also <xreftarget="RFC6772"/>).</t>target="RFC6772" format="default"/>).</t> <t>As noted in <xreftarget="spec"/>,target="spec" format="default"/>, each location field in an entry is optional, in order to support expressing only the level of specificitywhichthat the publisher has deemed acceptable. There is no requirement that the level of specificity be consistent across all entries within a feed. In particular, the Postal Code field (<xreftarget="postal"/>)target="postal" format="default"/>) can provide very specific geolocation, sometimes within a building. Such specific Postal Code valuesMUST NOT<bcp14>MUST NOT</bcp14> be published in geofeeds without the express consent of the parties being located.</t> <t>Operators who publish geolocation information are strongly encouraged to inform affected users/customers of this fact and of the potential privacy-related consequences and trade-offs.</t> </section> <sectiontitle="Relationnumbered="true" toc="default"> <name>Relation to OtherWork">Work</name> <t>While not originally done in conjunction with the GEOPRIV Working Group <xreftarget="GEOPRIV"/> working group,target="GEOPRIV" format="default"/>, Richard Barnes observed that this work is nevertheless consistent with that which the group has defined, both for address format and for privacy. The data elements in geolocation feeds are equivalent to the following XML structure(vis.(<xref target="RFC5139" format="default"/> <xreftarget="RFC5139"/>): <figure> <artwork><![CDATA[target="W3C.REC-xml-20081126" format="default"/>): </t> <sourcecode type="xml"><![CDATA[ <civicAddress> <country>country</country> <A1>region</A1> <A2>city</A2> <PC>postal_code</PC> </civicAddress>]]></artwork> </figure></t>]]></sourcecode> <t>Providing geolocation information to this granularity is equivalent to the following privacy policy(vis. the(the definition of the<eref target="http://tools.ietf.org/html/rfc6772#section-6.5.1"> 'building'</eref>'building' <xref target="RFC6772" sectionFormat="of" section="6.5.1"/> level ofdisclosure): <figure> <artwork><![CDATA[disclosure):</t> <sourcecode type="xml"><![CDATA[ <ruleset> <rule> <conditions/> <actions/> <transformations> <provide-location profile="civic-transformation"> <provide-civic>building</provide-civic> </provide-location> </transformations> </rule> </ruleset>]]></artwork> </figure></t>]]></sourcecode> </section> <section anchor="Security"title="Security Considerations">numbered="true" toc="default"> <name>Security Considerations</name> <t>As there is no true security in the obscurity of the location of any given IP address, self-publication of this data fundamentally opens no new attack vectors. For publishers, self-published data may increase the ease with which such location data might be exploited (it can, for example, make easy the discovery of prefixes populated with customers as distinct from prefixes not generally in use).</t> <t>For consumers, feed retrieval processes may receive input from potentially hostile sources(e.g.(e.g., in the event of hijacked traffic). As such, proper input validation and defense measuresMUST<bcp14>MUST</bcp14> be taken (see the discussion in <xreftarget="integrity"/>).</t>target="integrity" format="default"/>).</t> <t>Similarly, consumers who do not perform sufficient verification of published data bear the same risks as from other forms of geolocation configuration errors (see the discussion in Sections <xreftarget="authority"/>target="authority" format="counter"/> and <xreftarget="accuracy"/>).</t>target="accuracy" format="counter"/>).</t> <t>Validation of a feed's contents includes verifying that the publisher is authoritative for the IP prefixes included in the feed. Failure to verify IP prefix authority would, for example, allow ISP Bob to make geolocation statements about IP space held by ISP Alice. At thistimetime, only out-of-band verification methods are implemented(i.e.(i.e., an ISP's feed may be verified against publicly available IP allocation data). </t> </section> <section anchor="future_work"title="Plannednumbered="true" toc="default"> <name>Planned FutureWork">Work</name> <t>In order to more flexibly support future extensions, use of a more expressive feed format has been suggested. Use of JavaScript Object Notation(JSON,(JSON) <xreftarget="RFC4627"/>),target="RFC8259" format="default"/>, specifically, has been discussed. However, at the time ofwritingwriting, no such specification nor implementation exists. Nevertheless, work on extensions is deferred until a more suitable format has been selected.</t> <t>The authors are planning on writing a document describing such a new format. This document describes a currently deployed and used format. Given the extremely limited extensibility of the present format no extensions to it are anticipated. Extensibility requirements are instead expected to be integral to the development of a new format.</t> </section> <sectiontitle="Findingnumbered="true" toc="default"> <name>Finding Self-Published IP GeolocationFeeds">Feeds</name> <t>The issue of finding, and later verifying, geolocation feeds is not formally specified in this document. At this time, only ad hoc feed discovery and verification has a modicum of established practice (see below); discussion of other mechanisms has been removed for clarity.</t> <sectiontitle="Adnumbered="true" toc="default"> <name>Ad Hoc'Well Known' URIs">'Well-Known' URIs</name> <t>To date, geolocation feeds have been shared informally in the form of HTTPS URIs exchanged in email threads. Threeofexample URIsdocumented below(<xreftarget="GEO_IETF"/>,target="GEO_IETF" format="default"/>, <xreftarget="GEO_RIPE_NCC"/>,target="GEO_RIPE_NCC" format="default"/>, and <xreftarget="GEO_ICANN"/>)target="GEO_ICANN" format="default"/>) describe networks that change locations periodically, the operators and operational practices of which are well known within their respective technical communities.</t> <t>The contents of the feeds are verified by a similarly ad hocprocessprocess, including:<list style="symbols"> <t>personal</t> <ul spacing="normal"> <li>personal knowledge of the parties involved in theexchange, and</t> <t>comparisonexchange and</li> <li>comparison of feed-advertised prefixes with the BGP-advertised prefixes of Autonomous System Numbers known to be operated by thepublishers.</t> </list></t>publishers.</li> </ul> <t>Ad hoc mechanisms, while useful for early experimentation by producers and consumers, are unlikely to be adequate for long-term, widespread use by multiple parties. Future versions of any such self-published geolocation feed mechanismSHOULD<bcp14>SHOULD</bcp14> address scalability concerns by defining a means for automated discovery and verification of operational authority of advertised prefixes.</t> </section> <sectiontitle="Other Mechanisms">numbered="true" toc="default"> <name>Other Mechanisms</name> <t>Previous versions of this document referenced use of the WHOIS service(<xref target="RFC3912"/>)<xref target="RFC3912" format="default"/> operated byRIRsRegional Internet Registries (RIRs), as well as possible DNS-based schemes to discover and validate geofeeds. To the authors'knowledgeknowledge, support for such mechanisms has never been implemented, and this speculative text has been removed to avoid ambiguity.</t> </section> </section> <sectiontitle="IANA Considerations">numbered="true" toc="default"> <name>IANA Considerations</name> <t>This documentmakeshas norequests of the IANA.</t> </section> <section title="Acknowledgements"> <t>The authors would like to express their gratitude to reviewers and early implementers, including but not limited to Mikael Abrahamsson, Andrew Alston, Ray Bellis, John Bond, Alissa Cooper, Andras Erdei, Stephen Farrell, Marco Hogewoning, Mike Joseph, Maciej Kuzniar, George Michaelson, Menno Schepers, Justyna Sidorska, Pim van Pelt, and Bjoern A. Zeeb.</t> <t>Richard L. Barnes and Andy Newton in particular contributed substantial review, text, and advice.</t>IANA actions.</t> </section> </middle> <back><references title="Normative References"><references> <name>References</name> <references> <name>Normative References</name> <reference anchor="ISO.3166.1alpha2" target="http://www.iso.org/iso/home/standards/country_codes/iso-3166-1_decoding_table.htm"> <front> <title>ISO 3166-1 decoding table</title><author fullname="ISO 3166 Maintenance agency"> <organization abbrev="ISO">International Organization for Standardization</organization><author> <organization>ISO</organization> </author><date/></front> </reference><?rfc include='reference.RFC.2616.xml'?> <?rfc include='reference.RFC.3629.xml'?> <?rfc include='reference.RFC.4180.xml'?> <?rfc include='reference.RFC.4291.xml'?> <?rfc include='reference.RFC.4632.xml'?> <?rfc include='reference.RFC.5952.xml'?><xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7234.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3629.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4180.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4291.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.4632.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5952.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/> <reference anchor="ISO.3166.2" target="http://www.iso.org/iso/home/standards/country_codes.htm#2012_iso3166-2"> <front> <title>ISO 3166-2:2007</title> <author> <organization>ISO</organization> </author> </front> </reference> <reference anchor="W3C.REC-xml-20081126" target="http://www.w3.org/TR/2008/REC-xml-20081126" quoteTitle="true" derivedAnchor="W3C.REC-xml-20081126"> <front> <title>Extensible Markup Language (XML) 1.0 (Fifth Edition)</title> <authorfullname="ISO 3166 Maintenance agency">initials="T." surname="Bray" fullname="Tim Bray"> <organizationabbrev="ISO">International Organization for Standardization</organization>showOnFrontPage="true"/> </author><date/><author initials="J." surname="Paoli" fullname="Jean Paoli"> <organization showOnFrontPage="true"/> </author> <author initials="M." surname="Sperberg-McQueen" fullname="Michael Sperberg-McQueen"> <organization showOnFrontPage="true"/> </author> <author initials="E." surname="Maler" fullname="Eve Maler"> <organization showOnFrontPage="true"/> </author> <author initials="F." surname="Yergeau" fullname="François Yergeau"> <organization showOnFrontPage="true"/> </author> <date month="November" year="2008"/> </front> <seriesInfo name="World Wide Web Consortium Recommendation" value="REC-xml-20081126"/> <format type="HTML" target="http://www.w3.org/TR/2008/REC-xml-20081126"/> </reference> </references><references title="Informative References"> <?rfc include='reference.RFC.2119.xml'?> <?rfc include='reference.RFC.2818.xml'?> <?rfc include='reference.RFC.3912.xml'?> <?rfc include='reference.RFC.4408.xml'?> <?rfc include='reference.RFC.4627.xml'?> <?rfc include='reference.RFC.5139.xml'?> <?rfc include='reference.RFC.6772.xml'?> <?rfc include='reference.RFC.8174.xml'?><references> <name>Informative References</name> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.2818.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.3912.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.7208.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.8259.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.5139.xml"/> <xi:include href="https://xml2rfc.tools.ietf.org/public/rfc/bibxml/reference.RFC.6772.xml"/> <reference anchor="GEO_IETF" target="https://noc.ietf.org/geo/google.csv"> <front> <title>IETF Meeting Network Geolocation Data</title> <authorfullname="Warren" initials="A." surname="Kumari"> <organization abbrev="IETF NOC">Internet Engineering Task Force (IETF) NOC</organization>initials="W." surname="Kumari" fullname="Warren Kumari"> <organization/> </author><date/></front> </reference> <reference anchor="GEO_RIPE_NCC" target="https://meetings.ripe.net/geo/google.csv"> <front> <title>RIPE NCC Meeting Geolocation Data</title> <author fullname="Menno Schepers" initials="M." surname="Schepers"> <organization abbrev="RIPENCC">RéseauxNCC">Réseaux IPEuropéensEuropéens Network Coordination Centre</organization> </author><date/></front> </reference> <reference anchor="GEO_ICANN" target="https://meeting-services.icann.org/geo/google.csv"> <front> <title>ICANN Meeting Geolocation Data</title> <author><organization abbrev="ICANN">Internet Corporation For Assigned Names and Numbers</organization><organization>ICANN</organization> </author><date/></front> </reference> <reference anchor="GEO_Google" target="https://www.gstatic.com/geofeed/corp_external"> <front> <title>Google Corp Geofeed</title> <author> <organization>Google, LLC</organization> </author><date/></front> </reference> <reference anchor="GEOPRIV" target="http://datatracker.ietf.org/wg/geopriv/"> <front><title>IETF geopriv Working Group</title><title>Geographic Location/Privacy (geopriv)</title> <author><organization abbrev="IETF">Internet Engineering Task Force</organization><organization>IETF</organization> </author> <date/> </front> </reference> <reference anchor="IPADDR_PY" target="http://code.google.com/p/ipaddr-py/"> <front><title>Python<title>Google's Python IP address manipulation library</title> <author fullname="Mike Shields" initials="M." surname="Shields"> <organization abbrev="Google">Google Inc.</organization> </author> <author fullname="Peter Moody" initials="P." surname="Moody"> <organization abbrev="Google">Google Inc.</organization> </author> <date/> </front> </reference> <reference anchor="ISO-GLOSSARY" target="https://www.iso.org/glossary-for-iso-3166.html"> <front> <title>Glossary for ISO 3166</title> <author> <organization>ISO</organization> </author> <date/> </front> </reference> </references> </references> <sectiontitle="Samplenumbered="true" toc="default"> <name>Sample Python ValidationCode">Code</name> <t>Included here is a simple format validator in Python for self-published ipgeo feeds. This tool reads CSV data in the self-published ipgeo feed format from the standard input and performs basic validation. It is intended for use by feed publishers before launching a feed. Note that this validator does not verify the uniqueness of every IP prefix entry within the feed as awhole,whole but only verifies the syntax of each single line from within the feed. A complete validatorMUST<bcp14>MUST</bcp14> also ensure IP prefix uniqueness.</t> <t>The main source file "ipgeo_feed_validator.py" follows. It requires use of the open source ipaddr Python library for IP address and CIDR parsing and validation <xreftarget="IPADDR_PY"/>.</t> <t><figure> <artwork><![CDATA[ <CODE BEGINS>target="IPADDR_PY" format="default"/>.</t> <sourcecode name="" type="python" markers="true"><![CDATA[ #!/usr/bin/python # # Copyright (c) 2012 IETF Trust and the persons identified as # authors of#the code. All rights reserved. Redistribution and use # in source and#binary forms, with or without modification, is # permitted pursuant to,#and subject to the license terms contained # in, the Simplified BSD#License set forth in Section 4.c of the # IETF Trust's Legal Provisions#Relating to IETF # Documents (http://trustee.ietf.org/license-info). """Simple format validator for self-published ipgeo feeds. This tool reads CSV data in the self-published ipgeo feed format from the standard input and performs basic validation. It is intended for use by feed publishers before launching a feed. """ import csv import ipaddr import re import sys class IPGeoFeedValidator(object): def __init__(self): self.prefixes = {} self.line_number = 0 self.output_log = {} self.SetOutputStream(sys.stderr) def Validate(self, feed): """Check validity of an IPGeo feed. Args: feed: iterable with feed lines """ for line in feed: self._ValidateLine(line) def SetOutputStream(self, logfile): """Controls where the output messages go do (STDERR by default). Use None to disable logging. Args: logfile: a file object (e.g.,sys.stdout or sys.stderr)sys.stdout) or None. """ self.output_stream = logfile def CountErrors(self, severity): """How many ERRORs or WARNINGs were generated.""" return len(self.output_log.get(severity, [])) ############################################################ def _ValidateLine(self, line): line = line.rstrip('\r\n') self.line_number += 1 self.line = line.split('#')[0] self.is_correct_line = True if self._ShouldIgnoreLine(line): return fields = [field for field in csv.reader([line])][0] self._ValidateFields(fields) self._FlushOutputStream() def _ShouldIgnoreLine(self, line): line = line.strip() if line.startswith('#'): return True return len(line) == 0 ############################################################ def _ValidateFields(self, fields): assert(len(fields) > 0) is_correct = self._IsIPAddressOrPrefixCorrect(fields[0]) if len(fields) > 1: if not self._IsAlpha2CodeCorrect(fields[1]): is_correct = False if len(fields) > 2 and not self._IsRegionCodeCorrect(fields[2]): is_correct = False if len(fields) != 5: self._ReportWarning('5 fields were expected (got %d).' % len(fields)) ############################################################ def _IsIPAddressOrPrefixCorrect(self, field): if '/' in field: return self._IsCIDRCorrect(field) return self._IsIPAddressCorrect(field) def _IsCIDRCorrect(self, cidr): try: ipprefix = ipaddr.IPNetwork(cidr) if ipprefix.network._ip != ipprefix._ip: self._ReportError('Incorrect IP Network.') return False if ipprefix.is_private: self._ReportError('IP Address must not be private.') return False except: self._ReportError('Incorrect IP Network.') return False return True def _IsIPAddressCorrect(self, ipaddress): try: ip = ipaddr.IPAddress(ipaddress) except: self._ReportError('Incorrect IP Address.') return False if ip.is_private: self._ReportError('IP Address must not be private.') return False return True ############################################################ def _IsAlpha2CodeCorrect(self, alpha2code): if len(alpha2code) == 0: return True if len(alpha2code) != 2 or not alpha2code.isalpha(): self._ReportError( 'Alpha 2 code must be in the ISO 3166-1 alpha 2 format.') return False return True def _IsRegionCodeCorrect(self, region_code): if len(region_code) == 0: return True if '-' not in region_code: self._ReportError('Region code must be intheISO 3166-2 format.') return False parts = region_code.split('-') if not self._IsAlpha2CodeCorrect(parts[0]): return False return True ############################################################ def _ReportError(self, message): self._ReportWithSeverity('ERROR', message) def _ReportWarning(self, message): self._ReportWithSeverity('WARNING', message) def _ReportWithSeverity(self, severity, message): self.is_correct_line = False output_line = '%s: %s\n' % (severity, message) if severity not in self.output_log: self.output_log[severity] = [] self.output_log[severity].append(output_line) if self.output_stream is not None: self.output_stream.write(output_line) def _FlushOutputStream(self): if self.is_correct_line: return if self.output_stream is None: return self.output_stream.write('line %d: %s\n\n' % (self.line_number, self.line)) ############################################################ def main(): feed_validator = IPGeoFeedValidator() feed_validator.Validate(sys.stdin) if feed_validator.CountErrors('ERROR'): sys.exit(1) if __name__ == '__main__': main()<CODE ENDS> ]]></artwork> </figure></t>]]></sourcecode> <t>A unit test file, "ipgeo_feed_validator_test.py" is provided as well. It provides basic test coverage of the code above, though does not test correct handling of non-ASCII UTF-8 strings.</t><t><figure> <artwork><![CDATA[ <CODE BEGINS><sourcecode name="" type="python" markers="true"><![CDATA[ #!/usr/bin/python # # Copyright (c) 2012 IETF Trust and the persons identified as # authors of#the code. All rights reserved. Redistribution and use # in source and#binary forms, with or without modification, is # permitted pursuant to,#and subject to the license terms contained # in, the Simplified BSD#License set forth in Section 4.c of the # IETF Trust's Legal Provisions#Relating to IETF # Documents (http://trustee.ietf.org/license-info). import sys from ipgeo_feed_validator import IPGeoFeedValidator class IPGeoFeedValidatorTest(object): def __init__(self): self.validator = IPGeoFeedValidator() self.validator.SetOutputStream(None) self.successes = 0 self.failures = 0 def Run(self): self.TestFeedLine('# asdf', 0, 0) self.TestFeedLine(' ', 0, 0) self.TestFeedLine('', 0, 0) self.TestFeedLine('asdf', 1, 1) self.TestFeedLine('asdf,US,,,', 1, 0) self.TestFeedLine('aaaa::,US,,,', 0, 0) self.TestFeedLine('zzzz::,US', 1, 1) self.TestFeedLine(',US,,,', 1, 0) self.TestFeedLine('55.66.77', 1, 1) self.TestFeedLine('55.66.77.888', 1, 1) self.TestFeedLine('55.66.77.asdf', 1, 1) self.TestFeedLine('2001:db8:cafe::/48,PL,PL-MZ,,02-784', 0, 0) self.TestFeedLine('2001:db8:cafe::/48', 0, 1) self.TestFeedLine('55.66.77.88,PL', 0, 1) self.TestFeedLine('55.66.77.88,PL,,,', 0, 0) self.TestFeedLine('55.66.77.88,,,,', 0, 0) self.TestFeedLine('55.66.77.88,ZZ,,,', 0, 0) self.TestFeedLine('55.66.77.88,US,,,', 0, 0) self.TestFeedLine('55.66.77.88,USA,,,', 1, 0) self.TestFeedLine('55.66.77.88,99,,,', 1, 0) self.TestFeedLine('55.66.77.88,US,US-CA,,', 0, 0) self.TestFeedLine('55.66.77.88,US,USA-CA,,', 1, 0) self.TestFeedLine('55.66.77.88,USA,USA-CA,,', 2, 0) self.TestFeedLine('55.66.77.88,US,US-CA,Mountain View,', 0, 0) self.TestFeedLine('55.66.77.88,US,US-CA,Mountain View,94043', 0, 0) self.TestFeedLine('55.66.77.88,US,US-CA,Mountain View,94043,' '1600 Ampthitheatre Parkway', 0, 1) self.TestFeedLine('55.66.77.0/24,US,,,', 0, 0) self.TestFeedLine('55.66.77.88/24,US,,,', 1, 0) self.TestFeedLine('55.66.77.88/32,US,,,', 0, 0) self.TestFeedLine('55.66.77/24,US,,,', 1, 0) self.TestFeedLine('55.66.77.0/35,US,,,', 1, 0) self.TestFeedLine('172.15.30.1,US,,,', 0, 0) self.TestFeedLine('172.28.30.1,US,,,', 1, 0) self.TestFeedLine('192.167.100.1,US,,,', 0, 0) self.TestFeedLine('192.168.100.1,US,,,', 1, 0) self.TestFeedLine('10.0.5.9,US,,,', 1, 0) self.TestFeedLine('10.0.5.0/24,US,,,', 1, 0) self.TestFeedLine('fc00::/48,PL,,,', 1, 0) self.TestFeedLine('fe00::/48,PL,,,', 0, 0) print ('%d tests passed, %d failed' % (self.successes, self.failures)) def IsOutputLogCorrectAtSeverity(self, severity, expected_msg_count): msg_count = self.validator.CountErrors(severity) if msg_count != expected_msg_count: print ('TEST FAILED: %s\nexpected %d %s[s], observed %d\n%s\n' %( self.validator.line,(self.validator.line, expected_msg_count, severity, msg_count, str(self.validator.output_log[severity]))) return False return True def IsOutputLogCorrect(self, new_errors, new_warnings): retval = True if not self.IsOutputLogCorrectAtSeverity('ERROR', new_errors): retval = False if not self.IsOutputLogCorrectAtSeverity('WARNING', new_warnings): retval = False return retval def TestFeedLine(self, line, warning_count, error_count): self.validator.output_log['WARNING'] = [] self.validator.output_log['ERROR'] = [] self.validator._ValidateLine(line) if not self.IsOutputLogCorrect(warning_count, error_count): self.failures += 1 return False self.successes += 1 return True if __name__ == '__main__': IPGeoFeedValidatorTest().Run()<CODE ENDS> ]]></artwork> </figure></t>]]></sourcecode> </section> <section numbered="false" toc="default"> <name>Acknowledgements</name> <t>The authors would like to express their gratitude to reviewers and early implementors, including but not limited to <contact fullname="Mikael Abrahamsson"/>, <contact fullname="Andrew Alston"/>, <contact fullname="Ray Bellis"/>, <contact fullname="John Bond"/>, <contact fullname="Alissa Cooper"/>, <contact fullname="Andras Erdei"/>, <contact fullname="Stephen Farrell"/>, <contact fullname="Marco Hogewoning"/>, <contact fullname="Mike Joseph"/>, <contact fullname="Maciej Kuzniar"/>, <contact fullname="George Michaelson"/>, <contact fullname="Menno Schepers"/>, <contact fullname="Justyna Sidorska"/>, <contact fullname="Pim van Pelt"/>, and <contact fullname="Bjoern A. Zeeb"/>.</t> <t>In particular, <contact fullname="Richard L. Barnes"/> and <contact fullname="Andy Newton"/> contributed substantial review, text, and advice.</t> </section> </back> </rfc>