rfc9638.original.xml   rfc9638.xml 
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?> <!-- Required for schema
validation and schema-aware editing --> <!-- draft submitted in xml v3 -->
<!-- <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> -->
<!-- This third-party XSLT can be enabled for direct transformations
in XML processors, including most browsers -->
<!DOCTYPE rfc [ <!DOCTYPE rfc [
<!ENTITY filename "draft-ietf-nvo3-encap-12"> <!ENTITY nbsp "&#160;">
<!ENTITY nbsp "&#160;"> <!ENTITY zwsp "&#8203;">
<!ENTITY zwsp "&#8203;"> <!ENTITY nbhy "&#8209;">
<!ENTITY nbhy "&#8209;"> <!ENTITY wj "&#8288;">
<!ENTITY wj "&#8288;">
]> ]>
<!-- If further character entities are required then they should be
added to the DOCTYPE above. Use of an external entity file is not
recommended. -->
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<rfc <rfc xmlns:xi="http://www.w3.org/2001/XInclude" category="info" docName="draft-i
xmlns:xi="http://www.w3.org/2001/XInclude" etf-nvo3-encap-12" number="9638" consensus="true" obsoletes="" tocInclude="t
category="info" rue" ipr="trust200902" updates="" submissionType="IETF" xml:lang="en" versio
docName="&filename;" n="3" symRefs="true" sortRefs="true">
ipr="trust200902"
updates=""
submissionType="IETF"
xml:lang="en"
version="3">
<!--
* docName should be the name of your draft * category should be
one of std, bcp, info, exp, historic * ipr should be one of
trust200902, noModificationTrust200902, noDerivativesTrust200902,
pre5378Trust200902 * updates can be an RFC number as NNNN *
obsoletes can be an RFC number as NNNN
<!-- ____________________FRONT_MATTER____________________ -->
<front> <front>
<title abbrev="NVO3 Encapsulation Considerations">Network <title abbrev="NVO3 Encapsulation Considerations">Network
Virtualization Overlays (NVO3) Encapsulation Considerations</title> Virtualization over Layer 3 (NVO3) Encapsulation Considerations</title>
<!-- The abbreviated title is required if the full title is <seriesInfo name="RFC" value="9638"/>
longer than 39 characters --> <author initials="S." surname="Boutros" fullname="Sami Boutros" role="editor"
>
<seriesInfo name="Internet-Draft"
value="&filename;"/>
<author initials="S." surname="Boutros"
fullname="Sami Boutros" role="editor">
<organization>Ciena Corporation</organization> <organization>Ciena Corporation</organization>
<address> <address>
<postal> <postal>
<country>USA</country> <country>United States of America</country>
</postal> </postal>
<email>sboutros@ciena.com</email> <email>sboutros@ciena.com</email>
</address> </address>
</author> </author>
<author fullname="Donald E. Eastlake 3rd" initials="D." surname="Eastlake 3rd
<author fullname="Donald E. Eastlake 3rd" initials="D." " role="editor">
surname="Eastlake" role="editor"> <organization>Independent</organization>
<organization>Futurewei Technologies</organization>
<address> <address>
<postal> <postal>
<street>2386 Panoramic Circle</street> <street>2386 Panoramic Circle</street>
<city>Apopka</city> <city>Apopka</city>
<region>Florida</region> <region>FL</region>
<code>32703</code> <code>32703</code>
<country>USA</country> <country>United States of America</country>
</postal> </postal>
<phone>+1-508-333-2270</phone> <phone>+1-508-333-2270</phone>
<email>d3e3e3@gmail.com</email> <email>d3e3e3@gmail.com</email>
</address> </address>
</author> </author>
<date year="2024" month="2" day="19"/> <date year="2024" month="September"/>
<area>Routing</area>
<workgroup>NVO3 Working Group</workgroup>
<!-- "Internet Engineering Task Force" is fine for individual
submissions. If this element is not present, the default is
"Network Working Group", which is used by the RFC Editor as a
nod to the history of the RFC Series. -->
<keyword></keyword> <area>RTG</area>
<!-- Multiple keywords are allowed. Keywords are incorporated <workgroup>nvo3</workgroup>
into HTML output files for use by search engines. -->
<abstract> <abstract>
<t>The IETF Network Virtualization Overlays (NVO3) Working Group <t>The IETF Network Virtualization Overlays (NVO3) Working Group
developed considerations for a common encapsulation that addresses developed considerations for a common encapsulation that addresses
various network virtualization overlay technical concerns. This various network virtualization overlay technical concerns.
document provides a record, for the benefit of the IETF community, This document provides a record, for the benefit of the IETF community,
of the considerations arrived at starting from the output of an NVO3 of the considerations arrived at by the NVO3 Working Group starting from
encapsulation design team. These considerations may be helpful with the output of the NVO3 encapsulation Design Team. These considerations
future deliberations by working groups over the choice of may be helpful with future deliberations by working groups over the choice of
encapsulation formats.</t> encapsulation formats.</t>
<t>There are implications of having different encapsulations in real <t>There are implications of having different encapsulations in real
environments consisting of both software and hardware environments consisting of both software and hardware
implementations and within and spanning multiple data centers. For implementations and within and spanning multiple data centers. For
example, OAM functions such as path MTU discovery become challenging example, Operations, Administration, and Maintenance (OAM) functions such as p ath MTU discovery become challenging
with multiple encapsulations along the data path.</t> with multiple encapsulations along the data path.</t>
<t>Based on these considerations, the Working Group determined that <t>Based on these considerations, the NVO3 Working Group determined that
Geneve with a few modifications as the common encapsulation. This Generic Network Virtualization Encapsulation (Geneve) with a few modifications
document provides more details, particularly in Section 7.</t> is the common encapsulation. This document provides more details, particularly
in <xref target="Recommendations"/>.</t>
</abstract> </abstract>
</front> </front>
<!-- ____________________MIDDLE_MATTER____________________ -->
<middle> <middle>
<section>
<section> <!-- 1. -->
<name>Introduction</name> <name>Introduction</name>
<t>The NVO3 Working Group is chartered to gather requirements and <t>The NVO3 Working Group is chartered to gather requirements and
develop solutions for network virtualization data planes based on develop solutions for network virtualization data planes based on
encapsulation of virtual network traffic over an IP-based underlay encapsulation of virtual network traffic over an IP-based underlay
data plane. Requirements include due consideration for OAM and data plane. Requirements include due consideration for OAM and
security. Based on these requirements the WG was to select, extend, security. Based on these requirements, the WG was to select, extend,
and/or develop one or more data plane encapsulation format(s).</t> and/or develop one or more data plane encapsulation formats.</t>
<t>This led to WG drafts and an RFC describing three encapsulations as <t>This led to WG Internet-Drafts and an RFC describing three encapsulations as
follows:</t> follows:</t>
<ul> <ul>
<li><xref target="RFC8926"/> Geneve: Generic Network Virtualization <li>"Geneve: Generic Network Virtualization Encapsulation" <xref target="RFC89
Encapsulation</li> 26"/></li>
<li>"Generic UDP Encapsulation" <xref target="I-D.ietf-intarea-gue"/></li>
<li><xref target="ietf_intarea_gue"/> Generic UDP Encapsulation</li> <li>"Generic Protocol Extension for VXLAN (VXLAN-GPE)" <xref target="I-D.ietf-
nvo3-vxlan-gpe"/></li>
<li><xref target="nvo3_vxlan_gpe"/> Generic Protocol Extension for
VXLAN (VXLAN-GPE)</li>
</ul> </ul>
<t>Discussion on the list and in face-to-face meetings identified a <t>Discussion on the list and in face-to-face meetings identified a
number of technical problems with each of these encapsulations. number of technical problems with each of these encapsulations.
Furthermore, there was clear consensus at the 96th IETF meeting in Furthermore, there was a clear consensus at the 96th IETF meeting in
Berlin that, to maximize interoperability, the working group should Berlin that the working group should progress only one data plane encapsulation,
progress only one data plane encapsulation. In order to overcome a to maximize interoperability. In order to overcome a
deadlock on the encapsulation decision, the WG consensus was to form a deadlock on the encapsulation decision, the WG consensus was to form a
Design Team <xref target="RFC2418"/> to resolve this issue and provide Design Team <xref target="RFC2418"/> to resolve this issue and provide
initial considerations.</t> initial considerations.</t>
</section> </section>
<section> <!-- 2. --> <section>
<name>Design Team and Working Group Process</name> <name>Design Team and Working Group Process</name>
<t>The Design Team was to select one of the proposed encapsulations <t>The Design Team was to select one of the proposed encapsulations and
and enhance it to address the technical concerns. The simple enhance it to address the technical concerns. The goals were simple evolution o
evolution of deployed networks as well as applicability to all f
locations in the NVO3 architecture were goals. The Design Team was deployed networks as well as applicability to all locations in the NVO3
to specifically avoid selecting a design that is burdensome on architecture. The Design Team was to specifically select a design that allows fo
hardware implementations but should allow future extensibility. The r future extensibility but is not burdensome on hardware implementations. The se
selected design also needed to operate well with ICMP and in Equal lected design also needed to operate well with the Internet Control Message Prot
Cost Multi-Path (ECMP) environments. If further extensibility is ocol (ICMP) and
required, then it should be done in such a manner that it does not in Equal-Cost Multipath (ECMP) environments. If further extensibility is
require the consent of an entity outside of the IETF.</t> required, then it should be done in such a manner that it does not require the
consent of an entity outside of the IETF.</t>
<t>The output of the Design Team was then prcoessed through the <t>The output of the Design Team was then processed through the
working group resulting in working group consensus for this working group, resulting in a working group consensus for this
document.</t> document.</t>
</section> </section>
<section> <!-- 3. --> <section>
<name>Terminology</name> <name>Terminology</name>
<t>
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>",
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>
"MAY", and "OPTIONAL" in this document are to be interpreted as ",
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>",
when, and only when, they appear in all capitals, as shown here.</t> "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to
be
interpreted as described in BCP&nbsp;14 <xref target="RFC2119"/> <xref
target="RFC8174"/> when, and only when, they appear in all capitals, as
shown here.
</t>
</section> </section>
<section> <!-- 4. --> <section>
<name>Abbreviations and Acronyms</name> <name>Abbreviations, Acronyms, and Definitions</name>
<t>The following abbreviations and acronyms are used in this <t>The following abbreviations and acronyms are used in this
document:</t> document:</t>
<dl>
<dl> <dt>ACL:</dt><dd>Access Control List</dd>
<dt>ACL </dt><dd>- Access Control List</dd> <dt>ECMP:</dt><dd>Equal-Cost Multipath</dd>
<dt>EVPN:</dt><dd>Ethernet VPN <xref target="RFC8365"/></dd>
<dt>DT</dt><dd>- NVO3 encapsulation Design Team</dd> <dt>Geneve:</dt><dd>Generic Network Virtualization Encapsulation <xref target
="RFC8926"/></dd>
<dt>ECMP</dt><dd>- Equal Cost Multi-Path</dd> <dt>GPE:</dt><dd>Generic Protocol Extension</dd>
<dt>GUE:</dt><dd>Generic UDP Encapsulation <xref target="I-D.ietf-intarea-gue
<dt>EVPN</dt><dd>- Ethernet VPN <xref target="RFC8365"/></dd> "/></dd>
<dt>HMAC:</dt><dd>Hash-Based Message Authentication Code <xref target="RFC210
<dt>Geneve</dt><dd>- Generic Network Virtualization Encapsulation <xref 4"/></dd>
target="RFC8926"/></dd> <dt>IEEE:</dt><dd>Institute for Electrical and Electronic Engineers (<eref
brackets="angle" target="https://www.ieee.org/"/>)</dd>
<dt>GPE </dt><dd>- Generic Protocol Extension</dd> <dt>NIC:</dt><dd>Network Interface Card (refers to network interface
hardware that is not necessarily a discrete "card")</dd>
<dt>GUE </dt><dd>- Generic UDP Encapsulation <xref <dt>NSH:</dt><dd>Network Service Header <xref target="RFC8300"/></dd>
target="ietf_intarea_gue"/></dd> <dt>NVA:</dt><dd>Network Virtualization Authority</dd>
<dt>NVE:</dt><dd>Network Virtual Edge (refers to an NVE device)</dd>
<dt>HMAC</dt><dd>- Hash based keyed Message Authentication Code <xref <dt>NVO3:</dt><dd>Network Virtualization over Layer 3</dd>
target="RFC2104"/></dd> <dt>OAM:</dt><dd>Operations, Administration, and Maintenance <xref target="RF
C6291"/></dd>
<dt>IEEE</dt><dd>- Institute for Electrical and Electronic Engineers <dt>PWE3:</dt><dd>Pseudowire Emulation Edge-to-Edge</dd>
(www.ieee.org)</dd> <dt>TCAM:</dt><dd>Ternary Content-Addressable Memory</dd>
<dt>TLV:</dt><dd>Type-Length-Value</dd>
<dt>NIC</dt><dd>- Network Interface Card (refers to network interface <dt>Transit device:</dt><dd>Refers to underlay network devices between NVEs.<
hardware which is not necessarily a discrete "card")</dd> /dd>
<dt>UUID:</dt><dd>Universally Unique Identifier</dd>
<dt>NSH </dt><dd>- Network Service Header <xref target="RFC8300"/></dd> <dt>VNI:</dt><dd>Virtual Network Identifier</dd>
<dt>VXLAN:</dt><dd>Virtual eXtensible Local Area Network <xref target="RFC734
<dt>NVA </dt><dd>- Network Virtualization Authority</dd> 8"/></dd>
</dl>
<dt>NVE </dt><dd>- Network Virtual Edge (device)</dd>
<dt>NVO3</dt><dd>- Network Virtualization Overlays over Layer 3</dd>
<dt>OAM </dt><dd>- Operations, Administration, and Maintenance <xref target="RFC
6291"/></dd>
<dt>PWE3</dt><dd>- Pseudowire Emulation Edge to Edge</dd>
<dt>TCAM</dt><dd>- Ternary Content-Addressable Memory</dd>
<dt>TLV </dt><dd>- Type, Length, and Value</dd>
<dt>Transit device</dt><dd>- Underlay network devices between NVE(s).</dd>
<dt>UUID</dt><dd>- Universally Unique Identifier</dd>
<dt>VNI </dt><dd>- Virtual Network Identifier</dd>
<dt>VXLAN</dt><dd>- Virtual eXtensible LAN <xref target="RFC7348"/></dd>
</dl>
</section> </section>
<section> <!-- 5. --> <section>
<name>Encapsulation Issues and Background</name> <name>Encapsulation Issues and Background</name>
<t>The following subsections describe issues with current <t>The following subsections describe issues with current
encapsulations as discussed by the NVO3 WG. Numerous extensions and encapsulations as discussed by the NVO3 WG. Numerous extensions and
options have been designed for GUE and Geneve which may help resolve options have been designed for GUE and Geneve that may help resolve
some of these issues but have not yet been validated by the WG.</t> some of these issues, but these have not yet been validated by the WG.</t>
<t>Also included are diagrams and information on the candidate <t>Also included are diagrams and information on the candidate
encapsulations. These are mostly copied from other documents. Since encapsulations. These are mostly copied from other documents. Since
each protocol is assumed to be sent over UDP, an initial UDP Header each protocol is assumed to be sent over UDP, an initial UDP header
is shown which would be preceded by an IPv4 or IPv6 Header.</t> is shown that would be preceded by an IPv4 or IPv6 header.</t>
<section>
<section> <!-- 5.1 -->
<name>Geneve</name> <name>Geneve</name>
<t>The Geneve packet format, taken from <xref target="RFC8926"/>, is shown in <t>The Geneve packet format, taken from <xref target="RFC8926"/>, is shown in
<xref target="GeneveHeader"/> below.</t> <xref target="GeneveHeader"/> below.</t>
<figure anchor="GeneveHeader"> <figure anchor="GeneveHeader">
<name>Geneve Header</name> <name>Geneve Header</name>
<artwork type="ascii-art" align="center"> <artwork type="ascii-art" align="center"><![CDATA[
<![CDATA[
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
Outer UDP Header: Outer UDP Header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Dest Port = 6081 Geneve | | Source Port | Dest Port = 6081 Geneve |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UDP Length | UDP Checksum | | UDP Length | UDP Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Geneve Header: Geneve Header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver| Opt Len |O|C| Rsvd. | Protocol Type | |Ver| Opt Len |O|C| Rsvd. | Protocol Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Virtual Network Identifier (VNI) | Reserved | | Virtual Network Identifier (VNI) | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
~ Variable-Length Options ~ ~ Variable-Length Options ~
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]> ]]></artwork>
</artwork>
</figure> </figure>
<t>The type of payload being carried is indicated by an Ethertype <t>The type of payload being carried is indicated by an Ethertype
<xref target="RFC7042"/> in the Protocol Type field in the Geneve <xref target="RFC9542"/> in the Protocol Type field in the Geneve
Header; Ethernet itself is represented by Ethertype 0x6558. See header; Ethernet itself is represented by Ethertype 0x6558. See
<xref target="RFC8926"/> for details concerning UDP header <xref target="RFC8926"/> for details concerning UDP header
fields. The O bit indicates an OAM packet. The C bit is the fields. The O bit indicates an OAM packet. The Geneve C bit is the
"Critical" bit which means that the options must be processed or the "Critical" bit, which means that the options must be processed or the
packet discarded.</t> packet discarded.</t>
<t>Issues with Geneve <xref target="RFC8926"/> are as follows:</t> <t>Issues with Geneve <xref target="RFC8926"/> are as follows:</t>
<ul> <ul>
<li>Can't be implemented cost-effectively in all use cases because <li>Geneve can't be implemented cost-effectively in all use cases because
variable length header and order of the TLVs makes it costly (in the variable-length header and order of the TLVs make it costly (in
terms of number of gates) to implement in hardware.</li> terms of number of gates) to implement in hardware.</li>
<li>Header doesn't fit into largest commonly available parse <li>The header doesn't fit into the largest commonly available parse
buffer (256 bytes in NIC). Cannot justify doubling buffer size buffer (256 bytes in a NIC). Thus, doubling the buffer size can't be
unless it is mandatory for hardware to process additional option justified unless it is mandatory for hardware to process additional option
fields.</li> fields.</li>
</ul> </ul>
<t>Selection of Geneve despite these issues may be the result of the <t>The selection of Geneve despite these issues may be the result of the
Geneve design effort assuming that the Geneve header would typically Geneve design effort, assuming that the Geneve header would typically
be delivered to a server and parsed in software.</t> be delivered to a server and parsed in software.</t>
</section> </section>
<section> <!-- 5.2 --> <section>
<name>Generic UDP Encapsulation (GUE)</name> <name>Generic UDP Encapsulation (GUE)</name>
<figure anchor="GUEHeader"> <figure anchor="GUEHeader">
<name>GUE Header</name> <name>GUE Header</name>
<artwork type="ascii-art" align="center"> <artwork type="ascii-art" align="center"><![CDATA[
<![CDATA[
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
UDP Header: UDP Header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source port | Dest port = 6080 GUE | | Source Port | Dest Port = 6080 GUE |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UDP Length | Checksum | | UDP Length | UDP Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
GUE Header: GUE Header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0 |C| Hlen | Proto/ctype | Flags | | 0 |C| Hlen | Proto/ctype | Flags |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
~ Extensions Fields (optional) ~ ~ Extensions Fields (optional) ~
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]> ]]></artwork>
</artwork>
</figure> </figure>
<t>The type of payload being carried is indicated by an IANA Internet <t>The type of payload being carried is indicated by an IANA protocol number in
protocol number in the Proto/ctype field. The C bit indicates a the Proto/ctype field. The GUE C bit (Control bit) indicates a
Control packet.</t> control packet.</t>
<t>Issues with GUE <xref target="ietf_intarea_gue"/> are as <t>Issues with GUE <xref target="I-D.ietf-intarea-gue"/> are as
follows:</t> follows:</t>
<ul> <ul>
<li>There were a significant number of objections to GUE related to <li>There were a significant number of objections to GUE related to
the complexity of implementation in hardware, similar to those noted the complexity of its implementation in hardware, similar to those noted
for Geneve above, such as the variable length and for Geneve above, such as the variable length and
possible high maximum length of the header.</li> possible high maximum length of the header.</li>
</ul> </ul>
</section> </section>
<section> <!-- 5.3 --> <section>
<name>Generic Protocol Extension (GPE) for VXLAN</name> <name>Generic Protocol Extension (GPE) for VXLAN</name>
<figure anchor="GPEHeader"> <figure anchor="GPEHeader">
<name>GPE Header</name> <name>GPE Header</name>
<artwork type="ascii-art" align="center"> <artwork type="ascii-art" align="center"><![CDATA[
<![CDATA[
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
Outer UDP Header: Outer UDP Header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Dest Port = 4790 GPE | | Source Port | Dest Port = 4790 GPE |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| UDP Length | UDP Checksum | | UDP Length | UDP Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
VXLAN-GPE Header VXLAN-GPE Header
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R|R|Ver|I|P|B|O| Reserved | Next Protocol | |R|R|Ver|I|P|B|O| Reserved | Next Protocol |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VXLAN Network Identifier (VNI) | Reserved | | Virtual Network Identifier (VNI) | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]> ]]></artwork>
</artwork>
</figure> </figure>
<t>The type of payload being carried is indicated by the Next Protocol <t>The type of payload being carried is indicated by the Next Protocol
field using a VXLAN-GPE-specific registry. The I bit indicates that field using a registry specific to VXLAN-GPE. The I bit indicates that
the VNI is valid. The P bit indicates that the Next Protocol field is the VNI is valid. The P bit indicates that the Next Protocol field is
valid. The B bit indicates the packet is an ingress replicated valid. The B bit indicates that the packet is an ingress replicated
Broadcast, Unknown Unicast, or Multicast packet. The O bit indicates Broadcast, Unknown Unicast, or Multicast packet. The O bit indicates
an OAM packet.</t> an OAM packet.</t>
<t>Issues with VXLAN-GPE <xref target="nvo3_vxlan_gpe"/> are as <t>Issues with VXLAN-GPE <xref target="I-D.ietf-nvo3-vxlan-gpe"/> are as
follows:</t> follows:</t>
<ul> <ul>
<li>GPE is not day-1 backwards compatible with VXLAN <xref <li>GPE is not day one backwards compatible with VXLAN <xref
target="RFC7348"/>. Although the frame format is similar, it uses a target="RFC7348"/>. Although the frame format is similar, it uses a
different UDP port, so would require changes to existing different UDP port, so it would require changes to existing
implementations even if the rest of the GPE frame were the implementations even if the rest of the GPE frame were the
same.</li> same.</li>
<li>GPE is insufficiently extensible. It adds a Next Protocol field <li>GPE is insufficiently extensible. It adds a Next Protocol field
and some flag bits to the VXLAN header but is not otherwise and some flag bits to the VXLAN header but is not otherwise
extensible.</li> extensible.</li>
<li>Security, e.g., of the VNI, as discussed in <xref <li>As discussed in <xref target="SecExt"/>, security (e.g., of the VNI) has
target="SecExt"/>, has not been addressed by GPE. Although a shim not been addressed by GPE. Although a shim header could be added for
header could be added for security and to support other extensions, security and to support other extensions, this has not been defined
this has not been defined yet. More study would be needed to yet. More study would be needed to understand the implication of such a shim
understand the implication of such a shim on offloading in on offloading in NICs.</li>
NICs.</li>
</ul> </ul>
</section> </section>
</section> </section>
<section> <!-- 6. --> <section anchor="CommonEncapsulationConsiderations">
<name>Common Encapsulation Considerations</name> <name>Common Encapsulation Considerations</name>
<section> <!-- 6.1 --> <section>
<name>Current Encapsulations</name> <name>Current Encapsulations</name>
<t>Appendix A includes a detailed comparison between the three <t><xref target="EncapsulationComparison"/> includes a detailed comparison betwe en the three
proposed encapsulations. The comparison indicates several common proposed encapsulations. The comparison indicates several common
properties but also three major differences among the properties but also three major differences among the
encapsulations:</t> encapsulations:</t>
<ul> <ul>
<li>Extensibility: Geneve and GUE were defined with built-in <li>Extensibility: Geneve and GUE were defined with built-in
extensibility, while VXLAN-GPE is not inherently extensible. Note extensibility, while VXLAN-GPE is not inherently extensible. Note
that any of the three encapsulations can be extended using the that any of the three encapsulations can be extended using the
Network Service Header (NSH <xref target="RFC8300"/>).</li> Network Service Header (NSH) <xref target="RFC8300"/>.</li>
<li>Extension method: Geneve is extensible using Type/Length/Value <li>Extension method: Geneve is extensible using Type-Length-Value
(TLV) fields, while GUE uses a small set of possible extensions, and (TLV) fields, while GUE uses a small set of possible extensions and
a set of flags that indicate which extensions are present.</li> a set of flags that indicate which extensions are present.</li>
<li>Length field: Geneve and GUE include a Length field, indicating <li>Length field: Geneve and GUE include a Length field, indicating
the length of the encapsulation header, while VXLAN-GPE does not the length of the encapsulation header, while VXLAN-GPE does not
include such a field. Thus it may be harder to skip the encapsulation include such a field. Thus, it may be harder to skip the encapsulation
header with VXLAN-GPE</li> header with VXLAN-GPE</li>
</ul> </ul>
</section> </section>
<section> <!-- 6.2 --> <section anchor="ExtensionsUseCases">
<name>Useful Extensions Use Cases</name> <name>Useful Extensions Use Cases</name>
<t>Non-vendor specific extensions, such as TLVs, MUST follow the <t>Extensions that are not vendor-specific, such as TLVs, <bcp14>MUST</bcp14> fo llow the
standardization process. The following use cases for extensions show standardization process. The following use cases for extensions show
that there is a strong requirement to support variable length that there is a strong requirement to support variable-length
extensions with possible different subtypes.</t> extensions with possible different subtypes.</t>
<section> <!--6.2.1 --> <section>
<name>Telemetry Extensions</name> <name>Telemetry Extensions</name>
<t>In several scenarios it is beneficial to make information about the <t>In several scenarios, it is beneficial to make information available to the
path a packet took through the network or through a network device as operator about the path a packet took through the network or through a network
well as associated telemetry information available to the device as well as information about associated telemetry.</t>
operator.</t>
<t>This includes not only tasks like debugging, troubleshooting, and <t>This includes not only tasks like debugging, troubleshooting, and
network planning and optimization but also policy or service level network planning and optimization but also policy or service level
agreement compliance checks.</t> agreement compliance checks.</t>
<t>Packet scheduling algorithms, especially for balancing traffic <t>Packet scheduling algorithms, especially for balancing traffic
across equal cost paths or links, often leverage information contained across equal-cost paths or links, often leverage information contained
within the packet, such as protocol number, IP address, or MAC within the packet, such as protocol number, IP address, or Message
address. Probe packets would thus either need to be sent between the Authentication Code (MAC) address. Thus, probe packets would need to be either
exact same endpoints with the exact same parameters, or probe packets sent between the
would need to be artificially constructed as "fake" packets and exact same endpoints with the exact same parameters or artificially constructed
as "fake" packets and
inserted along the path. Both approaches are often not feasible from inserted along the path. Both approaches are often not feasible from
an operational perspective because access to the end-system is not an operational perspective because access to the end system is not
feasible or the diversity of parameters and associated probe packets feasible or the diversity of parameters and associated probe packets
to be created is simply too large. An extension providing an in-band to be created is simply too large. An extension providing an in-band
telemetry mechanism <xref target="RFC9197"/> is an alternative in telemetry mechanism <xref target="RFC9197"/> is an alternative in
those cases.</t> those cases.</t>
</section> </section>
<section anchor="SecExt"> <!-- 6.2.2 --> <section anchor="SecExt">
<name>Security/Integrity Extensions</name> <name>Security/Integrity Extensions</name>
<t>Since the currently proposed NVO3 encapsulations do not protect <t>Since the currently proposed NVO3 encapsulations do not protect
their headers, a single bit corruption in the VNI field could deliver their headers, a single bit corruption in the VNI field could deliver
a packet to the wrong tenant. Extension headers are needed to use any a packet to the wrong tenant. Extension headers are needed to use any
sophisticated security.</t> sophisticated security.</t>
<t>The possibility of VNI spoofing with an NVO3 protocol is <t>The possibility of VNI spoofing with an NVO3 protocol is
exacerbated by using UDP. Systems typically have no restrictions on exacerbated by using UDP. Systems typically have no restrictions on
applications being able to send to any UDP port so an unprivileged applications being able to send to any UDP port, so an unprivileged
application can trivially spoof VXLAN <xref target="RFC7348"/> packets application can trivially spoof VXLAN <xref target="RFC7348"/> packets,
for instance, including using arbitrary VNIs.</t> using arbitrary VNIs, for instance.</t>
<t>One can envision support of an HMAC-like Message Authentication <t>One can envision support of an HMAC-like Message Authentication
Code (MAC) <xref target="RFC2104"/> in an NVO3 extension to Code (MAC) <xref target="RFC2104"/> in an NVO3 extension to
authenticate the header and the outer IP addresses, thereby preventing authenticate the header and the outer IP addresses, thereby preventing
attackers from injecting packets with spoofed VNIs.</t> attackers from injecting packets with spoofed VNIs.</t>
<t>Another aspect of security is payload security. Essentially this <t>Another aspect of security is payload security. Essentially, this
makes packets that look like the following:</t> makes packets that look like the following:</t>
<sourcecode> <artwork><![CDATA[
IP|UDP|NVO3 Encap|DTLS/IPsec-ESP Extension|payload. IP|UDP|NVO3 Encap|DTLS/IPsec-ESP Extension|payload.
</sourcecode> ]]></artwork>
<t>This is desirable since we still have the UDP header for ECMP, the <t>This is desirable because:</t>
NVO3 header is in plain text so it can be read by network elements, <ul>
and different security or other payload transforms can be supported on <li>we still have the UDP header for ECMP,</li>
a single UDP port (we don't need a separate UDP port for DTLS/IPsec <li>the NVO3 header is in plain text so it can be read by network elements, and<
<xref target="RFC9147"/>/<xref target="RFC6071"/>).</t> /li>
<li>different security or other payload transforms can be supported on
a single UDP port (we don't need a separate UDP port for DTLS/IPsec; see <xref t
arget="RFC9147"/> and <xref target="RFC6071"/>, respectively).</li>
</ul>
</section> </section>
<section> <!-- 6.2.3 --> <section>
<name>Group Based Policy</name> <name>Group-Based Policy</name>
<t>Another use case would be to carry the Group Based Policy (GBP) <t>Another use case would be to carry the Group-Based Policy (GBP)
source group information within a NVO3 header extension in a similar source group information within a NVO3 header extension in a similar
manner as has been implemented for VXLAN <xref target="VXLANgroup"/>. manner as has been implemented for VXLAN <xref target="I-D.smith-vxlan-group-pol icy"/>.
This allows various forms of policy such as access control and QoS to This allows various forms of policy such as access control and QoS to
be applied between abstract groups rather than coupled to specific be applied between abstract groups rather than coupled to specific
endpoint addresses.</t> endpoint addresses.</t>
</section> </section>
</section> </section>
<section> <!-- 6.3 -->
<name>Hardware Considerations</name>
<section anchor="HardwareConsiderations">
<name>Hardware Considerations</name>
<t>Hardware restrictions should be taken into consideration along with <t>Hardware restrictions should be taken into consideration along with
future hardware enhancements that may provide more flexible metadata future hardware enhancements that may provide more flexible metadata (MD)
processing. However, the set of options that need to and will be processing. However, the set of options that need to and will be
implemented in hardware will be a subset of what is implemented in implemented in hardware will be a subset of what is implemented in
software, since software NVEs are likely to grow features, and hence software. This is because software NVEs are likely to grow features, and hence
option support, at a more rapid rate.</t> option support, at a more rapid rate.</t>
<t>It is hard to predict which options will be implemented in which <t>It is hard to predict which options will be implemented in which
piece of hardware and when. That depends on whether the hardware will piece of hardware and when. That depends on whether the hardware will
be in the form of</t> be in the form of:</t>
<ul> <ul>
<li>a NIC providing increasing offload capabilities to software <li>a NIC providing increasing offload capabilities to software
NVEs,</li> NVEs, or</li>
<li>or a switch chip being used as an NVE gateway towards <li>a switch chip being used as an NVE gateway towards
non-NVO3 parts of the network,</li> non-NVO3 parts of the network, or even</li>
<li>or even a transit device that participates in the NVO3 <li>a transit device that participates in the NVO3
dataplane, e.g., for OAM purposes.</li> data plane, e.g., for OAM purposes.</li>
</ul> </ul>
<t>A result of this is that it doesn't look useful to prescribe some <t>A result of this is that it doesn't look useful to prescribe some
order of the options so that the ones that are likely to be implemented order to the options so that the ones that are likely to be implemented
in hardware come first; we can't decide such an order when we define in hardware come first. We can't decide such an order when we define
the options, however a control plane can enforce such an order for the options; however, a control plane can enforce such an order for
some hardware implementation.</t> some hardware implementations.</t>
<t>We do know that hardware needs to initially be able to efficiently <t>We do know that hardware initially needs to be able to efficiently
skip over the NVO3 header to find the inner payload. That is needed skip over the NVO3 header to find the inner payload. That is needed
both for NICs implementing various TCP offload mechanisms and for both for NICs implementing various TCP offload mechanisms and for
transit devices and NVEs applying policy or ACLs to the inner transit devices and NVEs applying policy or ACLs to the inner
payload.</t> payload.</t>
</section> </section>
<section> <!-- 6.4 --> <section>
<name>Extension Size</name> <name>Extension Size</name>
<t>Extension header length has a significant impact on hardware and <t>Extension header length has a significant impact on hardware and
software implementations. A maximum total header length that is too software implementations. A maximum total header length that is too
small will unnecessarily constrain software flexibility. A maximum small will unnecessarily constrain software flexibility. A maximum
total header length that is too large will place a nontrivial cost on total header length that is too large will place a nontrivial cost on
hardware implementations. Thus, the DT recommends that there be a hardware implementations. Thus, the DT recommends that there be a
minimum and maximum total available extension header length specified. minimum and maximum total available extension header length specified.
The maximum total header length is determined by the size of the bit The maximum total header length is determined by the size of the bit
field allocated for the total extension header length field. The risk field allocated for the total extension header length field. The risk
with this approach is that it may be difficult to extend the total with this approach is that it may be difficult to extend the total
header size in the future. The minimum total header length is header size in the future. The minimum total header length is
determined by a requirement in the specifications that all determined by a requirement in the specifications that all
implementations must meet. The risk with this approach is that all implementations must meet. The risk with this approach is that all
implementations will only implement support for the minimum total implementations will only implement support for the minimum total
header length which would then become the de facto maximum total header length, which would then become the de facto maximum total
header length.</t> header length.</t>
<t>The recommended minimum total available header length is 64 <t>The recommended minimum total available header length is 64
bytes.</t> bytes.</t>
<t>The size of an extension header should always be 4 byte <t>The size of an extension header should always be 4-byte
aligned.</t> aligned.</t>
<t>The maximum length of a single option should be large enough to <t>The maximum length of a single option should be large enough to
meet the different extension use case requirements, e.g., in-band meet the different extension use case requirements, e.g., for in-band
telemetry and future use.</t> telemetry and future use.</t>
</section> </section>
<section> <!-- 6.5 -->
<name>Ordering of Extension Headers</name>
<section>
<name>Ordering of Extension Headers</name>
<t>To support hardware nodes at the target NVE or at a transit device <t>To support hardware nodes at the target NVE or at a transit device
that can process one or a few extension headers in TCAM, a control that can process one or a few extension headers in TCAM, a control
plane in such a deployment can signal a capability to ensure a plane in such a deployment could signal a capability to ensure that a
specific extension header will always appear in a specific order, for specific extension header will always appear in a specific order, for
example the first one in the packet.</t> example, that such a specific extension header appear first in the packet.</t>
<t>The order of the extension headers should be hardware friendly for <t>The order of the extension headers should be hardware friendly for
both the sender and the receiver and possibly some transit devices both the sender and the receiver and possibly some transit devices
also. This may requre that the extension headers and their order be as well. This may require that the extension headers and their order be
dynamically determined based on the hardware of those devices.</t> determined dynamically based on the hardware of those devices.</t>
<t>Transit devices don't participate in control plane communication <t>Transit devices don't participate in control plane communication
between the end points and are not required to process the extension between the endpoints and are not required to process the extension
headers; however, if they do, they may need to process only a small headers; however, if they do, they may need to process only a small
subset of the extension headers that will be consumed by target subset of the extension headers that will be consumed by target
NVEs.</t> NVEs.</t>
</section> </section>
<section> <!-- 6.6 -->
<section>
<name>TLV versus Bit Fields</name> <name>TLV versus Bit Fields</name>
<t>If there is a well-known initial set of options that are likely to <t>If there is a well-known initial set of options that is likely to
be implemented in software and in hardware, it can be efficient to use be implemented in software and in hardware, it can be efficient to use
the bit fields approach to indicate the presence of extensions as in the bit fields approach to indicate the presence of extensions as in
GUE. However, as described in section 6.3, if options are added over GUE. However, as described in <xref target="HardwareConsiderations"/>, if optio ns are added over
time and different subsets of options are likely to be implemented in time and different subsets of options are likely to be implemented in
different pieces of hardware, then it would be hard for the IETF to different pieces of hardware, then it would be hard for the IETF to
specify which options should get the early bit fields. TLVs are a lot specify which options should get the early bit fields. TLVs are a lot
more flexible, which avoids the need to determine the relative more flexible, which avoids the need to determine the relative
importance of different options. However, general TLVs of arbitrary importance of different options. However, general TLVs of arbitrary
order, size, and repetition are difficult to implement in hardware. A order, size, and repetition are difficult to implement in hardware. A
middle ground is to use TLVs with restrictions on their size and middle ground is to use TLVs with restrictions on their size and
alignment, observing that individual TLVs can have a fixed length, and alignment, observing that individual TLVs can have a fixed length, and
to support via the control plane a method such that an NVE will only to support via the control plane a method such that an NVE will only
receive options that it needs and implements. The control plane receive options that it needs and implements. The control plane
approach can potentially be used to control the order of the TLVs sent approach can potentially be used to control the order of the TLVs sent
to a particular NVE. Note that transit devices are not likely to to a particular NVE. Note that transit devices are not likely to
participate in the control plane; hence, to the extent that they need participate in the control plane; hence, to the extent that they need
to participate in option processing, some other method must be to participate in option processing, some other method must be
used. Transit devices would have issues with future GUE bit fields used. Transit devices would have issues with future GUE bit fields
being defined for future options as well.</t> being defined for future options as well.</t>
<t>A benefit of TLVs from a hardware perspective is that they are self <t>A benefit of TLVs from a hardware perspective is that they are self describin
describing, i.e., all the information is in the TLV. In a bit field g,
i.e., all the information is in the TLV. In a bit field
approach, the hardware needs to look up the bit to determine the approach, the hardware needs to look up the bit to determine the
length of the data associated with the bit through some separate length of the data associated with the bit through some separate
table, which would add hardware complexity.</t> table, which would add hardware complexity.</t>
<t>There are use cases where multiple modules of software are running <t>There are use cases where multiple modules of software are running
on an NVE. These can be modules such as a diagnostic module by one on an NVE. These can be modules such as a diagnostic module by one
vendor that does packet sampling and another module from a different vendor that does packet sampling and another module from a different
vendor that implements a firewall. Using a TLV format, it is easier vendor that implements a firewall. Using a TLV format, it is easier
to have different software modules process different TLVs, which could to have different software modules process different TLVs without conflicting wi
be standard extensions or vendor specific extensions defined by the th each other. Such TLVs could be standard extensions or vendor-specific extensi
different vendors, without conflicting with each other. This can help ons. This can help
with hardware modularity as well. There are some implementations with with hardware modularity as well. There are some implementations with
options that allows different software modules, like MAC learning and options that allow different software modules, like MAC learning and
security, to process different options.</t> security, to process different options.</t>
</section> </section>
<section> <!-- 6.7 -->
<section>
<name>Control Plane Considerations</name> <name>Control Plane Considerations</name>
<t>Given that we want to allow considerable flexibility and <t>Given that we want to allow considerable flexibility and
extensibility, e.g., for software NVEs, yet be able to support extensibility (e.g., for software NVEs), yet want to be able to support
important extensions in less flexible contexts such as hardware NVEs, important extensions in less flexible contexts such as hardware NVEs,
it is useful to consider the control plane. By control plane in this it is useful to consider the control plane. By control plane in this
section we mean both protocols, such as EVPN <xref target="RFC8365"/> section we mean protocols, such as EVPN <xref target="RFC8365"/>
and others, and deployment specific configuration.</t> and others, and deployment-specific configurations.</t>
<t>If each NVE can express in the control plane that it only supports <t>If each NVE can express in the control plane that it only supports
certain extensions (which could be a single extension, or a few), and certain extensions (which could be a single extension, or a few), and
the source NVEs only include supported extensions in the NVO3 packets, the source NVEs only include supported extensions in the NVO3 packets,
then the target NVE can both use a simpler parser (e.g., a TCAM might then the target NVE can use a simpler parser (e.g., a TCAM might
be usable to look for a single NVO3 extension) and the depth of the be usable to look for a single NVO3 extension) and the depth of the
inner payload in the NVO3 packet will be minimized. Furthermore, if inner payload in the NVO3 packet will be minimized. Furthermore, if
the target NVE cares about a few extensions and can express in the the target NVE cares about a few extensions and can express in the
control plane the desired order of those extensions in the NVO3 control plane the desired order of those extensions in the NVO3
packets, then the deployment can provide useful functionality with packets, then the deployment can provide useful functionality with
simplified hardware requirements for the target NVE.</t> simplified hardware requirements for the target NVE.</t>
<t>Transit devices that are not aware of the NVO3 extensions somewhat <t>Transit devices that are not aware of the NVO3 extensions somewhat
benefit from such an approach, since the inner payload is less deep in benefit from such an approach, since the inner payload is less deep in
the packet if no extraneous extension headers are included in the the packet if no extraneous extension headers are included in the
packet. In general, a transit device is not likely to participate in packet. In general, a transit device is not likely to participate in
the NVO3 control plane. However, configuration mechanisms can take the NVO3 control plane. However, configuration mechanisms can take
into account limitations of the transit devices used in particular into account limitations of the transit devices used in particular
deployments.</t> deployments.</t>
<t>Note that with this approach different NVEs could desire different <t>Note that with this approach, different NVEs could desire different
extensions or sets of extensions, which means that the source NVE extensions or sets of extensions, which means that the source NVE
needs to be able to place different sets of extensions in different needs to be able to place different sets of extensions in different
NVO3 packets, and perhaps in different order. It also assumes that NVO3 packets, and perhaps in a different order. It also assumes that
underlay multicast or replication servers are not used together with underlay multicast or replication servers are not used together with
NVO3 extension headers.</t> NVO3 extension headers.</t>
<t>There is a need to consider mandatory extensions versus optional <t>There is a need to consider mandatory extensions versus optional
extensions. Mandatory extensions require the receiver to drop the extensions. Mandatory extensions require the receiver to drop the
packet if the extension is unknown. A control plane mechanism can packet if the extension is unknown. A control plane mechanism can
prevent the need for dropping unknown extensions, since they would not prevent the need for dropping unknown extensions, since they would not
be included to target NVEs that do not support them.</t> be included to target NVEs that do not support them.</t>
<t>The control planes defined today need to add the ability to <t>The control planes defined today need to add the ability to
describe the different encapsulations. Thus, perhaps EVPN <xref describe the different encapsulations. Thus, perhaps EVPN <xref
target="RFC8365"/> and any other control plane protocol that the IETF target="RFC8365"/> and any other control plane protocol that the IETF
defines should have a way to indicate the supported NVO3 extensions defines should have a way to indicate the supported NVO3 extensions
and their order, for each of the encapsulations supported.</t> and their order for each of the encapsulations supported.</t>
<t>Developing a separate draft on guidance for option processing and <t>Developing a separate document on guidance for option processing and
control plane participation should be considered. This should provide control plane participation should be considered. This should provide
examples/guidance on range of usage models and deployments scenarios examples and guidance on the range of usage models and deployment scenarios
for specific options and ordering that are relevant for that specific for specific options. It should also provide examples of option ordering that ar
deployment. This includes end points and middle boxes using the e relevant for that specific
deployment. This includes endpoints and middleboxes that are using the
options. Having the control plane negotiate the constraints is the options. Having the control plane negotiate the constraints is the
most appropriate and flexible way to address these requirements.</t> most appropriate and flexible way to address these requirements.</t>
</section> </section>
<section> <!-- 6.8 -->
<name>Split NVE</name>
<section>
<name>Split NVE</name>
<t>If there is a need for hosts to send and receive options in a split <t>If there is a need for hosts to send and receive options in a split
NVE case <xref target="RFC8394"/>, this is possible using any of the NVE case <xref target="RFC8394"/>, this is possible using any of the
existing extensible encapsulations (Geneve, GUE, GPE+NSH) by defining existing extensible encapsulations (GPE with NSH, GUE, or Geneve) by defining
a way to carry those over other transports. NSH can already be used a way to carry those over other transports. An NSH can already be used
over different transports.</t> over different transports.</t>
<t>If this is needed with other encapsulations it can be done by <t>If this is needed with other encapsulations, it can be done by
defining an Ethertype so that it can be carried over Ethernet and defining an Ethertype so that it can be carried over Ethernet and
<xref target="IEEE802.1Q"/>.</t> IEEE Std 802.1Q <xref target="IEEE802.1Q"/>.</t>
<t>If there is a need to carry other encapsulations over MPLS, it <t>If there is a need to carry other encapsulations over MPLS, it
would require an EVPN control plane to signal that other encapsulation would require an EVPN control plane to signal that other encapsulation
header + options will be present in front of the L2 packet. The VNI headers and options will be present in front of the Layer 2 (L2) packet. The VN I
can be ignored in the header, and the MPLS label will be the one used can be ignored in the header, and the MPLS label will be the one used
to identify the EVPN L2 instance.</t> to identify the EVPN L2 instance.</t>
</section> </section>
<section anchor="LargerVNI"> <!-- 6.9 -->
<section anchor="LargerVNI">
<name>Larger VNI Considerations</name> <name>Larger VNI Considerations</name>
<t>Whether we should make the VNI 32-bits or larger was one of the <t>Whether we should make the VNI 32 bits or larger was one of the
topics considered. The benefit of a 24-bit VNI would be to avoid topics considered. The benefit of a 24-bit VNI would be to avoid
unnecessary changes with existing proposals and implementations that unnecessary changes with existing proposals and implementations that
are almost all, if not all, using 24-bit VNI. If we need a larger are almost all, if not all, using a 24-bit VNI. If we need a larger
VNI, perhaps for a telemetry case, an extension can be used to support VNI, perhaps for a telemetry case, an extension can be used to support
that. </t> that. </t>
</section> </section>
</section> </section>
<section> <!-- 7. -->
<section anchor="Recommendations">
<name>Recommendations</name> <name>Recommendations</name>
<t>The Design Team (DT) reported that Geneve was most suitable as a <t>The Design Team reported that Geneve was most suitable as a
starting point for a proposed standard for network virtualization, for starting point for a proposed standard for network virtualization, for
the following reasons given below. This conclusion was supported by the following reasons given below. This conclusion was supported by
the NVO3 Working Group.</t> the NVO3 Working Group.</t>
<ol> <ol>
<li>On whether VNI should be in the base header or in an extension <li>On whether the VNI should be in the base header or in an extension
header and whether it should be a 24-bit or 32-bit field (see <xref header and whether it should be a 24-bit or 32-bit field (see <xref
target="LargerVNI"/>), it was agreed that VNI is critical target="LargerVNI"/>), it was agreed that the VNI is critical
information for network virtualization and MUST be present in all information for network virtualization and <bcp14>MUST</bcp14> be present in a
ll
packets. It was also agreed that a 24-bit VNI, which is supported packets. It was also agreed that a 24-bit VNI, which is supported
by Geneve, matches the existing widely used encapsulation formats, by Geneve, matches the existing widely used encapsulation formats,
i.e., VXLAN <xref target="RFC7348"/> and NVGRE <xref i.e., VXLAN <xref target="RFC7348"/> and Network Virtualization Using Generic Routing Encapsulation (NVGRE) <xref
target="RFC7637"/>, and hence is more suitable to use going target="RFC7637"/>, and hence is more suitable to use going
forward.</li> forward.</li>
<li>The Geneve header has the total options length which allows <li>The Geneve header has the total options length, which allows
skipping over the options for NIC offload operations and will allow skipping over the options for NIC offload operations and
transit devices to view flow information in the inner payload.</li> transit devices to view flow information in the inner payload.</li>
<li>The option of using NSH <xref target="RFC8300"/> with VXLAN-GPE <li>The option of using an NSH <xref target="RFC8300"/> with VXLAN-GPE
was considered but given that NSH is targeted at service chaining was considered, but given that an NSH is targeted at service chaining
and contains service chaining information, it is less suitable for and contains service chaining information, it is less suitable for
the network virtualization use case. The other downside for the network virtualization use case. The other downside of
VXLAN-GPE was lack of a header length in VXLAN-GPE, which makes VXLAN-GPE was the lack of a header length in VXLAN-GPE, which makes
skipping over the headers to process inner payload more difficult. A skipping over the headers to process inner payloads more difficult. A
Total Option Length is present in Geneve. It is not possible to total options length is present in Geneve. It is not possible to
skip any options in the middle with VXLAN-GPE. In principle a split skip any options in the middle with VXLAN-GPE. In principle, a split
between a base header and a header with options is interesting between a base header and a header with options is interesting
(whether that options header is NSH or some new header without ties (whether that options header is an NSH or some new header without ties
to a service path). Whether it would make sense to either use NSH to a service path). Whether it would make sense to either use an NSH
for this, or define a new NVO3 options header was explored. for this or define a new NVO3 options header was explored.
However, this makes it slightly harder to find the inner payload However, this makes it slightly harder to find the inner payload
since the length field is not in the NVO3 header itself. Thus, one since the Length field is not in the NVO3 header itself. Thus, one
more field would have to be extracted to compute the start of the more field would have to be extracted to compute the start of the
inner payload. Also, if the experience with IPv6 extension headers inner payload. Also, if the experience with IPv6 extension headers
is a guide, there would be a risk that key pieces of hardware might is a guide, there would be a risk that key pieces of hardware might
not implement the options header, resulting in future calls to not implement the options header, resulting in future calls to
deprecate its use. Making the options part of the base NVO3 header deprecate its use. Making the options part of the base NVO3 header
has less of those issues. Even though the implementation of any has less of those issues. Even though the implementation of any
particular option can not be predicted ahead of time, the option particular option can't be predicted ahead of time, the option
mechanism and ability to skip the options is likely to be broadly mechanism and ability to skip the options is likely to be broadly
implemented.</li> implemented.</li>
<li>The TLV style and bit field style of extension were compared. It <li>The TLV style and bit field style of extension mechanisms were compared. I
was deemed that parsing either TLVs or bit fields is expensive and, t
while bit fields may be simpler to parse, it is also more was deemed that parsing either TLVs or bit fields is expensive, and
restrictive and requires guessing which extensions will be widely while bit fields may be simpler to parse, they are also more
implemented so they can get early bit assignments. Given that half restrictive and require guessing which extensions will be widely
implemented in order to get early bit assignments. Given that half
the bits are already assigned in GUE, a widely deployed extension the bits are already assigned in GUE, a widely deployed extension
may appear in a flag extension, and this will require extra may appear in a flag extension, and this will require extra
processing, to dig the flag from the flag extension and then look processing to dig the flag from the flag extension and then look
for the extension itself. Also bit fields are not flexible enough for the extension itself. Also, bit fields are not flexible enough
to address the requirements from OAM, Telemetry, and security to address the requirements from OAM, telemetry, and security
extensions, for variable length option and different subtypes of the extensions for variable-length options and different subtypes of the
same option. While TLVs are more flexible, a control plane can same option. While TLVs are more flexible, a control plane can
restrict the number of option TLVs as well as the order and size of restrict the number of option TLVs as well as the order and size of
the TLVs to limit this flexibility and make the TLVs simpler for a the TLVs to limit this flexibility and make the TLVs simpler for a
dataplane implementation to handle.</li> data plane implementation to handle.</li>
<li>The multi-vendor NVE case was briefly discussed, as was the need <li>The multi-vendor NVE case was briefly discussed, as was the need
to allow vendors to put their own extensions in the NVE header. to allow vendors to put their own extensions in the NVE header.
This is possible with TLVs.</li> This is possible with TLVs.</li>
<li>It was agreed that the C (Critical) bit in Geneve is <li>It was agreed that the C bit (Critical bit) in Geneve is
helpful. This bit indicates that the header includes options which helpful. This bit indicates that the header includes options that
must be parsed or the packet discarded. It allows a receiver NVE to must be parsed, or else the packet must be discarded. The bit allows a receive
easily decide whether to process options or not, for example a UUID r NVE to
based packet trace, and how an optional extension such as that can easily decide whether or not to process options (such as a UUID-based packet t
be ignored by a receiver NVE and thus make it easy for NVE to skip race) and decide how an optional extension can be ignored. Thus, a Critical bit
over the options. Thus, the C bit should remain as defined in makes it easy for the NVE to skip over the options not marked with such a bit.
Thus, the C bit should remain as defined in
Geneve.</li> Geneve.</li>
<li>There are already some extensions that are being discussed (see <li>There are already some extensions of varying sizes that are being discusse
section 6.2) of varying sizes. By using Geneve options it is d (see
<xref target="ExtensionsUseCases"/>). By using Geneve options, it is
possible to get in-band parameters like switch id, ingress port, possible to get in-band parameters like switch id, ingress port,
egress port, internal delay, and queue size using TLV extensions for egress port, internal delay, and queue size using TLV extensions for
telemetry purpose from switches. It is also possible to add telemetry purposes from switches. It is also possible to add
security extension TLVs like HMAC <xref target="RFC2104"/> and security extension TLVs like HMAC <xref target="RFC2104"/> and
DTLS/IPsec <xref target="RFC9147"/>/<xref target="RFC6071"/> to DTLS/IPsec (see <xref target="RFC9147"/> and <xref target="RFC6071"/>, respect ively) to
authenticate the Geneve packet header and secure the Geneve packet authenticate the Geneve packet header and secure the Geneve packet
payload by software or hardware tunnel endpoints. A Group Based payload by software or hardware tunnel endpoints. A Group-Based
Policy extension TLV can be carried as well.</li> Policy extension TLV can be carried as well.</li>
<li>There are already implementations of Geneve options deployed in <li>There are already implementations of Geneve options deployed in
production networks. There is as well new hardware supporting production networks. There is new hardware supporting
Geneve TLV parsing. In addition, an In-band Telemetry <xref Geneve TLV parsing as well. In addition, an In-band Telemetry (INT) specifica
target="INT"/> specification is being developed by P4.org that tion <xref
illustrates the option of INT meta data carried over Geneve. target="INT"/> is being developed by P4.org that
OVN/OVS <xref target="OVN"/> have also defined some option TLV(s) illustrates the option of INT metadata carried over Geneve. Open Virtual Netwo
rk (OVN) and Open vSwitch (OVS) <xref target="OVN"/> have also defined one or mo
re option TLVs
for Geneve.</li> for Geneve.</li>
<li>Usage requirements (see Section 6) have been addressed while <li>Usage requirements (see <xref
considering the requirements and implementations in general target="CommonEncapsulationConsiderations"/>) have been addressed while also
including software and hardware.</li> considering requirements and implementations in general (including those for
software and hardware).</li>
</ol> </ol>
<t>There seems to be interest in standardizing some well-known secure <t>There seems to be interest in standardizing some well-known secure
option TLVs to secure the header and payload to guarantee option TLVs to secure the header and payload to guarantee
encapsulation header integrity and tenant data privacy. The working encapsulation header integrity and tenant data privacy. The working
group should consider standardizing such option(s).</t> group should consider standardizing such option(s).</t>
<t>The following enhancements to Geneve are recommended to make it <t>The following enhancements to Geneve are recommended to make it
more suitable to hardware and yet provide flexibility for more suitable to hardware and yet provide flexibility for
software:</t> software:</t>
<ul> <ul>
<li>The following sort of text is recommended: while TLVs are more <li>The following sort of text is recommended in Geneve documents: while TLVs are more
flexible, a control plane can restrict the number of option TLVs as flexible, a control plane can restrict the number of option TLVs as
well the order and size of the TLVs to make it simpler for a data well as the order and size of the TLVs to make it simpler for a data
plane implementation in software or hardware to handle. For plane implementation in software or hardware to handle. For
example, there may be some critical information such as a secure example, there may be some critical information such as a secure
hash that must be processed in a certain order at lowest hash that must be processed in a certain order at lowest
latency.</li> latency.</li>
<li>A control plane can negotiate a subset of option TLVs and <li>A control plane can negotiate a subset of option TLVs and
certain TLV ordering, as well as limiting the total number of option certain TLV ordering, as well as limiting the total number of option
TLVs present in the packet, for example, to allow for hardware TLVs present in the packet, for example, to allow for hardware
capable of processing fewer options. Hence, the control plane needs capable of processing fewer options. Hence, the control plane needs
to have the ability to describe the supported TLVs subset and their to have the ability to describe the supported TLVs subset and their
order.</li> order.</li>
<li>The Geneve documents should specify that the subset and order of <li>The Geneve documents should specify that the subset and order of
option TLVs SHOULD be configurable for each remote NVE in the option TLVs <bcp14>SHOULD</bcp14> be configurable for each remote NVE in the
absence of a protocol control plane.</li> absence of a protocol control plane.</li>
<li>Geneve should follow fragmentation recommendations in overlay <li>Geneve should follow fragmentation recommendations in overlay services
services like PWE3 and the L2/L3 VPN recommendations to guarantee like PWE3 and the L2/L3 VPN recommendations to guarantee larger MTUs for the
larger MTU for the tunnel overhead (<xref target="RFC3985"/> Section tunnel overhead (<xref target="RFC3985" sectionFormat="comma"
5.3).</li> section="5.3"/>).</li>
<li>Geneve should provide a recommendation for critical bit <li>The Geneve documents should provide a recommendation for C bit (Critical b
processing - text could specify how critical bits can be used with it)
control plane specifying the critical options.</li> processing. This text could specify how critical bits can be used with
control planes and specify the critical options.</li>
<li>Given that there is a telemetry option use case for a length of <li>Given that there is a telemetry option use case for a length of
256 bytes, it is recommended that Geneve increase the Single TLV 256 bytes, it is recommended that Geneve increase the single TLV
option length to 256.</li> option length to 256.</li>
<li>Geneve address requirements for OAM considerations for alternate <li>Geneve address requirements for OAM considerations for alternate
marking and for performance measurements that need a 2 bit field in marking and for performance measurements that need a 2-bit field in
the header should be considered and the need for the current OAM bit the header should be considered and the need for the current OAM bit
in the Geneve Header clarified.</li> in the Geneve header should be clarified.</li>
<li>The WG should work on security options for Geneve.</li> <li>The WG should work on security options for Geneve.</li>
</ul> </ul>
</section> </section>
<section anchor="Acknowledgements"> <!-- 8. -->
<name>Acknowledgements</name>
<t>The authors would like to thank Tom Herbert for providing the <section>
motivation for the Security/Integrity extension, and for his valuable
comments, T. Sridhar for his valuable comments and feedback, Anoop
Ghanwani for his extensive comments, and Ignas Bagdonas.</t>
</section>
<section> <!-- 9. -->
<name>Security Considerations</name> <name>Security Considerations</name>
<t>This document does not introduce any additional security <t>This document does not introduce any additional security constraints;
constraints; however, <xref target="SecExt"/> discusess however, <xref target="SecExt"/> discusses security/integrity extensions and
security/integrity extensions and this document suggests, in Section this document suggests, in <xref target="Recommendations"/>, that the NVO3 WG
7, that the the nvo3 WG work on security options for Geneve.</t> work on security options for Geneve.</t>
</section> <!-- end Security Considerations --> </section>
<section anchor="IANA"> <section anchor="IANA">
<name>IANA Considerations</name> <name>IANA Considerations</name>
<t>This document requires no IANA actions.</t> <t>This document has no IANA actions.</t>
</section> </section>
</middle> </middle>
<!-- ____________________BACK_MATTER____________________ -->
<back> <back>
<references> <displayreference target="I-D.ietf-intarea-gue-extensions" to="GUE-EXTENSIONS"/>
<name>Normative References</name> <displayreference target="I-D.ietf-intarea-gue" to="GUE"/>
<displayreference target="I-D.ietf-nvo3-vxlan-gpe" to="VXLAN-GPE"/>
<displayreference target="I-D.smith-vxlan-group-policy" to="VXLAN-GROUP"/>
<displayreference target="I-D.hy-nvo3-gue-4-nvo" to="GUE-ENCAPSULATION"/>
<xi:include <references>
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.2119.xml"/> <name>References</name>
<xi:include <references>
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.8174.xml"/> <name>Normative References</name>
</references> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xm
l"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xm
l"/>
<references> </references>
<references>
<name>Informative References</name> <name>Informative References</name>
<reference anchor="ietf_gue_extensions" <!-- [I-D.ietf-intarea-gue-extensions] IESG state: Expired as of 09/16/24 -->
target="https://datatracker.ietf.org/doc/draft-ietf-intarea-gue-extens <xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-i
ions/"> ntarea-gue-extensions.xml"/>
<front>
<title>Extensions for Generic UDP Encapsulation</title>
<author initials="T." surname="Herbert"/>
<author initials="L." surname="Yong"/>
<author initials="F." surname="Templin"/>
<date year="2019" month="March" day="8"/>
</front>
<seriesInfo name="work in" value="progress"/>
</reference>
<reference anchor="ietf_intarea_gue" <!-- [I-D.ietf-intarea-gue] IESG state: Expired as of 09/16/24 -->
target=""> <xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.ietf-i
<front> ntarea-gue.xml"/>
<title>Generic UDP Encapsulation</title>
<author initials="T." surname="Herbert"/>
<author initials="L." surname="Yong"/>
<author initials="O." surname="Zia"/>
<date year="2019" month="October" day="26"/>
</front>
<seriesInfo name="work in" value="progress"/>
</reference>
<reference anchor="IEEE802.1Q"> <reference anchor="IEEE802.1Q">
<front> <front>
<title>Bridges and Bridged Networks</title> <title>IEEE Standard for Local and Metropolitan Area Networks--Bridges and Bri
<author initials="IEEE" surname="802.1 WG" dged Networks</title>
fullname="IEEE 802.1 Working Group"> <author>
<organization>Institute for Electrical and Electronic <organization>IEEE</organization>
Engineers</organization>
</author> </author>
<date year="2014" month="November" day="3"/> <date year="2022" month="December"/>
</front> </front>
<seriesInfo name="IEEE Std" value="802.1Q-2014"/> <seriesInfo name="IEEE Std" value="802.1Q-2022"/>
<seriesInfo name="DOI" value="10.1109/IEEESTD.2022.10004498"/>
</reference> </reference>
<reference anchor="INT" <reference anchor="INT" target="https://p4.org/p4-spec/docs/INT_v2_1.pdf"
target="https://p4.org/p4-spec/docs/INT_v2_1.pdf"> >
<front> <front>
<title>In-band Network Telemetry (INT) Dataplane <title>In-band Network Telemetry (INT) Dataplane Specification</title>
Specification</title> <author>
<author fullname="P4.org"/> <organization>P4.org Applications Working Group</organization>
</author>
<date year="2020" month="November"/> <date year="2020" month="November"/>
</front> </front>
</reference> </reference>
<reference anchor="nvo3_vxlan_gpe" <!-- [nvo3_vxlan_gpe] [I-D.ietf-nvo3-vxlan-gpe] IESG state: Expired as of 09/16/
target="https://datatracker.ietf.org/doc/draft-ietf-nvo3-vxlan-gpe/"> 24. Entered the long way to display editor roles-->
<reference anchor="I-D.ietf-nvo3-vxlan-gpe" target="https://datatracker.ietf.org
/doc/html/draft-ietf-nvo3-vxlan-gpe-13">
<front> <front>
<title>Generic Protocol Extension for VXLAN (VXLAN-GPE)</title> <title>Generic Protocol Extension for VXLAN (VXLAN-GPE)</title>
<author initials="F." surname="Maino"/> <author fullname="Fabio Maino" initials="F." surname="Maino" role="editor">
<author initials="L." surname="Kreeger"/> <organization>Cisco Systems</organization>
<author initials="U." surname="Elzur"/> </author>
<date year="2023" month="November" day="04"/> <author fullname="Larry Kreeger" initials="L." surname="Kreeger" role="edito
r">
<organization>Arrcus</organization>
</author>
<author fullname="Uri Elzur" initials="U." surname="Elzur" role="editor">
<organization>Intel</organization>
</author>
<date day="4" month="November" year="2023"/>
</front> </front>
<seriesInfo name="work in" value="progress"/> <seriesInfo name="Internet-Draft" value="draft-ietf-nvo3-vxlan-gpe-13"/>
</reference> </reference>
<reference anchor="OVN" <reference anchor="OVN" target="https://www.openvswitch.org/">
target="https://www.openvswitch.org/">
<front> <front>
<title></title> <title>Open vSwitch</title>
<author fullname="Open Virtual Network"/> <author>
<organization>Linux Foundation</organization>
</author>
</front> </front>
</reference> </reference>
<xi:include <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2104.xml"
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.2104.xml"/> />
<xi:include <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2418.xml"
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.2418.xml"/> />
<xi:include <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3985.xml"
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.3985.xml"/> />
<xi:include <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6071.xml"
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.6071.xml"/> />
<xi:include <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6291.xml"
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.6291.xml"/> />
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7042.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7348.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7637.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.8300.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.8365.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.8394.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.8926.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.9147.xml"/>
<xi:include
href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.9197.xml"/>
<reference anchor="VXLANgroup" <!--Note: RFC 7042 was obsoleted by RFC 9542-->
target="https://datatracker.ietf.org/doc/html/draft-smith-vxlan-group- <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9542.xml"
policy-05"> />
<front> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7348.xml"
<title>VXLAN Group Policy Option</title> />
<author initials="M." surname="Smith"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7637.xml"
<author initials="L." surname="Kreeger"/> />
<date year="2018" month="October" day="22"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8300.xml"
</front> />
<seriesInfo name="work in" value="progress"/> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8365.xml"
</reference> />
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8394.xml"
/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8926.xml"
/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9147.xml"
/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9197.xml"
/>
<!-- [I-D.smith-vxlan-group-policy] IESG state: Expired as of 09/16/24-->
<xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.smith-vx
lan-group-policy.xml"/>
<!--[draft-hy-nvo3-gue-4-nvo-04] Added during AUTH48. IESG state: Expired as of
09/16/24-->
<xi:include href="https://bib.ietf.org/public/rfc/bibxml3/reference.I-D.hy-nvo3-
gue-4-nvo.xml"/>
</references>
</references> </references>
<section> <!-- Appendix A --> <section anchor="EncapsulationComparison">
<name>Encapsulation Comparison</name> <name>Encapsulation Comparison</name>
<section> <!-- A.1 --> <section>
<name>Overview</name> <name>Overview</name>
<t>This section presents a comparison of the three NVO3 <t>This section presents a comparison of the three NVO3
encapsulation proposals, Geneve <xref target="RFC8926"/>, GUE encapsulation proposals: Geneve <xref target="RFC8926"/>, GUE
<xref target="ietf_intarea_gue"/>, and VXLAN-GPE <xref <xref target="I-D.ietf-intarea-gue"/>, and VXLAN-GPE <xref
target="nvo3_vxlan_gpe"/>. The three encapsulations use an outer target="I-D.ietf-nvo3-vxlan-gpe"/>. The three encapsulations use an outer
UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet header, UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet header,
while GUE uses a 4-octet header. In addition to the base header, while GUE uses a 4-octet header. In addition to the base header,
optional extensions may be included in the encapsulation, as optional extensions may be included in the encapsulation, as
discussed in Section A.2 below.</t> discussed in <xref target="Extensibility"/> below.</t>
</section> </section>
<section> <!-- A.2 --> <section anchor="Extensibility">
<name>Extensibility</name> <name>Extensibility</name>
<section> <!-- A.2.1 --> <section>
<name>Native Extensibility Support</name> <name>Innate Extensibility Support</name>
<t>The Geneve and GUE encapsulations both enable optional headers to <t>The Geneve and GUE encapsulations both enable optional headers to
be incorporated at the end of the base encapsulation header.</t> be incorporated at the end of the base encapsulation header.</t>
<t>VXLAN-GPE does not provide native support for header extensions. <t>VXLAN-GPE does not provide innate support for header extensions.
However, as discussed in <xref target="nvo3_vxlan_gpe"/>, However, as discussed in <xref target="I-D.ietf-nvo3-vxlan-gpe"/>,
extensibility can be attained to some extent if the Network Service extensibility can be attained to some extent if the Network Service
Header (NSH) <xref target="RFC8300"/> is used immediately following Header (NSH) <xref target="RFC8300"/> is used immediately following
the VXLAN-GPE header. NSH supports either a fixed-size extension (MD the VXLAN-GPE header. The NSH supports either a fixed-size extension (MD
Type 1), or a variable-size TLV-based extension (MD Type 2). Note Type 1) or a variable-size TLV-based extension (MD Type 2). Note
that NSH-over-VXLAN-GPE implies an additional overhead of the 8-octets that NSH-over-VXLAN-GPE implies an additional overhead of the 8-octet
NSH header, in addition to the VXLAN-GPE header.</t> NSH, in addition to the VXLAN-GPE header.</t>
</section> </section>
<section> <!-- A.2.2 --> <section>
<name>Extension Parsing</name> <name>Extension Parsing</name>
<t>The Geneve Variable Length Options are defined as Type/Length/Value <t>The Geneve variable-length options are defined as Type-Length-Value
(TLV) extensions. Similarly, VXLAN-GPE, when using NSH, can include (TLV) extensions. Similarly, VXLAN-GPE, when using an NSH, can include
NSH TLV-based extensions. In contrast, GUE defines a small set of NSH TLV-based extensions. In contrast, GUE defines a small set of
possible extension fields (proposed in <xref possible extension fields (proposed in <xref
target="ietf_gue_extensions"/>), and a set of flags in the GUE header target="I-D.ietf-intarea-gue-extensions"/> and <xref
target="I-D.hy-nvo3-gue-4-nvo"/>), and a set of flags in the GUE header
that indicate for each extension type whether it is present or that indicate for each extension type whether it is present or
not.</t> not.</t>
<t>TLV-based extensions, as defined in Geneve, provide the flexibility <t>TLV-based extensions, as defined in Geneve, provide the flexibility
for a large number of possible extension types. Similar behavior can for a large number of possible extension types. Similar behavior can
be supported in NSH-over-VXLAN-GPE when using MD Type 2. The be supported in NSH-over-VXLAN-GPE when using MD Type 2. The
flag-based approach taken in GUE strives to simplify implementations flag-based approach taken in GUE strives to simplify implementations
by defining a small number of possible extensions used in a fixed by defining a small number of possible extensions used in a fixed
order.</t> order.</t>
<t>The Geneve and GUE headers both include a length field, defining <t>The Geneve and GUE headers both include a Length field that defines
the total length of the encapsulation, including the optional the total length of the encapsulation, including the optional
extensions. This length field simplifies the parsing by transit extensions. This Length field simplifies the parsing by transit
devices that skip the encapsulation header without parsing its devices that skip the encapsulation header without parsing its
extensions.</t> extensions.</t>
</section> </section>
<section> <!-- A.2.3 --> <section>
<name>Critical Extensions</name> <name>Critical Extensions</name>
<t>The Geneve encapsulation header includes the 'C' field, which <t>The Geneve encapsulation header includes the C field, which
indicates whether the current Geneve header includes critical options, indicates whether the current Geneve header includes critical options,
that is to say, options which must be parsed by the target NVE. If that is to say, options which must be parsed by the target NVE. If
the endpoint is not able to process a critical option, the packet is the endpoint is not able to process a critical option, the packet is
discarded.</t> discarded.</t>
</section> </section>
<section> <!-- A.2.4 --> <section>
<name>Maximal Header Length</name> <name>Maximal Header Length</name>
<t>The maximal header length in Geneve, including options, is 260 <t>The maximal header length in Geneve, including options, is 260
octets. GUE defines the maximal header to be 128 octets. VXLAN-GPE octets. GUE defines the maximal header to be 128 octets. VXLAN-GPE
uses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE is uses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE is
used, yielding an encapsulation header of up to 264 octets.</t> used, yielding an encapsulation header of up to 264 octets.</t>
</section> </section>
</section> </section>
<section> <!-- A.3 --> <section>
<name>Encapsulation Header</name> <name>Encapsulation Header</name>
<section> <!-- A.3.1 --> <section>
<name>Virtual Network Identifier (VNI)</name> <name>Virtual Network Identifier (VNI)</name>
<t>The Geneve and VXLAN-GPE headers both include a 24-bit VNI field. <t>The Geneve and VXLAN-GPE headers both include a 24-bit VNI field.
GUE, on the other hand, enables the use of a 32-bit field called VNID; GUE, on the other hand, enables the use of a 32-bit field called VNID;
this field is not included in the GUE header, but was defined as an this field is not included in the GUE header but was defined as an
optional extension in <xref target="ietf_gue_extensions"/>.</t> optional extension in <xref target="I-D.hy-nvo3-gue-4-nvo"/>.</t>
<t>The VXLAN-GPE header includes the 'I' bit, indicating that the VNI <t>The VXLAN-GPE header includes the I bit, indicating that the VNI
field is valid in the current header. A similar indicator is defined field is valid in the current header. A similar indicator is defined
as a flag in the GUE header <xref target="ietf_gue_extensions"/>.</t> as a flag in the GUE header <xref target="I-D.ietf-intarea-gue-extensions"/>.</t >
</section> </section>
<section> <!-- A.3.2 --> <section>
<name>Next Protocol</name> <name>Next Protocol</name>
<t>All three encapsulation headers include a field that specifies the <t>All three encapsulation headers include a field that specifies the
type of the next protocol header, which resides after the NVO3 type of the next protocol header, which resides after the NVO3
encapsulation header. The Geneve header includes a 16-bit field that encapsulation header. The Geneve header includes a 16-bit field that
uses the IEEE Ethertype convention. GUE uses an 8-bit field, which uses the IEEE Ethertype convention. GUE uses an 8-bit field, which
uses the IANA Internet protocol numbering. The VXLAN-GPE header uses the IANA protocol numbering. The VXLAN-GPE header
incorporates an 8-bit Next Protocol field, using a VXLAN-GPE-specific incorporates an 8-bit Next Protocol field, using a registry specific to VXLAN-GP
registry, defined in <xref target="nvo3_vxlan_gpe"/>.</t> E, defined in <xref target="I-D.ietf-nvo3-vxlan-gpe"/>.</t>
<t>The VXLAN-GPE header also includes the 'P' bit, which explicitly <t>The VXLAN-GPE header also includes the P bit, which explicitly
indicates whether the Next Protocol field is present in the current indicates whether the Next Protocol field is present in the current
header.</t> header.</t>
</section> </section>
<section> <!-- A.3.3 --> <section> <!-- A.3.3 -->
<name>Other Header Fields</name> <name>Other Header Fields</name>
<t>The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicates <t>The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicates
whether the current packet is an OAM packet. The GUE header includes whether the current packet is an OAM packet. The GUE header includes
a similar field, but uses different terminology; the GUE 'C-bit' a similar field but uses different terminology; the GUE C bit (Control bit)
specifies whether the current packet is a control packet. Note that specifies whether the current packet is a control packet. Note that
the GUE control bit can potentially be used in a large set of the GUE C bit can potentially be used in a large set of
protocols that are not OAM protocols. However, the control packet protocols that are not OAM protocols. However, the control packet
examples discussed in <xref target="ietf_intarea_gue"/> are examples discussed in <xref target="I-D.ietf-intarea-gue"/> are
OAM-related.</t> related to OAM.</t>
<t>Each of the three NVO3 encapsulation headers includes a 2-bit <t>Each of the three NVO3 encapsulation headers includes a 2-bit
Version field, which is currently defined to be zero.</t> Version field, which is currently defined to be zero.</t>
<t>The Geneve and VXLAN-GPE headers include reserved fields; 14 bits <t>The Geneve and VXLAN-GPE headers include reserved fields; 14 bits
in the Geneve header, and 27 bits in the VXLAN-GPE header are in the Geneve header and 27 bits in the VXLAN-GPE header are
reserved.</t> reserved.</t>
</section> </section>
</section> </section>
<section> <!-- A.4 --> <section>
<name>Comparison Summary</name> <name>Comparison Summary</name>
<t>The following table summarizes the comparison between the three <t>The following table summarizes the comparison between the three
NVO3 encapsulations. In some cases a plus sign ("+") or minus sign NVO3 encapsulations. In some cases, a plus sign ("+") or minus sign
("-") is used to indicate that the header is stronger or weaker in an ("-") is used to indicate that the header is stronger or weaker in an
area respectively.</t> area, respectively.</t>
<figure anchor="ComparisonChart">
<name>NVO3 Encapsulations Comparison</name>
<artwork type="ascii-art" align="center">
<![CDATA[
+----------------+----------------+----------------+----------------+
| | Geneve | GUE | VXLAN-GPE |
+----------------+----------------+----------------+----------------+
| Outer transport| UDP/IP | UDP/IP | UDP/IP |
| UDP Port Number| 6081 | 6080 | 4790 |
+----------------+----------------+----------------+----------------+
| Base header | 8 octets | 4 octets | 8 octets |
| length | | | (16 octets |
| | | | using NSH) |
+----------------+----------------+----------------+----------------+
| Extensibility |Variable length |Extension fields| No native ext- |
| | options | | ensibility. |
| | | | Might use NSH. |
+----------------+----------------+----------------+----------------+
| Extension | TLV-based | Flag-based | TLV-based |
| parsing method | | |(using NSH with |
| | | | MD Type 2) |
+----------------+----------------+----------------+----------------+
| Extension | Variable | Fixed | Variable |
| order | | | (using NSH) |
+----------------+----------------+----------------+----------------+
| Length field | + | + | - |
+----------------+----------------+----------------+----------------+
| Max Header | 260 octets | 128 octets | 8 octets |
| Length | | |(264 using NSH) |
+----------------+----------------+----------------+----------------+
| Critical exte- | + | - | - |
| nsion bit | | | |
+----------------+----------------+----------------+----------------+
| VNI field size | 24 bits | 32 bits | 24 bits |
| | | (extension) | |
+----------------+----------------+----------------+----------------+
| Next protocol | 16 bits | 8 bits | 8 bits |
| field | Ethertype | Internet prot- | New registry |
| | registry | ocol registry | |
+----------------+----------------+----------------+----------------+
| Next protocol | - | - | + |
| indicator | | | |
+----------------+----------------+----------------+----------------+
| OAM / control | OAM bit | Control bit | OAM bit |
| field | | | |
+----------------+----------------+----------------+----------------+
| Version field | 2 bits | 2 bits | 2 bits |
+----------------+----------------+----------------+----------------+
| Reserved bits | 14 bits | none | 27 bits |
+----------------+----------------+----------------+----------------+
]]>
</artwork>
</figure>
<table anchor="EncapsulationsComparisonTable" align="center">
<name>Encapsulations Comparison</name>
<thead>
<tr>
<th></th>
<th>Geneve</th>
<th>GUE</th>
<th>VXLAN-GPE</th>
</tr>
</thead>
<tbody>
<tr>
<td>Outer transport UDP Port Number</td>
<td>UDP/IP 6081</td>
<td>UDP/IP 6080</td>
<td>UDP/IP 4790</td>
</tr>
<tr>
<td>Base header length</td>
<td>8 octets</td>
<td>4 octets</td>
<td>8 octets (16 octets using an NSH)</td>
</tr>
<tr>
<td>Extensibility</td>
<td>Variable-length options</td>
<td>Extension fields</td>
<td>No innate extensibility. Might use an NSH.</td>
</tr>
<tr>
<td>Extension parsing method</td>
<td>TLV-based</td>
<td>Flag-based</td>
<td>TLV-based (using an NSH with MD Type 2)</td>
</tr>
<tr>
<td>Extension order</td>
<td>Variable</td>
<td>Fixed</td>
<td>Variable (using an NSH)</td>
</tr>
<tr>
<td>Length field</td>
<td>+</td>
<td>+</td>
<td>-</td>
</tr>
<tr>
<td>Max header length</td>
<td>260 octets</td>
<td>128 octets</td>
<td>8 octets (264 using an NSH)</td>
</tr>
<tr>
<td>Critical extension bit</td>
<td>+</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>VNI field size</td>
<td>24 bits</td>
<td>32 bits (extension)</td>
<td>24 bits</td>
</tr>
<tr>
<td>Next Protocol field</td>
<td>16 bits Ethertype registry</td>
<td>8 bits Internet protocol registry</td>
<td>8 bits New registry</td>
</tr>
<tr>
<td>Next protocol indicator</td>
<td>-</td>
<td>-</td>
<td>+</td>
</tr>
<tr>
<td>OAM / Control field</td>
<td>OAM bit</td>
<td>Control bit</td>
<td>OAM bit</td>
</tr>
<tr>
<td>Version field</td>
<td>2 bits</td>
<td>2 bits</td>
<td>2 bits</td>
</tr>
<tr>
<td>Reserved bits</td>
<td>14 bits</td>
<td>none</td>
<td>27 bits</td>
</tr>
</tbody>
</table>
</section> </section>
</section> </section>
<section anchor="Acknowledgements" numbered="false">
<name>Acknowledgements</name>
<t>The authors would like to thank <contact fullname="Tom Herbert"/> for
providing the motivation for the security/integrity extension and for his
valuable comments; <contact fullname="T. Sridhar"/> for his valuable comments
and feedback; <contact fullname="Anoop Ghanwani"/> for his extensive comments;
and <contact fullname="Ignas Bagdonas"/>.</t>
</section>
<section anchor="Contributors" numbered="false"> <section anchor="Contributors" numbered="false">
<name>Contributors</name> <name>Contributors</name>
<t>The following co-authors have contributed to this document:.</t> <t>The following coauthors have contributed to this document:</t>
<contact fullname="Ilango Ganga"> <contact fullname="Ilango Ganga">
<organization>Intel</organization> <organization>Intel</organization>
<address> <address>
<email>ilango.s.ganga@intel.com</email> <email>ilango.s.ganga@intel.com</email>
</address> </address>
</contact> </contact>
<contact fullname="Pankaj Garg"> <contact fullname="Pankaj Garg">
<organization>Microsoft</organization> <organization>Microsoft</organization>
<address> <address>
<email> pankajg@microsoft.com</email> <email> pankajg@microsoft.com</email>
</address> </address>
</contact> </contact>
<contact fullname="Rajeev Manur"> <contact fullname="Rajeev Manur">
<organization>Broadcom</organization> <organization>Broadcom</organization>
<address> <address>
<email>rajeev.manur@broadcom.com</email> <email>rajeev.manur@broadcom.com</email>
</address> </address>
</contact> </contact>
<contact fullname="Tal Mizrahi"> <contact fullname="Tal Mizrahi">
<organization>Huawei</organization> <organization>Huawei</organization>
<address> <address>
<email>tal.mizrahi.phd@gmail.com</email> <email>tal.mizrahi.phd@gmail.com</email>
</address> </address>
</contact> </contact>
<contact fullname="David Mozes"> <contact fullname="David Mozes">
<address> <address>
<email>mosesster@gmail.com</email> <email>mosesster@gmail.com</email>
</address> </address>
</contact> </contact>
<contact fullname="Erik Nordmark"> <contact fullname="Erik Nordmark">
<organization>ZEDEDA</organization> <organization>ZEDEDA</organization>
<address> <address>
<email>nordmark@sonic.net</email> <email>nordmark@sonic.net</email>
</address> </address>
</contact> </contact>
<contact fullname="Michael Smith"> <contact fullname="Michael Smith">
<organization>Cisco</organization> <organization>Cisco</organization>
<address> <address>
<email>michsmit@cisco.com</email> <email>michsmit@cisco.com</email>
</address> </address>
</contact> </contact>
<contact fullname="Sam Aldrin"> <contact fullname="Sam Aldrin">
<organization>Google</organization> <organization>Google</organization>
<address> <address>
<email>aldrin.ietf@gmail.com</email> <email>aldrin.ietf@gmail.com</email>
</address> </address>
</contact> </contact>
</section> </section>
</back> </back>
</rfc> </rfc>
 End of changes. 229 change blocks. 
589 lines changed or deleted 585 lines changed or added

This html diff was produced by rfcdiff 1.48.