rfc9611xml2.original.xml   rfc9611.xml 
<?xml version="1.0" encoding="US-ASCII"?> <?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY RFC2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC <!DOCTYPE rfc [
.2119.xml"> <!ENTITY nbsp "&#160;">
<!ENTITY RFC2367 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC <!ENTITY zwsp "&#8203;">
.2367.xml"> <!ENTITY nbhy "&#8209;">
<!ENTITY RFC4034 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC <!ENTITY wj "&#8288;">
.4034.xml">
<!ENTITY RFC4301 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC
.4301.xml">
<!ENTITY RFC5890 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC
.5890.xml">
<!ENTITY RFC6698 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC
.6698.xml">
<!ENTITY RFC7296 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC
.7296.xml">
<!ENTITY RFC7942 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC
.7942.xml">
<!ENTITY RFC8174 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC
.8174.xml">
]> ]>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc strict="yes" ?> <rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" updates="" obs
<?rfc toc="yes"?> oletes="" category="std" docName="draft-ietf-ipsecme-multi-sa-performance-09" nu
<?rfc tocdepth="4"?> mber="9611" consensus="true" submissionType="IETF" tocInclude="true" tocDepth="4
<?rfc symrefs="yes"?> " symRefs="true" sortRefs="true" version="3">
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="no" ?>
<rfc ipr="trust200902" updates="" obsoletes="" category="std" docName="draft-iet
f-ipsecme-multi-sa-performance-09">
<front> <front>
<title>IKEv2 support for per-resource Child SAs</title> <title abbrev="IKEv2 Support for Per-Resource Child SAs">Internet Key Exchan
ge Protocol Version 2 (IKEv2) Support for Per&nbhy;Resource Child Security Assoc
iations (SAs)</title>
<seriesInfo name="RFC" value="9611"/>
<author fullname="Antony Antony" initials="A." surname="Antony"> <author fullname="Antony Antony" initials="A." surname="Antony">
<organization abbrev="secunet">secunet Security Networks AG</organization> <organization abbrev="secunet">secunet Security Networks AG</organization>
<address> <address>
<email>antony.antony@secunet.com</email> <email>antony.antony@secunet.com</email>
</address> </address>
</author> </author>
<author initials="T." surname="Brunner" fullname="Tobias Brunner"> <author initials="T." surname="Brunner" fullname="Tobias Brunner">
<organization abbrev="codelabs">codelabs GmbH</organization> <organization abbrev="codelabs">codelabs GmbH</organization>
<address> <address>
<email>tobias@codelabs.ch</email> <email>tobias@codelabs.ch</email>
skipping to change at line 48 skipping to change at line 39
<address> <address>
<email>steffen.klassert@secunet.com</email> <email>steffen.klassert@secunet.com</email>
</address> </address>
</author> </author>
<author initials="P." surname="Wouters" fullname="Paul Wouters"> <author initials="P." surname="Wouters" fullname="Paul Wouters">
<organization>Aiven</organization> <organization>Aiven</organization>
<address> <address>
<email>paul.wouters@aiven.io</email> <email>paul.wouters@aiven.io</email>
</address> </address>
</author> </author>
<date/> <date month="July" year="2024"/>
<area>General</area> <area>SEC</area>
<workgroup>Network</workgroup> <workgroup>ipsecme</workgroup>
<keyword>IKEv2</keyword> <keyword>IKEv2</keyword>
<keyword>IPsec</keyword> <keyword>IPsec</keyword>
<abstract> <abstract>
<t> <t>
This document defines one Notify Message Status Types and one Notify Mess In order to increase the bandwidth of IPsec traffic between peers,
age this document defines one Notify Message Status Types and one Notify
Error Types payload for the Internet Key Exchange Protocol Version 2 (IKE Message Error Types payload for the Internet Key Exchange Protocol
v2) Version 2 (IKEv2) to support the negotiation of multiple Child
to support the negotiation of multiple Child Security Associations (SAs) Security Associations (SAs) with the same Traffic Selectors used on
with different resources, such as CPUs.
the same Traffic Selectors used on different resources, such as CPUs, to
increase bandwidth of IPsec traffic between peers.
</t> </t>
<t> <t>
The SA_RESOURCE_INFO notification is used to convey information that the The SA_RESOURCE_INFO notification is used to convey information that the
negotiated Child SA and subsequent new Child SAs with the same Traffic Se lectors negotiated Child SA and subsequent new Child SAs with the same Traffic Se lectors
are a logical group of Child SAs where most or all of the Child SAs are are a logical group of Child SAs where most or all of the Child SAs are
bound to a specific resource, such as a specific CPU. The TS_MAX_QUEUE bound to a specific resource, such as a specific CPU. The TS_MAX_QUEUE
notify conveys that the peer is unwilling to create more additional Child notify conveys that the peer is unwilling to create more additional Child
SAs for this particular negotiated Traffic Selector combination. SAs for this particular negotiated Traffic Selector combination.
</t> </t>
<t> <t>
Using multiple Child SAs with the same Traffic Selectors has the benefit Using multiple Child SAs with the same Traffic Selectors has the benefit
that each resource holding the Child SA has its own Sequence Number Count er, that each resource holding the Child SA has its own Sequence Number Count er,
ensuring that CPUs don't have to synchronize their cryptographic state or disable ensuring that CPUs don't have to synchronize their cryptographic state or disable
their packet replay protection. their packet replay protection.
</t> </t>
</abstract> </abstract>
</front> </front>
<middle> <middle>
<section title="Introduction"> <section numbered="true" toc="default">
<name>Introduction</name>
<t> <t>
Most IPsec implementations are currently limited to using one Most IPsec implementations are currently limited to using one
hardware queue or a single CPU resource for a Child SA. Running hardware queue or a single CPU resource for a Child SA.
Running
packet stream encryption in parallel can be done, but there is a bottlene ck of packet stream encryption in parallel can be done, but there is a bottlene ck of
different parts of the hardware locking or waiting to get their different parts of the hardware locking or waiting to get their
sequence number assigned for the packet it is encrypting. The sequence number assigned for the packet being encrypted. The
result is that a machine with many such resources is limited to result is that a machine with many such resources is limited to
only using one of these resources per Child SA. This severely using only one of these resources per Child SA. This severely
limits the throughput that can be attained. For example, at the limits the throughput that can be attained. For example, at the
time of writing, an unencrypted link of 10Gbps or more is commonly time of writing, an unencrypted link of 10 Gbps or more is commonly
reduced to 2-5Gbps when IPsec is used to encrypt the link using reduced to 2-5 Gbps when IPsec is used to encrypt the link using
AES-GCM. By using the implementation specified in this document, AES-GCM. By using the implementation specified in this document,
aggregate throughput increased from 5Gbps using 1 CPU to 40-60 aggregate throughput increased from 5Gbps using 1 CPU to 40-60
Gbps using 25-30 CPUs. Gbps using 25-30 CPUs.
</t> </t>
<t> <t>
While this could be (partially) mitigated by setting up multiple While this could be (partially) mitigated by setting up multiple
narrowed Child SAs, for example using Populate From Packet (PFP) narrowed Child SAs (for example, using Populate From Packet (PFP)
as specified in IPsec Architecture <xref target="RFC4301"/>, this as specified in IPsec architecture <xref target="RFC4301" format="default
"/>), this
IPsec feature would cause too many Child SAs (one per network flow) IPsec feature would cause too many Child SAs (one per network flow)
or too few Child SAs (one network flow used on multiple CPUs). PFP is or too few Child SAs (one network flow used on multiple CPUs). PFP is
also not widely implemented. also not widely implemented.
</t> </t>
<t> <t>
To make better use of multiple network queues and CPUs, it can To make better use of multiple network queues and CPUs, it can
be beneficial to negotiate and install multiple Child be beneficial to negotiate and install multiple Child
SAs with identical Traffic Selectors. IKEv2 <xref target="RFC7296"/> SAs with identical Traffic Selectors. IKEv2 <xref target="RFC7296" format ="default"/>
already allows installing multiple Child SAs with identical Traffic already allows installing multiple Child SAs with identical Traffic
Selectors, but it offers no method to indicate that the additional Child Selectors, but it offers no method to indicate that the additional Child
SA is being requested for performance increase reasons and is restricted SA is being requested for performance increase reasons and is restricted
to some resource (queue or CPU). to some resource (queue or CPU).
</t> </t>
<t> <t>
When an IKEv2 peer is receiving more additional Child SA's for a single When an IKEv2 peer is receiving more additional Child SAs for a single
set of Traffic Selectors than it is willing to create, it can return an set of Traffic Selectors than it is willing to create, it can return an
error notify of TS_MAX_QUEUE. error notify of TS_MAX_QUEUE.
</t> </t>
<section title="Requirements Language"> <section numbered="true" toc="default">
<name>Requirements Language</name>
<t> <t>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL NOT</bcp14>
"OPTIONAL" in this document are to be interpreted as described in BCP ",
14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>",
when, they appear in all capitals, as shown here. "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
</t> "<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to
be
interpreted as described in BCP&nbsp;14 <xref target="RFC2119"/> <xref
target="RFC8174"/> when, and only when, they appear in all capitals, as
shown here.
</t>
</section> </section>
<section title="Terminology"> <section numbered="true" toc="default">
<t>This document uses the following terms defined in IKEv2 <xref target=" <name>Terminology</name>
RFC7296"/>: <t>This document uses the following terms defined in IKEv2 <xref target=
Notification Data, Traffic Selectors (TS), TSi/TSr, Child SA, "RFC7296" format="default"/>:
Configuration Payload (CP), IKE SA, CREATE_CHILD_SA and NO_ADDITIONAL_ Notification Data, Traffic Selector (TS), Traffic Selector initiator (
SAS. TSi), Traffic Selector responder (TSr), Child SA,
</t> Configuration Payload (CP), IKE SA, CREATE_CHILD_SA, and NO_ADDITIONAL
<t>This document also uses the following terms defined in <xref target="R _SAS.
FC4301"/>: </t>
SPD, SA. <t>This document also uses the following terms defined in <xref target="
</t> RFC4301" format="default"/>:
Security Policy Database (SPD), SA.
</t>
</section> </section>
</section> </section>
<section title="Performance bottlenecks" anchor="performance"> <section anchor="performance" numbered="true" toc="default">
<name>Performance Bottlenecks</name>
<t> <t>
There are several pragmatic reasons why most implementations must restrict a There are several pragmatic reasons why most implementations must restrict a
Child Security Association (SA) to a single specific hardware resource. A Child Security Association (SA) to a single specific hardware resource. A
primary limitation arises from the challenges associated with sharing primary limitation arises from the challenges associated with sharing
cryptographic states, counters, and sequence numbers among multiple CPUs. When cryptographic states, counters, and sequence numbers among multiple CPUs. When
these CPUs attempt to simultaneously utilize shared states, it becomes these CPUs attempt to simultaneously utilize shared states, it becomes
impractical to do so without incurring a significant performance penalty. It is impractical to do so without incurring a significant performance penalty. It is
necessary to negotiate and establish multiple Child Security Associations (SAs) necessary to negotiate and establish multiple Child SAs
with identical Traffic Selector initiator (TSi) and Traffic Selector respo nder with identical Traffic Selector initiator (TSi) and Traffic Selector respo nder
(TSr) on a per-resource basis." (TSr) on a per-resource basis.
</t> </t>
</section> </section>
<section title="Negotiation of CPU specific Child SAs" anchor="neg_cpu"> <section anchor="neg_cpu" numbered="true" toc="default">
<name>Negotiation of Resource-Specific Child SAs</name>
<t> <t>
An initial IKEv2 exchange is used to setup an IKE SA and the An initial IKEv2 exchange is used to set up an IKE SA and the
initial Child SA. If multiple Child SAs with the same Traffic initial Child SA. If multiple Child SAs with the same Traffic
Selectors that are bound to a single resource are desired, the Selectors that are bound to a single resource are desired, the
initiator will add the SA_RESOURCE_INFO notify payload to the initiator will add the SA_RESOURCE_INFO notify payload to the
Exchange negotiating the Child SA (e.g. IKE_AUTH or CREATE_CHILD_SA). Exchange negotiating the Child SA (e.g., IKE_AUTH or CREATE_CHILD_SA).
If this initial Child SA will be tied to a specific resource, it If this initial Child SA will be tied to a specific resource, it
MAY indicate this by including an identifier in the Notification <bcp14>MAY</bcp14> indicate this by including an identifier in the Notifi
Data. A responder that is willing to have multiple Child SAs cation
Data.
<!--[rfed] Should "Notify Data" be "Notification Data" here?
There are zero instance of "Notify Data" in RFCs, and
"Notification Data" is used in the preceding sentence.
Also, may the article "a" be removed?
Original:
A responder that
is willing to have multiple Child SAs for the same Traffic Selectors
will respond by also adding the SA_RESOURCE_INFO notify payload in
which it MAY add a non-zero Notify Data.
Perhaps:
A responder that
is willing to have multiple Child SAs for the same Traffic Selectors
will respond by also adding the SA_RESOURCE_INFO notify payload in
which it MAY add non-zero Notification Data.
Similarly, should "notify data" be updated in the Security Considerations sectio
n?
Original: The notify data SHOULD NOT be an identifier ...
Perhaps: The Notification Data SHOULD NOT be an identifier ...
-->
A responder that is willing to have multiple Child SAs
for the same Traffic Selectors will respond by also adding the for the same Traffic Selectors will respond by also adding the
SA_RESOURCE_INFO notify payload in which it MAY add a non-zero SA_RESOURCE_INFO notify payload in which it <bcp14>MAY</bcp14> add a non-
Notify Data. zero
Notification Data.
</t> </t>
<t> <t>
Additional resource-specific Child SAs are negotiated as regular Child Additional resource-specific Child SAs are negotiated as regular Child
SAs using the CREATE_CHILD_SA exchange and are similarly identified by an SAs using the CREATE_CHILD_SA exchange and are similarly identified by an
accompanying SA_RESOURCE_INFO notification.</t> accompanying SA_RESOURCE_INFO notification.</t>
<t> <t>
Upon installation, each resource-specific Child SA is Upon installation, each resource-specific Child SA is
associated with an additional local selector, such as the CPU. associated with an additional local selector, such as the CPU.
These resource-specific Child SAs MUST be negotiated with identical These resource-specific Child SAs <bcp14>MUST</bcp14> be negotiated with identical
Child SA properties that were negotiated for the initial Child Child SA properties that were negotiated for the initial Child
SA. This includes cryptographic algorithms, Traffic Selectors, SA. This includes cryptographic algorithms, Traffic Selectors,
Mode (e.g. transport mode), compression usage, etc. However, each Mode (e.g., transport mode), compression usage, etc. However, each
Child SA does have its own keying material that is individually derived Child SA does have its own keying material that is individually derived
according to the regular IKEv2 process. The SA_RESOURCE_INFO according to the regular IKEv2 process. The SA_RESOURCE_INFO
notify payload MAY be empty or MAY contain some identifying data. notify payload <bcp14>MAY</bcp14> be empty or <bcp14>MAY</bcp14> contain
This identifying data SHOULD be a unique identifier within all some identifying data.
the Child SAs with the same TS payloads and the peer MUST only This identifying data <bcp14>SHOULD</bcp14> be a unique identifier within
all
the Child SAs with the same TS payloads, and the peer <bcp14>MUST</bcp14>
only
use it for debugging purposes. use it for debugging purposes.
</t> </t>
<t> <t>
Additional Child SAs can be started on-demand or can be started Additional Child SAs can be started on demand or can be started
all at once. Peers may also delete specific per-resource Child all at once. Peers may also delete specific per-resource Child
SAs if they deem the associated resource to be idle. SAs if they deem the associated resource to be idle.
</t> </t>
<t> <t>
During the CREATE_CHILD_SA rekey for the Child SA, the During the CREATE_CHILD_SA rekey for the Child SA, the
SA_RESOURCE_INFO notification MAY be included, but regardless of SA_RESOURCE_INFO notification <bcp14>MAY</bcp14> be included, but regardl ess of
whether or not it is included, the rekeyed Child SA should be bound whether or not it is included, the rekeyed Child SA should be bound
to the same resource(s) as the Child SA that is being rekeyed. to the same resource(s) as the Child SA that is being rekeyed.
</t> </t>
</section> </section>
<section title="Implementation Considerations" anchor="impl_consider"> <section anchor="impl_consider" numbered="true" toc="default">
<name>Implementation Considerations</name>
<t> <t>
There are various considerations that an implementation can There are various considerations that an implementation can
use to determine the best procedure to install multiple Child SAs. use to determine the best procedure to install multiple Child SAs.
</t> </t>
<t> <t>
A simple procedure could be to install one additional Child SA A simple procedure could be to install one additional Child SA
on each CPU. An implementation can ensure that one Child SA can be on each CPU. An implementation can ensure that one Child SA can be
used by all CPUs, so that while negotiating a new per-CPU Child SA, used by all CPUs, so that while negotiating a new per-CPU Child SA,
which typically takes 1 RTT delay, the CPU with no CPU-specific which typically takes 1 RTT delay, the CPU with no CPU-specific
Child SA can still encrypt its packets using the Child SA that is Child SA can still encrypt its packets using the Child SA that is
available for all CPUs. Alternatively, if an implementation finds available for all CPUs. Alternatively, if an implementation finds
it needs to encrypt a packet but the current CPU does not have it needs to encrypt a packet but the current CPU does not have
the resources to encrypt this packet, it can relay that packet the resources to encrypt this packet, it can relay that packet
to a specific CPU that does have the capability to encrypt the to a specific CPU that does have the capability to encrypt the
packet, although this will come with a performance penalty. packet, although this will come with a performance penalty.
</t> </t>
<t> <t>
Performing per-CPU Child SA negotiations can result in both peers Performing per-CPU Child SA negotiations can result in both peers
initiating additional Child SAs at once. This is especially likely initiating additional Child SAs simultaneously. This is especially likely
if per-CPU Child SAs are triggered by individual SADB_ACQUIRE if per-CPU Child SAs are triggered by individual SADB_ACQUIRE messages
<xref target="RFC2367"/> messages. Responders should install the <xref target="RFC2367" format="default"/>. Responders should install the
additional Child SA on a CPU with the least amount of additional additional Child SA on a CPU with the least amount of additional
Child SAs for this TSi/TSr pair. Child SAs for this TSi/TSr pair.
</t> </t>
<t> <t>
When the number of queue or CPU resources are different between the When the number of queue or CPU resources are different between the
peers, the peer with the least amount of resources may decide to peers, the peer with the least amount of resources may decide to
not install a second outbound Child SA for the same resource as not install a second outbound Child SA for the same resource, as
it will never use it to send traffic. However, it must install it will never use it to send traffic. However, it must install
all inbound Child SAs as it has committed to receiving traffic all inbound Child SAs because it has committed to receiving traffic
on these negotiated Child SAs. on these negotiated Child SAs.
</t> </t>
<t> <t>
If per-CPU packet trigger (e.g. SADB_ACQUIRE) messages are implemented If per-CPU packet trigger (e.g., SADB_ACQUIRE) messages are implemented
(see <xref target="Operations"/>), (see <xref target="Operations" format="default"/>),
the Traffic Selector (TSi) entry containing the information of the the Traffic Selector (TSi) entry containing the information of the
trigger packet should be included in the TS set similarly to trigger packet should be included in the TS set similarly to
regular Child SAs as specified in IKEv2 <xref target="RFC7296"/> Section regular Child SAs as specified in IKEv2 <xref target="RFC7296" format="de
2.9. fault" section="2.9"
sectionFormat="comma"/>.
Based on the trigger TSi entry, an implementation can select the most Based on the trigger TSi entry, an implementation can select the most
optimal target CPU to install the additional Child SA on. For example, optimal target CPU to install the additional Child SA on. For example,
if the trigger packet was for a TCP destination to port 25 (SMTP), it if the trigger packet was for a TCP destination to port 25 (SMTP), it
might be able to install the Child SA on the CPU that is also running might be able to install the Child SA on the CPU that is also running
the mail server process. Trigger packet Traffic Selectors are the mail server process. Trigger packet Traffic Selectors are
documented in IKEv2 <xref target="RFC7296"/> Section 2.9. documented in IKEv2 <xref target="RFC7296" format="default" section="2.9" sectionFormat="comma"/>.
</t> </t>
<t> <t>
As per IKEv2, rekeying a Child SA SHOULD use the same (or As per IKEv2, rekeying a Child SA <bcp14>SHOULD</bcp14> use the same (or
wider) Traffic Selectors to ensure that the new Child SA covers wider) Traffic Selectors to ensure that the new Child SA covers
everything that the rekeyed Child SA covers. This includes everything that the rekeyed Child SA covers.
Traffic Selectors negotiated via Configuration Payloads (CP) This includes
such as INTERNAL_IP4_ADDRESS which may use the original wide TS Traffic Selectors negotiated via Configuration Payloads
such as INTERNAL_IP4_ADDRESS, which may use the original wide TS
set or use the narrowed TS set. set or use the narrowed TS set.
</t> </t>
</section> </section>
<section title="Payload Format" anchor="payload_formats"> <section anchor="payload_formats" numbered="true" toc="default">
<name>Payload Format</name>
<t> <t>
The Notify Payload format is defined in IKEv2 <xref target="RFC7296"/> The Notify Payload format is defined in IKEv2 <xref target="RFC7296" form
section 3.10, and is copied here for convenience. at="default" section="3.10" sectionFormat="comma"/>, and is copied here for conv
enience.
</t> </t>
<t> <t>
All multi-octet fields representing integers are laid out in big All multi-octet fields representing integers are laid out in big
endian order (also known as "most significant byte first", or endian order (also known as "most significant byte first", or
"network byte order"). "network byte order").
</t> </t>
<section title="SA_RESOURCE_INFO Notify Message Status Type payload" ancho <section anchor="payload_info_cpu" numbered="true" toc="default">
r="payload_info_cpu"> <name>SA_RESOURCE_INFO Notify Message Status Type Payload</name>
<figure align="center"> <artwork align="left" name="" type="" alt=""><![CDATA[
<artwork align="left"><![CDATA[
1 2 3 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-----------------------------+-------------------------------+ +-------------------------------+-------------------------------+
! Next Payload !C! RESERVED ! Payload Length ! | Next Payload |C| RESERVED | Payload Length |
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! Protocol ID ! SPI Size ! Notify Message Type ! | Protocol ID | SPI Size | Notify Message Type |
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! ! | |
~ Resource Identifier (optional) ~ ~ Resource Identifier (optional) ~
! ! | |
+-------------------------------+-------------------------------+ +-------------------------------+-------------------------------+
]]></artwork> ]]></artwork>
</figure>
<t> <dl>
<list style="symbols"> <dt>(C)ritical flag -</dt><dd><bcp14>MUST</bcp14> be 0.</dd>
<t>Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0.</t> <dt>Protocol ID (1 octet) -</dt><dd><bcp14>MUST</bcp14> be 0. <bcp14>M
<t>SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0.</t> UST</bcp14> be ignored if not 0.</dd>
<t>Notify Status Message Type value (2 octets) - set to [TBD1].</t> <dt>SPI Size (1 octet) -</dt><dd><bcp14>MUST</bcp14> be 0. <bcp14>MUST
<t>Resource Identifier (optional). This opaque data may be set to co </bcp14> be ignored if not 0.</dd>
nvey the local identity of the resource.</t> <dt>Notify Status Message Type value (2 octets) -</dt><dd>set to 16444
</list> .</dd>
</t> <dt>Resource Identifier (optional) -</dt><dd>This opaque data may be s
et to convey the local identity of the resource.</dd>
</dl>
</section> </section>
<section title="TS_MAX_QUEUE Notify Message Error Type Payload" anchor="pa <section anchor="payload_max_q" numbered="true" toc="default">
yload_max_q"> <name>TS_MAX_QUEUE Notify Message Error Type Payload</name>
<figure align="center"> <artwork align="left" name="" type="" alt=""><![CDATA[
<artwork align="left"><![CDATA[
1 2 3 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! Next Payload !C! RESERVED ! Payload Length ! | Next Payload |C| RESERVED | Payload Length |
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! Protocol ID ! SPI Size ! Notify Message Type ! | Protocol ID | SPI Size | Notify Message Type |
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
]]></artwork> ]]></artwork>
</figure> <dl>
<t> <dt>(C)ritical flag -</dt><dd><bcp14>MUST</bcp14> be 0.</dd>
<list style="symbols"> <dt>Protocol ID (1 octet) -</dt><dd><bcp14>MUST</bcp14> be 0. <bcp14>M
<t>Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0.</t> UST</bcp14> be ignored if not 0.</dd>
<t>SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0.</t> <dt>SPI Size (1 octet) -</dt><dd><bcp14>MUST</bcp14> be 0. <bcp14>MUST
<t>Notify Message Error Type (2 octets) - set to [TBD2]</t> </bcp14> be ignored if not 0.</dd>
<dt>Notify Message Error Type (2 octets) -</dt><dd>set to 48.</dd>
</dl>
<t>There is no data associated with this Notify type.</t> <t>There is no data associated with this Notify type.</t>
</list>
</t>
</section> </section>
</section> </section>
<section anchor="Operations" title="Operational Considerations"> <section anchor="Operations" numbered="true" toc="default">
<t> <name>Operational Considerations</name>
Implementations supporting per-CPU SAs SHOULD extend their local <t>
Implementations supporting per-CPU SAs <bcp14>SHOULD</bcp14> extend their
local
SPD selector, and the mechanism of on-demand negotiation that is SPD selector, and the mechanism of on-demand negotiation that is
triggered by traffic to include a CPU (or queue) identifier in triggered by traffic to include a CPU (or queue) identifier in
their packet trigger (e.g. SADB_ACQUIRE) message from the SPD to their packet trigger (e.g., SADB_ACQUIRE) message from the SPD to
the IKE daemon. An implementation which does not support the IKE daemon. An implementation that does not support
receiving per-CPU packet trigger messages MAY initiate all its Child receiving per-CPU packet trigger messages <bcp14>MAY</bcp14> initiate all
its Child
SAs immediately upon receiving the (only) packet trigger message it SAs immediately upon receiving the (only) packet trigger message it
will receive from the IPsec stack. Such implementations also need will receive from the IPsec stack. Such an implementation also needs
to be careful when receiving a Delete Notify request for a per-CPU to be careful when receiving a Delete Notify request for a per-CPU
Child SA, as it has no method to detect when it should bring up such Child SA, as it has no method to detect when it should bring up such
a per-CPU Child SA again later. And bringing the deleted per-CPU a per-CPU Child SA again later. Also, bringing the deleted per-CPU
Child SA up again immediately after receiving the Delete Notify Child SA up again immediately after receiving the Delete Notify
might cause an infinite loop between the peers. Another issue of might cause an infinite loop between the peers. Another issue with
not bringing up all its per-CPU Child SAs is that if the peer acts not bringing up all its per-CPU Child SAs is that if the peer acts
similarly, the two peers might end up with only the first Child similarly, the two peers might end up with only the first Child
SA without ever activating any per-CPU Child SAs. It is therefor SA without ever activating any per-CPU Child SAs. It is therefore
RECOMMENDED to implement per-CPU packet trigger messages. <bcp14>RECOMMENDED</bcp14> to implement per-CPU packet trigger messages.
</t> </t>
<t> <t>
Peers SHOULD be flexible with the maximum number of Child SAs they Peers <bcp14>SHOULD</bcp14> be flexible with the maximum number of Child S
allow for a given TSi/TSr combination to account for corner cases. For As they
allow for a given TSi/TSr combination in order to account for corner cases
. For
example, during Child SA rekeying, there might be a large number example, during Child SA rekeying, there might be a large number
of additional Child SAs created before the old Child SAs are torn of additional Child SAs created before the old Child SAs are torn
down. Similarly, when using on-demand Child SAs, both ends could down. Similarly, when using on-demand Child SAs, both ends could
trigger multiple Child SA requests as the initial packet causing trigger multiple Child SA requests as the initial packet causing
the Child SA negotiation might have been transported to the peer the Child SA negotiation might have been transported to the peer
via the first Child SA where its reply packet might also trigger an via the first Child SA, where its reply packet might also trigger an
on-demand Child SA negotiation to start. As additional Child SAs on-demand Child SA negotiation to start. As additional Child SAs
consume little additional resources, allowing at the very least double consume little additional resources, allowing at the very least double
the number of available CPUs is RECOMMENDED. An implementation MAY allow the number of available CPUs is <bcp14>RECOMMENDED</bcp14>. An implementat ion <bcp14>MAY</bcp14> allow
unlimited additional Child SAs and only limit this number based on its unlimited additional Child SAs and only limit this number based on its
generic resource protection strategies that are used to require COOKIES generic resource protection strategies that are used to require COOKIES
or refuse new IKE or Child SA negotiations. Although having a very large or refuse new IKE or Child SA negotiations. Although having a very large
number (e.g. hundreds or thousands) of SAs may slow down per-packet SAD lo number (e.g., hundreds or thousands) of SAs may slow down per-packet SAD l
okup. ookup.
</t> </t>
<t> <t>
Implementations might support dynamically moving a per-CPU Child Implementations might support dynamically moving a per-CPU Child
SAs from one CPU to another CPU. If this method is supported, SA from one CPU to another CPU. If this method is supported,
implementations must be careful to move both the inbound and outbound implementations must be careful to move both the inbound and outbound
SAs. If the IPsec endpoint is a gateway, it can move the inbound SA SAs. If the IPsec endpoint is a gateway, it can move the inbound SA
and outbound SA independently of each other. It is likely that and outbound SA independently of each other. It is likely that
for a gateway, IPsec traffic would be asymmetric. If the IPsec for a gateway, IPsec traffic would be asymmetric. If the IPsec
endpoint is the same host responsible for generating the traffic, endpoint is the same host responsible for generating the traffic,
the inbound and outbound SAs SHOULD remain as a pair on the same CPU. the inbound and outbound SAs <bcp14>SHOULD</bcp14> remain as a pair on the same CPU.
If a host previously skipped installing an outbound SA because it If a host previously skipped installing an outbound SA because it
would be an unused duplicate outbound SA, it will have to create would be an unused duplicate outbound SA, it will have to create
and add the previously skipped outbound SA to the SAD with the new and add the previously skipped outbound SA to the SAD with the new
CPU ID. The inbound SA may not have CPU ID in the SAD. Adding the CPU ID. The inbound SA may not have a CPU ID in the SAD. Adding the
outbound SA to the SAD requires access to the key material, whereas outbound SA to the SAD requires access to the key material, whereas
for updating the CPU selector on an existing outbound SAs access updating the CPU selector on an existing outbound SAs might not require acc
to key material might not be needed. To support this, the IKE ess
to key material. To support this, the IKE
software might have to hold on to the key material longer than it software might have to hold on to the key material longer than it
normally would, as it might actively attempt to destroy key material normally would, as it might actively attempt to destroy key material
from memory that the IKE daemon no longer needs access to. from memory that the IKE daemon no longer needs access to.
</t> </t>
<t> <t>
An implementation that does not accept any further resource specific An implementation that does not accept any further resource-specific
Child SAs MUST NOT return the NO_ADDITIONAL_SAS error because this Child SAs <bcp14>MUST NOT</bcp14> return the NO_ADDITIONAL_SAS error because
can be interpreted by the peer that no other Child SAs with different it
TSi/TSr are allowed either. Instead, it MUST return TS_MAX_QUEUE. could be misinterpreted by the peer to mean that no other Child SA with
</t> a different TSi and/or TSr is allowed either. Instead, it <bcp14>MUST</bcp14>
return TS_MAX_QUEUE.
</t>
</section> </section>
<section anchor="Security" title="Security Considerations"> <section anchor="Security" numbered="true" toc="default">
<t> <name>Security Considerations</name>
<t>
Similar to how an implementation should limit the number of Similar to how an implementation should limit the number of
half-open SAs to limit the impact of a denial of service attack, half-open SAs to limit the impact of a denial-of-service attack,
it is RECOMMENDED that an implementation limits the maximum number of addi it is <bcp14>RECOMMENDED</bcp14> that an implementation limits the maximum
tional number of additional
Child SAs allowed per unique TSi/TSr. Child SAs allowed per unique TSi/TSr.
</t> </t>
<t> <t>
Using multiple resource specific child SAs makes sense for Using multiple resource-specific child SAs makes sense for
high volume IPsec connections on IPsec gateway machines where the high-volume IPsec connections on IPsec gateway machines where the
administrator has a trust relationship with the peer's administrator has a trust relationship with the peer's
administrator and abuse is unlikely and easily escalated to resolve. administrator and abuse is unlikely and easily escalated to resolve.
</t> </t>
<t> <t>
This trust relationship is usually not present for the Remote This trust relationship is usually not present for the deployments of remo
Access VPN type deployments, and allowing per-CPU Child SA's te
is NOT RECOMMENDED in these scenarios. Therefore, it is also NOT access VPNs, and allowing per-CPU Child SAs
RECOMMENDED to allow per-CPU Child SAs per default. is <bcp14>NOT RECOMMENDED</bcp14> in these scenarios. Therefore, it is als
</t> o <bcp14>NOT
<t> RECOMMENDED</bcp14> to allow per-CPU Child SAs by default.
</t>
<t>
The SA_RESOURCE_INFO notify contains an optional data payload that The SA_RESOURCE_INFO notify contains an optional data payload that
can be used by the peer to identify the Child SA belonging to a can be used by the peer to identify the Child SA belonging to a
specific resource. The notify data SHOULD NOT be an identifier that specific resource. Notification data <bcp14>SHOULD NOT</bcp14> be an iden tifier that
can be used to gain information about the hardware. For example, can be used to gain information about the hardware. For example,
using the CPU number itself as identifier might give an attacker using the CPU number itself as the identifier might give an attacker
knowledge which packets are handled by which CPU ID and it might knowledge of which packets are handled by which CPU ID, and it might
optimize a brute force attack against the system. optimize a brute-force attack against the system.
</t> </t>
</section> </section>
<section title="Implementation Status" anchor="impl_status"> <section anchor="IANA" numbered="true" toc="default">
<t> <name>IANA Considerations</name>
[Note to RFC Editor: Please remove this section and the reference to
<xref target="RFC7942"/> before publication.]
</t>
<t>
This section records the status of known implementations of the
protocol defined by this specification at the time of posting of
this Internet-Draft, and is based on a proposal described in
<xref target="RFC7942"/>. The description of implementations in this
section is intended to assist the IETF in its decision processes
in progressing drafts to RFCs. Please note that the listing of
any individual implementation here does not imply endorsement
by the IETF. Furthermore, no effort has been spent to verify the
information presented here that was supplied by IETF contributors.
This is not intended as, and must not be construed to be, a catalog
of available implementations or their features. Readers are advised
to note that other implementations may exist.
</t>
<t>
According to <xref target="RFC7942"/>, "this will allow reviewers
and working groups to assign due consideration to documents that
have the benefit of running code, which may serve as evidence of
valuable experimentation and feedback that have made the implemented
protocols more mature. It is up to the individual working groups
to use this information as they see fit".
</t>
<t> <t>
Authors are requested to add a note to the RFC Editor at the IANA has registered one new value in the "IKEv2 Notify Message Status Ty
top of this section, advising the Editor to remove the entire pes" registry.
section before publication, as well as the reference to <xref target="RFC7 </t>
942"/>. <table anchor="iana_requests_i">
</t> <thead>
<section anchor="section.impl-status.xfrm" title="Linux XFRM"> <tr>
<t> <th>Value</th>
<list style="hanging"> <th>Notify Message Status Type</th>
<t hangText="Organization: ">Linux kernel XFRM</t> <th>Reference</th>
<t hangText="Name: ">XFRM-PCPU-v7 https://git.kernel.org/pub/scm/lin </tr>
ux/kernel/git/klassert/linux-stk.git/log/?h=xfrm-pcpu-v7</t> </thead>
<t hangText="Description: "> An initial Kernel IPsec implementation <tbody>
of the per-CPU method.</t> <tr>
<t hangText="Level of maturity: ">Alpha</t> <td>16444</td>
<t hangText="Coverage: "> <td>SA_RESOURCE_INFO</td>
Implements a general Child SA and per-CPU Child SAs. It only support <td>RFC 9611</td>
s </tr>
the NETLINK API. The PFKEYv2 API is not supported.</t> </tbody>
<t hangText="Licensing: ">GPLv2</t> </table>
<t hangText="Implementation experience: "> The Linux XFRM
implementation added two additional attributes to support per-CPU S
As.
There is a new attribute XFRMA_SA_PCPU, u32, for the SAD entry.
This attribute should present on the outgoing SA, per-CPU Child SAs
,
starting from 0. This attribute MUST NOT be present on the first
XFRM SA. It is used by the kernel only for the outgoing traffic,
(clear to encrypted).
The incoming SAs do not need XFRMA_SA_PCPU attribute. XFRM stack ca
n not
use CPU id on the incoming SA. The kernel internally sets the valu
e to
0xFFFFFF for the incoming SA and the initial Child SA that can be u
sed by
any CPU.
However, one may add XFRMA_SA_PCPU to the incoming per-CPU SA to s
teer
the ESP flow, to a specific Q or CPU e.g ethtool ntuple configurati
on.
The SPD entry has new flag XFRM_POLICY_CPU_ACQUIRE.
It should be set only on the "out" policy. The flag should
be disabled when the policy is a trap policy, without SPD entries.
After a successful negotiation of SA_RESOURCE_INFO, while adding th
e
first Child SA, the SPD entry can be updated with the
XFRM_POLICY_CPU_ACQUIRE flag.
When XFRM_POLICY_CPU_ACQUIRE is set, the XFRM_MSG_ACQUIRE generated
will include the XFRMA_SA_PCPU attribute.
</t>
<t hangText="Contact: ">Steffen Klassert steffen.klassert@secunet.co
m</t>
</list>
</t>
</section>
<section anchor="section.impl-status.libreswan" title="Libreswan">
<t>
<list style="hanging">
<t hangText="Organization: ">The Libreswan Project</t>
<t hangText="Name: ">pcpu-3 https://libreswan.org/wiki/XFRM_pCPU</t>
<t hangText="Description: ">
An initial IKE implementation of the per-CPU method.</t>
<t hangText="Level of maturity: ">Alpha</t>
<t hangText="Coverage: ">
implements combining a regular (all-CPUs) Child SA and per-CPU addit
ional Child SAs</t>
<t hangText="Licensing: ">GPLv2</t>
<t hangText="Implementation experience: ">TBD</t>
<t hangText="Contact: ">Libreswan Development: swan-dev@libreswan.or
g</t>
</list>
</t>
</section>
<section anchor="section.impl-status.strongswan" title="strongSwan">
<t>
<list style="hanging">
<t hangText="Organization: ">The StrongSwan Project</t>
<t hangText="Name: ">StrongSwan https://github.com/strongswan/strong
swan/tree/per-cpu-sas-poc/</t>
<t hangText="Description: ">
An initial IKE implementation of the per-CPU method.</t>
<t hangText="Level of maturity: ">Alpha</t>
<t hangText="Coverage: "> implements combining a regular (all-CPUs)
Child SA and per-CPU additional Child SAs</t>
<t hangText="Licensing: ">GPLv2</t>
<t hangText="Implementation experience: ">
StrongSwan use private space values for notifications
SA_RESOURCE_INFO (40970).
</t>
<t hangText="Contact: ">Tobias Brunner tobias@strongswan.org</t>
</list>
</t>
</section>
<section anchor="section.impl-status.iproute2" title="iproute2">
<t>
<list style="hanging">
<t hangText="Organization: ">The iproute2 Project</t>
<t hangText="Name: "> iproute2 https://github.com/antonyantony/iproute2
/tree/pcpu-v1</t>
<t hangText="Description: ">Implemented the per-CPU attributes for the
"ip xfrm" command.</t>
<t hangText="Level of maturity: ">Alpha</t>
<t hangText="Licensing: ">GPLv2</t>
<t hangText="Implementation experience: ">TBD</t>
<t hangText="Contact: ">Antony Antony antony.antony@secunet.com</t>
</list>
</t>
</section>
</section>
<section anchor="IANA" title="IANA Considerations">
<t> <t>
This document defines one new registration for the IANA "IKEv2 Notify Me IANA has registered one new value in the "IKEv2 Notify Message Error Typ
ssage Status Types" registry. es" registry.
</t> </t>
<figure align="center" anchor="iana_requests_i"> <table anchor="iana_requests_e">
<artwork align="left"><![CDATA[ <thead>
Value Notify Message Status Type Reference <tr>
----- ------------------------------ --------------- <th>Value</th>
[TBD1] SA_RESOURCE_INFO [this document] <th>Notify Message Error Type</th>
]]></artwork> <th>Reference</th>
</figure> </tr>
<t> </thead>
This document defines one new registration for the IANA "IKEv2 Notify Me <tbody>
ssage Error Types" registry. <tr>
</t> <td>48</td>
<figure align="center" anchor="iana_requests_e"> <td>TS_MAX_QUEUE</td>
<artwork align="left"><![CDATA[ <td>RFC 9611</td>
Value Notify Message Error Type Reference </tr>
----- ------------------------------ --------------- </tbody>
[TBD2] TS_MAX_QUEUE [this document] </table>
]]></artwork>
</figure>
</section>
<section title="Acknowledgements" anchor="acks">
<t>The following people provided reviews and valuable feedback: Roman Danyl
iw, Warren Kumari
Tero Kivinen, Murray Kucherawy, John Scudder, Valery Smyslov, Gunter van
de Velde and Eric Vyncke.
</t>
</section> </section>
</middle> </middle>
<back> <back>
<references title="Normative References"> <references>
&RFC2119; <name>References</name>
&RFC7296; <references>
&RFC8174; <name>Normative References</name>
</references> <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2
<references title="Informative References"> 119.xml"/>
&RFC2367; <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7
&RFC4301; 296.xml"/>
&RFC7942; <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8
174.xml"/>
</references>
<references>
<name>Informative References</name>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2
367.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4
301.xml"/>
</references>
</references> </references>
<section anchor="acks" numbered="false" toc="default">
<name>Acknowledgements</name>
<t>The following people provided reviews and valuable feedback:
<contact fullname="Roman Danyliw" />, <contact fullname="Warren Kumari" />
,
<contact fullname="Tero Kivinen" />, <contact fullname="Murray Kucherawy"
/>,
<contact fullname="John Scudder" />, <contact fullname="Valery Smyslov" />
,
<contact fullname="Gunter van de Velde" />, and <contact fullname="Éric Vy
ncke" />.
</t>
</section>
</back> </back>
</rfc> </rfc>
 End of changes. 78 change blocks. 
371 lines changed or deleted 314 lines changed or added

This html diff was produced by rfcdiff 1.48.