rfc9611.original | rfc9611.txt | |||
---|---|---|---|---|
Network A. Antony | Internet Engineering Task Force (IETF) A. Antony | |||
Internet-Draft secunet | Request for Comments: 9611 secunet | |||
Intended status: Standards Track T. Brunner | Category: Standards Track T. Brunner | |||
Expires: 3 November 2024 codelabs | ISSN: 2070-1721 codelabs | |||
S. Klassert | S. Klassert | |||
secunet | secunet | |||
P. Wouters | P. Wouters | |||
Aiven | Aiven | |||
2 May 2024 | July 2024 | |||
IKEv2 support for per-resource Child SAs | Internet Key Exchange Protocol Version 2 (IKEv2) Support for | |||
draft-ietf-ipsecme-multi-sa-performance-09 | Per-Resource Child Security Associations (SAs) | |||
Abstract | Abstract | |||
This document defines one Notify Message Status Types and one Notify | In order to increase the bandwidth of IPsec traffic between peers, | |||
this document defines one Notify Message Status Types and one Notify | ||||
Message Error Types payload for the Internet Key Exchange Protocol | Message Error Types payload for the Internet Key Exchange Protocol | |||
Version 2 (IKEv2) to support the negotiation of multiple Child | Version 2 (IKEv2) to support the negotiation of multiple Child | |||
Security Associations (SAs) with the same Traffic Selectors used on | Security Associations (SAs) with the same Traffic Selectors used on | |||
different resources, such as CPUs, to increase bandwidth of IPsec | different resources, such as CPUs. | |||
traffic between peers. | ||||
The SA_RESOURCE_INFO notification is used to convey information that | The SA_RESOURCE_INFO notification is used to convey information that | |||
the negotiated Child SA and subsequent new Child SAs with the same | the negotiated Child SA and subsequent new Child SAs with the same | |||
Traffic Selectors are a logical group of Child SAs where most or all | Traffic Selectors are a logical group of Child SAs where most or all | |||
of the Child SAs are bound to a specific resource, such as a specific | of the Child SAs are bound to a specific resource, such as a specific | |||
CPU. The TS_MAX_QUEUE notify conveys that the peer is unwilling to | CPU. The TS_MAX_QUEUE notify conveys that the peer is unwilling to | |||
create more additional Child SAs for this particular negotiated | create more additional Child SAs for this particular negotiated | |||
Traffic Selector combination. | Traffic Selector combination. | |||
Using multiple Child SAs with the same Traffic Selectors has the | Using multiple Child SAs with the same Traffic Selectors has the | |||
benefit that each resource holding the Child SA has its own Sequence | benefit that each resource holding the Child SA has its own Sequence | |||
Number Counter, ensuring that CPUs don't have to synchronize their | Number Counter, ensuring that CPUs don't have to synchronize their | |||
cryptographic state or disable their packet replay protection. | cryptographic state or disable their packet replay protection. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This is an Internet Standards Track document. | |||
provisions of BCP 78 and BCP 79. | ||||
Internet-Drafts are working documents of the Internet Engineering | ||||
Task Force (IETF). Note that other groups may also distribute | ||||
working documents as Internet-Drafts. The list of current Internet- | ||||
Drafts is at https://datatracker.ietf.org/drafts/current/. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | This document is a product of the Internet Engineering Task Force | |||
and may be updated, replaced, or obsoleted by other documents at any | (IETF). It represents the consensus of the IETF community. It has | |||
time. It is inappropriate to use Internet-Drafts as reference | received public review and has been approved for publication by the | |||
material or to cite them other than as "work in progress." | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | ||||
This Internet-Draft will expire on 3 November 2024. | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | ||||
https://www.rfc-editor.org/info/rfc9611. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2024 IETF Trust and the persons identified as the | Copyright (c) 2024 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents (https://trustee.ietf.org/ | Provisions Relating to IETF Documents | |||
license-info) in effect on the date of publication of this document. | (https://trustee.ietf.org/license-info) in effect on the date of | |||
Please review these documents carefully, as they describe your rights | publication of this document. Please review these documents | |||
and restrictions with respect to this document. Code Components | carefully, as they describe your rights and restrictions with respect | |||
extracted from this document must include Revised BSD License text as | to this document. Code Components extracted from this document must | |||
described in Section 4.e of the Trust Legal Provisions and are | include Revised BSD License text as described in Section 4.e of the | |||
provided without warranty as described in the Revised BSD License. | Trust Legal Provisions and are provided without warranty as described | |||
in the Revised BSD License. | ||||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction | |||
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 | 1.1. Requirements Language | |||
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.2. Terminology | |||
2. Performance bottlenecks . . . . . . . . . . . . . . . . . . . 4 | 2. Performance Bottlenecks | |||
3. Negotiation of CPU specific Child SAs . . . . . . . . . . . . 4 | 3. Negotiation of Resource-Specific Child SAs | |||
4. Implementation Considerations . . . . . . . . . . . . . . . . 5 | 4. Implementation Considerations | |||
5. Payload Format . . . . . . . . . . . . . . . . . . . . . . . 6 | 5. Payload Format | |||
5.1. SA_RESOURCE_INFO Notify Message Status Type payload . . . 6 | 5.1. SA_RESOURCE_INFO Notify Message Status Type Payload | |||
5.2. TS_MAX_QUEUE Notify Message Error Type Payload . . . . . 7 | 5.2. TS_MAX_QUEUE Notify Message Error Type Payload | |||
6. Operational Considerations . . . . . . . . . . . . . . . . . 7 | 6. Operational Considerations | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | 7. Security Considerations | |||
8. Implementation Status . . . . . . . . . . . . . . . . . . . . 9 | 8. IANA Considerations | |||
8.1. Linux XFRM . . . . . . . . . . . . . . . . . . . . . . . 9 | 9. References | |||
8.2. Libreswan . . . . . . . . . . . . . . . . . . . . . . . . 10 | 9.1. Normative References | |||
8.3. strongSwan . . . . . . . . . . . . . . . . . . . . . . . 11 | 9.2. Informative References | |||
8.4. iproute2 . . . . . . . . . . . . . . . . . . . . . . . . 11 | Acknowledgements | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 | Authors' Addresses | |||
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
11.1. Normative References . . . . . . . . . . . . . . . . . . 12 | ||||
11.2. Informative References . . . . . . . . . . . . . . . . . 12 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 | ||||
1. Introduction | 1. Introduction | |||
Most IPsec implementations are currently limited to using one | Most IPsec implementations are currently limited to using one | |||
hardware queue or a single CPU resource for a Child SA. Running | hardware queue or a single CPU resource for a Child SA. Running | |||
packet stream encryption in parallel can be done, but there is a | packet stream encryption in parallel can be done, but there is a | |||
bottleneck of different parts of the hardware locking or waiting to | bottleneck of different parts of the hardware locking or waiting to | |||
get their sequence number assigned for the packet it is encrypting. | get their sequence number assigned for the packet being encrypted. | |||
The result is that a machine with many such resources is limited to | The result is that a machine with many such resources is limited to | |||
only using one of these resources per Child SA. This severely limits | using only one of these resources per Child SA. This severely limits | |||
the throughput that can be attained. For example, at the time of | the throughput that can be attained. For example, at the time of | |||
writing, an unencrypted link of 10Gbps or more is commonly reduced to | writing, an unencrypted link of 10 Gbps or more is commonly reduced | |||
2-5Gbps when IPsec is used to encrypt the link using AES-GCM. By | to 2-5 Gbps when IPsec is used to encrypt the link using AES-GCM. By | |||
using the implementation specified in this document, aggregate | using the implementation specified in this document, aggregate | |||
throughput increased from 5Gbps using 1 CPU to 40-60 Gbps using 25-30 | throughput increased from 5Gbps using 1 CPU to 40-60 Gbps using 25-30 | |||
CPUs. | CPUs. | |||
While this could be (partially) mitigated by setting up multiple | While this could be (partially) mitigated by setting up multiple | |||
narrowed Child SAs, for example using Populate From Packet (PFP) as | narrowed Child SAs (for example, using Populate From Packet (PFP) as | |||
specified in IPsec Architecture [RFC4301], this IPsec feature would | specified in IPsec architecture [RFC4301]), this IPsec feature would | |||
cause too many Child SAs (one per network flow) or too few Child SAs | cause too many Child SAs (one per network flow) or too few Child SAs | |||
(one network flow used on multiple CPUs). PFP is also not widely | (one network flow used on multiple CPUs). PFP is also not widely | |||
implemented. | implemented. | |||
To make better use of multiple network queues and CPUs, it can be | To make better use of multiple network queues and CPUs, it can be | |||
beneficial to negotiate and install multiple Child SAs with identical | beneficial to negotiate and install multiple Child SAs with identical | |||
Traffic Selectors. IKEv2 [RFC7296] already allows installing | Traffic Selectors. IKEv2 [RFC7296] already allows installing | |||
multiple Child SAs with identical Traffic Selectors, but it offers no | multiple Child SAs with identical Traffic Selectors, but it offers no | |||
method to indicate that the additional Child SA is being requested | method to indicate that the additional Child SA is being requested | |||
for performance increase reasons and is restricted to some resource | for performance increase reasons and is restricted to some resource | |||
(queue or CPU). | (queue or CPU). | |||
When an IKEv2 peer is receiving more additional Child SA's for a | When an IKEv2 peer is receiving more additional Child SAs for a | |||
single set of Traffic Selectors than it is willing to create, it can | single set of Traffic Selectors than it is willing to create, it can | |||
return an error notify of TS_MAX_QUEUE. | return an error notify of TS_MAX_QUEUE. | |||
1.1. Requirements Language | 1.1. Requirements Language | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
"OPTIONAL" in this document are to be interpreted as described in BCP | "OPTIONAL" in this document are to be interpreted as described in | |||
14 [RFC2119] [RFC8174] when, and only when, they appear in all | BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all | |||
capitals, as shown here. | capitals, as shown here. | |||
1.2. Terminology | 1.2. Terminology | |||
This document uses the following terms defined in IKEv2 [RFC7296]: | This document uses the following terms defined in IKEv2 [RFC7296]: | |||
Notification Data, Traffic Selectors (TS), TSi/TSr, Child SA, | Notification Data, Traffic Selector (TS), Traffic Selector initiator | |||
Configuration Payload (CP), IKE SA, CREATE_CHILD_SA and | (TSi), Traffic Selector responder (TSr), Child SA, Configuration | |||
NO_ADDITIONAL_SAS. | Payload (CP), IKE SA, CREATE_CHILD_SA, and NO_ADDITIONAL_SAS. | |||
This document also uses the following terms defined in [RFC4301]: | This document also uses the following terms defined in [RFC4301]: | |||
SPD, SA. | Security Policy Database (SPD), SA. | |||
2. Performance bottlenecks | 2. Performance Bottlenecks | |||
There are several pragmatic reasons why most implementations must | There are several pragmatic reasons why most implementations must | |||
restrict a Child Security Association (SA) to a single specific | restrict a Child Security Association (SA) to a single specific | |||
hardware resource. A primary limitation arises from the challenges | hardware resource. A primary limitation arises from the challenges | |||
associated with sharing cryptographic states, counters, and sequence | associated with sharing cryptographic states, counters, and sequence | |||
numbers among multiple CPUs. When these CPUs attempt to | numbers among multiple CPUs. When these CPUs attempt to | |||
simultaneously utilize shared states, it becomes impractical to do so | simultaneously utilize shared states, it becomes impractical to do so | |||
without incurring a significant performance penalty. It is necessary | without incurring a significant performance penalty. It is necessary | |||
to negotiate and establish multiple Child Security Associations (SAs) | to negotiate and establish multiple Child SAs with identical Traffic | |||
with identical Traffic Selector initiator (TSi) and Traffic Selector | Selector initiator (TSi) and Traffic Selector responder (TSr) on a | |||
responder (TSr) on a per-resource basis." | per-resource basis. | |||
3. Negotiation of CPU specific Child SAs | 3. Negotiation of Resource-Specific Child SAs | |||
An initial IKEv2 exchange is used to setup an IKE SA and the initial | An initial IKEv2 exchange is used to set up an IKE SA and the initial | |||
Child SA. If multiple Child SAs with the same Traffic Selectors that | Child SA. If multiple Child SAs with the same Traffic Selectors that | |||
are bound to a single resource are desired, the initiator will add | are bound to a single resource are desired, the initiator will add | |||
the SA_RESOURCE_INFO notify payload to the Exchange negotiating the | the SA_RESOURCE_INFO notify payload to the Exchange negotiating the | |||
Child SA (e.g. IKE_AUTH or CREATE_CHILD_SA). If this initial Child | Child SA (e.g., IKE_AUTH or CREATE_CHILD_SA). If this initial Child | |||
SA will be tied to a specific resource, it MAY indicate this by | SA will be tied to a specific resource, it MAY indicate this by | |||
including an identifier in the Notification Data. A responder that | including an identifier in the Notification Data. A responder that | |||
is willing to have multiple Child SAs for the same Traffic Selectors | is willing to have multiple Child SAs for the same Traffic Selectors | |||
will respond by also adding the SA_RESOURCE_INFO notify payload in | will respond by also adding the SA_RESOURCE_INFO notify payload in | |||
which it MAY add a non-zero Notify Data. | which it MAY add a non-zero Notification Data. | |||
Additional resource-specific Child SAs are negotiated as regular | Additional resource-specific Child SAs are negotiated as regular | |||
Child SAs using the CREATE_CHILD_SA exchange and are similarly | Child SAs using the CREATE_CHILD_SA exchange and are similarly | |||
identified by an accompanying SA_RESOURCE_INFO notification. | identified by an accompanying SA_RESOURCE_INFO notification. | |||
Upon installation, each resource-specific Child SA is associated with | Upon installation, each resource-specific Child SA is associated with | |||
an additional local selector, such as the CPU. These resource- | an additional local selector, such as the CPU. These resource- | |||
specific Child SAs MUST be negotiated with identical Child SA | specific Child SAs MUST be negotiated with identical Child SA | |||
properties that were negotiated for the initial Child SA. This | properties that were negotiated for the initial Child SA. This | |||
includes cryptographic algorithms, Traffic Selectors, Mode (e.g. | includes cryptographic algorithms, Traffic Selectors, Mode (e.g., | |||
transport mode), compression usage, etc. However, each Child SA does | transport mode), compression usage, etc. However, each Child SA does | |||
have its own keying material that is individually derived according | have its own keying material that is individually derived according | |||
to the regular IKEv2 process. The SA_RESOURCE_INFO notify payload | to the regular IKEv2 process. The SA_RESOURCE_INFO notify payload | |||
MAY be empty or MAY contain some identifying data. This identifying | MAY be empty or MAY contain some identifying data. This identifying | |||
data SHOULD be a unique identifier within all the Child SAs with the | data SHOULD be a unique identifier within all the Child SAs with the | |||
same TS payloads and the peer MUST only use it for debugging | same TS payloads, and the peer MUST only use it for debugging | |||
purposes. | purposes. | |||
Additional Child SAs can be started on-demand or can be started all | Additional Child SAs can be started on demand or can be started all | |||
at once. Peers may also delete specific per-resource Child SAs if | at once. Peers may also delete specific per-resource Child SAs if | |||
they deem the associated resource to be idle. | they deem the associated resource to be idle. | |||
During the CREATE_CHILD_SA rekey for the Child SA, the | During the CREATE_CHILD_SA rekey for the Child SA, the | |||
SA_RESOURCE_INFO notification MAY be included, but regardless of | SA_RESOURCE_INFO notification MAY be included, but regardless of | |||
whether or not it is included, the rekeyed Child SA should be bound | whether or not it is included, the rekeyed Child SA should be bound | |||
to the same resource(s) as the Child SA that is being rekeyed. | to the same resource(s) as the Child SA that is being rekeyed. | |||
4. Implementation Considerations | 4. Implementation Considerations | |||
skipping to change at page 5, line 35 ¶ | skipping to change at line 208 ¶ | |||
by all CPUs, so that while negotiating a new per-CPU Child SA, which | by all CPUs, so that while negotiating a new per-CPU Child SA, which | |||
typically takes 1 RTT delay, the CPU with no CPU-specific Child SA | typically takes 1 RTT delay, the CPU with no CPU-specific Child SA | |||
can still encrypt its packets using the Child SA that is available | can still encrypt its packets using the Child SA that is available | |||
for all CPUs. Alternatively, if an implementation finds it needs to | for all CPUs. Alternatively, if an implementation finds it needs to | |||
encrypt a packet but the current CPU does not have the resources to | encrypt a packet but the current CPU does not have the resources to | |||
encrypt this packet, it can relay that packet to a specific CPU that | encrypt this packet, it can relay that packet to a specific CPU that | |||
does have the capability to encrypt the packet, although this will | does have the capability to encrypt the packet, although this will | |||
come with a performance penalty. | come with a performance penalty. | |||
Performing per-CPU Child SA negotiations can result in both peers | Performing per-CPU Child SA negotiations can result in both peers | |||
initiating additional Child SAs at once. This is especially likely | initiating additional Child SAs simultaneously. This is especially | |||
if per-CPU Child SAs are triggered by individual SADB_ACQUIRE | likely if per-CPU Child SAs are triggered by individual SADB_ACQUIRE | |||
[RFC2367] messages. Responders should install the additional Child | messages [RFC2367]. Responders should install the additional Child | |||
SA on a CPU with the least amount of additional Child SAs for this | SA on a CPU with the least amount of additional Child SAs for this | |||
TSi/TSr pair. | TSi/TSr pair. | |||
When the number of queue or CPU resources are different between the | When the number of queue or CPU resources are different between the | |||
peers, the peer with the least amount of resources may decide to not | peers, the peer with the least amount of resources may decide to not | |||
install a second outbound Child SA for the same resource as it will | install a second outbound Child SA for the same resource, as it will | |||
never use it to send traffic. However, it must install all inbound | never use it to send traffic. However, it must install all inbound | |||
Child SAs as it has committed to receiving traffic on these | Child SAs because it has committed to receiving traffic on these | |||
negotiated Child SAs. | negotiated Child SAs. | |||
If per-CPU packet trigger (e.g. SADB_ACQUIRE) messages are | If per-CPU packet trigger (e.g., SADB_ACQUIRE) messages are | |||
implemented (see Section 6), the Traffic Selector (TSi) entry | implemented (see Section 6), the Traffic Selector (TSi) entry | |||
containing the information of the trigger packet should be included | containing the information of the trigger packet should be included | |||
in the TS set similarly to regular Child SAs as specified in IKEv2 | in the TS set similarly to regular Child SAs as specified in IKEv2 | |||
[RFC7296], Section 2.9. Based on the trigger TSi entry, an | ||||
[RFC7296] Section 2.9. Based on the trigger TSi entry, an | ||||
implementation can select the most optimal target CPU to install the | implementation can select the most optimal target CPU to install the | |||
additional Child SA on. For example, if the trigger packet was for a | additional Child SA on. For example, if the trigger packet was for a | |||
TCP destination to port 25 (SMTP), it might be able to install the | TCP destination to port 25 (SMTP), it might be able to install the | |||
Child SA on the CPU that is also running the mail server process. | Child SA on the CPU that is also running the mail server process. | |||
Trigger packet Traffic Selectors are documented in IKEv2 [RFC7296] | Trigger packet Traffic Selectors are documented in IKEv2 [RFC7296], | |||
Section 2.9. | Section 2.9. | |||
As per IKEv2, rekeying a Child SA SHOULD use the same (or wider) | As per IKEv2, rekeying a Child SA SHOULD use the same (or wider) | |||
Traffic Selectors to ensure that the new Child SA covers everything | Traffic Selectors to ensure that the new Child SA covers everything | |||
that the rekeyed Child SA covers. This includes Traffic Selectors | that the rekeyed Child SA covers. This includes Traffic Selectors | |||
negotiated via Configuration Payloads (CP) such as | negotiated via Configuration Payloads such as INTERNAL_IP4_ADDRESS, | |||
INTERNAL_IP4_ADDRESS which may use the original wide TS set or use | which may use the original wide TS set or use the narrowed TS set. | |||
the narrowed TS set. | ||||
5. Payload Format | 5. Payload Format | |||
The Notify Payload format is defined in IKEv2 [RFC7296] section 3.10, | The Notify Payload format is defined in IKEv2 [RFC7296], | |||
and is copied here for convenience. | Section 3.10, and is copied here for convenience. | |||
All multi-octet fields representing integers are laid out in big | All multi-octet fields representing integers are laid out in big | |||
endian order (also known as "most significant byte first", or | endian order (also known as "most significant byte first", or | |||
"network byte order"). | "network byte order"). | |||
5.1. SA_RESOURCE_INFO Notify Message Status Type payload | 5.1. SA_RESOURCE_INFO Notify Message Status Type Payload | |||
1 2 3 | 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-----------------------------+-------------------------------+ | +-------------------------------+-------------------------------+ | |||
! Next Payload !C! RESERVED ! Payload Length ! | | Next Payload |C| RESERVED | Payload Length | | |||
+---------------+---------------+-------------------------------+ | +---------------+---------------+-------------------------------+ | |||
! Protocol ID ! SPI Size ! Notify Message Type ! | | Protocol ID | SPI Size | Notify Message Type | | |||
+---------------+---------------+-------------------------------+ | +---------------+---------------+-------------------------------+ | |||
! ! | | | | |||
~ Resource Identifier (optional) ~ | ~ Resource Identifier (optional) ~ | |||
! ! | | | | |||
+-------------------------------+-------------------------------+ | +-------------------------------+-------------------------------+ | |||
* Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. | (C)ritical flag - MUST be 0. | |||
* SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. | Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. | |||
* Notify Status Message Type value (2 octets) - set to [TBD1]. | SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. | |||
* Resource Identifier (optional). This opaque data may be set to | Notify Status Message Type value (2 octets) - set to 16444. | |||
Resource Identifier (optional) - This opaque data may be set to | ||||
convey the local identity of the resource. | convey the local identity of the resource. | |||
5.2. TS_MAX_QUEUE Notify Message Error Type Payload | 5.2. TS_MAX_QUEUE Notify Message Error Type Payload | |||
1 2 3 | 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+---------------+---------------+-------------------------------+ | +---------------+---------------+-------------------------------+ | |||
! Next Payload !C! RESERVED ! Payload Length ! | | Next Payload |C| RESERVED | Payload Length | | |||
+---------------+---------------+-------------------------------+ | +---------------+---------------+-------------------------------+ | |||
! Protocol ID ! SPI Size ! Notify Message Type ! | | Protocol ID | SPI Size | Notify Message Type | | |||
+---------------+---------------+-------------------------------+ | +---------------+---------------+-------------------------------+ | |||
* Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. | (C)ritical flag - MUST be 0. | |||
* SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. | Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. | |||
* Notify Message Error Type (2 octets) - set to [TBD2] | SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. | |||
* There is no data associated with this Notify type. | Notify Message Error Type (2 octets) - set to 48. | |||
There is no data associated with this Notify type. | ||||
6. Operational Considerations | 6. Operational Considerations | |||
Implementations supporting per-CPU SAs SHOULD extend their local SPD | Implementations supporting per-CPU SAs SHOULD extend their local SPD | |||
selector, and the mechanism of on-demand negotiation that is | selector, and the mechanism of on-demand negotiation that is | |||
triggered by traffic to include a CPU (or queue) identifier in their | triggered by traffic to include a CPU (or queue) identifier in their | |||
packet trigger (e.g. SADB_ACQUIRE) message from the SPD to the IKE | packet trigger (e.g., SADB_ACQUIRE) message from the SPD to the IKE | |||
daemon. An implementation which does not support receiving per-CPU | daemon. An implementation that does not support receiving per-CPU | |||
packet trigger messages MAY initiate all its Child SAs immediately | packet trigger messages MAY initiate all its Child SAs immediately | |||
upon receiving the (only) packet trigger message it will receive from | upon receiving the (only) packet trigger message it will receive from | |||
the IPsec stack. Such implementations also need to be careful when | the IPsec stack. Such an implementation also needs to be careful | |||
receiving a Delete Notify request for a per-CPU Child SA, as it has | when receiving a Delete Notify request for a per-CPU Child SA, as it | |||
no method to detect when it should bring up such a per-CPU Child SA | has no method to detect when it should bring up such a per-CPU Child | |||
again later. And bringing the deleted per-CPU Child SA up again | SA again later. Also, bringing the deleted per-CPU Child SA up again | |||
immediately after receiving the Delete Notify might cause an infinite | immediately after receiving the Delete Notify might cause an infinite | |||
loop between the peers. Another issue of not bringing up all its | loop between the peers. Another issue with not bringing up all its | |||
per-CPU Child SAs is that if the peer acts similarly, the two peers | per-CPU Child SAs is that if the peer acts similarly, the two peers | |||
might end up with only the first Child SA without ever activating any | might end up with only the first Child SA without ever activating any | |||
per-CPU Child SAs. It is therefor RECOMMENDED to implement per-CPU | per-CPU Child SAs. It is therefore RECOMMENDED to implement per-CPU | |||
packet trigger messages. | packet trigger messages. | |||
Peers SHOULD be flexible with the maximum number of Child SAs they | Peers SHOULD be flexible with the maximum number of Child SAs they | |||
allow for a given TSi/TSr combination to account for corner cases. | allow for a given TSi/TSr combination in order to account for corner | |||
For example, during Child SA rekeying, there might be a large number | cases. For example, during Child SA rekeying, there might be a large | |||
of additional Child SAs created before the old Child SAs are torn | number of additional Child SAs created before the old Child SAs are | |||
down. Similarly, when using on-demand Child SAs, both ends could | torn down. Similarly, when using on-demand Child SAs, both ends | |||
trigger multiple Child SA requests as the initial packet causing the | could trigger multiple Child SA requests as the initial packet | |||
Child SA negotiation might have been transported to the peer via the | causing the Child SA negotiation might have been transported to the | |||
first Child SA where its reply packet might also trigger an on-demand | peer via the first Child SA, where its reply packet might also | |||
Child SA negotiation to start. As additional Child SAs consume | trigger an on-demand Child SA negotiation to start. As additional | |||
little additional resources, allowing at the very least double the | Child SAs consume little additional resources, allowing at the very | |||
number of available CPUs is RECOMMENDED. An implementation MAY allow | least double the number of available CPUs is RECOMMENDED. An | |||
unlimited additional Child SAs and only limit this number based on | implementation MAY allow unlimited additional Child SAs and only | |||
its generic resource protection strategies that are used to require | limit this number based on its generic resource protection strategies | |||
COOKIES or refuse new IKE or Child SA negotiations. Although having | that are used to require COOKIES or refuse new IKE or Child SA | |||
a very large number (e.g. hundreds or thousands) of SAs may slow down | negotiations. Although having a very large number (e.g., hundreds or | |||
per-packet SAD lookup. | thousands) of SAs may slow down per-packet SAD lookup. | |||
Implementations might support dynamically moving a per-CPU Child SAs | Implementations might support dynamically moving a per-CPU Child SA | |||
from one CPU to another CPU. If this method is supported, | from one CPU to another CPU. If this method is supported, | |||
implementations must be careful to move both the inbound and outbound | implementations must be careful to move both the inbound and outbound | |||
SAs. If the IPsec endpoint is a gateway, it can move the inbound SA | SAs. If the IPsec endpoint is a gateway, it can move the inbound SA | |||
and outbound SA independently of each other. It is likely that for a | and outbound SA independently of each other. It is likely that for a | |||
gateway, IPsec traffic would be asymmetric. If the IPsec endpoint is | gateway, IPsec traffic would be asymmetric. If the IPsec endpoint is | |||
the same host responsible for generating the traffic, the inbound and | the same host responsible for generating the traffic, the inbound and | |||
outbound SAs SHOULD remain as a pair on the same CPU. If a host | outbound SAs SHOULD remain as a pair on the same CPU. If a host | |||
previously skipped installing an outbound SA because it would be an | previously skipped installing an outbound SA because it would be an | |||
unused duplicate outbound SA, it will have to create and add the | unused duplicate outbound SA, it will have to create and add the | |||
previously skipped outbound SA to the SAD with the new CPU ID. The | previously skipped outbound SA to the SAD with the new CPU ID. The | |||
inbound SA may not have CPU ID in the SAD. Adding the outbound SA to | inbound SA may not have a CPU ID in the SAD. Adding the outbound SA | |||
the SAD requires access to the key material, whereas for updating the | to the SAD requires access to the key material, whereas updating the | |||
CPU selector on an existing outbound SAs access to key material might | CPU selector on an existing outbound SAs might not require access to | |||
not be needed. To support this, the IKE software might have to hold | key material. To support this, the IKE software might have to hold | |||
on to the key material longer than it normally would, as it might | on to the key material longer than it normally would, as it might | |||
actively attempt to destroy key material from memory that the IKE | actively attempt to destroy key material from memory that the IKE | |||
daemon no longer needs access to. | daemon no longer needs access to. | |||
An implementation that does not accept any further resource specific | An implementation that does not accept any further resource-specific | |||
Child SAs MUST NOT return the NO_ADDITIONAL_SAS error because this | Child SAs MUST NOT return the NO_ADDITIONAL_SAS error because it | |||
can be interpreted by the peer that no other Child SAs with different | could be misinterpreted by the peer to mean that no other Child SA | |||
TSi/TSr are allowed either. Instead, it MUST return TS_MAX_QUEUE. | with a different TSi and/or TSr is allowed either. Instead, it MUST | |||
return TS_MAX_QUEUE. | ||||
7. Security Considerations | 7. Security Considerations | |||
Similar to how an implementation should limit the number of half-open | Similar to how an implementation should limit the number of half-open | |||
SAs to limit the impact of a denial of service attack, it is | SAs to limit the impact of a denial-of-service attack, it is | |||
RECOMMENDED that an implementation limits the maximum number of | RECOMMENDED that an implementation limits the maximum number of | |||
additional Child SAs allowed per unique TSi/TSr. | additional Child SAs allowed per unique TSi/TSr. | |||
Using multiple resource specific child SAs makes sense for high | Using multiple resource-specific child SAs makes sense for high- | |||
volume IPsec connections on IPsec gateway machines where the | volume IPsec connections on IPsec gateway machines where the | |||
administrator has a trust relationship with the peer's administrator | administrator has a trust relationship with the peer's administrator | |||
and abuse is unlikely and easily escalated to resolve. | and abuse is unlikely and easily escalated to resolve. | |||
This trust relationship is usually not present for the Remote Access | This trust relationship is usually not present for the deployments of | |||
VPN type deployments, and allowing per-CPU Child SA's is NOT | remote access VPNs, and allowing per-CPU Child SAs is NOT RECOMMENDED | |||
RECOMMENDED in these scenarios. Therefore, it is also NOT | in these scenarios. Therefore, it is also NOT RECOMMENDED to allow | |||
RECOMMENDED to allow per-CPU Child SAs per default. | per-CPU Child SAs by default. | |||
The SA_RESOURCE_INFO notify contains an optional data payload that | The SA_RESOURCE_INFO notify contains an optional data payload that | |||
can be used by the peer to identify the Child SA belonging to a | can be used by the peer to identify the Child SA belonging to a | |||
specific resource. The notify data SHOULD NOT be an identifier that | specific resource. Notification data SHOULD NOT be an identifier | |||
can be used to gain information about the hardware. For example, | that can be used to gain information about the hardware. For | |||
using the CPU number itself as identifier might give an attacker | example, using the CPU number itself as the identifier might give an | |||
knowledge which packets are handled by which CPU ID and it might | attacker knowledge of which packets are handled by which CPU ID, and | |||
optimize a brute force attack against the system. | it might optimize a brute-force attack against the system. | |||
8. Implementation Status | ||||
[Note to RFC Editor: Please remove this section and the reference to | ||||
[RFC7942] before publication.] | ||||
This section records the status of known implementations of the | ||||
protocol defined by this specification at the time of posting of this | ||||
Internet-Draft, and is based on a proposal described in [RFC7942]. | ||||
The description of implementations in this section is intended to | ||||
assist the IETF in its decision processes in progressing drafts to | ||||
RFCs. Please note that the listing of any individual implementation | ||||
here does not imply endorsement by the IETF. Furthermore, no effort | ||||
has been spent to verify the information presented here that was | ||||
supplied by IETF contributors. This is not intended as, and must not | ||||
be construed to be, a catalog of available implementations or their | ||||
features. Readers are advised to note that other implementations may | ||||
exist. | ||||
According to [RFC7942], "this will allow reviewers and working groups | ||||
to assign due consideration to documents that have the benefit of | ||||
running code, which may serve as evidence of valuable experimentation | ||||
and feedback that have made the implemented protocols more mature. | ||||
It is up to the individual working groups to use this information as | ||||
they see fit". | ||||
Authors are requested to add a note to the RFC Editor at the top of | ||||
this section, advising the Editor to remove the entire section before | ||||
publication, as well as the reference to [RFC7942]. | ||||
8.1. Linux XFRM | ||||
Organization: Linux kernel XFRM | ||||
Name: XFRM-PCPU-v7 | ||||
https://git.kernel.org/pub/scm/linux/kernel/git/klassert/linux- | ||||
stk.git/log/?h=xfrm-pcpu-v7 | ||||
Description: An initial Kernel IPsec implementation of the per-CPU | ||||
method. | ||||
Level of maturity: Alpha | ||||
Coverage: Implements a general Child SA and per-CPU Child SAs. It | ||||
only supports the NETLINK API. The PFKEYv2 API is not supported. | ||||
Licensing: GPLv2 | ||||
Implementation experience: The Linux XFRM implementation added two | ||||
additional attributes to support per-CPU SAs. There is a new | ||||
attribute XFRMA_SA_PCPU, u32, for the SAD entry. This attribute | ||||
should present on the outgoing SA, per-CPU Child SAs, starting | ||||
from 0. This attribute MUST NOT be present on the first XFRM SA. | ||||
It is used by the kernel only for the outgoing traffic, (clear to | ||||
encrypted). The incoming SAs do not need XFRMA_SA_PCPU attribute. | ||||
XFRM stack can not use CPU id on the incoming SA. The kernel | ||||
internally sets the value to 0xFFFFFF for the incoming SA and the | ||||
initial Child SA that can be used by any CPU. However, one may | ||||
add XFRMA_SA_PCPU to the incoming per-CPU SA to steer the ESP | ||||
flow, to a specific Q or CPU e.g ethtool ntuple configuration. | ||||
The SPD entry has new flag XFRM_POLICY_CPU_ACQUIRE. It should be | ||||
set only on the "out" policy. The flag should be disabled when | ||||
the policy is a trap policy, without SPD entries. After a | ||||
successful negotiation of SA_RESOURCE_INFO, while adding the first | ||||
Child SA, the SPD entry can be updated with the | ||||
XFRM_POLICY_CPU_ACQUIRE flag. When XFRM_POLICY_CPU_ACQUIRE is | ||||
set, the XFRM_MSG_ACQUIRE generated will include the XFRMA_SA_PCPU | ||||
attribute. | ||||
Contact: Steffen Klassert steffen.klassert@secunet.com | ||||
8.2. Libreswan | ||||
Organization: The Libreswan Project | ||||
Name: pcpu-3 https://libreswan.org/wiki/XFRM_pCPU | ||||
Description: An initial IKE implementation of the per-CPU method. | ||||
Level of maturity: Alpha | ||||
Coverage: implements combining a regular (all-CPUs) Child SA and | ||||
per-CPU additional Child SAs | ||||
Licensing: GPLv2 | ||||
Implementation experience: TBD | ||||
Contact: Libreswan Development: swan-dev@libreswan.org | ||||
8.3. strongSwan | ||||
Organization: The StrongSwan Project | ||||
Name: StrongSwan https://github.com/strongswan/strongswan/tree/per- | ||||
cpu-sas-poc/ | ||||
Description: An initial IKE implementation of the per-CPU method. | ||||
Level of maturity: Alpha | ||||
Coverage: implements combining a regular (all-CPUs) Child SA and | ||||
per-CPU additional Child SAs | ||||
Licensing: GPLv2 | ||||
Implementation experience: StrongSwan use private space values for | ||||
notifications SA_RESOURCE_INFO (40970). | ||||
Contact: Tobias Brunner tobias@strongswan.org | ||||
8.4. iproute2 | ||||
Organization: The iproute2 Project | ||||
Name: iproute2 https://github.com/antonyantony/iproute2/tree/pcpu-v1 | ||||
Description: Implemented the per-CPU attributes for the "ip xfrm" | ||||
command. | ||||
Level of maturity: Alpha | ||||
Licensing: GPLv2 | ||||
Implementation experience: TBD | ||||
Contact: Antony Antony antony.antony@secunet.com | ||||
9. IANA Considerations | ||||
This document defines one new registration for the IANA "IKEv2 Notify | ||||
Message Status Types" registry. | ||||
Value Notify Message Status Type Reference | 8. IANA Considerations | |||
----- ------------------------------ --------------- | ||||
[TBD1] SA_RESOURCE_INFO [this document] | ||||
Figure 1 | IANA has registered one new value in the "IKEv2 Notify Message Status | |||
Types" registry. | ||||
This document defines one new registration for the IANA "IKEv2 Notify | +=======+============================+===========+ | |||
Message Error Types" registry. | | Value | Notify Message Status Type | Reference | | |||
+=======+============================+===========+ | ||||
| 16444 | SA_RESOURCE_INFO | RFC 9611 | | ||||
+-------+----------------------------+-----------+ | ||||
Value Notify Message Error Type Reference | Table 1 | |||
----- ------------------------------ --------------- | ||||
[TBD2] TS_MAX_QUEUE [this document] | ||||
Figure 2 | IANA has registered one new value in the "IKEv2 Notify Message Error | |||
Types" registry. | ||||
10. Acknowledgements | +=======+===========================+===========+ | |||
| Value | Notify Message Error Type | Reference | | ||||
+=======+===========================+===========+ | ||||
| 48 | TS_MAX_QUEUE | RFC 9611 | | ||||
+-------+---------------------------+-----------+ | ||||
The following people provided reviews and valuable feedback: Roman | Table 2 | |||
Danyliw, Warren Kumari Tero Kivinen, Murray Kucherawy, John Scudder, | ||||
Valery Smyslov, Gunter van de Velde and Eric Vyncke. | ||||
11. References | 9. References | |||
11.1. Normative References | 9.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<https://www.rfc-editor.org/info/rfc2119>. | <https://www.rfc-editor.org/info/rfc2119>. | |||
[RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. | [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. | |||
Kivinen, "Internet Key Exchange Protocol Version 2 | Kivinen, "Internet Key Exchange Protocol Version 2 | |||
(IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October | (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October | |||
2014, <https://www.rfc-editor.org/info/rfc7296>. | 2014, <https://www.rfc-editor.org/info/rfc7296>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | |||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | |||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | May 2017, <https://www.rfc-editor.org/info/rfc8174>. | |||
11.2. Informative References | 9.2. Informative References | |||
[RFC2367] McDonald, D., Metz, C., and B. Phan, "PF_KEY Key | [RFC2367] McDonald, D., Metz, C., and B. Phan, "PF_KEY Key | |||
Management API, Version 2", RFC 2367, | Management API, Version 2", RFC 2367, | |||
DOI 10.17487/RFC2367, July 1998, | DOI 10.17487/RFC2367, July 1998, | |||
<https://www.rfc-editor.org/info/rfc2367>. | <https://www.rfc-editor.org/info/rfc2367>. | |||
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the | [RFC4301] Kent, S. and K. Seo, "Security Architecture for the | |||
Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, | Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, | |||
December 2005, <https://www.rfc-editor.org/info/rfc4301>. | December 2005, <https://www.rfc-editor.org/info/rfc4301>. | |||
[RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running | Acknowledgements | |||
Code: The Implementation Status Section", BCP 205, | ||||
RFC 7942, DOI 10.17487/RFC7942, July 2016, | The following people provided reviews and valuable feedback: Roman | |||
<https://www.rfc-editor.org/info/rfc7942>. | Danyliw, Warren Kumari, Tero Kivinen, Murray Kucherawy, John Scudder, | |||
Valery Smyslov, Gunter van de Velde, and Éric Vyncke. | ||||
Authors' Addresses | Authors' Addresses | |||
Antony Antony | Antony Antony | |||
secunet Security Networks AG | secunet Security Networks AG | |||
Email: antony.antony@secunet.com | Email: antony.antony@secunet.com | |||
Tobias Brunner | Tobias Brunner | |||
codelabs GmbH | codelabs GmbH | |||
Email: tobias@codelabs.ch | Email: tobias@codelabs.ch | |||
End of changes. 73 change blocks. | ||||
301 lines changed or deleted | 169 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |