rfc8881v1.txt   rfc8881.txt 
Internet Engineering Task Force (IETF) D. Noveck, Ed. Internet Engineering Task Force (IETF) D. Noveck, Ed.
Request for Comments: 8881 NetApp Request for Comments: 8881 NetApp
Obsoletes: 5661 C. Lever Obsoletes: 5661 C. Lever
Category: Standards Track ORACLE Category: Standards Track ORACLE
ISSN: 2070-1721 July 2020 ISSN: 2070-1721 August 2020
Network File System (NFS) Version 4 Minor Version 1 Protocol Network File System (NFS) Version 4 Minor Version 1 Protocol
Abstract Abstract
This document describes the Network File System (NFS) version 4 minor This document describes the Network File System (NFS) version 4 minor
version 1, including features retained from the base protocol (NFS version 1, including features retained from the base protocol (NFS
version 4 minor version 0, which is specified in RFC 7530) and version 4 minor version 0, which is specified in RFC 7530) and
protocol extensions made subsequently. The later minor version has protocol extensions made subsequently. The later minor version has
no dependencies on NFS version 4 minor version 0, and is considered a no dependencies on NFS version 4 minor version 0, and is considered a
skipping to change at line 335 skipping to change at line 335
simultaneous use of multiple connections between a client and server, simultaneous use of multiple connections between a client and server,
potentially to different network addresses, and Transparent State potentially to different network addresses, and Transparent State
Migration, which allows a file system to be transferred between Migration, which allows a file system to be transferred between
servers in a way that provides to the client the ability to maintain servers in a way that provides to the client the ability to maintain
its existing locking state across the transfer. its existing locking state across the transfer.
The revised description of the NFS version 4 minor version 1 The revised description of the NFS version 4 minor version 1
(NFSv4.1) protocol presented in this update is necessary to enable (NFSv4.1) protocol presented in this update is necessary to enable
full use of these features together with other multi-server namespace full use of these features together with other multi-server namespace
features. This document is in the form of an updated description of features. This document is in the form of an updated description of
the NFSv4.1 protocol previously defined in RFC 5661 [65]. RFC 5661 the NFSv4.1 protocol previously defined in RFC 5661 [66]. RFC 5661
is obsoleted by this document. However, the update has a limited is obsoleted by this document. However, the update has a limited
scope and is focused on enabling full use of trunking and Transparent scope and is focused on enabling full use of trunking and Transparent
State Migration. The need for these changes is discussed in State Migration. The need for these changes is discussed in
Appendix A. Appendix B describes the specific changes made to arrive Appendix A. Appendix B describes the specific changes made to arrive
at the current text. at the current text.
This limited-scope update replaces the current NFSv4.1 RFC with the This limited-scope update replaces the current NFSv4.1 RFC with the
intention of providing an authoritative and complete specification, intention of providing an authoritative and complete specification,
the motivation for which is discussed in [35], addressing the issues the motivation for which is discussed in [36], addressing the issues
within the scope of the update. However, it will not address issues within the scope of the update. However, it will not address issues
that are known but outside of this limited scope as could be expected that are known but outside of this limited scope as could be expected
by a full update of the protocol. Below are some areas that are by a full update of the protocol. Below are some areas that are
known to need addressing in a future update of the protocol: known to need addressing in a future update of the protocol:
* Work needs to be done with regard to RFC 8178 [66], which * Work needs to be done with regard to RFC 8178 [67], which
establishes NFSv4-wide versioning rules. As RFC 5661 is currently establishes NFSv4-wide versioning rules. As RFC 5661 is currently
inconsistent with that document, changes are needed in order to inconsistent with that document, changes are needed in order to
arrive at a situation in which there would be no need for RFC 8178 arrive at a situation in which there would be no need for RFC 8178
to update the NFSv4.1 specification. to update the NFSv4.1 specification.
* Work needs to be done with regard to RFC 8434 [69], which * Work needs to be done with regard to RFC 8434 [70], which
establishes the requirements for parallel NFS (pNFS) layout types, establishes the requirements for parallel NFS (pNFS) layout types,
which are not clearly defined in RFC 5661. When that work is done which are not clearly defined in RFC 5661. When that work is done
and the resulting documents approved, the new NFSv4.1 and the resulting documents approved, the new NFSv4.1
specification document will provide a clear set of requirements specification document will provide a clear set of requirements
for layout types and a description of the file layout type that for layout types and a description of the file layout type that
conforms to those requirements. Other layout types will have conforms to those requirements. Other layout types will have
their own specification documents that conform to those their own specification documents that conform to those
requirements as well. requirements as well.
* Work needs to be done to address many errata reports relevant to * Work needs to be done to address many errata reports relevant to
RFC 5661, other than errata report 2006 [63], which is addressed RFC 5661, other than errata report 2006 [64], which is addressed
in this document. Addressing that report was not deferrable in this document. Addressing that report was not deferrable
because of the interaction of the changes suggested there and the because of the interaction of the changes suggested there and the
newly described handling of state and session migration. newly described handling of state and session migration.
The errata reports that have been deferred and that will need to The errata reports that have been deferred and that will need to
be addressed in a later document include reports currently be addressed in a later document include reports currently
assigned a range of statuses in the errata reporting system, assigned a range of statuses in the errata reporting system,
including reports marked Accepted and those marked Hold For including reports marked Accepted and those marked Hold For
Document Update because the change was too minor to address Document Update because the change was too minor to address
immediately. immediately.
skipping to change at line 390 skipping to change at line 390
one in state Rejected, that will need to be addressed in a later one in state Rejected, that will need to be addressed in a later
document. This will involve making changes to consensus decisions document. This will involve making changes to consensus decisions
reflected in RFC 5661, in situations in which the working group reflected in RFC 5661, in situations in which the working group
has decided that the treatment in RFC 5661 is incorrect and needs has decided that the treatment in RFC 5661 is incorrect and needs
to be revised to reflect the working group's new consensus and to to be revised to reflect the working group's new consensus and to
ensure compatibility with existing implementations that do not ensure compatibility with existing implementations that do not
follow the handling described in RFC 5661. follow the handling described in RFC 5661.
Note that it is expected that all such errata reports will remain Note that it is expected that all such errata reports will remain
relevant to implementors and the authors of an eventual relevant to implementors and the authors of an eventual
rfc5661bis, despite the fact that this document, when approved, rfc5661bis, despite the fact that this document obsoletes RFC 5661
will obsolete RFC 5661 [65]. [66].
* There is a need for a new approach to the description of * There is a need for a new approach to the description of
internationalization since the current internationalization internationalization since the current internationalization
section (Section 14) has never been implemented and does not meet section (Section 14) has never been implemented and does not meet
the needs of the NFSv4 protocol. Possible solutions are to create the needs of the NFSv4 protocol. Possible solutions are to create
a new internationalization section modeled on that in [67] or to a new internationalization section modeled on that in [68] or to
create a new document describing internationalization for all create a new document describing internationalization for all
NFSv4 minor versions and reference that document in the RFCs NFSv4 minor versions and reference that document in the RFCs
defining both NFSv4.0 and NFSv4.1. defining both NFSv4.0 and NFSv4.1.
* There is a need for a revised treatment of security in NFSv4.1. * There is a need for a revised treatment of security in NFSv4.1.
The issues with the existing treatment are discussed in The issues with the existing treatment are discussed in
Appendix C. Appendix C.
Until the above work is done, there will not be a consistent set of Until the above work is done, there will not be a consistent set of
documents that provides a description of the NFSv4.1 protocol, and documents that provides a description of the NFSv4.1 protocol, and
any full description would involve documents updating other documents any full description would involve documents updating other documents
within the specification. The updates applied by RFC 8434 [69] and within the specification. The updates applied by RFC 8434 [70] and
RFC 8178 [66] to RFC 5661 also apply to this specification, and will RFC 8178 [67] to RFC 5661 also apply to this specification, and will
apply to any subsequent v4.1 specification until that work is done. apply to any subsequent v4.1 specification until that work is done.
1.2. The NFS Version 4 Minor Version 1 Protocol 1.2. The NFS Version 4 Minor Version 1 Protocol
The NFS version 4 minor version 1 (NFSv4.1) protocol is the second The NFS version 4 minor version 1 (NFSv4.1) protocol is the second
minor version of the NFS version 4 (NFSv4) protocol. The first minor minor version of the NFS version 4 (NFSv4) protocol. The first minor
version, NFSv4.0, is now described in RFC 7530 [67]. It generally version, NFSv4.0, is now described in RFC 7530 [68]. It generally
follows the guidelines for minor versioning that are listed in follows the guidelines for minor versioning that are listed in
Section 10 of RFC 3530 [36]. However, it diverges from guidelines 11 Section 10 of RFC 3530 [37]. However, it diverges from guidelines 11
("a client and server that support minor version X must support minor ("a client and server that support minor version X must support minor
versions 0 through X-1") and 12 ("no new features may be introduced versions 0 through X-1") and 12 ("no new features may be introduced
as mandatory in a minor version"). These divergences are due to the as mandatory in a minor version"). These divergences are due to the
introduction of the sessions model for managing non-idempotent introduction of the sessions model for managing non-idempotent
operations and the RECLAIM_COMPLETE operation. These two new operations and the RECLAIM_COMPLETE operation. These two new
features are infrastructural in nature and simplify implementation of features are infrastructural in nature and simplify implementation of
existing and other new features. Making them anything but REQUIRED existing and other new features. Making them anything but REQUIRED
would add undue complexity to protocol definition and implementation. would add undue complexity to protocol definition and implementation.
NFSv4.1 accordingly updates the minor versioning guidelines NFSv4.1 accordingly updates the minor versioning guidelines
(Section 2.7). (Section 2.7).
skipping to change at line 458 skipping to change at line 458
* describe the NFSv4.0 protocol, except where needed to contrast * describe the NFSv4.0 protocol, except where needed to contrast
with NFSv4.1. with NFSv4.1.
* modify the specification of the NFSv4.0 protocol. * modify the specification of the NFSv4.0 protocol.
* clarify the NFSv4.0 protocol. * clarify the NFSv4.0 protocol.
1.5. NFSv4 Goals 1.5. NFSv4 Goals
The NFSv4 protocol is a further revision of the NFS protocol defined The NFSv4 protocol is a further revision of the NFS protocol defined
already by NFSv3 [37]. It retains the essential characteristics of already by NFSv3 [38]. It retains the essential characteristics of
previous versions: easy recovery; independence of transport previous versions: easy recovery; independence of transport
protocols, operating systems, and file systems; simplicity; and good protocols, operating systems, and file systems; simplicity; and good
performance. NFSv4 has the following goals: performance. NFSv4 has the following goals:
* Improved access and good performance on the Internet * Improved access and good performance on the Internet
The protocol is designed to transit firewalls easily, perform well The protocol is designed to transit firewalls easily, perform well
where latency is high and bandwidth is low, and scale to very where latency is high and bandwidth is low, and scale to very
large numbers of clients per server. large numbers of clients per server.
skipping to change at line 726 skipping to change at line 726
filehandles. filehandles.
1.8.3.2. File Attributes 1.8.3.2. File Attributes
The NFSv4.1 protocol has a rich and extensible file object attribute The NFSv4.1 protocol has a rich and extensible file object attribute
structure, which is divided into REQUIRED, RECOMMENDED, and named structure, which is divided into REQUIRED, RECOMMENDED, and named
attributes (see Section 5). attributes (see Section 5).
Several (but not all) of the REQUIRED attributes are derived from the Several (but not all) of the REQUIRED attributes are derived from the
attributes of NFSv3 (see the definition of the fattr3 data type in attributes of NFSv3 (see the definition of the fattr3 data type in
[37]). An example of a REQUIRED attribute is the file object's type [38]). An example of a REQUIRED attribute is the file object's type
(Section 5.8.1.2) so that regular files can be distinguished from (Section 5.8.1.2) so that regular files can be distinguished from
directories (also known as folders in some operating environments) directories (also known as folders in some operating environments)
and other types of objects. REQUIRED attributes are discussed in and other types of objects. REQUIRED attributes are discussed in
Section 5.1. Section 5.1.
An example of three RECOMMENDED attributes are acl, sacl, and dacl. An example of three RECOMMENDED attributes are acl, sacl, and dacl.
These attributes define an Access Control List (ACL) on a file object These attributes define an Access Control List (ACL) on a file object
(Section 6). An ACL provides directory and file access control (Section 6). An ACL provides directory and file access control
beyond the model used in NFSv3. The ACL definition allows for beyond the model used in NFSv3. The ACL definition allows for
specification of specific sets of permissions for individual users specification of specific sets of permissions for individual users
skipping to change at line 876 skipping to change at line 876
* Data retention (Section 5.13). * Data retention (Section 5.13).
* Identification of the implementation of the NFS client and server * Identification of the implementation of the NFS client and server
(Section 18.35). (Section 18.35).
* Support for notification of the availability of byte-range locks * Support for notification of the availability of byte-range locks
(see the new OPEN4_RESULT_MAY_NOTIFY_LOCK reply flag in (see the new OPEN4_RESULT_MAY_NOTIFY_LOCK reply flag in
Section 18.16 and see Section 20.11). Section 18.16 and see Section 20.11).
* In NFSv4.1, LIPKEY and SPKM-3 are not required security mechanisms * In NFSv4.1, LIPKEY and SPKM-3 are not required security mechanisms
[38]. [39].
2. Core Infrastructure 2. Core Infrastructure
2.1. Introduction 2.1. Introduction
NFSv4.1 relies on core infrastructure common to nearly every NFSv4.1 relies on core infrastructure common to nearly every
operation. This core infrastructure is described in the remainder of operation. This core infrastructure is described in the remainder of
this section. this section.
2.2. RPC and XDR 2.2. RPC and XDR
skipping to change at line 995 skipping to change at line 995
------------------------------------------------------------------ ------------------------------------------------------------------
390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes 390003 krb5 1.2.840.113554.1.2.2 rpc_gss_svc_none yes yes
390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes 390004 krb5i 1.2.840.113554.1.2.2 rpc_gss_svc_integrity yes yes
390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes 390005 krb5p 1.2.840.113554.1.2.2 rpc_gss_svc_privacy no yes
Note that the number and name of the pseudo flavor are presented here Note that the number and name of the pseudo flavor are presented here
as a mapping aid to the implementor. Because the NFSv4.1 protocol as a mapping aid to the implementor. Because the NFSv4.1 protocol
includes a method to negotiate security and it understands the GSS- includes a method to negotiate security and it understands the GSS-
API mechanism, the pseudo flavor is not needed. The pseudo flavor is API mechanism, the pseudo flavor is not needed. The pseudo flavor is
needed for the NFSv3 since the security negotiation is done via the needed for the NFSv3 since the security negotiation is done via the
MOUNT protocol as described in [39]. MOUNT protocol as described in [40].
At the time NFSv4.1 was specified, the Advanced Encryption Standard At the time NFSv4.1 was specified, the Advanced Encryption Standard
(AES) with HMAC-SHA1 was a REQUIRED algorithm set for Kerberos V5. (AES) with HMAC-SHA1 was a REQUIRED algorithm set for Kerberos V5.
In contrast, when NFSv4.0 was specified, weaker algorithm sets were In contrast, when NFSv4.0 was specified, weaker algorithm sets were
REQUIRED for Kerberos V5, and were REQUIRED in the NFSv4.0 REQUIRED for Kerberos V5, and were REQUIRED in the NFSv4.0
specification, because the Kerberos V5 specification at the time did specification, because the Kerberos V5 specification at the time did
not specify stronger algorithms. The NFSv4.1 specification does not not specify stronger algorithms. The NFSv4.1 specification does not
specify REQUIRED algorithms for Kerberos V5, and instead, the specify REQUIRED algorithms for Kerberos V5, and instead, the
implementor is expected to track the evolution of the Kerberos V5 implementor is expected to track the evolution of the Kerberos V5
standard if and when stronger algorithms are specified. standard if and when stronger algorithms are specified.
skipping to change at line 1157 skipping to change at line 1157
the same string. The implementor is cautioned from an approach the same string. The implementor is cautioned from an approach
that requires the string to be recorded in a local file because that requires the string to be recorded in a local file because
this precludes the use of the implementation in an environment this precludes the use of the implementation in an environment
where there is no local disk and all file access is from an where there is no local disk and all file access is from an
NFSv4.1 server. NFSv4.1 server.
* The string should be the same for each server network address that * The string should be the same for each server network address that
the client accesses. This way, if a server has multiple the client accesses. This way, if a server has multiple
interfaces, the client can trunk traffic over multiple network interfaces, the client can trunk traffic over multiple network
paths as described in Section 2.10.5. (Note: the precise opposite paths as described in Section 2.10.5. (Note: the precise opposite
was advised in the NFSv4.0 specification [36].) was advised in the NFSv4.0 specification [37].)
* The algorithm for generating the string should not assume that the * The algorithm for generating the string should not assume that the
client's network address will not change, unless the client client's network address will not change, unless the client
implementation knows it is using statically assigned network implementation knows it is using statically assigned network
addresses. This includes changes between client incarnations and addresses. This includes changes between client incarnations and
even changes while the client is still running in its current even changes while the client is still running in its current
incarnation. Thus, with dynamic address assignment, if the client incarnation. Thus, with dynamic address assignment, if the client
includes just the client's network address in the co_ownerid includes just the client's network address in the co_ownerid
string, there is a real risk that after the client gives up the string, there is a real risk that after the client gives up the
network address, another client, using a similar algorithm for network address, another client, using a similar algorithm for
skipping to change at line 1258 skipping to change at line 1258
To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a To facilitate upgrade from NFSv4.0 to NFSv4.1, a server may compare a
value of data type client_owner4 in an EXCHANGE_ID with a value of value of data type client_owner4 in an EXCHANGE_ID with a value of
data type nfs_client_id4 that was established using the SETCLIENTID data type nfs_client_id4 that was established using the SETCLIENTID
operation of NFSv4.0. A server that does so will allow an upgraded operation of NFSv4.0. A server that does so will allow an upgraded
client to avoid waiting until the lease (i.e., the lease established client to avoid waiting until the lease (i.e., the lease established
by the NFSv4.0 instance client) expires. This requires that the by the NFSv4.0 instance client) expires. This requires that the
value of data type client_owner4 be constructed the same way as the value of data type client_owner4 be constructed the same way as the
value of data type nfs_client_id4. If the latter's contents included value of data type nfs_client_id4. If the latter's contents included
the server's network address (per the recommendations of the NFSv4.0 the server's network address (per the recommendations of the NFSv4.0
specification [36]), and the NFSv4.1 client does not wish to use a specification [37]), and the NFSv4.1 client does not wish to use a
client ID that prevents trunking, it should send two EXCHANGE_ID client ID that prevents trunking, it should send two EXCHANGE_ID
operations. The first EXCHANGE_ID will have a client_owner4 equal to operations. The first EXCHANGE_ID will have a client_owner4 equal to
the nfs_client_id4. This will clear the state created by the NFSv4.0 the nfs_client_id4. This will clear the state created by the NFSv4.0
client. The second EXCHANGE_ID will not have the server's network client. The second EXCHANGE_ID will not have the server's network
address. The state created for the second EXCHANGE_ID will not have address. The state created for the second EXCHANGE_ID will not have
to wait for lease expiration, because there will be no state to to wait for lease expiration, because there will be no state to
expire. expire.
2.4.2. Server Release of Client ID 2.4.2. Server Release of Client ID
skipping to change at line 1631 skipping to change at line 1631
2.7. Minor Versioning 2.7. Minor Versioning
To address the requirement of an NFS protocol that can evolve as the To address the requirement of an NFS protocol that can evolve as the
need arises, the NFSv4.1 protocol contains the rules and framework to need arises, the NFSv4.1 protocol contains the rules and framework to
allow for future minor changes or versioning. allow for future minor changes or versioning.
The base assumption with respect to minor versioning is that any The base assumption with respect to minor versioning is that any
future accepted minor version will be documented in one or more future accepted minor version will be documented in one or more
Standards Track RFCs. Minor version 0 of the NFSv4 protocol is Standards Track RFCs. Minor version 0 of the NFSv4 protocol is
represented by [36], and minor version 1 is represented by this RFC. represented by [37], and minor version 1 is represented by this RFC.
The COMPOUND and CB_COMPOUND procedures support the encoding of the The COMPOUND and CB_COMPOUND procedures support the encoding of the
minor version being requested by the client. minor version being requested by the client.
The following items represent the basic rules for the development of The following items represent the basic rules for the development of
minor versions. Note that a future minor version may modify or add minor versions. Note that a future minor version may modify or add
to the following rules as part of the minor version definition. to the following rules as part of the minor version definition.
1. Procedures are not added or deleted. 1. Procedures are not added or deleted.
To maintain the general RPC model, NFSv4 minor versions will not To maintain the general RPC model, NFSv4 minor versions will not
skipping to change at line 1784 skipping to change at line 1784
2.9. Transport Layers 2.9. Transport Layers
2.9.1. REQUIRED and RECOMMENDED Properties of Transports 2.9.1. REQUIRED and RECOMMENDED Properties of Transports
NFSv4.1 works over Remote Direct Memory Access (RDMA) and non-RDMA- NFSv4.1 works over Remote Direct Memory Access (RDMA) and non-RDMA-
based transports with the following attributes: based transports with the following attributes:
* The transport supports reliable delivery of data, which NFSv4.1 * The transport supports reliable delivery of data, which NFSv4.1
requires but neither NFSv4.1 nor RPC has facilities for ensuring requires but neither NFSv4.1 nor RPC has facilities for ensuring
[40]. [41].
* The transport delivers data in the order it was sent. Ordered * The transport delivers data in the order it was sent. Ordered
delivery simplifies detection of transmit errors, and simplifies delivery simplifies detection of transmit errors, and simplifies
the sending of arbitrary sized requests and responses via the the sending of arbitrary sized requests and responses via the
record marking protocol [3]. record marking protocol [3].
Where an NFSv4.1 implementation supports operation over the IP Where an NFSv4.1 implementation supports operation over the IP
network protocol, any transport used between NFS and IP MUST be among network protocol, any transport used between NFS and IP MUST be among
the IETF-approved congestion control transport protocols. At the the IETF-approved congestion control transport protocols. At the
time this document was written, the only two transports that had the time this document was written, the only two transports that had the
skipping to change at line 1886 skipping to change at line 1886
contents must not be blindly used when replies are sent from it, contents must not be blindly used when replies are sent from it,
and credit information appropriate to the channel must be and credit information appropriate to the channel must be
refreshed by the RPC layer. refreshed by the RPC layer.
In addition, as described in Section 2.10.6.2, while a session is In addition, as described in Section 2.10.6.2, while a session is
active, the NFSv4.1 requester MUST NOT stop waiting for a reply. active, the NFSv4.1 requester MUST NOT stop waiting for a reply.
2.9.3. Ports 2.9.3. Ports
Historically, NFSv3 servers have listened over TCP port 2049. The Historically, NFSv3 servers have listened over TCP port 2049. The
registered port 2049 [41] for the NFS protocol should be the default registered port 2049 [42] for the NFS protocol should be the default
configuration. NFSv4.1 clients SHOULD NOT use the RPC binding configuration. NFSv4.1 clients SHOULD NOT use the RPC binding
protocols as described in [42]. protocols as described in [43].
2.10. Session 2.10. Session
NFSv4.1 clients and servers MUST support and MUST use the session NFSv4.1 clients and servers MUST support and MUST use the session
feature as described in this section. feature as described in this section.
2.10.1. Motivation and Overview 2.10.1. Motivation and Overview
Previous versions and minor versions of NFS have suffered from the Previous versions and minor versions of NFS have suffered from the
following: following:
skipping to change at line 2577 skipping to change at line 2577
Given that well-formulated XIDs continue to be required, this raises Given that well-formulated XIDs continue to be required, this raises
the question: why do SEQUENCE and CB_SEQUENCE replies have a session the question: why do SEQUENCE and CB_SEQUENCE replies have a session
ID, slot ID, and sequence ID? Having the session ID in the reply ID, slot ID, and sequence ID? Having the session ID in the reply
means that the requester does not have to use the XID to look up the means that the requester does not have to use the XID to look up the
session ID, which would be necessary if the connection were session ID, which would be necessary if the connection were
associated with multiple sessions. Having the slot ID and sequence associated with multiple sessions. Having the slot ID and sequence
ID in the reply means that the requester does not have to use the XID ID in the reply means that the requester does not have to use the XID
to look up the slot ID and sequence ID. Furthermore, since the XID to look up the slot ID and sequence ID. Furthermore, since the XID
is only 32 bits, it is too small to guarantee the re-association of a is only 32 bits, it is too small to guarantee the re-association of a
reply with its request [43]; having session ID, slot ID, and sequence reply with its request [44]; having session ID, slot ID, and sequence
ID in the reply allows the client to validate that the reply in fact ID in the reply allows the client to validate that the reply in fact
belongs to the matched request. belongs to the matched request.
The SEQUENCE (and CB_SEQUENCE) operation also carries a The SEQUENCE (and CB_SEQUENCE) operation also carries a
"highest_slotid" value, which carries additional requester slot usage "highest_slotid" value, which carries additional requester slot usage
information. The requester MUST always indicate the slot ID information. The requester MUST always indicate the slot ID
representing the outstanding request with the highest-numbered slot representing the outstanding request with the highest-numbered slot
value. The requester should in all cases provide the most value. The requester should in all cases provide the most
conservative value possible, although it can be increased somewhat conservative value possible, although it can be increased somewhat
above the actual instantaneous usage to maintain some minimum or above the actual instantaneous usage to maintain some minimum or
skipping to change at line 2706 skipping to change at line 2706
cache entry for the slot whenever an error is returned from SEQUENCE cache entry for the slot whenever an error is returned from SEQUENCE
or CB_SEQUENCE. or CB_SEQUENCE.
2.10.6.1.3. Optional Reply Caching 2.10.6.1.3. Optional Reply Caching
On a per-request basis, the requester can choose to direct the On a per-request basis, the requester can choose to direct the
replier to cache the reply to all operations after the first replier to cache the reply to all operations after the first
operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE. csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
The reason it would not direct the replier to cache the entire reply The reason it would not direct the replier to cache the entire reply
is that the request is composed of all idempotent operations [40]. is that the request is composed of all idempotent operations [41].
Caching the reply may offer little benefit. If the reply is too Caching the reply may offer little benefit. If the reply is too
large (see Section 2.10.6.4), it may not be cacheable anyway. Even large (see Section 2.10.6.4), it may not be cacheable anyway. Even
if the reply to idempotent request is small enough to cache, if the reply to idempotent request is small enough to cache,
unnecessarily caching the reply slows down the server and increases unnecessarily caching the reply slows down the server and increases
RPC latency. RPC latency.
Whether or not the requester requests the reply to be cached has no Whether or not the requester requests the reply to be cached has no
effect on the slot processing. If the result of SEQUENCE or effect on the slot processing. If the result of SEQUENCE or
CB_SEQUENCE is NFS4_OK, then the slot's sequence ID MUST be CB_SEQUENCE is NFS4_OK, then the slot's sequence ID MUST be
incremented by one. If a requester does not direct the replier to incremented by one. If a requester does not direct the replier to
skipping to change at line 2770 skipping to change at line 2770
Since the replier may only cache a small amount of the information Since the replier may only cache a small amount of the information
that would be required to determine whether this is a case of a false that would be required to determine whether this is a case of a false
retry, the replier may send to the client any of the following retry, the replier may send to the client any of the following
responses: responses:
* The cached reply to the original request (if the replier has * The cached reply to the original request (if the replier has
cached it in its entirety and the users of the original request cached it in its entirety and the users of the original request
and retry match). and retry match).
* A reply that consists only of the Sequence operation with the * A reply that consists only of the Sequence operation with the
error NFS4ERR_FALSE_RETRY. error NFS4ERR_SEQ_FALSE_RETRY.
* A reply consisting of the response to Sequence with the status * A reply consisting of the response to Sequence with the status
NFS4_OK, together with the second operation as it appeared in the NFS4_OK, together with the second operation as it appeared in the
retried request with an error of NFS4ERR_RETRY_UNCACHED_REP or retried request with an error of NFS4ERR_RETRY_UNCACHED_REP or
other error as described above. other error as described above.
* A reply that consists of the response to Sequence with the status * A reply that consists of the response to Sequence with the status
NFS4_OK, together with the second operation as it appeared in the NFS4_OK, together with the second operation as it appeared in the
original request with an error of NFS4ERR_RETRY_UNCACHED_REP or original request with an error of NFS4ERR_RETRY_UNCACHED_REP or
other error as described above. other error as described above.
skipping to change at line 2792 skipping to change at line 2792
2.10.6.1.3.1. False Retry 2.10.6.1.3.1. False Retry
If a requester sent a Sequence operation with a slot ID and sequence If a requester sent a Sequence operation with a slot ID and sequence
ID that are in the reply cache but the replier detected that the ID that are in the reply cache but the replier detected that the
retried request is not the same as the original request, including a retried request is not the same as the original request, including a
retry that has different operations or different arguments in the retry that has different operations or different arguments in the
operations from the original and a retry that uses a different operations from the original and a retry that uses a different
principal in the RPC request's credential field that translates to a principal in the RPC request's credential field that translates to a
different user, then this is a false retry. When the replier detects different user, then this is a false retry. When the replier detects
a false retry, it is permitted (but not always obligated) to return a false retry, it is permitted (but not always obligated) to return
NFS4ERR_FALSE_RETRY in response to the Sequence operation when it NFS4ERR_SEQ_FALSE_RETRY in response to the Sequence operation when it
detects a false retry. detects a false retry.
Translations of particularly privileged user values to other users Translations of particularly privileged user values to other users
due to the lack of appropriately secure credentials, as configured on due to the lack of appropriately secure credentials, as configured on
the replier, should be applied before determining whether the users the replier, should be applied before determining whether the users
are the same or different. If the replier determines the users are are the same or different. If the replier determines the users are
different between the original request and a retry, then the replier different between the original request and a retry, then the replier
MUST return NFS4ERR_FALSE_RETRY. MUST return NFS4ERR_SEQ_FALSE_RETRY.
If an operation of the retry is an illegal operation, or an operation If an operation of the retry is an illegal operation, or an operation
that was legal in a previous minor version of NFSv4 and MUST NOT be that was legal in a previous minor version of NFSv4 and MUST NOT be
supported in the current minor version (e.g., SETCLIENTID), the supported in the current minor version (e.g., SETCLIENTID), the
replier MAY return NFS4ERR_FALSE_RETRY (and MUST do so if the users replier MAY return NFS4ERR_SEQ_FALSE_RETRY (and MUST do so if the
of the original request and retry differ). Otherwise, the replier users of the original request and retry differ). Otherwise, the
MAY return NFS4ERR_OP_ILLEGAL or NFS4ERR_BADXDR or NFS4ERR_NOTSUPP as replier MAY return NFS4ERR_OP_ILLEGAL or NFS4ERR_BADXDR or
appropriate. Note that the handling is in contrast for how the NFS4ERR_NOTSUPP as appropriate. Note that the handling is in
replier deals with retries requests with no cached reply. The contrast for how the replier deals with retries requests with no
difference is due to NFS4ERR_FALSE_RETRY being a valid error for only cached reply. The difference is due to NFS4ERR_SEQ_FALSE_RETRY being
Sequence operations, whereas NFS4ERR_RETRY_UNCACHED_REP is a valid a valid error for only Sequence operations, whereas
error for all operations except illegal operations and operations NFS4ERR_RETRY_UNCACHED_REP is a valid error for all operations except
that MUST NOT be supported in the current minor version of NFSv4. illegal operations and operations that MUST NOT be supported in the
current minor version of NFSv4.
2.10.6.2. Retry and Replay of Reply 2.10.6.2. Retry and Replay of Reply
A requester MUST NOT retry a request, unless the connection it used A requester MUST NOT retry a request, unless the connection it used
to send the request disconnects. The requester can then reconnect to send the request disconnects. The requester can then reconnect
and re-send the request, or it can re-send the request over a and re-send the request, or it can re-send the request over a
different connection that is associated with the same session. different connection that is associated with the same session.
If the requester is a server wanting to re-send a callback operation If the requester is a server wanting to re-send a callback operation
over the backchannel of a session, the requester of course cannot over the backchannel of a session, the requester of course cannot
skipping to change at line 3083 skipping to change at line 3084
view the problem is as a single transaction consisting of each view the problem is as a single transaction consisting of each
operation in the COMPOUND followed by storing the result in operation in the COMPOUND followed by storing the result in
persistent storage, then finally a transaction commit. If there is a persistent storage, then finally a transaction commit. If there is a
failure before the transaction is committed, then the server rolls failure before the transaction is committed, then the server rolls
back the transaction. If the server itself fails, then when it back the transaction. If the server itself fails, then when it
restarts, its recovery logic could roll back the transaction before restarts, its recovery logic could roll back the transaction before
starting the NFSv4.1 server. starting the NFSv4.1 server.
While the description of the implementation for atomic execution of While the description of the implementation for atomic execution of
the request and caching of the reply is beyond the scope of this the request and caching of the reply is beyond the scope of this
document, an example implementation for NFSv2 [44] is described in document, an example implementation for NFSv2 [45] is described in
[45]. [46].
2.10.7. RDMA Considerations 2.10.7. RDMA Considerations
A complete discussion of the operation of RPC-based protocols over A complete discussion of the operation of RPC-based protocols over
RDMA transports is in [32]. A discussion of the operation of NFSv4, RDMA transports is in [32]. A discussion of the operation of NFSv4,
including NFSv4.1, over RDMA is in [33]. Where RDMA is considered, including NFSv4.1, over RDMA is in [33]. Where RDMA is considered,
this specification assumes the use of such a layering; it addresses this specification assumes the use of such a layering; it addresses
only the upper-layer issues relevant to making best use of RPC/RDMA. only the upper-layer issues relevant to making best use of RPC/RDMA.
2.10.7.1. RDMA Connection Resources 2.10.7.1. RDMA Connection Resources
skipping to change at line 3235 skipping to change at line 3236
2.10.8.2. Backchannel RPC Security 2.10.8.2. Backchannel RPC Security
When the NFSv4.1 client establishes the backchannel, it informs the When the NFSv4.1 client establishes the backchannel, it informs the
server of the security flavors and principals to use when sending server of the security flavors and principals to use when sending
requests. If the security flavor is RPCSEC_GSS, the client expresses requests. If the security flavor is RPCSEC_GSS, the client expresses
the principal in the form of an established RPCSEC_GSS context. The the principal in the form of an established RPCSEC_GSS context. The
server is free to use any of the flavor/principal combinations the server is free to use any of the flavor/principal combinations the
client offers, but it MUST NOT use unoffered combinations. This way, client offers, but it MUST NOT use unoffered combinations. This way,
the client need not provide a target GSS principal for the the client need not provide a target GSS principal for the
backchannel as it did with NFSv4.0, nor does the server have to backchannel as it did with NFSv4.0, nor does the server have to
implement an RPCSEC_GSS initiator as it did with NFSv4.0 [36]. implement an RPCSEC_GSS initiator as it did with NFSv4.0 [37].
The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL The CREATE_SESSION (Section 18.36) and BACKCHANNEL_CTL
(Section 18.33) operations allow the client to specify flavor/ (Section 18.33) operations allow the client to specify flavor/
principal combinations. principal combinations.
Also note that the SP4_SSV state protection mode (see Sections 18.35 Also note that the SP4_SSV state protection mode (see Sections 18.35
and 2.10.8.3) has the side benefit of providing SSV-derived and 2.10.8.3) has the side benefit of providing SSV-derived
RPCSEC_GSS contexts (Section 2.10.9). RPCSEC_GSS contexts (Section 2.10.9).
2.10.8.3. Protection from Unauthorized State Changes 2.10.8.3. Protection from Unauthorized State Changes
skipping to change at line 3483 skipping to change at line 3484
iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech iso.org.dod.internet.private.enterprise.Michael Eisler.nfs.ssv_mech
(1.3.6.1.4.1.28882.1.1). While the SSV mechanism does not define any (1.3.6.1.4.1.28882.1.1). While the SSV mechanism does not define any
initial context tokens, the OID can be used to let servers indicate initial context tokens, the OID can be used to let servers indicate
that the SSV mechanism is acceptable whenever the client sends a that the SSV mechanism is acceptable whenever the client sends a
SECINFO or SECINFO_NO_NAME operation (see Section 2.6). SECINFO or SECINFO_NO_NAME operation (see Section 2.6).
The SSV mechanism defines four subkeys derived from the SSV value. The SSV mechanism defines four subkeys derived from the SSV value.
Each time SET_SSV is invoked, the subkeys are recalculated by the Each time SET_SSV is invoked, the subkeys are recalculated by the
client and server. The calculation of each of the four subkeys client and server. The calculation of each of the four subkeys
depends on each of the four respective ssv_subkey4 enumerated values. depends on each of the four respective ssv_subkey4 enumerated values.
The calculation uses the HMAC [51] algorithm, using the current SSV The calculation uses the HMAC [52] algorithm, using the current SSV
as the key, the one-way hash algorithm as negotiated by EXCHANGE_ID, as the key, the one-way hash algorithm as negotiated by EXCHANGE_ID,
and the input text as represented by the XDR encoded enumeration and the input text as represented by the XDR encoded enumeration
value for that subkey of data type ssv_subkey4. If the length of the value for that subkey of data type ssv_subkey4. If the length of the
output of the HMAC algorithm exceeds the length of key of the output of the HMAC algorithm exceeds the length of key of the
encryption algorithm (which is also negotiated by EXCHANGE_ID), then encryption algorithm (which is also negotiated by EXCHANGE_ID), then
the subkey MUST be truncated from the HMAC output, i.e., if the the subkey MUST be truncated from the HMAC output, i.e., if the
subkey is of N bytes long, then the first N bytes of the HMAC output subkey is of N bytes long, then the first N bytes of the HMAC output
MUST be used for the subkey. The specification of EXCHANGE_ID states MUST be used for the subkey. The specification of EXCHANGE_ID states
that the length of the output of the HMAC algorithm MUST NOT be less that the length of the output of the HMAC algorithm MUST NOT be less
than the length of subkey needed for the encryption algorithm (see than the length of subkey needed for the encryption algorithm (see
skipping to change at line 3926 skipping to change at line 3927
1. If the client has other connections to other server network 1. If the client has other connections to other server network
addresses associated with the same session, attempt a COMPOUND addresses associated with the same session, attempt a COMPOUND
with a single operation, SEQUENCE, on each of the other with a single operation, SEQUENCE, on each of the other
connections. connections.
2. If the attempts succeed, the session is still alive, and this is 2. If the attempts succeed, the session is still alive, and this is
a strong indicator that the server's network address has moved. a strong indicator that the server's network address has moved.
The client might send an EXCHANGE_ID on the connection that The client might send an EXCHANGE_ID on the connection that
returned NFS4ERR_BADSESSION to see if there are opportunities for returned NFS4ERR_BADSESSION to see if there are opportunities for
client ID trunking (i.e., the same client ID and so_major value client ID trunking (i.e., the same client ID and so_major_id
are returned). The client might use DNS to see if the moved value are returned). The client might use DNS to see if the
network address was replaced with another, so that the moved network address was replaced with another, so that the
performance and availability benefits of session trunking can performance and availability benefits of session trunking can
continue. continue.
3. If the SEQUENCE requests fail with NFS4ERR_BADSESSION, then the 3. If the SEQUENCE requests fail with NFS4ERR_BADSESSION, then the
session no longer exists on any of the server network addresses session no longer exists on any of the server network addresses
for which the client has connections associated with that session for which the client has connections associated with that session
ID. It is possible the session is still alive and available on ID. It is possible the session is still alive and available on
other network addresses. The client sends an EXCHANGE_ID on all other network addresses. The client sends an EXCHANGE_ID on all
the connections to see if the server owner is still listening on the connections to see if the server owner is still listening on
those network addresses. If the same server owner is returned those network addresses. If the same server owner is returned
skipping to change at line 4359 skipping to change at line 4360
}; };
The fattr4 data type is used to represent file and directory The fattr4 data type is used to represent file and directory
attributes. attributes.
The bitmap is a counted array of 32-bit integers used to contain bit The bitmap is a counted array of 32-bit integers used to contain bit
values. The position of the integer in the array that contains bit n values. The position of the integer in the array that contains bit n
can be computed from the expression (n / 32), and its bit within that can be computed from the expression (n / 32), and its bit within that
integer is (n mod 32). integer is (n mod 32).
0 1 0 1
+-----------+-----------+-----------+-- +-----------+-----------+-----------+--
| count | 31 .. 0 | 63 .. 32 | | count | 31 .. 0 | 63 .. 32 |
+-----------+-----------+-----------+-- +-----------+-----------+-----------+--
3.3.8. change_info4 3.3.8. change_info4
struct change_info4 { struct change_info4 {
bool atomic; bool atomic;
changeid4 before; changeid4 before;
changeid4 after; changeid4 after;
skipping to change at line 4469 skipping to change at line 4470
The layouttype4 data type is 32 bits in length. The range The layouttype4 data type is 32 bits in length. The range
represented by the layout type is split into three parts. Type 0x0 represented by the layout type is split into three parts. Type 0x0
is reserved. Types within the range 0x00000001-0x7FFFFFFF are is reserved. Types within the range 0x00000001-0x7FFFFFFF are
globally unique and are assigned according to the description in globally unique and are assigned according to the description in
Section 22.5; they are maintained by IANA. Types within the range Section 22.5; they are maintained by IANA. Types within the range
0x80000000-0xFFFFFFFF are site specific and for private use only. 0x80000000-0xFFFFFFFF are site specific and for private use only.
The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file The LAYOUT4_NFSV4_1_FILES enumeration specifies that the NFSv4.1 file
layout type, as defined in Section 13, is to be used. The layout type, as defined in Section 13, is to be used. The
LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as LAYOUT4_OSD2_OBJECTS enumeration specifies that the object layout, as
defined in [46], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME defined in [47], is to be used. Similarly, the LAYOUT4_BLOCK_VOLUME
enumeration specifies that the block/volume layout, as defined in enumeration specifies that the block/volume layout, as defined in
[47], is to be used. [48], is to be used.
3.3.14. deviceid4 3.3.14. deviceid4
const NFS4_DEVICEID4_SIZE = 16; const NFS4_DEVICEID4_SIZE = 16;
typedef opaque deviceid4[NFS4_DEVICEID4_SIZE]; typedef opaque deviceid4[NFS4_DEVICEID4_SIZE];
Layout information includes device IDs that specify a storage device Layout information includes device IDs that specify a storage device
through a compact handle. Addressing and type information is through a compact handle. Addressing and type information is
obtained with the GETDEVICEINFO operation. Device IDs are not obtained with the GETDEVICEINFO operation. Device IDs are not
skipping to change at line 4684 skipping to change at line 4685
for a file system object. The contents of the filehandle are opaque for a file system object. The contents of the filehandle are opaque
to the client. Therefore, the server is responsible for translating to the client. Therefore, the server is responsible for translating
the filehandle to an internal representation of the file system the filehandle to an internal representation of the file system
object. object.
4.1. Obtaining the First Filehandle 4.1. Obtaining the First Filehandle
The operations of the NFS protocol are defined in terms of one or The operations of the NFS protocol are defined in terms of one or
more filehandles. Therefore, the client needs a filehandle to more filehandles. Therefore, the client needs a filehandle to
initiate communication with the server. With the NFSv3 protocol (RFC initiate communication with the server. With the NFSv3 protocol (RFC
1813 [37]), there exists an ancillary protocol to obtain this first 1813 [38]), there exists an ancillary protocol to obtain this first
filehandle. The MOUNT protocol, RPC program number 100005, provides filehandle. The MOUNT protocol, RPC program number 100005, provides
the mechanism of translating a string-based file system pathname to a the mechanism of translating a string-based file system pathname to a
filehandle, which can then be used by the NFS protocols. filehandle, which can then be used by the NFS protocols.
The MOUNT protocol has deficiencies in the area of security and use The MOUNT protocol has deficiencies in the area of security and use
via firewalls. This is one reason that the use of the public via firewalls. This is one reason that the use of the public
filehandle was introduced in RFC 2054 [48] and RFC 2055 [49]. With filehandle was introduced in RFC 2054 [49] and RFC 2055 [50]. With
the use of the public filehandle in combination with the LOOKUP the use of the public filehandle in combination with the LOOKUP
operation in the NFSv3 protocol, it has been demonstrated that the operation in the NFSv3 protocol, it has been demonstrated that the
MOUNT protocol is unnecessary for viable interaction between NFS MOUNT protocol is unnecessary for viable interaction between NFS
client and server. client and server.
Therefore, the NFSv4.1 protocol will not use an ancillary protocol Therefore, the NFSv4.1 protocol will not use an ancillary protocol
for translation from string-based pathnames to a filehandle. Two for translation from string-based pathnames to a filehandle. Two
special filehandles will be used as starting points for the NFS special filehandles will be used as starting points for the NFS
client. client.
skipping to change at line 4955 skipping to change at line 4956
Named attributes are accessed by the new OPENATTR operation, which Named attributes are accessed by the new OPENATTR operation, which
accesses a hidden directory of attributes associated with a file accesses a hidden directory of attributes associated with a file
system object. OPENATTR takes a filehandle for the object and system object. OPENATTR takes a filehandle for the object and
returns the filehandle for the attribute hierarchy. The filehandle returns the filehandle for the attribute hierarchy. The filehandle
for the named attributes is a directory object accessible by LOOKUP for the named attributes is a directory object accessible by LOOKUP
or READDIR and contains files whose names represent the named or READDIR and contains files whose names represent the named
attributes and whose data bytes are the value of the attribute. For attributes and whose data bytes are the value of the attribute. For
example: example:
+==========+===========+=================================+ +----------+-----------+---------------------------------+
+==========+===========+=================================+
| LOOKUP | "foo" | ; look up file | | LOOKUP | "foo" | ; look up file |
+----------+-----------+---------------------------------+ +----------+-----------+---------------------------------+
| GETATTR | attrbits | | | GETATTR | attrbits | |
+----------+-----------+---------------------------------+ +----------+-----------+---------------------------------+
| OPENATTR | | ; access foo's named attributes | | OPENATTR | | ; access foo's named attributes |
+----------+-----------+---------------------------------+ +----------+-----------+---------------------------------+
| LOOKUP | "x11icon" | ; look up specific attribute | | LOOKUP | "x11icon" | ; look up specific attribute |
+----------+-----------+---------------------------------+ +----------+-----------+---------------------------------+
| READ | 0,4096 | ; read stream of bytes | | READ | 0,4096 | ; read stream of bytes |
+----------+-----------+---------------------------------+ +----------+-----------+---------------------------------+
skipping to change at line 5158 skipping to change at line 5158
5.6. REQUIRED Attributes - List and Definition References 5.6. REQUIRED Attributes - List and Definition References
The list of REQUIRED attributes appears in Table 4. The meaning of The list of REQUIRED attributes appears in Table 4. The meaning of
the columns of the table are: the columns of the table are:
Name: The name of the attribute. Name: The name of the attribute.
Id: The number assigned to the attribute. In the event of conflicts Id: The number assigned to the attribute. In the event of conflicts
between the assigned number and [10], the latter is likely between the assigned number and [10], the latter is likely
authoritative, but should be resolved with Errata to this document authoritative, but should be resolved with Errata to this document
and/or [10]. See [50] for the Errata process. and/or [10]. See [51] for the Errata process.
Data Type: The XDR data type of the attribute. Data Type: The XDR data type of the attribute.
Acc: Access allowed to the attribute. R means read-only (GETATTR Acc: Access allowed to the attribute. R means read-only (GETATTR
may retrieve, SETATTR may not set). W means write-only (SETATTR may retrieve, SETATTR may not set). W means write-only (SETATTR
may set, GETATTR may not retrieve). R W means read/write (GETATTR may set, GETATTR may not retrieve). R W means read/write (GETATTR
may retrieve, SETATTR may set). may retrieve, SETATTR may set).
Defined in: The section of this specification that describes the Defined in: The section of this specification that describes the
attribute. attribute.
skipping to change at line 5210 skipping to change at line 5210
+--------------------+----+------------+-----+------------------+ +--------------------+----+------------+-----+------------------+
Table 4 Table 4
5.7. RECOMMENDED Attributes - List and Definition References 5.7. RECOMMENDED Attributes - List and Definition References
The RECOMMENDED attributes are defined in Table 5. The meanings of The RECOMMENDED attributes are defined in Table 5. The meanings of
the column headers are the same as Table 4; see Section 5.6 for the the column headers are the same as Table 4; see Section 5.6 for the
meanings. meanings.
+====================+====+================+=====+==================+ +====================+====+====================+=====+=============+
| Name | Id | Data Type | Acc | Defined in: | | Name | Id | Data Type | Acc | Defined in: |
+====================+====+================+=====+==================+ +====================+====+====================+=====+=============+
| acl | 12 | nfsace4<> | R W | Section 6.2.1 | | acl | 12 | nfsace4<> | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 6.2.1 |
| aclsupport | 13 | uint32_t | R | Section 6.2.1.2 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | aclsupport | 13 | uint32_t | R | Section |
| archive | 14 | bool | R W | Section 5.8.2.1 | | | | | | 6.2.1.2 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| cansettime | 15 | bool | R | Section 5.8.2.2 | | archive | 14 | bool | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.1 |
| case_insensitive | 16 | bool | R | Section 5.8.2.3 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | cansettime | 15 | bool | R | Section |
| case_preserving | 17 | bool | R | Section 5.8.2.4 | | | | | | 5.8.2.2 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| change_policy | 60 | chg_policy4 | R | Section 5.8.2.5 | | case_insensitive | 16 | bool | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.3 |
| chown_restricted | 18 | bool | R | Section 5.8.2.6 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | case_preserving | 17 | bool | R | Section |
| dacl | 58 | nfsacl41 | R W | Section 6.2.2 | | | | | | 5.8.2.4 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| dir_notif_delay | 56 | nfstime4 | R | Section 5.11.1 | | change_policy | 60 | chg_policy4 | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.5 |
| dirent_notif_delay | 57 | nfstime4 | R | Section 5.11.2 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | chown_restricted | 18 | bool | R | Section |
| fileid | 20 | uint64_t | R | Section 5.8.2.7 | | | | | | 5.8.2.6 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| files_avail | 21 | uint64_t | R | Section 5.8.2.8 | | dacl | 58 | nfsacl41 | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 6.2.2 |
| files_free | 22 | uint64_t | R | Section 5.8.2.9 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | dir_notif_delay | 56 | nfstime4 | R | Section |
| files_total | 23 | uint64_t | R | Section | | | | | | 5.11.1 |
| | | | | 5.8.2.10 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | dirent_notif_delay | 57 | nfstime4 | R | Section |
| fs_charset_cap | 76 | uint32_t | R | Section | | | | | | 5.11.2 |
| | | | | 5.8.2.11 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | fileid | 20 | uint64_t | R | Section |
| fs_layout_type | 62 | layouttype4<> | R | Section 5.12.1 | | | | | | 5.8.2.7 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| fs_locations | 24 | fs_locations | R | Section | | files_avail | 21 | uint64_t | R | Section |
| | | | | 5.8.2.12 | | | | | | 5.8.2.8 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| fs_locations_info | 67 | * | R | Section | | files_free | 22 | uint64_t | R | Section |
| | | | | 5.8.2.13 | | | | | | 5.8.2.9 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| fs_status | 61 | fs4_status | R | Section | | files_total | 23 | uint64_t | R | Section |
| | | | | 5.8.2.14 | | | | | | 5.8.2.10 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| hidden | 25 | bool | R W | Section | | fs_charset_cap | 76 | uint32_t | R | Section |
| | | | | 5.8.2.15 | | | | | | 5.8.2.11 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| homogeneous | 26 | bool | R | Section | | fs_layout_type | 62 | layouttype4<> | R | Section |
| | | | | 5.8.2.16 | | | | | | 5.12.1 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| layout_alignment | 66 | uint32_t | R | Section 5.12.2 | | fs_locations | 24 | fs_locations | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.12 |
| layout_blksize | 65 | uint32_t | R | Section 5.12.3 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | fs_locations_info | 67 | fs_locations_info4 | R | Section |
| layout_hint | 63 | layouthint4 | W | Section 5.12.4 | | | | | | 5.8.2.13 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| layout_type | 64 | layouttype4<> | R | Section 5.12.5 | | fs_status | 61 | fs4_status | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.14 |
| maxfilesize | 27 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.17 | | hidden | 25 | bool | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.15 |
| maxlink | 28 | uint32_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.18 | | homogeneous | 26 | bool | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.16 |
| maxname | 29 | uint32_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.19 | | layout_alignment | 66 | uint32_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.12.2 |
| maxread | 30 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.20 | | layout_blksize | 65 | uint32_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.12.3 |
| maxwrite | 31 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.21 | | layout_hint | 63 | layouthint4 | W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.12.4 |
| mdsthreshold | 68 | mdsthreshold4 | R | Section 5.12.6 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | layout_type | 64 | layouttype4<> | R | Section |
| mimetype | 32 | utf8str_cs | R W | Section | | | | | | 5.12.5 |
| | | | | 5.8.2.22 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | maxfilesize | 27 | uint64_t | R | Section |
| mode | 33 | mode4 | R W | Section 6.2.4 | | | | | | 5.8.2.17 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| mode_set_masked | 74 | mode_masked4 | W | Section 6.2.5 | | maxlink | 28 | uint32_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.18 |
| mounted_on_fileid | 55 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.23 | | maxname | 29 | uint32_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.19 |
| no_trunc | 34 | bool | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.24 | | maxread | 30 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.20 |
| numlinks | 35 | uint32_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.25 | | maxwrite | 31 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.21 |
| owner | 36 | utf8str_mixed | R W | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.26 | | mdsthreshold | 68 | mdsthreshold4 | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.12.6 |
| owner_group | 37 | utf8str_mixed | R W | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.27 | | mimetype | 32 | utf8str_cs | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.22 |
| quota_avail_hard | 38 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.28 | | mode | 33 | mode4 | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 6.2.4 |
| quota_avail_soft | 39 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.29 | | mode_set_masked | 74 | mode_masked4 | W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 6.2.5 |
| quota_used | 40 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.30 | | mounted_on_fileid | 55 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.23 |
| rawdev | 41 | specdata4 | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.31 | | no_trunc | 34 | bool | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.24 |
| retentevt_get | 71 | retention_get4 | R | Section 5.13.3 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | numlinks | 35 | uint32_t | R | Section |
| retentevt_set | 72 | retention_set4 | W | Section 5.13.4 | | | | | | 5.8.2.25 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| retention_get | 69 | retention_get4 | R | Section 5.13.1 | | owner | 36 | utf8str_mixed | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.26 |
| retention_hold | 73 | uint64_t | R W | Section 5.13.5 | +--------------------+----+--------------------+-----+-------------+
+--------------------+----+----------------+-----+------------------+ | owner_group | 37 | utf8str_mixed | R W | Section |
| retention_set | 70 | retention_set4 | W | Section 5.13.2 | | | | | | 5.8.2.27 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+--------------------+-----+-------------+
| sacl | 59 | nfsacl41 | R W | Section 6.2.3 | | quota_avail_hard | 38 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.28 |
| space_avail | 42 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.32 | | quota_avail_soft | 39 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.29 |
| space_free | 43 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.33 | | quota_used | 40 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.30 |
| space_total | 44 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.34 | | rawdev | 41 | specdata4 | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.31 |
| space_used | 45 | uint64_t | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.35 | | retentevt_get | 71 | retention_get4 | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.13.3 |
| system | 46 | bool | R W | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.36 | | retentevt_set | 72 | retention_set4 | W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.13.4 |
| time_access | 47 | nfstime4 | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.37 | | retention_get | 69 | retention_get4 | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.13.1 |
| time_access_set | 48 | settime4 | W | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.38 | | retention_hold | 73 | uint64_t | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.13.5 |
| time_backup | 49 | nfstime4 | R W | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.39 | | retention_set | 70 | retention_set4 | W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.13.2 |
| time_create | 50 | nfstime4 | R W | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.40 | | sacl | 59 | nfsacl41 | R W | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 6.2.3 |
| time_delta | 51 | nfstime4 | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.41 | | space_avail | 42 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.32 |
| time_metadata | 52 | nfstime4 | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.42 | | space_free | 43 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.33 |
| time_modify | 53 | nfstime4 | R | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.43 | | space_total | 44 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.34 |
| time_modify_set | 54 | settime4 | W | Section | +--------------------+----+--------------------+-----+-------------+
| | | | | 5.8.2.44 | | space_used | 45 | uint64_t | R | Section |
+--------------------+----+----------------+-----+------------------+ | | | | | 5.8.2.35 |
| * fs_locations_info4 | +--------------------+----+--------------------+-----+-------------+
+-------------------------------------------------------------------+ | system | 46 | bool | R W | Section |
| | | | | 5.8.2.36 |
+--------------------+----+--------------------+-----+-------------+
| time_access | 47 | nfstime4 | R | Section |
| | | | | 5.8.2.37 |
+--------------------+----+--------------------+-----+-------------+
| time_access_set | 48 | settime4 | W | Section |
| | | | | 5.8.2.38 |
+--------------------+----+--------------------+-----+-------------+
| time_backup | 49 | nfstime4 | R W | Section |
| | | | | 5.8.2.39 |
+--------------------+----+--------------------+-----+-------------+
| time_create | 50 | nfstime4 | R W | Section |
| | | | | 5.8.2.40 |
+--------------------+----+--------------------+-----+-------------+
| time_delta | 51 | nfstime4 | R | Section |
| | | | | 5.8.2.41 |
+--------------------+----+--------------------+-----+-------------+
| time_metadata | 52 | nfstime4 | R | Section |
| | | | | 5.8.2.42 |
+--------------------+----+--------------------+-----+-------------+
| time_modify | 53 | nfstime4 | R | Section |
| | | | | 5.8.2.43 |
+--------------------+----+--------------------+-----+-------------+
| time_modify_set | 54 | settime4 | W | Section |
| | | | | 5.8.2.44 |
+--------------------+----+--------------------+-----+-------------+
Table 5 Table 5
5.8. Attribute Definitions 5.8. Attribute Definitions
5.8.1. Definitions of REQUIRED Attributes 5.8.1. Definitions of REQUIRED Attributes
5.8.1.1. Attribute 0: supported_attrs 5.8.1.1. Attribute 0: supported_attrs
The bit vector that would retrieve all REQUIRED and RECOMMENDED The bit vector that would retrieve all REQUIRED and RECOMMENDED
attributes that are supported for this object. The scope of this attributes that are supported for this object. The scope of this
attribute applies to all objects with a matching fsid. attribute applies to all objects with a matching fsid.
skipping to change at line 5806 skipping to change at line 5832
5.8.2.44. Attribute 54: time_modify_set 5.8.2.44. Attribute 54: time_modify_set
Sets the time of last modification to the object. SETATTR use only. Sets the time of last modification to the object. SETATTR use only.
5.9. Interpreting owner and owner_group 5.9. Interpreting owner and owner_group
The RECOMMENDED attributes "owner" and "owner_group" (and also users The RECOMMENDED attributes "owner" and "owner_group" (and also users
and groups within the "acl" attribute) are represented in terms of a and groups within the "acl" attribute) are represented in terms of a
UTF-8 string. To avoid a representation that is tied to a particular UTF-8 string. To avoid a representation that is tied to a particular
underlying implementation at the client or server, the use of the underlying implementation at the client or server, the use of the
UTF-8 string has been chosen. Note that Section 6.1 of RFC 2624 [52] UTF-8 string has been chosen. Note that Section 6.1 of RFC 2624 [53]
provides additional rationale. It is expected that the client and provides additional rationale. It is expected that the client and
server will have their own local representation of owner and server will have their own local representation of owner and
owner_group that is used for local storage or presentation to the end owner_group that is used for local storage or presentation to the end
user. Therefore, it is expected that when these attributes are user. Therefore, it is expected that when these attributes are
transferred between the client and server, the local representation transferred between the client and server, the local representation
is translated to a syntax of the form "user@dns_domain". This will is translated to a syntax of the form "user@dns_domain". This will
allow for a client and server that do not use the same local allow for a client and server that do not use the same local
representation the ability to translate to a common syntax that can representation the ability to translate to a common syntax that can
be interpreted by both. be interpreted by both.
skipping to change at line 6799 skipping to change at line 6825
trigger log or alarm events. Such ACEs only take effect once they trigger log or alarm events. Such ACEs only take effect once they
are applied (with this bit cleared) to newly created files and are applied (with this bit cleared) to newly created files and
directories as specified by the ACE4_FILE_INHERIT_ACE and directories as specified by the ACE4_FILE_INHERIT_ACE and
ACE4_DIRECTORY_INHERIT_ACE flags. ACE4_DIRECTORY_INHERIT_ACE flags.
If this flag is present on an ACE, but neither If this flag is present on an ACE, but neither
ACE4_DIRECTORY_INHERIT_ACE nor ACE4_FILE_INHERIT_ACE is present, ACE4_DIRECTORY_INHERIT_ACE nor ACE4_FILE_INHERIT_ACE is present,
then an operation attempting to set such an attribute SHOULD fail then an operation attempting to set such an attribute SHOULD fail
with NFS4ERR_ATTRNOTSUPP. with NFS4ERR_ATTRNOTSUPP.
ACE4_SUCCESSFUL_ACCESS_ACE_FLAG ACE4_SUCCESSFUL_ACCESS_ACE_FLAG and ACE4_FAILED_ACCESS_ACE_FLAG
ACE4_FAILED_ACCESS_ACE_FLAG
The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and The ACE4_SUCCESSFUL_ACCESS_ACE_FLAG (SUCCESS) and
ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits may be set only on ACE4_FAILED_ACCESS_ACE_FLAG (FAILED) flag bits may be set only on
ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE ACE4_SYSTEM_AUDIT_ACE_TYPE (AUDIT) and ACE4_SYSTEM_ALARM_ACE_TYPE
(ALARM) ACE types. If during the processing of the file's ACL, (ALARM) ACE types. If during the processing of the file's ACL,
the server encounters an AUDIT or ALARM ACE that matches the the server encounters an AUDIT or ALARM ACE that matches the
principal attempting the OPEN, the server notes that fact, and the principal attempting the OPEN, the server notes that fact, and the
presence, if any, of the SUCCESS and FAILED flags encountered in presence, if any, of the SUCCESS and FAILED flags encountered in
the AUDIT or ALARM ACE. Once the server completes the ACL the AUDIT or ALARM ACE. Once the server completes the ACL
processing, it then notes if the operation succeeded or failed. processing, it then notes if the operation succeeded or failed.
If the operation succeeded, and if the SUCCESS flag was set for a If the operation succeeded, and if the SUCCESS flag was set for a
skipping to change at line 7573 skipping to change at line 7597
clients should use strong security mechanisms to access the pseudo clients should use strong security mechanisms to access the pseudo
file system in order to prevent man-in-the-middle attacks. file system in order to prevent man-in-the-middle attacks.
8. State Management 8. State Management
Integrating locking into the NFS protocol necessarily causes it to be Integrating locking into the NFS protocol necessarily causes it to be
stateful. With the inclusion of such features as share reservations, stateful. With the inclusion of such features as share reservations,
file and directory delegations, recallable layouts, and support for file and directory delegations, recallable layouts, and support for
mandatory byte-range locking, the protocol becomes substantially more mandatory byte-range locking, the protocol becomes substantially more
dependent on proper management of state than the traditional dependent on proper management of state than the traditional
combination of NFS and NLM (Network Lock Manager) [53]. These combination of NFS and NLM (Network Lock Manager) [54]. These
features include expanded locking facilities, which provide some features include expanded locking facilities, which provide some
measure of inter-client exclusion, but the state also offers features measure of inter-client exclusion, but the state also offers features
not readily providable using a stateless model. There are three not readily providable using a stateless model. There are three
components to making this state manageable: components to making this state manageable:
* clear division between client and server * clear division between client and server
* ability to reliably detect inconsistency in state between client * ability to reliably detect inconsistency in state between client
and server and server
skipping to change at line 8370 skipping to change at line 8394
requests to be processed during the grace period, it MUST determine requests to be processed during the grace period, it MUST determine
that no lock subsequently reclaimed will be rejected and that no lock that no lock subsequently reclaimed will be rejected and that no lock
subsequently reclaimed would have prevented any I/O operation subsequently reclaimed would have prevented any I/O operation
processed during the grace period. processed during the grace period.
Clients should be prepared for the return of NFS4ERR_GRACE errors for Clients should be prepared for the return of NFS4ERR_GRACE errors for
non-reclaim lock and I/O requests. In this case, the client should non-reclaim lock and I/O requests. In this case, the client should
employ a retry mechanism for the request. A delay (on the order of employ a retry mechanism for the request. A delay (on the order of
several seconds) between retries should be used to avoid overwhelming several seconds) between retries should be used to avoid overwhelming
the server. Further discussion of the general issue is included in the server. Further discussion of the general issue is included in
[54]. The client must account for the server that can perform I/O [55]. The client must account for the server that can perform I/O
and non-reclaim locking requests within the grace period as well as and non-reclaim locking requests within the grace period as well as
those that cannot do so. those that cannot do so.
A reclaim-type locking request outside the server's grace period can A reclaim-type locking request outside the server's grace period can
only succeed if the server can guarantee that no conflicting lock or only succeed if the server can guarantee that no conflicting lock or
I/O request has been granted since restart. I/O request has been granted since restart.
A server may, upon restart, establish a new value for the lease A server may, upon restart, establish a new value for the lease
period. Therefore, clients should, once a new client ID is period. Therefore, clients should, once a new client ID is
established, refetch the lease_time attribute and use it as the basis established, refetch the lease_time attribute and use it as the basis
skipping to change at line 9863 skipping to change at line 9887
* The existence of any server-specific semantics of OPEN/CLOSE that * The existence of any server-specific semantics of OPEN/CLOSE that
would make the required handling incompatible with the prescribed would make the required handling incompatible with the prescribed
handling that the delegated client would apply (see below). handling that the delegated client would apply (see below).
There are two types of OPEN delegations: OPEN_DELEGATE_READ and There are two types of OPEN delegations: OPEN_DELEGATE_READ and
OPEN_DELEGATE_WRITE. An OPEN_DELEGATE_READ delegation allows a OPEN_DELEGATE_WRITE. An OPEN_DELEGATE_READ delegation allows a
client to handle, on its own, requests to open a file for reading client to handle, on its own, requests to open a file for reading
that do not deny OPEN4_SHARE_ACCESS_READ access to others. Multiple that do not deny OPEN4_SHARE_ACCESS_READ access to others. Multiple
OPEN_DELEGATE_READ delegations may be outstanding simultaneously and OPEN_DELEGATE_READ delegations may be outstanding simultaneously and
do not conflict. An OPEN_DELEGATE_WRITE delegation allows the client do not conflict. An OPEN_DELEGATE_WRITE delegation allows the client
to handle, on its own, all opens. Only OPEN_DELEGATE_WRITE to handle, on its own, all opens. Only one OPEN_DELEGATE_WRITE
delegation may exist for a given file at a given time, and it is delegation may exist for a given file at a given time, and it is
inconsistent with any OPEN_DELEGATE_READ delegations. inconsistent with any OPEN_DELEGATE_READ delegations.
When a client has an OPEN_DELEGATE_READ delegation, it is assured When a client has an OPEN_DELEGATE_READ delegation, it is assured
that neither the contents, the attributes (with the exception of that neither the contents, the attributes (with the exception of
time_access), nor the names of any links to the file will change time_access), nor the names of any links to the file will change
without its knowledge, so long as the delegation is held. When a without its knowledge, so long as the delegation is held. When a
client has an OPEN_DELEGATE_WRITE delegation, it may modify the file client has an OPEN_DELEGATE_WRITE delegation, it may modify the file
data locally since no other client will be accessing the file's data. data locally since no other client will be accessing the file's data.
The client holding an OPEN_DELEGATE_WRITE delegation may only locally The client holding an OPEN_DELEGATE_WRITE delegation may only locally
skipping to change at line 10212 skipping to change at line 10236
no previous CLOSE operation has been sent to the server, a CLOSE no previous CLOSE operation has been sent to the server, a CLOSE
operation must be sent to the server. operation must be sent to the server.
* If a file has other open references at the client, then OPEN * If a file has other open references at the client, then OPEN
operations must be sent to the server. The appropriate stateids operations must be sent to the server. The appropriate stateids
will be provided by the server for subsequent use by the client will be provided by the server for subsequent use by the client
since the delegation stateid will no longer be valid. These OPEN since the delegation stateid will no longer be valid. These OPEN
requests are done with the claim type of CLAIM_DELEGATE_CUR. This requests are done with the claim type of CLAIM_DELEGATE_CUR. This
will allow the presentation of the delegation stateid so that the will allow the presentation of the delegation stateid so that the
client can establish the appropriate rights to perform the OPEN. client can establish the appropriate rights to perform the OPEN.
(see Section 18.16, which describes the OPEN operation, for (See Section 18.16, which describes the OPEN operation, for
details.) details.)
* If there are granted byte-range locks, the corresponding LOCK * If there are granted byte-range locks, the corresponding LOCK
operations need to be performed. This applies to the operations need to be performed. This applies to the
OPEN_DELEGATE_WRITE delegation case only. OPEN_DELEGATE_WRITE delegation case only.
* For an OPEN_DELEGATE_WRITE delegation, if at the time of recall * For an OPEN_DELEGATE_WRITE delegation, if at the time of recall
the file is not open for OPEN4_SHARE_ACCESS_WRITE/ the file is not open for OPEN4_SHARE_ACCESS_WRITE/
OPEN4_SHARE_ACCESS_BOTH, all modified data for the file must be OPEN4_SHARE_ACCESS_BOTH, all modified data for the file must be
flushed to the server. If the delegation had not existed, the flushed to the server. If the delegation had not existed, the
skipping to change at line 10903 skipping to change at line 10927
no provision is made for reclaiming directory delegations in the no provision is made for reclaiming directory delegations in the
event of client or server restart. The client can simply establish a event of client or server restart. The client can simply establish a
directory delegation in the same fashion as was done initially. directory delegation in the same fashion as was done initially.
11. Multi-Server Namespace 11. Multi-Server Namespace
NFSv4.1 supports attributes that allow a namespace to extend beyond NFSv4.1 supports attributes that allow a namespace to extend beyond
the boundaries of a single server. It is desirable that clients and the boundaries of a single server. It is desirable that clients and
servers support construction of such multi-server namespaces. Use of servers support construction of such multi-server namespaces. Use of
such multi-server namespaces is OPTIONAL; however, and for many such multi-server namespaces is OPTIONAL; however, and for many
purposes, single-server namespaces are perfectly acceptable. Use of purposes, single-server namespaces are perfectly acceptable. The use
multi-server namespaces can provide many advantages, by separating a of multi-server namespaces can provide many advantages by separating
file system's logical position in a namespace from the (possibly a file system's logical position in a namespace from the (possibly
changing) logistical and administrative considerations that result in changing) logistical and administrative considerations that cause a
particular file systems being located on particular servers via a particular file system to be located on a particular server via a
single network access path known in advance or determined using DNS. single network access path that has to be known in advance or
determined using DNS.
11.1. Terminology 11.1. Terminology
In this section as a whole (i.e., within all of Section 11), the In this section as a whole (i.e., within all of Section 11), the
phrase "client ID" always refers to the 64-bit shorthand identifier phrase "client ID" always refers to the 64-bit shorthand identifier
assigned by the server (a clientid4) and never to the structure that assigned by the server (a clientid4) and never to the structure that
the client uses to identify itself to the server (called an the client uses to identify itself to the server (called an
nfs_client_id4 or client_owner in NFSv4.0 and NFSv4.1, respectively). nfs_client_id4 or client_owner in NFSv4.0 and NFSv4.1, respectively).
The opaque identifier within those structures is referred to as a The opaque identifier within those structures is referred to as a
"client id string". "client id string".
skipping to change at line 10948 skipping to change at line 10973
trunkable" and "session-trunkable". trunkable" and "session-trunkable".
* Trunking discovery is a process by which a client using one * Trunking discovery is a process by which a client using one
network address can obtain other addresses that are connected to network address can obtain other addresses that are connected to
the same server. Typically, it builds on a trunking detection the same server. Typically, it builds on a trunking detection
facility by providing one or more methods by which candidate facility by providing one or more methods by which candidate
addresses are made available to the client, who can then use addresses are made available to the client, who can then use
trunking detection to appropriately filter them. trunking detection to appropriately filter them.
Despite the support for trunking detection, there was no Despite the support for trunking detection, there was no
description of trunking discovery provided in RFC 5661 [65], description of trunking discovery provided in RFC 5661 [66],
making it necessary to provide those means in this document. making it necessary to provide those means in this document.
The combination of a server network address and a particular The combination of a server network address and a particular
connection type to be used by a connection is referred to as a connection type to be used by a connection is referred to as a
"server endpoint". Although using different connection types may "server endpoint". Although using different connection types may
result in different ports being used, the use of different ports by result in different ports being used, the use of different ports by
multiple connections to the same network address in such cases is not multiple connections to the same network address in such cases is not
the essence of the distinction between the two endpoints used. This the essence of the distinction between the two endpoints used. This
is in contrast to the case of port-specific endpoints, in which the is in contrast to the case of port-specific endpoints, in which the
explicit specification of port numbers within network addresses is explicit specification of port numbers within network addresses is
skipping to change at line 11078 skipping to change at line 11103
able to use client ID trunking, but will only be able to use able to use client ID trunking, but will only be able to use
session trunking if the paths are also session-trunkable. session trunking if the paths are also session-trunkable.
* Two file system location elements are said to be session-trunkable * Two file system location elements are said to be session-trunkable
if they specify the same fs name and the location addresses are if they specify the same fs name and the location addresses are
such that the location addresses are session-trunkable. When the such that the location addresses are session-trunkable. When the
corresponding network paths are used, the client will be able to corresponding network paths are used, the client will be able to
able to use either client ID trunking or session trunking. able to use either client ID trunking or session trunking.
Discussion of the term "replica" is complicated by the fact that the Discussion of the term "replica" is complicated by the fact that the
term was used in RFC 5661 [65] with a meaning different from that term was used in RFC 5661 [66] with a meaning different from that
used in this document. In short, in [65] each replica is identified used in this document. In short, in [66] each replica is identified
by a single network access path, while in the current document, a set by a single network access path, while in the current document, a set
of network access paths that have server-trunkable network addresses of network access paths that have server-trunkable network addresses
and the same root-relative file system pathname is considered to be a and the same root-relative file system pathname is considered to be a
single replica with multiple network access paths. single replica with multiple network access paths.
Each set of server-trunkable location elements defines a set of Each set of server-trunkable location elements defines a set of
available network access paths to a particular file system. When available network access paths to a particular file system. When
there are multiple such file systems, each of which containing the there are multiple such file systems, each of which containing the
same data, these file systems are considered replicas of one another. same data, these file systems are considered replicas of one another.
Logically, such replication is symmetric, since the fs currently in Logically, such replication is symmetric, since the fs currently in
skipping to change at line 11124 skipping to change at line 11149
(e.g., priority for use, writability, currency, etc.). (e.g., priority for use, writability, currency, etc.).
* Help the client efficiently effect as seamless a transition as * Help the client efficiently effect as seamless a transition as
possible among multiple file system instances, when and if that possible among multiple file system instances, when and if that
should be necessary. should be necessary.
* Guide the selection of the appropriate connection type to be used * Guide the selection of the appropriate connection type to be used
when establishing a connection. when establishing a connection.
Within the fs_locations_info attribute, each fs_locations_server4 Within the fs_locations_info attribute, each fs_locations_server4
entry corresponds to a file system location entry with the fls_server entry corresponds to a file system location entry: the fls_server
field designating the server and with the location pathname within field designates the server, and the fl_rootpath field of the
the server's pseudo-fs given by the fl_rootpath field of the encompassing fs_locations_item4 gives the location pathname within
encompassing fs_locations_item4. the server's pseudo-fs.
The fs_locations attribute defined in NFSv4.0 is also a part of The fs_locations attribute defined in NFSv4.0 is also a part of
NFSv4.1. This attribute only allows specification of the file system NFSv4.1. This attribute only allows specification of the file system
locations where the data corresponding to a given file system may be locations where the data corresponding to a given file system may be
found. Servers SHOULD make this attribute available whenever found. Servers SHOULD make this attribute available whenever
fs_locations_info is supported, but client use of fs_locations_info fs_locations_info is supported, but client use of fs_locations_info
is preferable because it provides more information. is preferable because it provides more information.
Within the fs_locations attribute, each fs_location4 contains a file Within the fs_locations attribute, each fs_location4 contains a file
system location entry with the server field designating the server system location entry with the server field designating the server
skipping to change at line 11394 skipping to change at line 11419
* The client may fetch the file system location attribute for the * The client may fetch the file system location attribute for the
file system. This will provide either the name of the server file system. This will provide either the name of the server
(which can be turned into a set of network addresses using DNS) or (which can be turned into a set of network addresses using DNS) or
a set of server-trunkable location entries. Using the latter a set of server-trunkable location entries. Using the latter
alternative, the server can provide addresses it regards as alternative, the server can provide addresses it regards as
desirable to use to access the file system in question. Although desirable to use to access the file system in question. Although
these entries can contain port numbers, these port numbers are not these entries can contain port numbers, these port numbers are not
used in determining trunking relationships. Once the candidate used in determining trunking relationships. Once the candidate
addresses have been determined and EXCHANGE_ID done to the proper addresses have been determined and EXCHANGE_ID done to the proper
server, only the value of the so_major field returned by the server, only the value of the so_major_id field returned by the
servers in question determines whether a trunking relationship servers in question determines whether a trunking relationship
actually exists. actually exists.
When the client fetches a location attribute for a file system, it When the client fetches a location attribute for a file system, it
should be noted that the client may encounter multiple entries for a should be noted that the client may encounter multiple entries for a
number of reasons, such that when it determines trunking information, number of reasons, such that when it determines trunking information,
it may have to bypass addresses not trunkable with one already known. it may need to bypass addresses not trunkable with one already known.
The server can provide location entries that include either names or The server can provide location entries that include either names or
network addresses. It might use the latter form because of DNS- network addresses. It might use the latter form because of DNS-
related security concerns or because the set of addresses to be used related security concerns or because the set of addresses to be used
might require active management by the server. might require active management by the server.
Location entries used to discover candidate addresses for use in Location entries used to discover candidate addresses for use in
trunking are subject to change, as discussed in Section 11.5.7 below. trunking are subject to change, as discussed in Section 11.5.7 below.
The client may respond to such changes by using additional addresses The client may respond to such changes by using additional addresses
once they are verified or by ceasing to use existing ones. The once they are verified or by ceasing to use existing ones. The
skipping to change at line 11435 skipping to change at line 11460
may have to choose a connection type with no possibility of changing may have to choose a connection type with no possibility of changing
it within the scope of a single connection. it within the scope of a single connection.
The two file system location attributes differ as to the information The two file system location attributes differ as to the information
made available in this regard. The fs_locations attribute provides made available in this regard. The fs_locations attribute provides
no information to support connection type selection. As a result, no information to support connection type selection. As a result,
clients supporting multiple connection types would need to attempt to clients supporting multiple connection types would need to attempt to
establish connections using multiple connection types until the one establish connections using multiple connection types until the one
preferred by the client is successfully established. preferred by the client is successfully established.
The fs_locations_info attribute includes a flag, FSLI4TF_RDMA, which, The fs_locations_info attribute includes the FSLI4TF_RDMA flag, which
when set indicates that RPC-over-RDMA support is available using the is convenient for a client wishing to use RDMA. When this flag is
specified location entry, by "stepping up" an existing TCP connection set, it indicates that RPC-over-RDMA support is available using the
to include support for RDMA operation. This flag makes it convenient specified location entry. A client can establish a TCP connection
for a client wishing to use RDMA. When this flag is set, it can and then convert that connection to use RDMA by using the step-up
establish a TCP connection and then convert that connection to use facility.
RDMA by using the step-up facility.
Irrespective of the particular attribute used, when there is no Irrespective of the particular attribute used, when there is no
indication that a step-up operation can be performed, a client indication that a step-up operation can be performed, a client
supporting RDMA operation can establish a new RDMA connection, and it supporting RDMA operation can establish a new RDMA connection, and it
can be bound to the session already established by the TCP can be bound to the session already established by the TCP
connection, allowing the TCP connection to be dropped and the session connection, allowing the TCP connection to be dropped and the session
converted to further use in RDMA mode, if the server supports that. converted to further use in RDMA mode, if the server supports that.
11.5.4. File System Replication 11.5.4. File System Replication
skipping to change at line 11479 skipping to change at line 11503
How the difference between replicas affects file system transitions How the difference between replicas affects file system transitions
can be represented within the fs_locations and fs_locations_info can be represented within the fs_locations and fs_locations_info
attributes, and how the client deals with file system transition attributes, and how the client deals with file system transition
issues will be discussed in detail in later sections. issues will be discussed in detail in later sections.
Although the location attributes provide some information about the Although the location attributes provide some information about the
nature of the inter-replica transition, many aspects of the semantics nature of the inter-replica transition, many aspects of the semantics
of possible asynchronous updates are not currently described by the of possible asynchronous updates are not currently described by the
protocol, which makes it necessary for clients using replication to protocol, which makes it necessary for clients using replication to
switch among replicas undergoing change to familiarize themselves switch among replicas undergoing change to familiarize themselves
with the semantics of the update approach used. Because of this lack with the semantics of the update approach used. Due to this lack of
of specificity, many applications may find the use of migration more specificity, many applications may find the use of migration more
appropriate, since, in that case, the server, when effecting the appropriate because a server can propagate all updates made before an
transition, has established a point in time such that all updates established point in time to the new replica as part of the migration
made before that can propagated to the new replica as part of the event.
migration event.
11.5.4.1. File System Trunking Presented as Replication 11.5.4.1. File System Trunking Presented as Replication
In some situations, a file system location entry may indicate a file In some situations, a file system location entry may indicate a file
system access path to be used as an alternate location, where system access path to be used as an alternate location, where
trunking, rather than replication, is to be used. The situations in trunking, rather than replication, is to be used. The situations in
which this is appropriate are limited to those in which both of the which this is appropriate are limited to those in which both of the
following are true: following are true:
* The two file system locations (i.e., the one on which the location * The two file system locations (i.e., the one on which the location
skipping to change at line 11533 skipping to change at line 11556
system location entries with different handle, fileid, write- system location entries with different handle, fileid, write-
verifier, change, and readdir classes, indicates a serious verifier, change, and readdir classes, indicates a serious
problem. The client, if it allows transition to the file system problem. The client, if it allows transition to the file system
instance at all, must not treat any transition as a transparent instance at all, must not treat any transition as a transparent
one. The server SHOULD NOT indicate that these two entries (for one. The server SHOULD NOT indicate that these two entries (for
the same file system on the same server) belong to different the same file system on the same server) belong to different
handle, fileid, write-verifier, change, and readdir classes, handle, fileid, write-verifier, change, and readdir classes,
whether or not the two entries are shown belonging to the same whether or not the two entries are shown belonging to the same
simultaneous-use class. simultaneous-use class.
These situations were recognized by [65], even though that document These situations were recognized by [66], even though that document
made no explicit mention of trunking: made no explicit mention of trunking:
* It treated the situation that we describe as trunking as one of * It treated the situation that we describe as trunking as one of
simultaneous use of two distinct file system instances, even simultaneous use of two distinct file system instances, even
though, in the explanatory framework now used to describe the though, in the explanatory framework now used to describe the
situation, the case is one in which a single file system is situation, the case is one in which a single file system is
accessed by two different trunked addresses. accessed by two different trunked addresses.
* It treated the situation in which two paths are to be used * It treated the situation in which two paths are to be used
serially as a special sort of "transparent transition". However, serially as a special sort of "transparent transition". However,
skipping to change at line 11657 skipping to change at line 11680
located on one server with a file system located on another server. located on one server with a file system located on another server.
When this includes the use of pure referrals, servers are provided a When this includes the use of pure referrals, servers are provided a
way of placing a file system in a location within the namespace way of placing a file system in a location within the namespace
essentially without respect to its physical location on a particular essentially without respect to its physical location on a particular
server. This allows a single server or a set of servers to present a server. This allows a single server or a set of servers to present a
multi-server namespace that encompasses file systems located on a multi-server namespace that encompasses file systems located on a
wider range of servers. Some likely uses of this facility include wider range of servers. Some likely uses of this facility include
establishment of site-wide or organization-wide namespaces, with the establishment of site-wide or organization-wide namespaces, with the
eventual possibility of combining such together into a truly global eventual possibility of combining such together into a truly global
namespace, such as the one provided by AFS (the Andrew File System) namespace, such as the one provided by AFS (the Andrew File System)
[64]. [65].
Referrals occur when a client determines, upon first referencing a Referrals occur when a client determines, upon first referencing a
position in the current namespace, that it is part of a new file position in the current namespace, that it is part of a new file
system and that the file system is absent. When this occurs, system and that the file system is absent. When this occurs,
typically upon receiving the error NFS4ERR_MOVED, the actual location typically upon receiving the error NFS4ERR_MOVED, the actual location
or locations of the file system can be determined by fetching a or locations of the file system can be determined by fetching a
locations attribute. locations attribute.
The file system location attribute may designate a single file system The file system location attribute may designate a single file system
location or multiple file system locations, to be selected based on location or multiple file system locations, to be selected based on
skipping to change at line 11777 skipping to change at line 11800
11.6. Trunking without File System Location Information 11.6. Trunking without File System Location Information
In situations in which a file system is accessed using two server- In situations in which a file system is accessed using two server-
trunkable addresses (as indicated by the same value of the trunkable addresses (as indicated by the same value of the
so_major_id field of the eir_server_owner field returned in response so_major_id field of the eir_server_owner field returned in response
to EXCHANGE_ID), trunked access is allowed even though there might to EXCHANGE_ID), trunked access is allowed even though there might
not be any location entries specifically indicating the use of not be any location entries specifically indicating the use of
trunking for that file system. trunking for that file system.
This situation was recognized by [65], although that document made no This situation was recognized by [66], although that document made no
explicit mention of trunking and treated the situation as one of explicit mention of trunking and treated the situation as one of
simultaneous use of two distinct file system instances. In the simultaneous use of two distinct file system instances. In the
explanatory framework now used to describe the situation, the case is explanatory framework now used to describe the situation, the case is
one in which a single file system is accessed by two different one in which a single file system is accessed by two different
trunked addresses. trunked addresses.
11.7. Users and Groups in a Multi-Server Namespace 11.7. Users and Groups in a Multi-Server Namespace
As in the case of a single-server environment (see Section 5.9), when As in the case of a single-server environment (see Section 5.9), when
an owner or group name of the form "id@domain" is assigned to a file, an owner or group name of the form "id@domain" is assigned to a file,
skipping to change at line 11896 skipping to change at line 11919
How these are dealt with is discussed in Section 11.11. How these are dealt with is discussed in Section 11.11.
* Those in which access to the current file system instance is * Those in which access to the current file system instance is
retained, while the network path used to access that instance is retained, while the network path used to access that instance is
changed. This case is discussed in Section 11.10. changed. This case is discussed in Section 11.10.
11.10. Effecting Network Endpoint Transitions 11.10. Effecting Network Endpoint Transitions
The endpoints used to access a particular file system instance may The endpoints used to access a particular file system instance may
change in a number of ways, as listed below. In each of these cases, change in a number of ways, as listed below. In each of these cases,
the same fsid, filehandles, stateids, client IDs, and are used to the same fsid, client IDs, filehandles, and stateids are used to
continue access, with a continuity of lock state. In many cases, the continue access, with a continuity of lock state. In many cases, the
same sessions can also be used. same sessions can also be used.
The appropriate action depends on the set of replacement addresses The appropriate action depends on the set of replacement addresses
that are available for use (i.e., server endpoints that are server- that are available for use (i.e., server endpoints that are server-
trunkable with one previously being used). trunkable with one previously being used).
* When use of a particular address is to cease, and there is also * When use of a particular address is to cease, and there is also
another address currently in use that is server-trunkable with it, another address currently in use that is server-trunkable with it,
requests that would have been issued on the address whose use is requests that would have been issued on the address whose use is
skipping to change at line 11921 skipping to change at line 11944
will be used. will be used.
* When use of a particular connection is to cease, as indicated by * When use of a particular connection is to cease, as indicated by
receiving NFS4ERR_MOVED when using that connection, but that receiving NFS4ERR_MOVED when using that connection, but that
address is still indicated as accessible according to the address is still indicated as accessible according to the
appropriate file system location entries, it is likely that appropriate file system location entries, it is likely that
requests can be issued on a new connection of a different requests can be issued on a new connection of a different
connection type once that connection is established. Since any connection type once that connection is established. Since any
two non-port-specific server endpoints that share a network two non-port-specific server endpoints that share a network
address are inherently session-trunkable, the client can use address are inherently session-trunkable, the client can use
BIND_CONN_TO_SESSION to access the existing session using the new BIND_CONN_TO_SESSION to access the existing session with the new
connection and proceed to access the file system using the new
connection. connection.
* When there are no potential replacement addresses in use, but * When there are no potential replacement addresses in use, but
there are valid addresses session-trunkable with the one whose use there are valid addresses session-trunkable with the one whose use
is to be discontinued, the client can use BIND_CONN_TO_SESSION to is to be discontinued, the client can use BIND_CONN_TO_SESSION to
access the existing session using the new address. Although the access the existing session using the new address. Although the
target session will generally be accessible, there may be rare target session will generally be accessible, there may be rare
situations in which that session is no longer accessible when an situations in which that session is no longer accessible when an
attempt is made to bind the new connection to it. In this case, attempt is made to bind the new connection to it. In this case,
the client can create a new session to enable continued access to the client can create a new session to enable continued access to
skipping to change at line 12440 skipping to change at line 12462
11.12. Transferring State upon Migration 11.12. Transferring State upon Migration
When the transition is a result of a server-initiated decision to When the transition is a result of a server-initiated decision to
transition access, and the source and destination servers have transition access, and the source and destination servers have
implemented appropriate cooperation, it is possible to do the implemented appropriate cooperation, it is possible to do the
following: following:
* Transfer locking state from the source to the destination server * Transfer locking state from the source to the destination server
in a fashion similar to that provided by Transparent State in a fashion similar to that provided by Transparent State
Migration in NFSv4.0, as described in [68]. Server Migration in NFSv4.0, as described in [69]. Server
responsibilities are described in Section 11.14.2. responsibilities are described in Section 11.14.2.
* Transfer session state from the source to the destination server. * Transfer session state from the source to the destination server.
Server responsibilities in effecting such a transfer are described Server responsibilities in effecting such a transfer are described
in Section 11.14.3. in Section 11.14.3.
The means by which the client determines which of these transfer The means by which the client determines which of these transfer
events has occurred are described in Section 11.13. events has occurred are described in Section 11.13.
11.12.1. Transparent State Migration and pNFS 11.12.1. Transparent State Migration and pNFS
skipping to change at line 12556 skipping to change at line 12578
interrogating a file system location attribute. This enables a interrogating a file system location attribute. This enables a
client to determine a new replica's location or a new network client to determine a new replica's location or a new network
access path. access path.
This condition continues on subsequent attempts to access the file This condition continues on subsequent attempts to access the file
system in question. The only way the client can avoid the error system in question. The only way the client can avoid the error
is to cease accessing the file system in question at its old is to cease accessing the file system in question at its old
server location and access it instead using a different address at server location and access it instead using a different address at
which it is now available. which it is now available.
* Whenever a SEQUENCE operation is sent by a client to a server * Whenever a client sends a SEQUENCE operation to a server that
which generated state held on that client which is associated with generated state held on that client and associated with a file
a file system that is no longer accessible on the server at which system no longer accessible on that server, the response will
it was previously available, the response will contain a lease- contain the status bit SEQ4_STATUS_LEASE_MOVED, indicating that
migrated indication, with the SEQ4_STATUS_LEASE_MOVED status bit there has been a lease migration.
being set.
This condition continues until the client acknowledges the This condition continues until the client acknowledges the
notification by fetching a file system location attribute for the notification by fetching a file system location attribute for the
file system whose network access path is being changed. When file system whose network access path is being changed. When
there are multiple such file systems, a location attribute for there are multiple such file systems, a location attribute for
each such file system needs to be fetched. The location attribute each such file system needs to be fetched. The location attribute
for all migrated file systems needs to be fetched in order to for all migrated file systems needs to be fetched in order to
clear the condition. Even after the condition is cleared, the clear the condition. Even after the condition is cleared, the
client needs to respond by using the location information to client needs to respond by using the location information to
access the file system at its new location to ensure that leases access the file system at its new location to ensure that leases
skipping to change at line 12683 skipping to change at line 12704
the migration discovery process would deal with those indications. the migration discovery process would deal with those indications.
See below for details. See below for details.
* For such indications received in all other contexts, the * For such indications received in all other contexts, the
appropriate response is to initiate or otherwise provide for the appropriate response is to initiate or otherwise provide for the
execution of migration discovery for file systems associated with execution of migration discovery for file systems associated with
the server IP address returning the indication. the server IP address returning the indication.
This leaves a potential difficulty in situations in which the This leaves a potential difficulty in situations in which the
migration discovery process is near to completion but is still migration discovery process is near to completion but is still
operating. One should not ignore a LEASE_MOVED indication if the operating. One should not ignore a SEQ4_STATUS_LEASE_MOVED
migration discovery process is not able to respond to the discovery indication if the migration discovery process is not able to respond
of additional migrating file systems without additional aid. A to the discovery of additional migrating file systems without
further complexity relevant in addressing such situations is that a additional aid. A further complexity relevant in addressing such
lease-migrated indication may reflect the server's state at the time situations is that a lease-migrated indication may reflect the
the SEQUENCE operation was processed, which may be different from server's state at the time the SEQUENCE operation was processed,
that in effect at the time the response is received. Because new which may be different from that in effect at the time the response
migration events may occur at any time, and because a LEASE_MOVED is received. Because new migration events may occur at any time, and
indication may reflect the situation in effect a considerable time because a SEQ4_STATUS_LEASE_MOVED indication may reflect the
before the indication is received, special care needs to be taken to situation in effect a considerable time before the indication is
ensure that LEASE_MOVED indications are not inappropriately ignored. received, special care needs to be taken to ensure that
SEQ4_STATUS_LEASE_MOVED indications are not inappropriately ignored.
A useful approach to this issue involves the use of separate A useful approach to this issue involves the use of separate
externally-visible migration discovery states for each server. externally-visible migration discovery states for each server.
Separate values could represent the various possible states for the Separate values could represent the various possible states for the
migration discovery process for a server: migration discovery process for a server:
* Non-operation, in which migration discovery is not being * Non-operation, in which migration discovery is not being
performed. performed.
* Normal operation, in which there is an ongoing scan for migrated * Normal operation, in which there is an ongoing scan for migrated
skipping to change at line 12728 skipping to change at line 12750
* If the fs_status attribute indicates that the file system is a * If the fs_status attribute indicates that the file system is a
migrated one (i.e., fss_absent is true, and fss_type != migrated one (i.e., fss_absent is true, and fss_type !=
STATUS4_REFERRAL), then a migrated file system has been found. In STATUS4_REFERRAL), then a migrated file system has been found. In
this situation, it is likely that the fetch of the file system this situation, it is likely that the fetch of the file system
location attribute has cleared one of the file systems location attribute has cleared one of the file systems
contributing to the lease-migrated indication. contributing to the lease-migrated indication.
* In cases in which that happened, the thread cannot know whether * In cases in which that happened, the thread cannot know whether
the lease-migrated indication has been cleared, and so it enters the lease-migrated indication has been cleared, and so it enters
the completion/verification state and proceeds to issue a COMPOUND the completion/verification state and proceeds to issue a COMPOUND
to see if the LEASE_MOVED indication has been cleared. to see if the SEQ4_STATUS_LEASE_MOVED indication has been cleared.
* When the discovery process is in the completion/verification * When the discovery process is in the completion/verification
state, if other requests get a lease-migrated indication, they state, if other requests get a lease-migrated indication, they
note that it was received. Later, the existence of such note that it was received. Later, the existence of such
indications is used when the request completes, as described indications is used when the request completes, as described
below. below.
When the request used in the completion/verification state completes: When the request used in the completion/verification state completes:
* If a lease-migrated indication is returned, the discovery * If a lease-migrated indication is returned, the discovery
skipping to change at line 12756 skipping to change at line 12778
discovery process remains in the completion/verification state. discovery process remains in the completion/verification state.
* If there have been no lease-migrated indications, the work of * If there have been no lease-migrated indications, the work of
migration discovery is considered completed, and it enters the migration discovery is considered completed, and it enters the
non-operating state. Once it enters this state, subsequent lease- non-operating state. Once it enters this state, subsequent lease-
migrated indications will trigger a new migration discovery migrated indications will trigger a new migration discovery
process. process.
It should be noted that the process described above is not guaranteed It should be noted that the process described above is not guaranteed
to terminate, as a long series of new migration events might to terminate, as a long series of new migration events might
continually delay the clearing of the LEASE_MOVED indication. To continually delay the clearing of the SEQ4_STATUS_LEASE_MOVED
prevent unnecessary lease expiration, it is appropriate for clients indication. To prevent unnecessary lease expiration, it is
to use the discovery of migrations to effect lease renewal appropriate for clients to use the discovery of migrations to effect
immediately, rather than waiting for the clearing of the LEASE_MOVED lease renewal immediately, rather than waiting for the clearing of
indication when the complete set of migrations is available. the SEQ4_STATUS_LEASE_MOVED indication when the complete set of
migrations is available.
Lease discovery needs to be provided as described above. This Lease discovery needs to be provided as described above. This
ensures that the client discovers file system migrations soon enough ensures that the client discovers file system migrations soon enough
to renew its leases on each destination server before they expire. to renew its leases on each destination server before they expire.
Non-renewal of leases can lead to loss of locking state. While the Non-renewal of leases can lead to loss of locking state. While the
consequences of such loss can be ameliorated through implementations consequences of such loss can be ameliorated through implementations
of courtesy locks, servers are under no obligation to do so, and a of courtesy locks, servers are under no obligation to do so, and a
conflicting lock request may mean that a lock is revoked conflicting lock request may mean that a lock is revoked
unexpectedly. Clients should be aware of this possibility. unexpectedly. Clients should be aware of this possibility.
skipping to change at line 12801 skipping to change at line 12824
During the first phase of this process, the client proceeds to During the first phase of this process, the client proceeds to
examine file system location entries to find the initial network examine file system location entries to find the initial network
address it will use to continue access to the file system or its address it will use to continue access to the file system or its
replacement. For each location entry that the client examines, the replacement. For each location entry that the client examines, the
process consists of five steps: process consists of five steps:
1. Performing an EXCHANGE_ID directed at the location address. This 1. Performing an EXCHANGE_ID directed at the location address. This
operation is used to register the client owner (in the form of a operation is used to register the client owner (in the form of a
client_owner4) with the server, to obtain a client ID to be used client_owner4) with the server, to obtain a client ID to be used
subsequently to communicate with it, to obtain that client ID's subsequently to communicate with it, to obtain that client ID's
confirmation status, and to determine server_owner and scope for confirmation status, and to determine server_owner4 and scope for
the purpose of determining if the entry is trunkable with the the purpose of determining if the entry is trunkable with the
address previously being used to access the file system (i.e., address previously being used to access the file system (i.e.,
that it represents another network access path to the same file that it represents another network access path to the same file
system and can share locking state with it). system and can share locking state with it).
2. Making an initial determination of whether migration has 2. Making an initial determination of whether migration has
occurred. The initial determination will be based on whether the occurred. The initial determination will be based on whether the
EXCHANGE_ID results indicate that the current location element is EXCHANGE_ID results indicate that the current location element is
server-trunkable with that used to access the file system when server-trunkable with that used to access the file system when
access was terminated by receiving NFS4ERR_MOVED. If it is, then access was terminated by receiving NFS4ERR_MOVED. If it is, then
skipping to change at line 12827 skipping to change at line 12850
3. Obtaining access to existing session state or creating new 3. Obtaining access to existing session state or creating new
sessions. How this is done depends on the initial determination sessions. How this is done depends on the initial determination
of whether migration has occurred and can be done as described in of whether migration has occurred and can be done as described in
Section 11.13.4 below in the case of migration or as described in Section 11.13.4 below in the case of migration or as described in
Section 11.13.5 below in the case of a network address transfer Section 11.13.5 below in the case of a network address transfer
without migration. without migration.
4. Verifying the trunking relationship assumed in step 2 as 4. Verifying the trunking relationship assumed in step 2 as
discussed in Section 2.10.5.1. Although this step will generally discussed in Section 2.10.5.1. Although this step will generally
confirm the initial determination, it is possible for confirm the initial determination, it is possible for
verification to fail with the result that an initial verification to invalidate the initial determination of network
determination that a network address shift (without migration) address shift (without migration) and instead determine that
has occurred may be invalidated and migration determined to have migration had occurred. There is no need to redo step 3 above,
occurred. There is no need to redo step 3 above, since it will since it will be possible to continue use of the session
be possible to continue use of the session established already. established already.
5. Obtaining access to existing locking state and/or re-obtaining 5. Obtaining access to existing locking state and/or re-obtaining
it. How this is done depends on the final determination of it. How this is done depends on the final determination of
whether migration has occurred and can be done as described below whether migration has occurred and can be done as described below
in Section 11.13.4 in the case of migration or as described in in Section 11.13.4 in the case of migration or as described in
Section 11.13.5 in the case of a network address transfer without Section 11.13.5 in the case of a network address transfer without
migration. migration.
Once the initial address has been determined, clients are free to Once the initial address has been determined, clients are free to
apply an abbreviated process to find additional addresses trunkable apply an abbreviated process to find additional addresses trunkable
skipping to change at line 12898 skipping to change at line 12921
it is possible that a session was transferred as well. To deal with it is possible that a session was transferred as well. To deal with
that possibility, clients can, after doing the EXCHANGE_ID, issue a that possibility, clients can, after doing the EXCHANGE_ID, issue a
BIND_CONN_TO_SESSION to connect the transferred session to a BIND_CONN_TO_SESSION to connect the transferred session to a
connection to the new server. If that fails, it is an indication connection to the new server. If that fails, it is an indication
that the session was not transferred and that a new session needs to that the session was not transferred and that a new session needs to
be created to take its place. be created to take its place.
In some situations, it is possible for a BIND_CONN_TO_SESSION to In some situations, it is possible for a BIND_CONN_TO_SESSION to
succeed without session migration having occurred. If state merger succeed without session migration having occurred. If state merger
has taken place, then the associated client ID may have already had a has taken place, then the associated client ID may have already had a
set of existing sessions, with it being possible that the sessionid set of existing sessions, with it being possible that the session ID
of a given session is the same as one that might have been migrated. of a given session is the same as one that might have been migrated.
In that event, a BIND_CONN_TO_SESSION might succeed, even though In that event, a BIND_CONN_TO_SESSION might succeed, even though
there could have been no migration of the session with that there could have been no migration of the session with that session
sessionid. In such cases, the client will receive sequence errors ID. In such cases, the client will receive sequence errors when the
when the slot sequence values used are not appropriate on the new slot sequence values used are not appropriate on the new session.
session. When this occurs, the client can create a new a session and When this occurs, the client can create a new a session and cease
cease using the existing one. using the existing one.
Once the client has determined the initial migration status, and Once the client has determined the initial migration status, and
determined that there was a shift to a new server, it needs to re- determined that there was a shift to a new server, it needs to re-
establish its locking state, if possible. To enable this to happen establish its locking state, if possible. To enable this to happen
without loss of the guarantees normally provided by locking, the without loss of the guarantees normally provided by locking, the
destination server needs to implement a per-fs grace period in all destination server needs to implement a per-fs grace period in all
cases in which lock state was lost, including those in which cases in which lock state was lost, including those in which
Transparent State Migration was not implemented. Each client for Transparent State Migration was not implemented. Each client for
which there was a transfer of locking state to the new server will which there was a transfer of locking state to the new server will
have the duration of the grace period to reclaim its locks, from the have the duration of the grace period to reclaim its locks, from the
skipping to change at line 13066 skipping to change at line 13089
* The type of the lock, such as open, byte-range lock, delegation, * The type of the lock, such as open, byte-range lock, delegation,
or layout. or layout.
* For locks such as opens and byte-range locks, there will be * For locks such as opens and byte-range locks, there will be
information about the owner(s) of the lock. information about the owner(s) of the lock.
* For recallable/revocable lock types, the current recall status * For recallable/revocable lock types, the current recall status
needs to be included. needs to be included.
* For each lock type, there will be type-specific information, such * For each lock type, there will be associated type-specific
as share and deny modes for opens and type and byte ranges for information. For opens, this will include share and deny mode
byte-range locks and layouts. while for byte-range locks and layouts, there will be a type and a
byte-range.
Such information will most probably be organized by client id string Such information will most probably be organized by client id string
on the destination server so that it can be used to provide on the destination server so that it can be used to provide
appropriate context to each client when it makes itself known to the appropriate context to each client when it makes itself known to the
client. Issues connected with a client impersonating another by client. Issues connected with a client impersonating another by
presenting another client's client id string can be addressed using presenting another client's client id string can be addressed using
NFSv4.1 state protection features, as described in Section 21. NFSv4.1 state protection features, as described in Section 21.
A further server responsibility concerns locks that are revoked or A further server responsibility concerns locks that are revoked or
otherwise lost during the process of file system migration. Because otherwise lost during the process of file system migration. Because
skipping to change at line 13112 skipping to change at line 13136
granted until the client does a RECLAIM_COMPLETE, after reclaiming granted until the client does a RECLAIM_COMPLETE, after reclaiming
the locks it had, with the exception of reclaims denied because the locks it had, with the exception of reclaims denied because
they were attempts to reclaim locks that had been lost. they were attempts to reclaim locks that had been lost.
* Implement Transparent State Migration, except for the lock with * Implement Transparent State Migration, except for the lock with
the conflicting stateid. In this case, the client will be aware the conflicting stateid. In this case, the client will be aware
of a lost lock (through the SEQ4_STATUS flags) and be allowed to of a lost lock (through the SEQ4_STATUS flags) and be allowed to
reclaim it. reclaim it.
When transferring state between the source and destination, the When transferring state between the source and destination, the
issues discussed in Section 7.2 of [68] must still be attended to. issues discussed in Section 7.2 of [69] must still be attended to.
In this case, the use of NFS4ERR_DELAY may still be necessary in In this case, the use of NFS4ERR_DELAY may still be necessary in
NFSv4.1, as it was in NFSv4.0, to prevent locking state changing NFSv4.1, as it was in NFSv4.0, to prevent locking state changing
while it is being transferred. See Section 15.1.1.3 for information while it is being transferred. See Section 15.1.1.3 for information
about appropriate client retry approaches in the event that about appropriate client retry approaches in the event that
NFS4ERR_DELAY is returned. NFS4ERR_DELAY is returned.
There are a number of important differences in the NFS4.1 context: There are a number of important differences in the NFS4.1 context:
* The absence of RELEASE_LOCKOWNER means that the one case in which * The absence of RELEASE_LOCKOWNER means that the one case in which
an operation could not be deferred by use of NFS4ERR_DELAY no an operation could not be deferred by use of NFS4ERR_DELAY no
longer exists. longer exists.
* Sequencing of operations is no longer done using owner-based * Sequencing of operations is no longer done using owner-based
operation sequences numbers. Instead, sequencing is session- operation sequences numbers. Instead, sequencing is session-
based. based.
As a result, when sessions are not transferred, the techniques As a result, when sessions are not transferred, the techniques
discussed in Section 7.2 of [68] are adequate and will not be further discussed in Section 7.2 of [69] are adequate and will not be further
discussed. discussed.
11.14.3. Server Responsibilities in Effecting Session Transfer 11.14.3. Server Responsibilities in Effecting Session Transfer
The basic responsibility of the source server in effecting session The basic responsibility of the source server in effecting session
transfer is to make available to the destination server a description transfer is to make available to the destination server a description
of the current state of each slot with the session, including the of the current state of each slot with the session, including the
following: following:
* The last sequence value received for that slot. * The last sequence value received for that slot.
skipping to change at line 13238 skipping to change at line 13262
* Avoid enforcing any sequencing semantics for a particular slot * Avoid enforcing any sequencing semantics for a particular slot
until the client has established the starting sequence for that until the client has established the starting sequence for that
slot on the destination server. slot on the destination server.
* For each slot, avoid returning a cached reply returning * For each slot, avoid returning a cached reply returning
NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established
the starting sequence for that slot on the destination server. the starting sequence for that slot on the destination server.
* Until the client has established the starting sequence for a * Until the client has established the starting sequence for a
particular slot on the destination server, avoid reporting particular slot on the destination server, avoid reporting
NFS4ERR_SEQ_MISORDERED or returning a cached reply returning NFS4ERR_SEQ_MISORDERED or returning a cached reply that contains
NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of either NFS4ERR_DELAY or NFS4ERR_MOVED and consists solely of a
a series of operations where the response is NFS4_OK until the series of operations where the response is NFS4_OK until the final
final error. error.
Because of the considerations mentioned above, including the rules Because of the considerations mentioned above, including the rules
for the handling of NFS4ERR_DELAY included in Section 15.1.1.3, the for the handling of NFS4ERR_DELAY included in Section 15.1.1.3, the
destination server can respond appropriately to SEQUENCE operations destination server can respond appropriately to SEQUENCE operations
received from the client by adopting the three policies listed below: received from the client by adopting the three policies listed below:
* Not responding with NFS4ERR_SEQ_MISORDERED for the initial request * Not responding with NFS4ERR_SEQ_MISORDERED for the initial request
on a slot within a transferred session because the destination on a slot within a transferred session because the destination
server cannot be aware of requests made by the client after the server cannot be aware of requests made by the client after the
server handoff but before the client became aware of the shift. server handoff but before the client became aware of the shift.
skipping to change at line 13911 skipping to change at line 13935
by the server in any number of ways, including specification by the by the server in any number of ways, including specification by the
administrator or by current protocols for transferring data among administrator or by current protocols for transferring data among
replicas and protocols not yet developed. NFSv4.1 only defines how replicas and protocols not yet developed. NFSv4.1 only defines how
this information is presented by the server to the client. this information is presented by the server to the client.
11.17.1. The fs_locations_server4 Structure 11.17.1. The fs_locations_server4 Structure
The fs_locations_server4 structure consists of the following items in The fs_locations_server4 structure consists of the following items in
addition to the fls_server field, which specifies a network address addition to the fls_server field, which specifies a network address
or set of addresses to be used to access the specified file system. or set of addresses to be used to access the specified file system.
Note that both of these items (i.e., fls_currency and flinfo) specify Note that both of these items (i.e., fls_currency and fls_info)
attributes of the file system replica and should not be different specify attributes of the file system replica and should not be
when there are multiple fs_locations_server4 structures, each different when there are multiple fs_locations_server4 structures,
specifying a network path to the chosen replica, for the same each specifying a network path to the chosen replica, for the same
replica. replica.
When these values are different in two fs_locations_server4 When these values are different in two fs_locations_server4
structures, a client has no basis for choosing one over the other and structures, a client has no basis for choosing one over the other and
is best off simply ignoring both entries, whether these entries apply is best off simply ignoring both entries, whether these entries apply
to migration replication or referral. When there are more than two to migration replication or referral. When there are more than two
such entries, majority voting can be used to exclude a single such entries, majority voting can be used to exclude a single
erroneous entry from consideration. In the case in which trunking erroneous entry from consideration. In the case in which trunking
information is provided for a replica currently being accessed, the information is provided for a replica currently being accessed, the
additional trunked addresses can be ignored while access continues on additional trunked addresses can be ignored while access continues on
skipping to change at line 13977 skipping to change at line 14001
representing the same data, are such that 8 bits provide a quite representing the same data, are such that 8 bits provide a quite
acceptable range of values. Even where there might be more than acceptable range of values. Even where there might be more than
256 such file system instances, having more than 256 distinct 256 such file system instances, having more than 256 distinct
classes or priorities is unlikely. classes or priorities is unlikely.
* Explicit definition of the various specific data items within XDR * Explicit definition of the various specific data items within XDR
would limit expandability in that any extension within would would limit expandability in that any extension within would
require yet another attribute, leading to specification and require yet another attribute, leading to specification and
implementation clumsiness. In the context of the NFSv4 extension implementation clumsiness. In the context of the NFSv4 extension
model in effect at the time fs_locations_info was designed (i.e., model in effect at the time fs_locations_info was designed (i.e.,
that which is described in RFC 5661 [65]), this would necessitate that which is described in RFC 5661 [66]), this would necessitate
a new minor version to effect any Standards Track extension to the a new minor version to effect any Standards Track extension to the
data in fls_info. data in fls_info.
The set of fls_info data is subject to expansion in a future minor The set of fls_info data is subject to expansion in a future minor
version or in a Standards Track RFC within the context of a single version or in a Standards Track RFC within the context of a single
minor version. The server SHOULD NOT send and the client MUST NOT minor version. The server SHOULD NOT send and the client MUST NOT
use indices within the fls_info array or flag bits that are not use indices within the fls_info array or flag bits that are not
defined in Standards Track RFCs. defined in Standards Track RFCs.
In light of the new extension model defined in RFC 8178 [66] and the In light of the new extension model defined in RFC 8178 [67] and the
fact that the individual items within fls_info are not explicitly fact that the individual items within fls_info are not explicitly
referenced in the XDR, the following practices should be followed referenced in the XDR, the following practices should be followed
when extending or otherwise changing the structure of the data when extending or otherwise changing the structure of the data
returned in fls_info within the scope of a single minor version: returned in fls_info within the scope of a single minor version:
* All extensions need to be described by Standards Track documents. * All extensions need to be described by Standards Track documents.
There is no need for such documents to be marked as updating RFC There is no need for such documents to be marked as updating RFC
5661 [65] or this document. 5661 [66] or this document.
* It needs to be made clear whether the information in any added * It needs to be made clear whether the information in any added
data items applies to the replica specified by the entry or to the data items applies to the replica specified by the entry or to the
specific network paths specified in the entry. specific network paths specified in the entry.
* There needs to be a reliable way defined to determine whether the * There needs to be a reliable way defined to determine whether the
server is aware of the extension. This may be based on the length server is aware of the extension. This may be based on the length
field of the fls_info array, but it is more flexible to provide field of the fls_info array, but it is more flexible to provide
fs-scope or server-scope attributes to indicate what extensions fs-scope or server-scope attributes to indicate what extensions
are provided. are provided.
skipping to change at line 14680 skipping to change at line 14704
12.2.5. Storage Protocol 12.2.5. Storage Protocol
As noted in Figure 1, the storage protocol is the method used by the As noted in Figure 1, the storage protocol is the method used by the
client to store and retrieve data directly from the storage devices. client to store and retrieve data directly from the storage devices.
The NFSv4.1 pNFS feature has been structured to allow for a variety The NFSv4.1 pNFS feature has been structured to allow for a variety
of storage protocols to be defined and used. One example storage of storage protocols to be defined and used. One example storage
protocol is NFSv4.1 itself (as documented in Section 13). Other protocol is NFSv4.1 itself (as documented in Section 13). Other
options for the storage protocol are described elsewhere and include: options for the storage protocol are described elsewhere and include:
* Block/volume protocols such as Internet SCSI (iSCSI) [55] and FCP * Block/volume protocols such as Internet SCSI (iSCSI) [56] and FCP
[56]. The block/volume protocol support can be independent of the [57]. The block/volume protocol support can be independent of the
addressing structure of the block/volume protocol used, allowing addressing structure of the block/volume protocol used, allowing
more than one protocol to access the same file data and enabling more than one protocol to access the same file data and enabling
extensibility to other block/volume protocols. See [47] for a extensibility to other block/volume protocols. See [48] for a
layout specification that allows pNFS to use block/volume storage layout specification that allows pNFS to use block/volume storage
protocols. protocols.
* Object protocols such as OSD over iSCSI or Fibre Channel [57]. * Object protocols such as OSD over iSCSI or Fibre Channel [58].
See [46] for a layout specification that allows pNFS to use object See [47] for a layout specification that allows pNFS to use object
storage protocols. storage protocols.
It is possible that various storage protocols are available to both It is possible that various storage protocols are available to both
client and server and it may be possible that a client and server do client and server and it may be possible that a client and server do
not have a matching storage protocol available to them. Because of not have a matching storage protocol available to them. Because of
this, the pNFS server MUST support normal NFSv4.1 access to any file this, the pNFS server MUST support normal NFSv4.1 access to any file
accessible by the pNFS feature; this will allow for continued accessible by the pNFS feature; this will allow for continued
interoperability between an NFSv4.1 client and server. interoperability between an NFSv4.1 client and server.
12.2.6. Control Protocol 12.2.6. Control Protocol
skipping to change at line 14716 skipping to change at line 14740
state required by the storage devices to perform client access state required by the storage devices to perform client access
control, and, depending on the storage protocol, the enforcement of control, and, depending on the storage protocol, the enforcement of
authentication and authorization so that restrictions that would be authentication and authorization so that restrictions that would be
enforced by the metadata server are also enforced by the storage enforced by the metadata server are also enforced by the storage
device. device.
A particular control protocol is not REQUIRED by NFSv4.1 but A particular control protocol is not REQUIRED by NFSv4.1 but
requirements are placed on the control protocol for maintaining requirements are placed on the control protocol for maintaining
attributes like modify time, the change attribute, and the end-of- attributes like modify time, the change attribute, and the end-of-
file (EOF) position. Note that if pNFS is layered over a clustered, file (EOF) position. Note that if pNFS is layered over a clustered,
parallel file system (e.g., PVFS [58]), the mechanisms that enable parallel file system (e.g., PVFS [59]), the mechanisms that enable
clustering and parallelism in that file system can be considered the clustering and parallelism in that file system can be considered the
control protocol. control protocol.
12.2.7. Layout Types 12.2.7. Layout Types
A layout describes the mapping of a file's data to the storage A layout describes the mapping of a file's data to the storage
devices that hold the data. A layout is said to belong to a specific devices that hold the data. A layout is said to belong to a specific
layout type (data type layouttype4, see Section 3.3.13). The layout layout type (data type layouttype4, see Section 3.3.13). The layout
type allows for variants to handle different storage protocols, such type allows for variants to handle different storage protocols, such
as those associated with block/volume [47], object [46], and file as those associated with block/volume [48], object [47], and file
(Section 13) layout types. A metadata server, along with its control (Section 13) layout types. A metadata server, along with its control
protocol, MUST support at least one layout type. A private sub-range protocol, MUST support at least one layout type. A private sub-range
of the layout type namespace is also defined. Values from the of the layout type namespace is also defined. Values from the
private layout type range MAY be used for internal testing or private layout type range MAY be used for internal testing or
experimentation (see Section 3.3.13). experimentation (see Section 3.3.13).
As an example, the organization of the file layout type could be an As an example, the organization of the file layout type could be an
array of tuples (e.g., device ID, filehandle), along with a array of tuples (e.g., device ID, filehandle), along with a
definition of how the data is stored across the devices (e.g., definition of how the data is stored across the devices (e.g.,
striping). A block/volume layout might be an array of tuples that striping). A block/volume layout might be an array of tuples that
skipping to change at line 14950 skipping to change at line 14974
file for which a layout is held does not necessarily conflict with file for which a layout is held does not necessarily conflict with
the holding of the layout that describes the file being modified. the holding of the layout that describes the file being modified.
Therefore, it is the requirement of the storage protocol or layout Therefore, it is the requirement of the storage protocol or layout
type that determines the necessary behavior. For example, block/ type that determines the necessary behavior. For example, block/
volume layout types require that the layout's iomode agree with the volume layout types require that the layout's iomode agree with the
type of I/O being performed. type of I/O being performed.
Depending upon the layout type and storage protocol in use, storage Depending upon the layout type and storage protocol in use, storage
device access permissions may be granted by LAYOUTGET and may be device access permissions may be granted by LAYOUTGET and may be
encoded within the type-specific layout. For an example of storage encoded within the type-specific layout. For an example of storage
device access permissions, see an object-based protocol such as [57]. device access permissions, see an object-based protocol such as [58].
If access permissions are encoded within the layout, the metadata If access permissions are encoded within the layout, the metadata
server SHOULD recall the layout when those permissions become invalid server SHOULD recall the layout when those permissions become invalid
for any reason -- for example, when a file becomes unwritable or for any reason -- for example, when a file becomes unwritable or
inaccessible to a client. Note, clients are still required to inaccessible to a client. Note, clients are still required to
perform the appropriate OPEN, LOCK, and ACCESS operations as perform the appropriate OPEN, LOCK, and ACCESS operations as
described above. The degree to which it is possible for the client described above. The degree to which it is possible for the client
to circumvent these operations and the consequences of doing so must to circumvent these operations and the consequences of doing so must
be clearly specified by the individual layout type specifications. be clearly specified by the individual layout type specifications.
In addition, these specifications must be clear about the In addition, these specifications must be clear about the
requirements and non-requirements for the checking performed by the requirements and non-requirements for the checking performed by the
skipping to change at line 16044 skipping to change at line 16068
pNFS configuration. Such layout types SHOULD NOT be used when pNFS configuration. Such layout types SHOULD NOT be used when
client-only access checks do not provide sufficient assurance that client-only access checks do not provide sufficient assurance that
NFSv4.1 access control is being applied correctly. (This is not a NFSv4.1 access control is being applied correctly. (This is not a
problem for the file layout type described in Section 13 because the problem for the file layout type described in Section 13 because the
storage access protocol for LAYOUT4_NFSV4_1_FILES is NFSv4.1, and storage access protocol for LAYOUT4_NFSV4_1_FILES is NFSv4.1, and
thus the security model for storage device access via thus the security model for storage device access via
LAYOUT4_NFSv4_1_FILES is the same as that of the metadata server.) LAYOUT4_NFSv4_1_FILES is the same as that of the metadata server.)
For handling of access control specific to a layout, the reader For handling of access control specific to a layout, the reader
should examine the layout specification, such as the NFSv4.1/ should examine the layout specification, such as the NFSv4.1/
file-based layout (Section 13) of this document, the blocks layout file-based layout (Section 13) of this document, the blocks layout
[47], and objects layout [46]. [48], and objects layout [47].
13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type 13. NFSv4.1 as a Storage Protocol in pNFS: the File Layout Type
This section describes the semantics and format of NFSv4.1 file-based This section describes the semantics and format of NFSv4.1 file-based
layouts for pNFS. NFSv4.1 file-based layouts use the layouts for pNFS. NFSv4.1 file-based layouts use the
LAYOUT4_NFSV4_1_FILES layout type. The LAYOUT4_NFSV4_1_FILES type LAYOUT4_NFSV4_1_FILES layout type. The LAYOUT4_NFSV4_1_FILES type
defines striping data across multiple NFSv4.1 data servers. defines striping data across multiple NFSv4.1 data servers.
13.1. Client ID and Session Considerations 13.1. Client ID and Session Considerations
skipping to change at line 17846 skipping to change at line 17870
full information about the state of the session on the source full information about the state of the session on the source
makes it impossible to process the request immediately. makes it impossible to process the request immediately.
In such cases, returning the error NFS4ERR_DELAY allows necessary In such cases, returning the error NFS4ERR_DELAY allows necessary
preparatory operations to proceed without holding up requester preparatory operations to proceed without holding up requester
resources such as a session slot. After delaying for period of time, resources such as a session slot. After delaying for period of time,
the client can then re-send the operation in question, often as part the client can then re-send the operation in question, often as part
of a nearly identical request. Because of the need to avoid spurious of a nearly identical request. Because of the need to avoid spurious
reissues of non-idempotent operations and to avoid acting in response reissues of non-idempotent operations and to avoid acting in response
to NFS4ERR_DELAY errors returned on responses returned from the to NFS4ERR_DELAY errors returned on responses returned from the
replier's replay cache, integration with the session-provided replay replier's reply cache, integration with the session-provided reply
cache is necessary. There are a number of cases to deal with, each cache is necessary. There are a number of cases to deal with, each
of which requires different sorts of handling by the requester and of which requires different sorts of handling by the requester and
replier: replier:
* If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request * If NFS4ERR_DELAY is returned on a SEQUENCE operation, the request
is retried in full with the SEQUENCE operation containing the same is retried in full with the SEQUENCE operation containing the same
slot and sequence values. In this case, the replier MUST avoid slot and sequence values. In this case, the replier MUST avoid
returning a response containing NFS4ERR_DELAY as the response to returning a response containing NFS4ERR_DELAY as the response to
SEQUENCE solely on the basis of its presence in the replay cache. SEQUENCE solely because an earlier instance of the same request
If the replier did this, the retries would not be effective as returned that error and it was stored in the reply cache. If the
there would be no opportunity for the replier to see whether the replier did this, the retries would not be effective as there
would be no opportunity for the replier to see whether the
condition that generated the NFS4ERR_DELAY had been rectified condition that generated the NFS4ERR_DELAY had been rectified
during the interim between the original request and the retry. during the interim between the original request and the retry.
* If NFS4ERR_DELAY is returned on an operation other than SEQUENCE * If NFS4ERR_DELAY is returned on an operation other than SEQUENCE
that validly appears as the first operation of a request, the that validly appears as the first operation of a request, the
handling is similar. The request can be retried in full without handling is similar. The request can be retried in full without
modification. In this case as well, the replier MUST avoid modification. In this case as well, the replier MUST avoid
returning a response containing NFS4ERR_DELAY as the response to returning a response containing NFS4ERR_DELAY as the response to
an initial operation of a request solely on the basis of its an initial operation of a request solely on the basis of its
presence in the replay cache. If the replier did this, the presence in the reply cache. If the replier did this, the retries
retries would not be effective as there would be no opportunity would not be effective as there would be no opportunity for the
for the replier to see whether the condition that generated the replier to see whether the condition that generated the
NFS4ERR_DELAY had been rectified during the interim between the NFS4ERR_DELAY had been rectified during the interim between the
original request and the retry. original request and the retry.
* If NFS4ERR_DELAY is returned on an operation other than the first * If NFS4ERR_DELAY is returned on an operation other than the first
in the request, the request when retried MUST contain a SEQUENCE in the request, the request when retried MUST contain a SEQUENCE
operation that is different than the original one, with either the operation that is different than the original one, with either the
bin id or the sequence value different from that in the original slot ID or the sequence value different from that in the original
request. Because requesters do this, there is no need for the request. Because requesters do this, there is no need for the
replier to take special care to avoid returning an NFS4ERR_DELAY replier to take special care to avoid returning an NFS4ERR_DELAY
error obtained from the replay cache. When no non-idempotent error obtained from the reply cache. When no non-idempotent
operations have been processed before the NFS4ERR_DELAY was operations have been processed before the NFS4ERR_DELAY was
returned, the requester should retry the request in full, with the returned, the requester should retry the request in full, with the
only difference from the original request being the modification only difference from the original request being the modification
to the slot ID or sequence value in the reissued SEQUENCE to the slot ID or sequence value in the reissued SEQUENCE
operation. operation.
* When NFS4ERR_DELAY is returned on an operation other than the * When NFS4ERR_DELAY is returned on an operation other than the
first within a request and there has been a non-idempotent first within a request and there has been a non-idempotent
operation processed before the NFS4ERR_DELAY was returned, operation processed before the NFS4ERR_DELAY was returned,
reissuing the request as is normally done would incorrectly cause reissuing the request as is normally done would incorrectly cause
skipping to change at line 18115 skipping to change at line 18140
feature. feature.
15.1.4.1. NFS4ERR_BADTYPE (Error Code 10007) 15.1.4.1. NFS4ERR_BADTYPE (Error Code 10007)
An attempt was made to create an object with an inappropriate type An attempt was made to create an object with an inappropriate type
specified to CREATE. This may be because the type is undefined, specified to CREATE. This may be because the type is undefined,
because the type is not supported by the server, or because the type because the type is not supported by the server, or because the type
is not intended to be created by CREATE (such as a regular file or is not intended to be created by CREATE (such as a regular file or
named attribute, for which OPEN is used to do the file creation). named attribute, for which OPEN is used to do the file creation).
15.1.4.2. NFS4ERR_DQUOT (Error Code 19) 15.1.4.2. NFS4ERR_DQUOT (Error Code 69)
Resource (quota) hard limit exceeded. The user's resource limit on Resource (quota) hard limit exceeded. The user's resource limit on
the server has been exceeded. the server has been exceeded.
15.1.4.3. NFS4ERR_EXIST (Error Code 17) 15.1.4.3. NFS4ERR_EXIST (Error Code 17)
A file of the specified target name (when creating, renaming, or A file of the specified target name (when creating, renaming, or
linking) already exists. linking) already exists.
15.1.4.4. NFS4ERR_FBIG (Error Code 27) 15.1.4.4. NFS4ERR_FBIG (Error Code 27)
skipping to change at line 21261 skipping to change at line 21286
* When a client executes a regular file, it has to read the file * When a client executes a regular file, it has to read the file
from the server. Strictly speaking, the server should not allow from the server. Strictly speaking, the server should not allow
the client to read a file being executed unless the user has read the client to read a file being executed unless the user has read
permissions on the file. Requiring explicit read permissions on permissions on the file. Requiring explicit read permissions on
executable files in order to access them over NFS is not going to executable files in order to access them over NFS is not going to
be acceptable to some users and storage administrators. be acceptable to some users and storage administrators.
Historically, NFS servers have allowed a user to READ a file if Historically, NFS servers have allowed a user to READ a file if
the user has execute access to the file. the user has execute access to the file.
As a practical example, the UNIX specification [59] states that an As a practical example, the UNIX specification [60] states that an
implementation claiming conformance to UNIX may indicate in the implementation claiming conformance to UNIX may indicate in the
access() programming interface's result that a privileged user has access() programming interface's result that a privileged user has
execute rights, even if no execute permission bits are set on the execute rights, even if no execute permission bits are set on the
regular file's attributes. It is possible to claim conformance to regular file's attributes. It is possible to claim conformance to
the UNIX specification and instead not indicate execute rights in the UNIX specification and instead not indicate execute rights in
that situation, which is true for some operating environments. that situation, which is true for some operating environments.
Suppose the operating environments of the client and server are Suppose the operating environments of the client and server are
implementing the access() semantics for privileged users differently, implementing the access() semantics for privileged users differently,
and the ACCESS operation implementations of the client and server and the ACCESS operation implementations of the client and server
follow their respective access() semantics. This can cause undesired follow their respective access() semantics. This can cause undesired
skipping to change at line 23715 skipping to change at line 23740
18.20.3. DESCRIPTION 18.20.3. DESCRIPTION
This operation replaces the current filehandle with the filehandle This operation replaces the current filehandle with the filehandle
that represents the public filehandle of the server's namespace. that represents the public filehandle of the server's namespace.
This filehandle may be different from the "root" filehandle that may This filehandle may be different from the "root" filehandle that may
be associated with some other directory on the server. be associated with some other directory on the server.
PUTPUBFH also clears the current stateid. PUTPUBFH also clears the current stateid.
The public filehandle represents the concepts embodied in RFC 2054 The public filehandle represents the concepts embodied in RFC 2054
[48], RFC 2055 [49], and RFC 2224 [60]. The intent for NFSv4.1 is [49], RFC 2055 [50], and RFC 2224 [61]. The intent for NFSv4.1 is
that the public filehandle (represented by the PUTPUBFH operation) be that the public filehandle (represented by the PUTPUBFH operation) be
used as a method of providing WebNFS server compatibility with NFSv3. used as a method of providing WebNFS server compatibility with NFSv3.
The public filehandle and the root filehandle (represented by the The public filehandle and the root filehandle (represented by the
PUTROOTFH operation) SHOULD be equivalent. If the public and root PUTROOTFH operation) SHOULD be equivalent. If the public and root
filehandles are not equivalent, then the directory corresponding to filehandles are not equivalent, then the directory corresponding to
the public filehandle MUST be a descendant of the directory the public filehandle MUST be a descendant of the directory
corresponding to the root filehandle. corresponding to the root filehandle.
See Section 16.2.3.1.1 for more details on the current filehandle. See Section 16.2.3.1.1 for more details on the current filehandle.
skipping to change at line 23737 skipping to change at line 23762
See Section 16.2.3.1.2 for more details on the current stateid. See Section 16.2.3.1.2 for more details on the current stateid.
18.20.4. IMPLEMENTATION 18.20.4. IMPLEMENTATION
This operation is used in an NFS request to set the context for file This operation is used in an NFS request to set the context for file
accessing operations that follow in the same COMPOUND request. accessing operations that follow in the same COMPOUND request.
With the NFSv3 public filehandle, the client is able to specify With the NFSv3 public filehandle, the client is able to specify
whether the pathname provided in the LOOKUP should be evaluated as whether the pathname provided in the LOOKUP should be evaluated as
either an absolute path relative to the server's root or relative to either an absolute path relative to the server's root or relative to
the public filehandle. RFC 2224 [60] contains further discussion of the public filehandle. RFC 2224 [61] contains further discussion of
the functionality. With NFSv4.1, that type of specification is not the functionality. With NFSv4.1, that type of specification is not
directly available in the LOOKUP operation. The reason for this is directly available in the LOOKUP operation. The reason for this is
because the component separators needed to specify absolute vs. because the component separators needed to specify absolute vs.
relative are not allowed in NFSv4. Therefore, the client is relative are not allowed in NFSv4. Therefore, the client is
responsible for constructing its request such that the use of either responsible for constructing its request such that the use of either
PUTROOTFH or PUTPUBFH signifies absolute or relative evaluation of an PUTROOTFH or PUTPUBFH signifies absolute or relative evaluation of an
NFS URL, respectively. NFS URL, respectively.
Note that there are warnings mentioned in RFC 2224 [60] with respect Note that there are warnings mentioned in RFC 2224 [61] with respect
to the use of absolute evaluation and the restrictions the server may to the use of absolute evaluation and the restrictions the server may
place on that evaluation with respect to how much of its namespace place on that evaluation with respect to how much of its namespace
has been made available. These same warnings apply to NFSv4.1. It has been made available. These same warnings apply to NFSv4.1. It
is likely, therefore, that because of server implementation details, is likely, therefore, that because of server implementation details,
an NFSv3 absolute public filehandle look up may behave differently an NFSv3 absolute public filehandle look up may behave differently
than an NFSv4.1 absolute resolution. than an NFSv4.1 absolute resolution.
There is a form of security negotiation as described in RFC 2755 [61] There is a form of security negotiation as described in RFC 2755 [62]
that uses the public filehandle and an overloading of the pathname. that uses the public filehandle and an overloading of the pathname.
This method is not available with NFSv4.1 as filehandles are not This method is not available with NFSv4.1 as filehandles are not
overloaded with special meaning and therefore do not provide the same overloaded with special meaning and therefore do not provide the same
framework as NFSv3. Clients should therefore use the security framework as NFSv3. Clients should therefore use the security
negotiation mechanisms described in Section 2.6. negotiation mechanisms described in Section 2.6.
18.21. Operation 24: PUTROOTFH - Set Root Filehandle 18.21. Operation 24: PUTROOTFH - Set Root Filehandle
18.21.1. ARGUMENTS 18.21.1. ARGUMENTS
skipping to change at line 25073 skipping to change at line 25098
struct gss_cb_handles4 { struct gss_cb_handles4 {
rpc_gss_svc_t gcbp_service; /* RFC 2203 */ rpc_gss_svc_t gcbp_service; /* RFC 2203 */
gsshandle4_t gcbp_handle_from_server; gsshandle4_t gcbp_handle_from_server;
gsshandle4_t gcbp_handle_from_client; gsshandle4_t gcbp_handle_from_client;
}; };
union callback_sec_parms4 switch (uint32_t cb_secflavor) { union callback_sec_parms4 switch (uint32_t cb_secflavor) {
case AUTH_NONE: case AUTH_NONE:
void; void;
case AUTH_SYS: case AUTH_SYS:
authsys_parms cbsp_sys_cred; /* RFC 1831 */ authsys_parms cbsp_sys_cred; /* RFC 5531 */
case RPCSEC_GSS: case RPCSEC_GSS:
gss_cb_handles4 cbsp_gss_handles; gss_cb_handles4 cbsp_gss_handles;
}; };
struct BACKCHANNEL_CTL4args { struct BACKCHANNEL_CTL4args {
uint32_t bca_cb_program; uint32_t bca_cb_program;
callback_sec_parms4 bca_sec_parms<>; callback_sec_parms4 bca_sec_parms<>;
}; };
18.33.2. RESULT 18.33.2. RESULT
skipping to change at line 25354 skipping to change at line 25379
case NFS4_OK: case NFS4_OK:
EXCHANGE_ID4resok eir_resok4; EXCHANGE_ID4resok eir_resok4;
default: default:
void; void;
}; };
18.35.3. DESCRIPTION 18.35.3. DESCRIPTION
The client uses the EXCHANGE_ID operation to register a particular The client uses the EXCHANGE_ID operation to register a particular
client_owner with the server. However, when the client_owner has instance of that client with the server, as represented by a
already been registered by other means (e.g., Transparent State client_owner4. However, when the client_owner4 has already been
Migration), the client may still use EXCHANGE_ID to obtain the client registered by other means (e.g., Transparent State Migration), the
ID assigned previously. client may still use EXCHANGE_ID to obtain the client ID assigned
previously.
The client ID returned from this operation will be associated with The client ID returned from this operation will be associated with
the connection on which the EXCHANGE_ID is received and will serve as the connection on which the EXCHANGE_ID is received and will serve as
a parent object for sessions created by the client on this connection a parent object for sessions created by the client on this connection
or to which the connection is bound. As a result of using those or to which the connection is bound. As a result of using those
sessions to make requests involving the creation of state, that state sessions to make requests involving the creation of state, that state
will become associated with the client ID returned. will become associated with the client ID returned.
In situations in which the registration of the client_owner has not In situations in which the registration of the client_owner has not
occurred previously, the client ID must first be used, along with the occurred previously, the client ID must first be used, along with the
skipping to change at line 25505 skipping to change at line 25531
derived from the SSV, and the derivation is via the hash derived from the SSV, and the derivation is via the hash
algorithm. The selection of an encryption algorithm with a algorithm. The selection of an encryption algorithm with a
key length that exceeded the length of the output of the key length that exceeded the length of the output of the
hash algorithm would require padding, and thus weaken the hash algorithm would require padding, and thus weaken the
use of the encryption algorithm. use of the encryption algorithm.
o hash length SHOULD be <= SSV length. This is because the o hash length SHOULD be <= SSV length. This is because the
SSV is a key used to derive subkeys via an HMAC, and it is SSV is a key used to derive subkeys via an HMAC, and it is
recommended that the key used as input to an HMAC be at recommended that the key used as input to an HMAC be at
least as long as the length of the HMAC's hash algorithm's least as long as the length of the HMAC's hash algorithm's
output (see Section 3 of [51]). output (see Section 3 of [52]).
o key length SHOULD be <= SSV length. This is a transitive o key length SHOULD be <= SSV length. This is a transitive
result of the above two invariants. result of the above two invariants.
o key length SHOULD be >= hash length / 2. This is because o key length SHOULD be >= hash length / 2. This is because
the subkey derivation is via an HMAC and it is recommended the subkey derivation is via an HMAC and it is recommended
that if the HMAC has to be truncated, it should not be that if the HMAC has to be truncated, it should not be
truncated to less than half the hash length (see Section 4 truncated to less than half the hash length (see Section 4
of RFC 2104 [51]). of RFC 2104 [52]).
- Number of concurrent versions of the SSV the client and server - Number of concurrent versions of the SSV the client and server
will support (see Section 2.10.9). This property is will support (see Section 2.10.9). This property is
represented by spi_window in the EXCHANGE_ID results. The represented by spi_window in the EXCHANGE_ID results. The
property may be updated by subsequent EXCHANGE_ID operations. property may be updated by subsequent EXCHANGE_ID operations.
* The client's implementation ID as represented by the * The client's implementation ID as represented by the
eia_client_impl_id field of the arguments. The property may be eia_client_impl_id field of the arguments. The property may be
updated by subsequent EXCHANGE_ID requests. updated by subsequent EXCHANGE_ID requests.
skipping to change at line 25613 skipping to change at line 25639
it. If an update to the client ID changes the value of it. If an update to the client ID changes the value of
EXCHGID4_FLAG_BIND_PRINC_STATEID's client ID property, the effect EXCHGID4_FLAG_BIND_PRINC_STATEID's client ID property, the effect
applies only to new stateids. Existing stateids (and all stateids applies only to new stateids. Existing stateids (and all stateids
with the same "other" field) that were created with stateid to with the same "other" field) that were created with stateid to
principal binding in force will continue to have binding in force. principal binding in force will continue to have binding in force.
Existing stateids (and all stateids with the same "other" field) that Existing stateids (and all stateids with the same "other" field) that
were created with stateid to principal not in force will continue to were created with stateid to principal not in force will continue to
have binding not in force. have binding not in force.
The EXCHGID4_FLAG_USE_NON_PNFS, EXCHGID4_FLAG_USE_PNFS_MDS, and The EXCHGID4_FLAG_USE_NON_PNFS, EXCHGID4_FLAG_USE_PNFS_MDS, and
EXCHGID4_FLAG_USE_PNFS_DS bits are described in Section 2.10.2.2 and EXCHGID4_FLAG_USE_PNFS_DS bits are described in Section 13.1 and
convey roles the client ID is to be used for in a pNFS environment. convey roles the client ID is to be used for in a pNFS environment.
The server MUST set one of the acceptable combinations of these bits The server MUST set one of the acceptable combinations of these bits
(roles) in eir_flags, as specified in that section. Note that the (roles) in eir_flags, as specified in that section. Note that the
same client owner/server owner pair can have multiple roles. same client owner/server owner pair can have multiple roles.
Multiple roles can be associated with the same client ID or with Multiple roles can be associated with the same client ID or with
different client IDs. Thus, if a client sends EXCHANGE_ID from the different client IDs. Thus, if a client sends EXCHANGE_ID from the
same client owner to the same server owner multiple times, but same client owner to the same server owner multiple times, but
specifies different pNFS roles each time, the server might return specifies different pNFS roles each time, the server might return
different client IDs. Given that different pNFS roles might have different client IDs. Given that different pNFS roles might have
different client IDs, the client may ask for different properties for different client IDs, the client may ask for different properties for
skipping to change at line 26240 skipping to change at line 26266
is currently in non-RDMA mode but has the capability to operate is currently in non-RDMA mode but has the capability to operate
in RDMA mode, then the client is requesting that the server in RDMA mode, then the client is requesting that the server
"step up" to RDMA mode on the connection. If the server "step up" to RDMA mode on the connection. If the server
agrees, it sets CREATE_SESSION4_FLAG_CONN_RDMA in the result agrees, it sets CREATE_SESSION4_FLAG_CONN_RDMA in the result
field csr_flags. If CREATE_SESSION4_FLAG_CONN_RDMA is not set field csr_flags. If CREATE_SESSION4_FLAG_CONN_RDMA is not set
in csa_flags, then CREATE_SESSION4_FLAG_CONN_RDMA MUST NOT be in csa_flags, then CREATE_SESSION4_FLAG_CONN_RDMA MUST NOT be
set in csr_flags. Note that once the server agrees to step up, set in csr_flags. Note that once the server agrees to step up,
it and the client MUST exchange all future traffic on the it and the client MUST exchange all future traffic on the
connection with RPC RDMA framing and not Record Marking ([32]). connection with RPC RDMA framing and not Record Marking ([32]).
csa_fore_chan_attrs, csa_fore_chan_attrs: The csa_fore_chan_attrs csa_fore_chan_attrs, csa_back_chan_attrs: The csa_fore_chan_attrs
and csa_back_chan_attrs fields apply to attributes of the fore and csa_back_chan_attrs fields apply to attributes of the fore
channel (which conveys requests originating from the client to the channel (which conveys requests originating from the client to the
server), and the backchannel (the channel that conveys callback server), and the backchannel (the channel that conveys callback
requests originating from the server to the client), respectively. requests originating from the server to the client), respectively.
The results are in corresponding structures called The results are in corresponding structures called
csr_fore_chan_attrs and csr_back_chan_attrs. The results csr_fore_chan_attrs and csr_back_chan_attrs. The results
establish attributes for each channel, and on all subsequent use establish attributes for each channel, and on all subsequent use
of each channel of the session. Each structure has the following of each channel of the session. Each structure has the following
fields: fields:
skipping to change at line 27230 skipping to change at line 27256
expansive recovery of file system objects if the metadata server does expansive recovery of file system objects if the metadata server does
not get a positive indication from all clients holding a not get a positive indication from all clients holding a
LAYOUTIOMODE4_RW layout that they have successfully completed all LAYOUTIOMODE4_RW layout that they have successfully completed all
their writes. Sending a LAYOUTCOMMIT (if required) and then their writes. Sending a LAYOUTCOMMIT (if required) and then
following with LAYOUTRETURN can provide such an indication and allow following with LAYOUTRETURN can provide such an indication and allow
for graceful and efficient recovery. for graceful and efficient recovery.
If loca_reclaim is TRUE, the metadata server is free to either If loca_reclaim is TRUE, the metadata server is free to either
examine or ignore the value in the field loca_stateid. The metadata examine or ignore the value in the field loca_stateid. The metadata
server implementation might or might not encode in its layout stateid server implementation might or might not encode in its layout stateid
information that allows the metadate server to perform a consistency information that allows the metadata server to perform a consistency
check on the LAYOUTCOMMIT request. check on the LAYOUTCOMMIT request.
18.43. Operation 50: LAYOUTGET - Get Layout Information 18.43. Operation 50: LAYOUTGET - Get Layout Information
18.43.1. ARGUMENT 18.43.1. ARGUMENT
struct LAYOUTGET4args { struct LAYOUTGET4args {
/* CURRENT_FH: file */ /* CURRENT_FH: file */
bool loga_signal_layout_avail; bool loga_signal_layout_avail;
layouttype4 loga_layout_type; layouttype4 loga_layout_type;
skipping to change at line 28279 skipping to change at line 28305
This operation is used to update the SSV for a client ID. Before This operation is used to update the SSV for a client ID. Before
SET_SSV is called the first time on a client ID, the SSV is zero. SET_SSV is called the first time on a client ID, the SSV is zero.
The SSV is the key used for the SSV GSS mechanism (Section 2.10.9) The SSV is the key used for the SSV GSS mechanism (Section 2.10.9)
SET_SSV MUST be preceded by a SEQUENCE operation in the same SET_SSV MUST be preceded by a SEQUENCE operation in the same
COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV COMPOUND. It MUST NOT be used if the client did not opt for SP4_SSV
state protection when the client ID was created (see Section 18.35); state protection when the client ID was created (see Section 18.35);
the server returns NFS4ERR_INVAL in that case. the server returns NFS4ERR_INVAL in that case.
The field ssa_digest is computed as the output of the HMAC (RFC 2104 The field ssa_digest is computed as the output of the HMAC (RFC 2104
[51]) using the subkey derived from the SSV4_SUBKEY_MIC_I2T and [52]) using the subkey derived from the SSV4_SUBKEY_MIC_I2T and
current SSV as the key (see Section 2.10.9 for a description of current SSV as the key (see Section 2.10.9 for a description of
subkeys), and an XDR encoded value of data type ssa_digest_input4. subkeys), and an XDR encoded value of data type ssa_digest_input4.
The field sdi_seqargs is equal to the arguments of the SEQUENCE The field sdi_seqargs is equal to the arguments of the SEQUENCE
operation for the COMPOUND procedure that SET_SSV is within. operation for the COMPOUND procedure that SET_SSV is within.
The argument ssa_ssv is XORed with the current SSV to produce the new The argument ssa_ssv is XORed with the current SSV to produce the new
SSV. The argument ssa_ssv SHOULD be generated randomly. SSV. The argument ssa_ssv SHOULD be generated randomly.
In the response, ssr_digest is the output of the HMAC using the In the response, ssr_digest is the output of the HMAC using the
subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and subkey derived from SSV4_SUBKEY_MIC_T2I and new SSV as the key, and
skipping to change at line 28599 skipping to change at line 28625
DESTROY_CLIENTID allows a server to immediately reclaim the resources DESTROY_CLIENTID allows a server to immediately reclaim the resources
consumed by an unused client ID, and also to forget that it ever consumed by an unused client ID, and also to forget that it ever
generated the client ID. By forgetting that it ever generated the generated the client ID. By forgetting that it ever generated the
client ID, the server can safely reuse the client ID on a future client ID, the server can safely reuse the client ID on a future
EXCHANGE_ID operation. EXCHANGE_ID operation.
18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished
18.51.1. ARGUMENT 18.51.1. ARGUMENT
<CODE BEGINS>
struct RECLAIM_COMPLETE4args { struct RECLAIM_COMPLETE4args {
/* /*
* If rca_one_fs TRUE, * If rca_one_fs TRUE,
* *
* CURRENT_FH: object in * CURRENT_FH: object in
* file system reclaim is * file system reclaim is
* complete for. * complete for.
*/ */
bool rca_one_fs; bool rca_one_fs;
}; };
<CODE ENDS>
18.51.2. RESULTS 18.51.2. RESULTS
<CODE BEGINS>
struct RECLAIM_COMPLETE4res { struct RECLAIM_COMPLETE4res {
nfsstat4 rcr_status; nfsstat4 rcr_status;
}; };
<CODE ENDS>
18.51.3. DESCRIPTION 18.51.3. DESCRIPTION
A RECLAIM_COMPLETE operation is used to indicate that the client has A RECLAIM_COMPLETE operation is used to indicate that the client has
reclaimed all of the locking state that it will recover using reclaimed all of the locking state that it will recover using
reclaim, when it is recovering state due to either a server restart reclaim, when it is recovering state due to either a server restart
or the migration of a file system to another server. There are two or the migration of a file system to another server. There are two
types of RECLAIM_COMPLETE operations: types of RECLAIM_COMPLETE operations:
* When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done. * When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done.
skipping to change at line 28680 skipping to change at line 28702
These two may be done in any order as long as all necessary lock These two may be done in any order as long as all necessary lock
reclaims have been done before issuing either of them. reclaims have been done before issuing either of them.
Any locks not reclaimed at the point at which RECLAIM_COMPLETE is Any locks not reclaimed at the point at which RECLAIM_COMPLETE is
done become non-reclaimable. The client MUST NOT attempt to reclaim done become non-reclaimable. The client MUST NOT attempt to reclaim
them, either during the current server instance or in any subsequent them, either during the current server instance or in any subsequent
server instance, or on another server to which responsibility for server instance, or on another server to which responsibility for
that file system is transferred. If the client were to do so, it that file system is transferred. If the client were to do so, it
would be violating the protocol by representing itself as owning would be violating the protocol by representing itself as owning
locks that it does not own, and so has no right to reclaim. See locks that it does not own, and so has no right to reclaim. See
Section 8.4.3 of [65] for a discussion of edge conditions related to Section 8.4.3 of [66] for a discussion of edge conditions related to
lock reclaim. lock reclaim.
By sending a RECLAIM_COMPLETE, the client indicates readiness to By sending a RECLAIM_COMPLETE, the client indicates readiness to
proceed to do normal non-reclaim locking operations. The client proceed to do normal non-reclaim locking operations. The client
should be aware that such operations may temporarily result in should be aware that such operations may temporarily result in
NFS4ERR_GRACE errors until the server is ready to terminate its grace NFS4ERR_GRACE errors until the server is ready to terminate its grace
period. period.
18.51.4. IMPLEMENTATION 18.51.4. IMPLEMENTATION
skipping to change at line 29550 skipping to change at line 29572
The client is to return OPEN_DELEGATE_WRITE delegations on regular The client is to return OPEN_DELEGATE_WRITE delegations on regular
file objects. file objects.
RCA4_TYPE_MASK_DIR_DLG RCA4_TYPE_MASK_DIR_DLG
The client is to return directory delegations. The client is to return directory delegations.
RCA4_TYPE_MASK_FILE_LAYOUT RCA4_TYPE_MASK_FILE_LAYOUT
The client is to return layouts of type LAYOUT4_NFSV4_1_FILES. The client is to return layouts of type LAYOUT4_NFSV4_1_FILES.
RCA4_TYPE_MASK_BLK_LAYOUT RCA4_TYPE_MASK_BLK_LAYOUT
See [47] for a description. See [48] for a description.
RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX RCA4_TYPE_MASK_OBJ_LAYOUT_MIN to RCA4_TYPE_MASK_OBJ_LAYOUT_MAX
See [46] for a description. See [47] for a description.
RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX
This range is reserved for telling the client to recall layouts of This range is reserved for telling the client to recall layouts of
experimental or site-specific layout types (see Section 3.3.13). experimental or site-specific layout types (see Section 3.3.13).
When a bit is set in the type mask that corresponds to an undefined When a bit is set in the type mask that corresponds to an undefined
type of recallable object, NFS4ERR_INVAL MUST be returned. When a type of recallable object, NFS4ERR_INVAL MUST be returned. When a
bit is set that corresponds to a defined type of object but the bit is set that corresponds to a defined type of object but the
client does not support an object of the type, NFS4ERR_INVAL MUST NOT client does not support an object of the type, NFS4ERR_INVAL MUST NOT
be returned. Future minor versions of NFSv4 may expand the set of be returned. Future minor versions of NFSv4 may expand the set of
skipping to change at line 30240 skipping to change at line 30262
Similar considerations apply if the threat to be avoided is the Similar considerations apply if the threat to be avoided is the
redirection of client traffic to inappropriate (i.e., poorly redirection of client traffic to inappropriate (i.e., poorly
performing) servers. In both cases, there is no reason for the performing) servers. In both cases, there is no reason for the
information returned to depend on the identity of the client information returned to depend on the identity of the client
principal requesting it, while the validity of the server principal requesting it, while the validity of the server
information, which has the capability to affect all client information, which has the capability to affect all client
principals, is of considerable importance. principals, is of considerable importance.
22. IANA Considerations 22. IANA Considerations
This section uses terms that are defined in [62]. This section uses terms that are defined in [63].
22.1. IANA Actions 22.1. IANA Actions
This update does not require any modification of, or additions to, This update does not require any modification of, or additions to,
registry entries or registry rules associated with NFSv4.1. However, registry entries or registry rules associated with NFSv4.1. However,
since this document obsoletes RFC 8881, IANA has updated all registry since this document obsoletes RFC 5661, IANA has updated all registry
entries and registry rules references that point to RFC 5661 to point entries and registry rules references that point to RFC 5661 to point
to this document instead. to this document instead.
Previous actions by IANA related to NFSv4.1 are listed in the Previous actions by IANA related to NFSv4.1 are listed in the
remaining subsections of Section 22. remaining subsections of Section 22.
22.2. Named Attribute Definitions 22.2. Named Attribute Definitions
IANA created a registry called the "NFSv4 Named Attribute Definitions IANA created a registry called the "NFSv4 Named Attribute Definitions
Registry". Registry".
skipping to change at line 30274 skipping to change at line 30296
attributes as needed, they are encouraged to register the attributes attributes as needed, they are encouraged to register the attributes
with IANA. with IANA.
Such registered named attributes are presumed to apply to all minor Such registered named attributes are presumed to apply to all minor
versions of NFSv4, including those defined subsequently to the versions of NFSv4, including those defined subsequently to the
registration. If the named attribute is intended to be limited to registration. If the named attribute is intended to be limited to
specific minor versions, this will be clearly stated in the specific minor versions, this will be clearly stated in the
registry's assignment. registry's assignment.
All assignments to the registry are made on a First Come First Served All assignments to the registry are made on a First Come First Served
basis, per Section 4.1 of [62]. The policy for each assignment is basis, per Section 4.4 of [63]. The policy for each assignment is
Specification Required, per Section 4.1 of [62]. Specification Required, per Section 4.6 of [63].
Under the NFSv4.1 specification, the name of a named attribute can in Under the NFSv4.1 specification, the name of a named attribute can in
theory be up to 2^(32) - 1 bytes in length, but in practice NFSv4.1 theory be up to 2^(32) - 1 bytes in length, but in practice NFSv4.1
clients and servers will be unable to handle a string that long. clients and servers will be unable to handle a string that long.
IANA should reject any assignment request with a named attribute that IANA should reject any assignment request with a named attribute that
exceeds 128 UTF-8 characters. To give the IESG the flexibility to exceeds 128 UTF-8 characters. To give the IESG the flexibility to
set up bases of assignment of Experimental Use and Standards Action, set up bases of assignment of Experimental Use and Standards Action,
the prefixes of "EXPE" and "STDS" are Reserved. The named attribute the prefixes of "EXPE" and "STDS" are Reserved. The named attribute
with a zero-length name is Reserved. with a zero-length name is Reserved.
skipping to change at line 30334 skipping to change at line 30356
The potential exists for new notification types to be added to the The potential exists for new notification types to be added to the
CB_NOTIFY_DEVICEID operation (see Section 20.12). This can be done CB_NOTIFY_DEVICEID operation (see Section 20.12). This can be done
via changes to the operations that register notifications, or by via changes to the operations that register notifications, or by
adding new operations to NFSv4. This requires a new minor version of adding new operations to NFSv4. This requires a new minor version of
NFSv4, and requires a Standards Track document from the IETF. NFSv4, and requires a Standards Track document from the IETF.
Another way to add a notification is to specify a new layout type Another way to add a notification is to specify a new layout type
(see Section 22.5). (see Section 22.5).
Hence, all assignments to the registry are made on a Standards Action Hence, all assignments to the registry are made on a Standards Action
basis per Section 4.1 of [62], with Expert Review required. basis per Section 4.6 of [63], with Expert Review required.
The registry is a list of assignments, each containing five fields The registry is a list of assignments, each containing five fields
per assignment. per assignment.
1. The name of the notification type. This name must have the 1. The name of the notification type. This name must have the
prefix "NOTIFY_DEVICEID4_". This name must be unique. prefix "NOTIFY_DEVICEID4_". This name must be unique.
2. The value of the notification. IANA will assign this number, and 2. The value of the notification. IANA will assign this number, and
the request from the registrant will use TBD1 instead of an the request from the registrant will use TBD1 instead of an
actual value. IANA MUST use a whole number that can be no higher actual value. IANA MUST use a whole number that can be no higher
skipping to change at line 30405 skipping to change at line 30427
The potential exists for new object types to be added to the The potential exists for new object types to be added to the
CB_RECALL_ANY operation (see Section 20.6). This can be done via CB_RECALL_ANY operation (see Section 20.6). This can be done via
changes to the operations that add recallable types, or by adding new changes to the operations that add recallable types, or by adding new
operations to NFSv4. This requires a new minor version of NFSv4, and operations to NFSv4. This requires a new minor version of NFSv4, and
requires a Standards Track document from IETF. Another way to add a requires a Standards Track document from IETF. Another way to add a
new recallable object is to specify a new layout type (see new recallable object is to specify a new layout type (see
Section 22.5). Section 22.5).
All assignments to the registry are made on a Standards Action basis All assignments to the registry are made on a Standards Action basis
per Section 4.1 of [62], with Expert Review required. per Section 4.9 of [63], with Expert Review required.
Recallable object types are 32-bit unsigned numbers. There are no Recallable object types are 32-bit unsigned numbers. There are no
Reserved values. Values in the range 12 through 15, inclusive, are Reserved values. Values in the range 12 through 15, inclusive, are
designated for Private Use. designated for Private Use.
The registry is a list of assignments, each containing five fields The registry is a list of assignments, each containing five fields
per assignment. per assignment.
1. The name of the recallable object type. This name must have the 1. The name of the recallable object type. This name must have the
prefix "RCA4_TYPE_MASK_". The name must be unique. prefix "RCA4_TYPE_MASK_". The name must be unique.
skipping to change at line 30607 skipping to change at line 30629
access-control models are preserved. That is, if a metadata access-control models are preserved. That is, if a metadata
server would restrict a READ or WRITE operation, how would server would restrict a READ or WRITE operation, how would
pNFS via the layout similarly restrict a corresponding input pNFS via the layout similarly restrict a corresponding input
or output operation? or output operation?
3. The author documents the new layout specification as an Internet- 3. The author documents the new layout specification as an Internet-
Draft. Draft.
4. The author submits the Internet-Draft for review through the IETF 4. The author submits the Internet-Draft for review through the IETF
standards process as defined in "The Internet Standards Process-- standards process as defined in "The Internet Standards Process--
Revision 3" (BCP 9). The new layout specification will be Revision 3" (BCP 9 [35]). The new layout specification will be
submitted for eventual publication as a Standards Track RFC. submitted for eventual publication as a Standards Track RFC.
5. The layout specification progresses through the IETF standards 5. The layout specification progresses through the IETF standards
process. process.
22.6. Path Variable Definitions 22.6. Path Variable Definitions
This section deals with the IANA considerations associated with the This section deals with the IANA considerations associated with the
variable substitution feature for location names as described in variable substitution feature for location names as described in
Section 11.17.3. As described there, variables subject to Section 11.17.3. As described there, variables subject to
skipping to change at line 30896 skipping to change at line 30918
1003.1, 2004 Edition, HTML Version", ISBN 1931624232, 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
2004, <https://www.opengroup.org>. 2004, <https://www.opengroup.org>.
[25] Schaad, J., Kaliski, B., and R. Housley, "Additional [25] Schaad, J., Kaliski, B., and R. Housley, "Additional
Algorithms and Identifiers for RSA Cryptography for use in Algorithms and Identifiers for RSA Cryptography for use in
the Internet X.509 Public Key Infrastructure Certificate the Internet X.509 Public Key Infrastructure Certificate
and Certificate Revocation List (CRL) Profile", RFC 4055, and Certificate Revocation List (CRL) Profile", RFC 4055,
DOI 10.17487/RFC4055, June 2005, DOI 10.17487/RFC4055, June 2005,
<https://www.rfc-editor.org/info/rfc4055>. <https://www.rfc-editor.org/info/rfc4055>.
[26] National Institute of Standards and Technology, [26] National Institute of Standards and Technology, "Computer
"Cryptographic Algorithm Object Registration", November Security Objects Register", May 2016,
2007, <https://csrc.nist.gov/projects/computer-security-objects-
<http://csrc.nist.gov/groups/ST/crypto_apps_infra/csor/ register/algorithm-registration>.
algorithms.html>.
[27] Adamson, A. and N. Williams, "Remote Procedure Call (RPC) [27] Adamson, A. and N. Williams, "Remote Procedure Call (RPC)
Security Version 3", RFC 7861, DOI 10.17487/RFC7861, Security Version 3", RFC 7861, DOI 10.17487/RFC7861,
November 2016, <https://www.rfc-editor.org/info/rfc7861>. November 2016, <https://www.rfc-editor.org/info/rfc7861>.
[28] Neuman, C., Yu, T., Hartman, S., and K. Raeburn, "The [28] Neuman, C., Yu, T., Hartman, S., and K. Raeburn, "The
Kerberos Network Authentication Service (V5)", RFC 4120, Kerberos Network Authentication Service (V5)", RFC 4120,
DOI 10.17487/RFC4120, July 2005, DOI 10.17487/RFC4120, July 2005,
<https://www.rfc-editor.org/info/rfc4120>. <https://www.rfc-editor.org/info/rfc4120>.
skipping to change at line 30940 skipping to change at line 30961
[33] Lever, C., "Network File System (NFS) Upper-Layer Binding [33] Lever, C., "Network File System (NFS) Upper-Layer Binding
to RPC-over-RDMA Version 1", RFC 8267, to RPC-over-RDMA Version 1", RFC 8267,
DOI 10.17487/RFC8267, October 2017, DOI 10.17487/RFC8267, October 2017,
<https://www.rfc-editor.org/info/rfc8267>. <https://www.rfc-editor.org/info/rfc8267>.
[34] Hoffman, P. and P. McManus, "DNS Queries over HTTPS [34] Hoffman, P. and P. McManus, "DNS Queries over HTTPS
(DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018, (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018,
<https://www.rfc-editor.org/info/rfc8484>. <https://www.rfc-editor.org/info/rfc8484>.
[35] Bradner, S., "The Internet Standards Process -- Revision
3", BCP 9, RFC 2026, October 1996.
Kolkman, O., Bradner, S., and S. Turner, "Characterization
of Proposed Standards", BCP 9, RFC 7127, January 2014.
Dusseault, L. and R. Sparks, "Guidance on Interoperation
and Implementation Reports for Advancement to Draft
Standard", BCP 9, RFC 5657, September 2009.
Housley, R., Crocker, D., and E. Burger, "Reducing the
Standards Track to Two Maturity Levels", BCP 9, RFC 6410,
October 2011.
Resnick, P., "Retirement of the "Internet Official
Protocol Standards" Summary Document", BCP 9, RFC 7100,
December 2013.
Dawkins, S., "Increasing the Number of Area Directors in
an IETF Area", BCP 9, RFC 7475, March 2015.
<https://www.rfc-editor.org/info/bcp9>
23.2. Informative References 23.2. Informative References
[35] Roach, A., "Process for Handling Non-Major Revisions to [36] Roach, A., "Process for Handling Non-Major Revisions to
Existing RFCs", Work in Progress, Internet-Draft, draft- Existing RFCs", Work in Progress, Internet-Draft, draft-
roach-bis-documents-00, 7 May 2019, roach-bis-documents-00, 7 May 2019,
<https://tools.ietf.org/html/draft-roach-bis-documents- <https://tools.ietf.org/html/draft-roach-bis-documents-
00>. 00>.
[36] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., [37] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R.,
Beame, C., Eisler, M., and D. Noveck, "Network File System Beame, C., Eisler, M., and D. Noveck, "Network File System
(NFS) version 4 Protocol", RFC 3530, DOI 10.17487/RFC3530, (NFS) version 4 Protocol", RFC 3530, DOI 10.17487/RFC3530,
April 2003, <https://www.rfc-editor.org/info/rfc3530>. April 2003, <https://www.rfc-editor.org/info/rfc3530>.
[37] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS [38] Callaghan, B., Pawlowski, B., and P. Staubach, "NFS
Version 3 Protocol Specification", RFC 1813, Version 3 Protocol Specification", RFC 1813,
DOI 10.17487/RFC1813, June 1995, DOI 10.17487/RFC1813, June 1995,
<https://www.rfc-editor.org/info/rfc1813>. <https://www.rfc-editor.org/info/rfc1813>.
[38] Eisler, M., "LIPKEY - A Low Infrastructure Public Key [39] Eisler, M., "LIPKEY - A Low Infrastructure Public Key
Mechanism Using SPKM", RFC 2847, DOI 10.17487/RFC2847, Mechanism Using SPKM", RFC 2847, DOI 10.17487/RFC2847,
June 2000, <https://www.rfc-editor.org/info/rfc2847>. June 2000, <https://www.rfc-editor.org/info/rfc2847>.
[39] Eisler, M., "NFS Version 2 and Version 3 Security Issues [40] Eisler, M., "NFS Version 2 and Version 3 Security Issues
and the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5", and the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5",
RFC 2623, DOI 10.17487/RFC2623, June 1999, RFC 2623, DOI 10.17487/RFC2623, June 1999,
<https://www.rfc-editor.org/info/rfc2623>. <https://www.rfc-editor.org/info/rfc2623>.
[40] Juszczak, C., "Improving the Performance and Correctness [41] Juszczak, C., "Improving the Performance and Correctness
of an NFS Server", USENIX Conference Proceedings, June of an NFS Server", USENIX Conference Proceedings, June
1990. 1990.
[41] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced [42] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced
by an On-line Database", RFC 3232, DOI 10.17487/RFC3232, by an On-line Database", RFC 3232, DOI 10.17487/RFC3232,
January 2002, <https://www.rfc-editor.org/info/rfc3232>. January 2002, <https://www.rfc-editor.org/info/rfc3232>.
[42] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", [43] Srinivasan, R., "Binding Protocols for ONC RPC Version 2",
RFC 1833, DOI 10.17487/RFC1833, August 1995, RFC 1833, DOI 10.17487/RFC1833, August 1995,
<https://www.rfc-editor.org/info/rfc1833>. <https://www.rfc-editor.org/info/rfc1833>.
[43] Werme, R., "RPC XID Issues", USENIX Conference [44] Werme, R., "RPC XID Issues", USENIX Conference
Proceedings, February 1996. Proceedings, February 1996.
[44] Nowicki, B., "NFS: Network File System Protocol [45] Nowicki, B., "NFS: Network File System Protocol
specification", RFC 1094, DOI 10.17487/RFC1094, March specification", RFC 1094, DOI 10.17487/RFC1094, March
1989, <https://www.rfc-editor.org/info/rfc1094>. 1989, <https://www.rfc-editor.org/info/rfc1094>.
[45] Bhide, A., Elnozahy, E. N., and S. P. Morgan, "A Highly [46] Bhide, A., Elnozahy, E. N., and S. P. Morgan, "A Highly
Available Network Server", USENIX Conference Proceedings, Available Network Server", USENIX Conference Proceedings,
January 1991. January 1991.
[46] Halevy, B., Welch, B., and J. Zelenka, "Object-Based [47] Halevy, B., Welch, B., and J. Zelenka, "Object-Based
Parallel NFS (pNFS) Operations", RFC 5664, Parallel NFS (pNFS) Operations", RFC 5664,
DOI 10.17487/RFC5664, January 2010, DOI 10.17487/RFC5664, January 2010,
<https://www.rfc-editor.org/info/rfc5664>. <https://www.rfc-editor.org/info/rfc5664>.
[47] Black, D., Fridella, S., and J. Glasgow, "Parallel NFS [48] Black, D., Fridella, S., and J. Glasgow, "Parallel NFS
(pNFS) Block/Volume Layout", RFC 5663, (pNFS) Block/Volume Layout", RFC 5663,
DOI 10.17487/RFC5663, January 2010, DOI 10.17487/RFC5663, January 2010,
<https://www.rfc-editor.org/info/rfc5663>. <https://www.rfc-editor.org/info/rfc5663>.
[48] Callaghan, B., "WebNFS Client Specification", RFC 2054, [49] Callaghan, B., "WebNFS Client Specification", RFC 2054,
DOI 10.17487/RFC2054, October 1996, DOI 10.17487/RFC2054, October 1996,
<https://www.rfc-editor.org/info/rfc2054>. <https://www.rfc-editor.org/info/rfc2054>.
[49] Callaghan, B., "WebNFS Server Specification", RFC 2055, [50] Callaghan, B., "WebNFS Server Specification", RFC 2055,
DOI 10.17487/RFC2055, October 1996, DOI 10.17487/RFC2055, October 1996,
<https://www.rfc-editor.org/info/rfc2055>. <https://www.rfc-editor.org/info/rfc2055>.
[50] IESG, "IESG Processing of RFC Errata for the IETF Stream", [51] IESG, "IESG Processing of RFC Errata for the IETF Stream",
July 2008, July 2008,
<https://www.ietf.org/about/groups/iesg/statements/ <https://www.ietf.org/about/groups/iesg/statements/
processing-rfc-errata/>. processing-rfc-errata/>.
[51] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- [52] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
Hashing for Message Authentication", RFC 2104, Hashing for Message Authentication", RFC 2104,
DOI 10.17487/RFC2104, February 1997, DOI 10.17487/RFC2104, February 1997,
<https://www.rfc-editor.org/info/rfc2104>. <https://www.rfc-editor.org/info/rfc2104>.
[52] Shepler, S., "NFS Version 4 Design Considerations", [53] Shepler, S., "NFS Version 4 Design Considerations",
RFC 2624, DOI 10.17487/RFC2624, June 1999, RFC 2624, DOI 10.17487/RFC2624, June 1999,
<https://www.rfc-editor.org/info/rfc2624>. <https://www.rfc-editor.org/info/rfc2624>.
[53] The Open Group, "Protocols for Interworking: XNFS, Version [54] The Open Group, "Protocols for Interworking: XNFS, Version
3W", ISBN 1-85912-184-5, February 1998. 3W", ISBN 1-85912-184-5, February 1998.
[54] Floyd, S. and V. Jacobson, "The Synchronization of [55] Floyd, S. and V. Jacobson, "The Synchronization of
Periodic Routing Messages", IEEE/ACM Transactions on Periodic Routing Messages", IEEE/ACM Transactions on
Networking, 2(2), pp. 122-136, April 1994. Networking, 2(2), pp. 122-136, April 1994.
[55] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., [56] Chadalapaka, M., Satran, J., Meth, K., and D. Black,
and E. Zeidner, "Internet Small Computer Systems Interface "Internet Small Computer System Interface (iSCSI) Protocol
(iSCSI)", RFC 3720, DOI 10.17487/RFC3720, April 2004, (Consolidated)", RFC 7143, DOI 10.17487/RFC7143, April
<https://www.rfc-editor.org/info/rfc3720>. 2014, <https://www.rfc-editor.org/info/rfc7143>.
[56] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version [57] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version
(FCP-2)", ANSI/INCITS, 350-2003, October 2003. (FCP-2)", ANSI/INCITS, 350-2003, October 2003.
[57] Weber, R.O., "Object-Based Storage Device Commands (OSD)", [58] Weber, R.O., "Object-Based Storage Device Commands (OSD)",
ANSI/INCITS, 400-2004, July 2004, ANSI/INCITS, 400-2004, July 2004,
<http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. <https://www.t10.org/drafts.htm>.
[58] Carns, P. H., Ligon III, W. B., Ross, R. B., and R. [59] Carns, P. H., Ligon III, W. B., Ross, R. B., and R.
Thakur, "PVFS: A Parallel File System for Linux Thakur, "PVFS: A Parallel File System for Linux
Clusters.", Proceedings of the 4th Annual Linux Showcase Clusters.", Proceedings of the 4th Annual Linux Showcase
and Conference, 2000. and Conference, 2000.
[59] The Open Group, "The Open Group Base Specifications Issue [60] The Open Group, "The Open Group Base Specifications Issue
6, IEEE Std 1003.1, 2004 Edition", 2004, 6, IEEE Std 1003.1, 2004 Edition", 2004,
<https://www.opengroup.org>. <https://www.opengroup.org>.
[60] Callaghan, B., "NFS URL Scheme", RFC 2224, [61] Callaghan, B., "NFS URL Scheme", RFC 2224,
DOI 10.17487/RFC2224, October 1997, DOI 10.17487/RFC2224, October 1997,
<https://www.rfc-editor.org/info/rfc2224>. <https://www.rfc-editor.org/info/rfc2224>.
[61] Chiu, A., Eisler, M., and B. Callaghan, "Security [62] Chiu, A., Eisler, M., and B. Callaghan, "Security
Negotiation for WebNFS", RFC 2755, DOI 10.17487/RFC2755, Negotiation for WebNFS", RFC 2755, DOI 10.17487/RFC2755,
January 2000, <https://www.rfc-editor.org/info/rfc2755>. January 2000, <https://www.rfc-editor.org/info/rfc2755>.
[62] Narten, T. and H. Alvestrand, "Guidelines for Writing an [63] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
IANA Considerations Section in RFCs", RFC 5226, Writing an IANA Considerations Section in RFCs", BCP 26,
DOI 10.17487/RFC5226, May 2008, RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/info/rfc5226>. <https://www.rfc-editor.org/info/rfc8126>.
[63] RFC Errata, Erratum ID 2006, RFC 5661, [64] RFC Errata, Erratum ID 2006, RFC 5661,
<https://www.rfc-editor.org/errata/eid2006>. <https://www.rfc-editor.org/errata/eid2006>.
[64] Spasojevic, M. and M. Satayanarayanan, "An Empirical Study [65] Spasojevic, M. and M. Satayanarayanan, "An Empirical Study
of a Wide-Area Distributed File System", May 1996, of a Wide-Area Distributed File System", ACM Transactions
<https://www.cs.cmu.edu/~satya/docdir/spasojevic-tocs-afs- on Computer Systems, Vol. 14, No. 2, pp. 200-222,
measurement-1996.pdf>. DOI 10.1145/227695.227698, May 1996,
<https://doi.org/10.1145/227695.227698>.
[65] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., [66] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1 "Network File System (NFS) Version 4 Minor Version 1
Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010,
<https://www.rfc-editor.org/info/rfc5661>. <https://www.rfc-editor.org/info/rfc5661>.
[66] Noveck, D., "Rules for NFSv4 Extensions and Minor [67] Noveck, D., "Rules for NFSv4 Extensions and Minor
Versions", RFC 8178, DOI 10.17487/RFC8178, July 2017, Versions", RFC 8178, DOI 10.17487/RFC8178, July 2017,
<https://www.rfc-editor.org/info/rfc8178>. <https://www.rfc-editor.org/info/rfc8178>.
[67] Haynes, T., Ed. and D. Noveck, Ed., "Network File System [68] Haynes, T., Ed. and D. Noveck, Ed., "Network File System
(NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530, (NFS) Version 4 Protocol", RFC 7530, DOI 10.17487/RFC7530,
March 2015, <https://www.rfc-editor.org/info/rfc7530>. March 2015, <https://www.rfc-editor.org/info/rfc7530>.
[68] Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker, [69] Noveck, D., Ed., Shivam, P., Lever, C., and B. Baker,
"NFSv4.0 Migration: Specification Update", RFC 7931, "NFSv4.0 Migration: Specification Update", RFC 7931,
DOI 10.17487/RFC7931, July 2016, DOI 10.17487/RFC7931, July 2016,
<https://www.rfc-editor.org/info/rfc7931>. <https://www.rfc-editor.org/info/rfc7931>.
[69] Haynes, T., "Requirements for Parallel NFS (pNFS) Layout [70] Haynes, T., "Requirements for Parallel NFS (pNFS) Layout
Types", RFC 8434, DOI 10.17487/RFC8434, August 2018, Types", RFC 8434, DOI 10.17487/RFC8434, August 2018,
<https://www.rfc-editor.org/info/rfc8434>. <https://www.rfc-editor.org/info/rfc8434>.
[70] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an [71] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
2014, <https://www.rfc-editor.org/info/rfc7258>. 2014, <https://www.rfc-editor.org/info/rfc7258>.
[71] Rescorla, E. and B. Korver, "Guidelines for Writing RFC [72] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Text on Security Considerations", BCP 72, RFC 3552, Text on Security Considerations", BCP 72, RFC 3552,
DOI 10.17487/RFC3552, July 2003, DOI 10.17487/RFC3552, July 2003,
<https://www.rfc-editor.org/info/rfc3552>. <https://www.rfc-editor.org/info/rfc3552>.
Appendix A. The Need for This Update Appendix A. The Need for This Update
This document includes an explanation of how clients and servers are This document includes an explanation of how clients and servers are
to determine the particular network access paths to be used to access to determine the particular network access paths to be used to access
a file system. This includes descriptions of how to handle changes a file system. This includes descriptions of how to handle changes
to the specific replica to be used or to the set of addresses to be to the specific replica to be used or to the set of addresses to be
used to access it, and how to deal transparently with transfers of used to access it, and how to deal transparently with transfers of
responsibility that need to be made. This includes cases in which responsibility that need to be made. This includes cases in which
there is a shift between one replica and another and those in which there is a shift between one replica and another and those in which
different network access paths are used to access the same replica. different network access paths are used to access the same replica.
As a result of the following problems in RFC 5661 [65], it was As a result of the following problems in RFC 5661 [66], it was
necessary to provide the specific updates that are made by this necessary to provide the specific updates that are made by this
document. These updates are described in Appendix B. document. These updates are described in Appendix B.
* RFC 5661 [65], while it dealt with situations in which various * RFC 5661 [66], while it dealt with situations in which various
forms of clustering allowed coordination of the state assigned by forms of clustering allowed coordination of the state assigned by
cooperating servers to be used, made no provisions for Transparent cooperating servers to be used, made no provisions for Transparent
State Migration. Within NFSv4.0, Transparent State Migration was State Migration. Within NFSv4.0, Transparent State Migration was
first explained clearly in RFC 7530 [67] and corrected and first explained clearly in RFC 7530 [68] and corrected and
clarified by RFC 7931 [68]. No corresponding explanation for clarified by RFC 7931 [69]. No corresponding explanation for
NFSv4.1 had been provided. NFSv4.1 had been provided.
* Although NFSv4.1 provided a clear definition of how trunking * Although NFSv4.1 provided a clear definition of how trunking
detection was to be done, there was no clear specification of how detection was to be done, there was no clear specification of how
trunking discovery was to be done, despite the fact that the trunking discovery was to be done, despite the fact that the
specification clearly indicated that this information could be specification clearly indicated that this information could be
made available via the file system location attributes. made available via the file system location attributes.
* Because the existence of multiple network access paths to the same * Because the existence of multiple network access paths to the same
file system was dealt with as if there were multiple replicas, file system was dealt with as if there were multiple replicas,
skipping to change at line 31145 skipping to change at line 31190
the addresses used to access a particular file system instance. the addresses used to access a particular file system instance.
As a result, in situations in which both migration and trunking As a result, in situations in which both migration and trunking
configuration changes were involved, neither of these could be configuration changes were involved, neither of these could be
clearly dealt with, and the relationship between these two clearly dealt with, and the relationship between these two
features was not seriously addressed. features was not seriously addressed.
* Because use of two network access paths to the same file system * Because use of two network access paths to the same file system
instance (i.e., trunking) was often treated as if two replicas instance (i.e., trunking) was often treated as if two replicas
were involved, it was considered that two replicas were being used were involved, it was considered that two replicas were being used
simultaneously. As a result, the treatment of replicas being used simultaneously. As a result, the treatment of replicas being used
simultaneously in RFC 5661 [65] was not clear, as it covered the simultaneously in RFC 5661 [66] was not clear, as it covered the
two distinct cases of a single file system instance being accessed two distinct cases of a single file system instance being accessed
by two different network access paths and two replicas being by two different network access paths and two replicas being
accessed simultaneously, with the limitations of the latter case accessed simultaneously, with the limitations of the latter case
not being clearly laid out. not being clearly laid out.
The majority of the consequences of these issues are dealt with by The majority of the consequences of these issues are dealt with by
presenting in Section 11 a replacement for Section 11 of RFC 5661 presenting in Section 11 a replacement for Section 11 of RFC 5661
[65]. This replacement modifies existing subsections within that [66]. This replacement modifies existing subsections within that
section and adds new ones as described in Appendix B.1. Also, some section and adds new ones as described in Appendix B.1. Also, some
existing sections were deleted. These changes were made in order to existing sections were deleted. These changes were made in order to
do the following: do the following:
* Reorganize the description so that the case of two network access * Reorganize the description so that the case of two network access
paths to the same file system instance is distinguished clearly paths to the same file system instance is distinguished clearly
from the case of two different replicas since, in the former case, from the case of two different replicas since, in the former case,
locking state is shared and there also can be sharing of session locking state is shared and there also can be sharing of session
state. state.
* Provide a clear statement regarding the desirability of * Provide a clear statement regarding the desirability of
transparent transfer of state between replicas together with a transparent transfer of state between replicas together with a
recommendation that either transparent transfer or a single-fs recommendation that either transparent transfer or a single-fs
grace period be provided. grace period be provided.
* Specifically delineate how a client is to handle such transfers, * Specifically delineate how a client is to handle such transfers,
taking into account the differences from the treatment in [68] taking into account the differences from the treatment in [69]
made necessary by the major protocol changes to NFSv4.1. made necessary by the major protocol changes to NFSv4.1.
* Discuss the relationship between transparent state transfer and * Discuss the relationship between transparent state transfer and
Parallel NFS (pNFS). Parallel NFS (pNFS).
* Clarify the fs_locations_info attribute in order to specify which * Clarify the fs_locations_info attribute in order to specify which
portions of the provided information apply to a specific network portions of the provided information apply to a specific network
access path and which apply to the replica that the path is used access path and which apply to the replica that the path is used
to access. to access.
In addition, other sections of RFC 5661 [65] were updated to correct In addition, other sections of RFC 5661 [66] were updated to correct
the consequences of the incorrect assumptions underlying the the consequences of the incorrect assumptions underlying the
treatment of multi-server namespace issues. These are described in treatment of multi-server namespace issues. These are described in
Appendices B.2 through B.4. Appendices B.2 through B.4.
* A revised introductory section regarding multi-server namespace * A revised introductory section regarding multi-server namespace
facilities is provided. facilities is provided.
* A more realistic treatment of server scope is provided. This * A more realistic treatment of server scope is provided. This
treatment reflects the more limited coordination of locking state treatment reflects the more limited coordination of locking state
adopted by servers actually sharing a common server scope. adopted by servers actually sharing a common server scope.
skipping to change at line 31213 skipping to change at line 31258
situations that would arise in dealing with Transparent State situations that would arise in dealing with Transparent State
Migration, or because some types of reclaim issues were not Migration, or because some types of reclaim issues were not
adequately dealt with in the context of fs-specific grace periods. adequately dealt with in the context of fs-specific grace periods.
For details, see Appendix B.2. For details, see Appendix B.2.
Appendix B. Changes in This Update Appendix B. Changes in This Update
B.1. Revisions Made to Section 11 of RFC 5661 B.1. Revisions Made to Section 11 of RFC 5661
A number of areas have been revised or extended, in many cases A number of areas have been revised or extended, in many cases
replacing subsections within Section 11 of RFC 5661 [65]: replacing subsections within Section 11 of RFC 5661 [66]:
* New introductory material, including a terminology section, * New introductory material, including a terminology section,
replaces the material in RFC 5661 [65], ranging from the start of replaces the material in RFC 5661 [66], ranging from the start of
the original Section 11 up to and including Section 11.1. The new the original Section 11 up to and including Section 11.1. The new
material starts at the beginning of Section 11 and continues material starts at the beginning of Section 11 and continues
through 11.2. through 11.2.
* A significant reorganization of the material in Sections 11.4 and * A significant reorganization of the material in Sections 11.4 and
11.5 of RFC 5661 [65] was necessary. The reasons for the 11.5 of RFC 5661 [66] was necessary. The reasons for the
reorganization of these sections into a single section with reorganization of these sections into a single section with
multiple subsections are discussed in Appendix B.1.1 below. This multiple subsections are discussed in Appendix B.1.1 below. This
replacement appears as Section 11.5. replacement appears as Section 11.5.
New material relating to the handling of the file system location New material relating to the handling of the file system location
attributes is contained in Sections 11.5.1 and 11.5.7. attributes is contained in Sections 11.5.1 and 11.5.7.
* A new section describing requirements for user and group handling * A new section describing requirements for user and group handling
within a multi-server namespace has been added as Section 11.7. within a multi-server namespace has been added as Section 11.7.
* A major replacement for Section 11.7 of RFC 5661 [65], entitled * A major replacement for Section 11.7 of RFC 5661 [66], entitled
"Effecting File System Transitions", appears as Sections 11.9 "Effecting File System Transitions", appears as Sections 11.9
through 11.14. The reasons for the reorganization of this section through 11.14. The reasons for the reorganization of this section
into multiple sections are discussed in Appendix B.1.2. into multiple sections are discussed in Appendix B.1.2.
* A replacement for Section 11.10 of RFC 5661 [65], entitled "The * A replacement for Section 11.10 of RFC 5661 [66], entitled "The
Attribute fs_locations_info", appears as Section 11.17, with Attribute fs_locations_info", appears as Section 11.17, with
Appendix B.1.3 describing the differences between the new section Appendix B.1.3 describing the differences between the new section
and the treatment within [65]. A revised treatment was necessary and the treatment within [66]. A revised treatment was necessary
because the original treatment did not make clear how the added because the original treatment did not make clear how the added
attribute information relates to the case of trunked paths to the attribute information relates to the case of trunked paths to the
same replica. These issues were not addressed in RFC 5661 [65] same replica. These issues were not addressed in RFC 5661 [66]
where the concepts of a replica and a network path used to access where the concepts of a replica and a network path used to access
a replica were not clearly distinguished. a replica were not clearly distinguished.
B.1.1. Reorganization of Sections 11.4 and 11.5 of RFC 5661 B.1.1. Reorganization of Sections 11.4 and 11.5 of RFC 5661
Previously, issues related to the fact that multiple location entries Previously, issues related to the fact that multiple location entries
directed the client to the same file system instance were dealt with directed the client to the same file system instance were dealt with
in Section 11.5 of RFC 5661 [65]. Because of the new treatment of in Section 11.5 of RFC 5661 [66]. Because of the new treatment of
trunking, these issues now belong within Section 11.5. trunking, these issues now belong within Section 11.5.
In this new section, trunking is covered in Section 11.5.2 together In this new section, trunking is covered in Section 11.5.2 together
with the other uses of file system location information described in with the other uses of file system location information described in
Sections 11.5.3 through 11.5.6. Sections 11.5.3 through 11.5.6.
As a result, Section 11.5, which replaces Section 11.4 of RFC 5661 As a result, Section 11.5, which replaces Section 11.4 of RFC 5661
[65], is substantially different than the section it replaces in that [66], is substantially different than the section it replaces in that
some original sections have been replaced by corresponding sections some original sections have been replaced by corresponding sections
as described below, while new sections have been added: as described below, while new sections have been added:
* The material in Section 11.5, exclusive of subsections, replaces * The material in Section 11.5, exclusive of subsections, replaces
the material in Section 11.4 of RFC 5661 [65] exclusive of the material in Section 11.4 of RFC 5661 [66] exclusive of
subsections. subsections.
* Section 11.5.1 is the new first subsection of the overall section. * Section 11.5.1 is the new first subsection of the overall section.
* Section 11.5.2 is the new second subsection of the overall * Section 11.5.2 is the new second subsection of the overall
section. section.
* Each of the Sections 11.5.4, 11.5.5, and 11.5.6 replaces (in * Each of the Sections 11.5.4, 11.5.5, and 11.5.6 replaces (in
order) one of the corresponding Sections 11.4.1, 11.4.2, and order) one of the corresponding Sections 11.4.1, 11.4.2, and
11.4.3 of RFC 5661 [65]. 11.4.4, and 11.4.5. 11.4.3 of RFC 5661 [66].
* Section 11.5.7 is the new final subsection of the overall section. * Section 11.5.7 is the new final subsection of the overall section.
B.1.2. Reorganization of Material Dealing with File System Transitions B.1.2. Reorganization of Material Dealing with File System Transitions
The material relating to file system transition, previously contained The material relating to file system transition, previously contained
in Section 11.7 of RFC 5661 [65] has been reorganized and augmented in Section 11.7 of RFC 5661 [66] has been reorganized and augmented
as described below: as described below:
* Because there can be a shift of the network access paths used to * Because there can be a shift of the network access paths used to
access a file system instance without any shift between replicas, access a file system instance without any shift between replicas,
a new Section 11.9 distinguishes between those cases in which a new Section 11.9 distinguishes between those cases in which
there is a shift between distinct replicas and those involving a there is a shift between distinct replicas and those involving a
shift in network access paths with no shift between replicas. shift in network access paths with no shift between replicas.
As a result, the new Section 11.10 deals with network address As a result, the new Section 11.10 deals with network address
transitions, while the bulk of the original Section 11.7 of RFC transitions, while the bulk of the original Section 11.7 of RFC
5661 [65] has been extensively modified as reflected in 5661 [66] has been extensively modified as reflected in
Section 11.11, which is now limited to cases in which there is a Section 11.11, which is now limited to cases in which there is a
shift between two different sets of replicas. shift between two different sets of replicas.
* The additional Section 11.12 discusses the case in which a shift * The additional Section 11.12 discusses the case in which a shift
to a different replica is made and state is transferred to allow to a different replica is made and state is transferred to allow
the client the ability to have continued access to its accumulated the client the ability to have continued access to its accumulated
locking state on the new server. locking state on the new server.
* The additional Section 11.13 discusses the client's response to * The additional Section 11.13 discusses the client's response to
access transitions, how it determines whether migration has access transitions, how it determines whether migration has
occurred, and how it gets access to any transferred locking and occurred, and how it gets access to any transferred locking and
session state. session state.
* The additional Section 11.14 discusses the responsibilities of the * The additional Section 11.14 discusses the responsibilities of the
source and destination servers when transferring locking and source and destination servers when transferring locking and
session state. session state.
This reorganization has caused a renumbering of the sections within This reorganization has caused a renumbering of the sections within
Section 11 of [65] as described below: Section 11 of [66] as described below:
* The new Sections 11.9 and 11.10 have resulted in the renumbering * The new Sections 11.9 and 11.10 have resulted in the renumbering
of existing sections with these numbers. of existing sections with these numbers.
* Section 11.7 of [65] has been substantially modified and appears * Section 11.7 of [66] has been substantially modified and appears
as Section 11.11. The necessary modifications reflect the fact as Section 11.11. The necessary modifications reflect the fact
that this section only deals with transitions between replicas, that this section only deals with transitions between replicas,
while transitions between network addresses are dealt with in while transitions between network addresses are dealt with in
other sections. Details of the reorganization are described later other sections. Details of the reorganization are described later
in this section. in this section.
* Sections 11.12, 11.13, and 11.14 have been added. * Sections 11.12, 11.13, and 11.14 have been added.
* Consequently, Sections 11.8, 11.9, 11.10, and 11.11 in [65] now * Consequently, Sections 11.8, 11.9, 11.10, and 11.11 in [66] now
appear as Sections 11.15, 11.16, 11.17, and 11.18, respectively. appear as Sections 11.15, 11.16, 11.17, and 11.18, respectively.
As part of this general reorganization, Section 11.7 of RFC 5661 [65] As part of this general reorganization, Section 11.7 of RFC 5661 [66]
has been modified as described below: has been modified as described below:
* Sections 11.7 and 11.7.1 of RFC 5661 [65] have been replaced by * Sections 11.7 and 11.7.1 of RFC 5661 [66] have been replaced by
Sections 11.11 and 11.11.1, respectively. Sections 11.11 and 11.11.1, respectively.
* Section 11.7.2 of RFC 5661 (and included subsections) has been * Section 11.7.2 of RFC 5661 (and included subsections) has been
deleted. deleted.
* Sections 11.7.3, 11.7.4, 11.7.5, 11.7.5.1, and 11.7.6 of RFC 5661 * Sections 11.7.3, 11.7.4, 11.7.5, 11.7.5.1, and 11.7.6 of RFC 5661
[65] have been replaced by Sections 11.11.2, 11.11.3, 11.11.4, [66] have been replaced by Sections 11.11.2, 11.11.3, 11.11.4,
11.11.4.1, and 11.11.5 respectively in this document. 11.11.4.1, and 11.11.5 respectively in this document.
* Section 11.7.7 of RFC 5661 [65] has been replaced by * Section 11.7.7 of RFC 5661 [66] has been replaced by
Section 11.11.9. This subsection has been moved to the end of the Section 11.11.9. This subsection has been moved to the end of the
section dealing with file system transitions. section dealing with file system transitions.
* Sections 11.7.8, 11.7.9, and 11.7.10 of RFC 5661 [65] have been * Sections 11.7.8, 11.7.9, and 11.7.10 of RFC 5661 [66] have been
replaced by Sections 11.11.6, 11.11.7, and 11.11.8 respectively in replaced by Sections 11.11.6, 11.11.7, and 11.11.8 respectively in
this document. this document.
B.1.3. Updates to the Treatment of fs_locations_info B.1.3. Updates to the Treatment of fs_locations_info
Various elements of the fs_locations_info attribute contain Various elements of the fs_locations_info attribute contain
information that applies to either a specific file system replica or information that applies to either a specific file system replica or
to a network path or set of network paths used to access such a to a network path or set of network paths used to access such a
replica. The original treatment of fs_locations_info (Section 11.10 replica. The original treatment of fs_locations_info (Section 11.10
of RFC 5661 [65]) did not clearly distinguish these cases, in part of RFC 5661 [66]) did not clearly distinguish these cases, in part
because the document did not clearly distinguish replicas from the because the document did not clearly distinguish replicas from the
paths used to access them. paths used to access them.
In addition, special clarification has been provided with regard to In addition, special clarification has been provided with regard to
the following fields: the following fields:
* With regard to the handling of FSLI4GF_GOING, it was clarified * With regard to the handling of FSLI4GF_GOING, it was clarified
that this only applies to the unavailability of a replica rather that this only applies to the unavailability of a replica rather
than to a path to access a replica. than to a path to access a replica.
* In describing the appropriate value for a server to use for * In describing the appropriate value for a server to use for
fli_valid_for, it was clarified that there is no need for the fli_valid_for, it was clarified that there is no need for the
client to frequently fetch the fs_locations_info value to be client to frequently fetch the fs_locations_info value to be
prepared for shifts in trunking patterns. prepared for shifts in trunking patterns.
* Clarification of the rules for extensions to the fls_info has been * Clarification of the rules for extensions to the fls_info has been
provided. The original treatment reflected the extension model provided. The original treatment reflected the extension model
that was in effect at the time RFC 5661 [65] was written, but has that was in effect at the time RFC 5661 [66] was written, but has
been updated in accordance with the extension model described in been updated in accordance with the extension model described in
RFC 8178 [66]. RFC 8178 [67].
B.2. Revisions Made to Operations in RFC 5661 B.2. Revisions Made to Operations in RFC 5661
Descriptions have been revised to address issues that arose in Descriptions have been revised to address issues that arose in
effecting necessary changes to multi-server namespace features. effecting necessary changes to multi-server namespace features.
* The treatment of EXCHANGE_ID (Section 18.35 of RFC 5661 [65]) * The treatment of EXCHANGE_ID (Section 18.35 of RFC 5661 [66])
assumed that client IDs cannot be created/confirmed other than by assumed that client IDs cannot be created/confirmed other than by
the EXCHANGE_ID and CREATE_SESSION operations. Also, the the EXCHANGE_ID and CREATE_SESSION operations. Also, the
necessary use of EXCHANGE_ID in recovery from migration and necessary use of EXCHANGE_ID in recovery from migration and
related situations was not clearly addressed. A revised treatment related situations was not clearly addressed. A revised treatment
of EXCHANGE_ID was necessary, and it appears in Section 18.35, of EXCHANGE_ID was necessary, and it appears in Section 18.35,
while the specific differences between it and the treatment within while the specific differences between it and the treatment within
[65] are explained in Appendix B.2.1 below. [66] are explained in Appendix B.2.1 below.
* The treatment of RECLAIM_COMPLETE in Section 18.51 of RFC 5661 * The treatment of RECLAIM_COMPLETE in Section 18.51 of RFC 5661
[65] was not sufficiently clear about the purpose and use of the [66] was not sufficiently clear about the purpose and use of the
rca_one_fs and how the server was to deal with inappropriate rca_one_fs and how the server was to deal with inappropriate
values of this argument. Because the resulting confusion raised values of this argument. Because the resulting confusion raised
interoperability issues, a new treatment of RECLAIM_COMPLETE was interoperability issues, a new treatment of RECLAIM_COMPLETE was
necessary, and it appears in Section 18.51, while the specific necessary, and it appears in Section 18.51, while the specific
differences between it and the treatment within RFC 5661 [65] are differences between it and the treatment within RFC 5661 [66] are
discussed in Appendix B.2.2 below. In addition, the definitions discussed in Appendix B.2.2 below. In addition, the definitions
of the reclaim-related errors have received an updated treatment of the reclaim-related errors have received an updated treatment
in Section 15.1.9 to reflect the fact that there are multiple in Section 15.1.9 to reflect the fact that there are multiple
contexts for lock reclaim operations. contexts for lock reclaim operations.
B.2.1. Revision of Treatment of EXCHANGE_ID B.2.1. Revision of Treatment of EXCHANGE_ID
There was a number of issues in the original treatment of EXCHANGE_ID There was a number of issues in the original treatment of EXCHANGE_ID
in RFC 5661 [65] that caused problems for Transparent State Migration in RFC 5661 [66] that caused problems for Transparent State Migration
and for the transfer of access between different network access paths and for the transfer of access between different network access paths
to the same file system instance. to the same file system instance.
These issues arose from the fact that this treatment was written: These issues arose from the fact that this treatment was written:
* Assuming that a client ID can only become known to a server by * Assuming that a client ID can only become known to a server by
having been created by executing an EXCHANGE_ID, with confirmation having been created by executing an EXCHANGE_ID, with confirmation
of the ID only possible by execution of a CREATE_SESSION. of the ID only possible by execution of a CREATE_SESSION.
* Considering the interactions between a client and a server only * Considering the interactions between a client and a server only
occurring on a single network address. occurring on a single network address.
As these assumptions have become invalid in the context of As these assumptions have become invalid in the context of
Transparent State Migration and active use of trunking, the treatment Transparent State Migration and active use of trunking, the treatment
has been modified in several respects: has been modified in several respects:
* It had been assumed that an EXCHANGE_ID executed when the server * It had been assumed that an EXCHANGE_ID executed when the server
is already aware of a given client instance must be either was already aware that a given client instance was either updating
updating associated parameters (e.g., with respect to callbacks) associated parameters (e.g., with respect to callbacks) or dealing
or a lingering retransmission to deal with a previously lost with a previously lost reply by retransmitting. As a result, any
reply. As result, any slot sequence returned by that operation slot sequence returned by that operation would be of no use. The
would be of no use. The original treatment went so far as to say original treatment went so far as to say that it "MUST NOT" be
that it "MUST NOT" be used, although this usage was not in accord used, although this usage was not in accord with [1]. This
with [1]. This created a difficulty when an EXCHANGE_ID is done created a difficulty when an EXCHANGE_ID is done after Transparent
after Transparent State Migration since that slot sequence would State Migration since that slot sequence would need to be used in
need to be used in a subsequent CREATE_SESSION. a subsequent CREATE_SESSION.
In the updated treatment, CREATE_SESSION is a way that client IDs In the updated treatment, CREATE_SESSION is a way that client IDs
are confirmed, but it is understood that other ways are possible. are confirmed, but it is understood that other ways are possible.
The slot sequence can be used as needed, and cases in which it The slot sequence can be used as needed, and cases in which it
would be of no use are appropriately noted. would be of no use are appropriately noted.
* It had been assumed that the only functions of EXCHANGE_ID were to * It had been assumed that the only functions of EXCHANGE_ID were to
inform the server of the client, to create the client ID, and to inform the server of the client, to create the client ID, and to
communicate it to the client. When multiple simultaneous communicate it to the client. When multiple simultaneous
connections are involved, as often happens when trunking, that connections are involved, as often happens when trunking, that
treatment was inadequate in that it ignored the role of treatment was inadequate in that it ignored the role of
EXCHANGE_ID in associating the client ID with the connection on EXCHANGE_ID in associating the client ID with the connection on
which it was done, so that it could be used by a subsequent which it was done, so that it could be used by a subsequent
CREATE_SESSSION whose parameters do not include an explicit client CREATE_SESSION whose parameters do not include an explicit client
ID. ID.
The new treatment explicitly discusses the role of EXCHANGE_ID in The new treatment explicitly discusses the role of EXCHANGE_ID in
associating the client ID with the connection so it can be used by associating the client ID with the connection so it can be used by
CREATE_SESSION and in associating a connection with an existing CREATE_SESSION and in associating a connection with an existing
session. session.
The new treatment can be found in Section 18.35 above. It supersedes The new treatment can be found in Section 18.35 above. It supersedes
the treatment in Section 18.35 of RFC 5661 [65]. the treatment in Section 18.35 of RFC 5661 [66].
B.2.2. Revision of Treatment of RECLAIM_COMPLETE B.2.2. Revision of Treatment of RECLAIM_COMPLETE
The following changes were made to the treatment of RECLAIM_COMPLETE The following changes were made to the treatment of RECLAIM_COMPLETE
in RFC 5661 [65] to arrive at the treatment in Section 18.51: in RFC 5661 [66] to arrive at the treatment in Section 18.51:
* In a number of places, the text was made more explicit about the * In a number of places, the text was made more explicit about the
purpose of rca_one_fs and its connection to file system migration. purpose of rca_one_fs and its connection to file system migration.
* There is a discussion of situations in which particular forms of * There is a discussion of situations in which particular forms of
RECLAIM_COMPLETE would need to be done. RECLAIM_COMPLETE would need to be done.
* There is a discussion of interoperability issues between * There is a discussion of interoperability issues between
implementations that may have arisen due to the lack of clarity of implementations that may have arisen due to the lack of clarity of
the previous treatment of RECLAIM_COMPLETE. the previous treatment of RECLAIM_COMPLETE.
B.3. Revisions Made to Error Definitions in RFC 5661 B.3. Revisions Made to Error Definitions in RFC 5661
The new handling of various situations required revisions to some The new handling of various situations required revisions to some
existing error definitions: existing error definitions:
* Because of the need to appropriately address trunking-related * Because of the need to appropriately address trunking-related
issues, some uses of the term "replica" in RFC 5661 [65] became issues, some uses of the term "replica" in RFC 5661 [66] became
problematic because a shift in network access paths was considered problematic because a shift in network access paths was considered
to be a shift to a different replica. As a result, the original to be a shift to a different replica. As a result, the original
definition of NFS4ERR_MOVED (in Section 15.1.2.4 of RFC 5661 [65]) definition of NFS4ERR_MOVED (in Section 15.1.2.4 of RFC 5661 [66])
was updated to reflect the different handling of unavailability of was updated to reflect the different handling of unavailability of
a particular fs via a specific network address. a particular fs via a specific network address.
Since such a situation is no longer considered to constitute Since such a situation is no longer considered to constitute
unavailability of a file system instance, the description has been unavailability of a file system instance, the description has been
changed, even though the set of circumstances in which it is to be changed, even though the set of circumstances in which it is to be
returned remains the same. The new paragraph explicitly returned remains the same. The new paragraph explicitly
recognizes that a different network address might be used, while recognizes that a different network address might be used, while
the previous description, misleadingly, treated this as a shift the previous description, misleadingly, treated this as a shift
between two replicas while only a single file system instance between two replicas while only a single file system instance
might be involved. The updated description appears in might be involved. The updated description appears in
Section 15.1.2.4. Section 15.1.2.4.
* Because of the need to accommodate the use of fs-specific grace * Because of the need to accommodate the use of fs-specific grace
periods, it was necessary to clarify some of the definitions of periods, it was necessary to clarify some of the definitions of
reclaim-related errors in Section 15 of RFC 5661 [65] so that the reclaim-related errors in Section 15 of RFC 5661 [66] so that the
text applies properly to reclaims for all types of grace periods. text applies properly to reclaims for all types of grace periods.
The updated descriptions appear within Section 15.1.9. The updated descriptions appear within Section 15.1.9.
* Because of the need to provide the clarifications in errata report * Because of the need to provide the clarifications in errata report
2006 [63] and to adapt these to properly explain the interaction 2006 [64] and to adapt these to properly explain the interaction
of NFS4ERR_DELAY with the replay cache, a revised description of of NFS4ERR_DELAY with the reply cache, a revised description of
NFS4ERR_DELAY appears in Section 15.1.1.3. This errata report, NFS4ERR_DELAY appears in Section 15.1.1.3. This errata report,
unlike many other RFC 5661 errata reports, is addressed in this unlike many other RFC 5661 errata reports, is addressed in this
document because of the extensive use of NFS4ERR_DELAY in document because of the extensive use of NFS4ERR_DELAY in
connection with state migration and session migration. connection with state migration and session migration.
B.4. Other Revisions Made to RFC 5661 B.4. Other Revisions Made to RFC 5661
Besides the major reworking of Section 11 of RFC 5661 [65] and the Besides the major reworking of Section 11 of RFC 5661 [66] and the
associated revisions to existing operations and errors, there were a associated revisions to existing operations and errors, there were a
number of related changes that were necessary: number of related changes that were necessary:
* The summary in Section 1.7.3.3 of RFC 5661 [65] was revised to * The summary in Section 1.7.3.3 of RFC 5661 [66] was revised to
reflect the changes made to Section 11 above. The updated summary reflect the changes made to Section 11 above. The updated summary
appears as Section 1.8.3.3 above. appears as Section 1.8.3.3 above.
* The discussion of server scope in Section 2.10.4 of RFC 5661 [65] * The discussion of server scope in Section 2.10.4 of RFC 5661 [66]
was replaced since it appeared to require a level of inter-server was replaced since it appeared to require a level of inter-server
coordination incompatible with its basic function of avoiding the coordination incompatible with its basic function of avoiding the
need for a globally uniform means of assigning server_owner need for a globally uniform means of assigning server_owner
values. A revised treatment appears in Section 2.10.4. values. A revised treatment appears in Section 2.10.4.
* The discussion of trunking in Section 2.10.5 of RFC 5661 [65] was * The discussion of trunking in Section 2.10.5 of RFC 5661 [66] was
revised to more clearly explain the multiple types of trunking revised to more clearly explain the multiple types of trunking
support and how the client can be made aware of the existing support and how the client can be made aware of the existing
trunking configuration. In addition, while the last paragraph trunking configuration. In addition, while the last paragraph
(exclusive of subsections) of that section dealing with (exclusive of subsections) of that section dealing with
server_owner changes was literally true, it had been a source of server_owner changes was literally true, it had been a source of
confusion. Since the original paragraph could be read as confusion. Since the original paragraph could be read as
suggesting that such changes be handled nondisruptively, the issue suggesting that such changes be handled nondisruptively, the issue
was clarified in the revised Section 2.10.5. was clarified in the revised Section 2.10.5.
Appendix C. Security Issues That Need to Be Addressed Appendix C. Security Issues That Need to Be Addressed
The following issues in the treatment of security within the NFSv4.1 The following issues in the treatment of security within the NFSv4.1
specification need to be addressed: specification need to be addressed:
* The Security Considerations Section of RFC 5661 [65] was not * The Security Considerations Section of RFC 5661 [66] was not
written in accordance with RFC 3552 (BCP 72) [71]. Of particular written in accordance with RFC 3552 (BCP 72) [72]. Of particular
concern was the fact that the section did not contain a threat concern was the fact that the section did not contain a threat
analysis. analysis.
* Initial analysis of the existing security issues with NFSv4.1 has * Initial analysis of the existing security issues with NFSv4.1 has
made it likely that a revised Security Considerations section for made it likely that a revised Security Considerations section for
the existing protocol (one containing a threat analysis) would be the existing protocol (one containing a threat analysis) would be
likely to conclude that NFSv4.1 does not meet the goal of secure likely to conclude that NFSv4.1 does not meet the goal of secure
use on the Internet. use on the Internet.
The Security Considerations section of this document (Section 21) has The Security Considerations section of this document (Section 21) has
skipping to change at line 31574 skipping to change at line 31619
creates need to be addressed. Addressing this issue must not be creates need to be addressed. Addressing this issue must not be
limited to the questions of whether the designation of this as limited to the questions of whether the designation of this as
OPTIONAL was justified and whether it should be changed. OPTIONAL was justified and whether it should be changed.
In any event, it may not be possible at this point to correct the In any event, it may not be possible at this point to correct the
security problems created by continued use of AUTH_SYS simply by security problems created by continued use of AUTH_SYS simply by
revising this designation. revising this designation.
* The lack of attention within the protocol to the possibility of * The lack of attention within the protocol to the possibility of
pervasive monitoring attacks such as those described in RFC 7258 pervasive monitoring attacks such as those described in RFC 7258
[70] (also BCP 188). [71] (also BCP 188).
In that connection, the use of CREATE_SESSION without privacy In that connection, the use of CREATE_SESSION without privacy
protection needs to be addressed as it exposes the session ID to protection needs to be addressed as it exposes the session ID to
view by an attacker. This is worrisome as this is precisely the view by an attacker. This is worrisome as this is precisely the
type of protocol artifact alluded to in RFC 7258, which can enable type of protocol artifact alluded to in RFC 7258, which can enable
further mischief on the part of the attacker as it enables denial- further mischief on the part of the attacker as it enables denial-
of-service attacks that can be executed effectively with only a of-service attacks that can be executed effectively with only a
single, normally low-value, credential, even when RPCSEC_GSS single, normally low-value, credential, even when RPCSEC_GSS
authentication is in use. authentication is in use.
 End of changes. 205 change blocks. 
475 lines changed or deleted 520 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/