rfc8881.form.txt   rfc8881.txt 
Internet Engineering Task Force (IETF) D. Noveck, Ed. Internet Engineering Task Force (IETF) D. Noveck, Ed.
Request for Comments: 0000 NetApp Request for Comments: 8881 NetApp
Obsoletes: 5661 C. Lever Obsoletes: 5661 C. Lever
Category: Standards Track ORACLE Category: Standards Track ORACLE
ISSN: 2070-1721 April 2020 ISSN: 2070-1721 July 2020
Network File System (NFS) Version 4 Minor Version 1 Protocol Network File System (NFS) Version 4 Minor Version 1 Protocol
Abstract Abstract
This document describes the Network File System (NFS) version 4 minor This document describes the Network File System (NFS) version 4 minor
version 1, including features retained from the base protocol (NFS version 1, including features retained from the base protocol (NFS
version 4 minor version 0, which is specified in RFC 7530) and version 4 minor version 0, which is specified in RFC 7530) and
protocol extensions made subsequently. The later minor version has protocol extensions made subsequently. The later minor version has
no dependencies on NFS version 4 minor version 0, and is considered a no dependencies on NFS version 4 minor version 0, and is considered a
separate protocol. separate protocol.
This document obsoletes RFC5661. It substantially revises the This document obsoletes RFC 5661. It substantially revises the
treatment of features relating to multi-server namespace, superseding treatment of features relating to multi-server namespace, superseding
the description of those features appearing in RFC5661. the description of those features appearing in RFC 5661.
Status of This Memo Status of This Memo
This is an Internet Standards Track document. This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has (IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841. Internet Standards is available in Section 2 of RFC 7841.
Information about the current status of this document, any errata, Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc0000. https://www.rfc-editor.org/info/rfc8881.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at line 67 skipping to change at line 67
Without obtaining an adequate license from the person(s) controlling Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other it for publication as an RFC or to translate it into languages other
than English. than English.
Table of Contents Table of Contents
1. Introduction 1. Introduction
1.1. Introduction to this Update 1.1. Introduction to This Update
1.2. The NFS Version 4 Minor Version 1 Protocol 1.2. The NFS Version 4 Minor Version 1 Protocol
1.3. Requirements Language 1.3. Requirements Language
1.4. Scope of This Document 1.4. Scope of This Document
1.5. NFSv4 Goals 1.5. NFSv4 Goals
1.6. NFSv4.1 Goals 1.6. NFSv4.1 Goals
1.7. General Definitions 1.7. General Definitions
1.8. Overview of NFSv4.1 Features 1.8. Overview of NFSv4.1 Features
1.9. Differences from NFSv4.0 1.9. Differences from NFSv4.0
2. Core Infrastructure 2. Core Infrastructure
2.1. Introduction 2.1. Introduction
skipping to change at line 162 skipping to change at line 162
10.7. Data and Metadata Caching and Memory Mapped Files 10.7. Data and Metadata Caching and Memory Mapped Files
10.8. Name and Directory Caching without Directory Delegations 10.8. Name and Directory Caching without Directory Delegations
10.9. Directory Delegations 10.9. Directory Delegations
11. Multi-Server Namespace 11. Multi-Server Namespace
11.1. Terminology 11.1. Terminology
11.2. File System Location Attributes 11.2. File System Location Attributes
11.3. File System Presence or Absence 11.3. File System Presence or Absence
11.4. Getting Attributes for an Absent File System 11.4. Getting Attributes for an Absent File System
11.5. Uses of File System Location Information 11.5. Uses of File System Location Information
11.6. Trunking without File System Location Information 11.6. Trunking without File System Location Information
11.7. Users and Groups in a Multi-server Namespace 11.7. Users and Groups in a Multi-Server Namespace
11.8. Additional Client-Side Considerations 11.8. Additional Client-Side Considerations
11.9. Overview of File Access Transitions 11.9. Overview of File Access Transitions
11.10. Effecting Network Endpoint Transitions 11.10. Effecting Network Endpoint Transitions
11.11. Effecting File System Transitions 11.11. Effecting File System Transitions
11.12. Transferring State upon Migration 11.12. Transferring State upon Migration
11.13. Client Responsibilities when Access is Transitioned 11.13. Client Responsibilities When Access Is Transitioned
11.14. Server Responsibilities Upon Migration 11.14. Server Responsibilities Upon Migration
11.15. Effecting File System Referrals 11.15. Effecting File System Referrals
11.16. The Attribute fs_locations 11.16. The Attribute fs_locations
11.17. The Attribute fs_locations_info 11.17. The Attribute fs_locations_info
11.18. The Attribute fs_status 11.18. The Attribute fs_status
12. Parallel NFS (pNFS) 12. Parallel NFS (pNFS)
12.1. Introduction 12.1. Introduction
12.2. pNFS Definitions 12.2. pNFS Definitions
12.3. pNFS Operations 12.3. pNFS Operations
12.4. pNFS Attributes 12.4. pNFS Attributes
skipping to change at line 300 skipping to change at line 300
and Control and Control
20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending
Delegation Wants Delegation Wants
20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of Possible 20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of Possible
Lock Availability Lock Availability
20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of Device 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of Device
ID Changes ID Changes
20.13. Operation 10044: CB_ILLEGAL - Illegal Callback Operation 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback Operation
21. Security Considerations 21. Security Considerations
22. IANA Considerations 22. IANA Considerations
22.1. IANA Actions Needed 22.1. IANA Actions
22.2. Named Attribute Definitions 22.2. Named Attribute Definitions
22.3. Device ID Notifications 22.3. Device ID Notifications
22.4. Object Recall Types 22.4. Object Recall Types
22.5. Layout Types 22.5. Layout Types
22.6. Path Variable Definitions 22.6. Path Variable Definitions
23. References 23. References
23.1. Normative References 23.1. Normative References
23.2. Informative References 23.2. Informative References
Appendix A. Need for this Update Appendix A. The Need for This Update
Appendix B. Changes in this Update Appendix B. Changes in This Update
B.1. Revisions Made to Section 11 of RFC5661 B.1. Revisions Made to Section 11 of RFC 5661
B.2. Revisions Made to Operations in RFC5661 B.2. Revisions Made to Operations in RFC 5661
B.3. Revisions Made to Error Definitions in RFC5661 B.3. Revisions Made to Error Definitions in RFC 5661
B.4. Other Revisions Made to RFC5661 B.4. Other Revisions Made to RFC 5661
Appendix C. Security Issues that Need to be Addressed Appendix C. Security Issues That Need to Be Addressed
Acknowledgments Acknowledgments
Authors' Addresses Authors' Addresses
1. Introduction 1. Introduction
1.1. Introduction to this Update 1.1. Introduction to This Update
Two important features previously defined in minor version 0 but Two important features previously defined in minor version 0 but
never fully addressed in minor version 1 are trunking, the never fully addressed in minor version 1 are trunking, which is the
simultaneous use of multiple connections between a client and server, simultaneous use of multiple connections between a client and server,
potentially to different network addresses, and transparent state potentially to different network addresses, and Transparent State
migration, which allows a file system to be transferred between Migration, which allows a file system to be transferred between
servers in a way that provides to the client the ability to maintain servers in a way that provides to the client the ability to maintain
its existing locking state across the transfer. its existing locking state across the transfer.
The revised description of the NFS version 4 minor version 1 The revised description of the NFS version 4 minor version 1
(NFSv4.1) protocol presented in this update is necessary to enable (NFSv4.1) protocol presented in this update is necessary to enable
full use of these features together with other multi-server namespace full use of these features together with other multi-server namespace
features. This document is in the form of an updated description of features. This document is in the form of an updated description of
the NFSv4.1 protocol previously defined in RFC 5661 [65]. RFC5661 is the NFSv4.1 protocol previously defined in RFC 5661 [65]. RFC 5661
obsoleted by this document. However, the update has a limited scope is obsoleted by this document. However, the update has a limited
and is focused on enabling full use of trunking and transparent state scope and is focused on enabling full use of trunking and Transparent
migration. The need for these changes is discussed in Appendix A. State Migration. The need for these changes is discussed in
Appendix B describes the specific changes made to arrive at the Appendix A. Appendix B describes the specific changes made to arrive
current text. at the current text.
This limited-scope update replaces the current NFSv4.1 RFC with the This limited-scope update replaces the current NFSv4.1 RFC with the
intention of providing an authoritative and complete specification, intention of providing an authoritative and complete specification,
the motivation for which is discussed in [35], addressing the issues the motivation for which is discussed in [35], addressing the issues
within the scope of the update. However, it will not address issues within the scope of the update. However, it will not address issues
that are known but outside of this limited scope as could expected by that are known but outside of this limited scope as could be expected
a full update of the protocol. Below are some areas which are known by a full update of the protocol. Below are some areas that are
to need addressing in a future update of the protocol. known to need addressing in a future update of the protocol:
* Work needs to be done with regard to RFC 8178 [66] which * Work needs to be done with regard to RFC 8178 [66], which
establishes NFSv4-wide versioning rules. As RFC5661 is currently establishes NFSv4-wide versioning rules. As RFC 5661 is currently
inconsistent with that document, changes are needed in order to inconsistent with that document, changes are needed in order to
arrive at a situation in which there would be no need for RFC8178 arrive at a situation in which there would be no need for RFC 8178
to update the NFSv4.1 specification. to update the NFSv4.1 specification.
* Work needs to be done with regard to RFC 8434 [69], which * Work needs to be done with regard to RFC 8434 [69], which
establishes the requirements for pNFS layout types, which are not establishes the requirements for parallel NFS (pNFS) layout types,
clearly defined in RFC5661. When that work is done and the which are not clearly defined in RFC 5661. When that work is done
resulting documents approved, the new NFSv4.1 specification and the resulting documents approved, the new NFSv4.1
document will provide a clear set of requirements for layout types specification document will provide a clear set of requirements
and a description of the file layout type that conforms to those for layout types and a description of the file layout type that
requirements. Other layout types will have their own conforms to those requirements. Other layout types will have
specification documents that conforms to those requirements as their own specification documents that conform to those
well. requirements as well.
* Work needs to be done to address many errata reports relevant to * Work needs to be done to address many errata reports relevant to
RFC 5661, other than errata report 2006 [63], which is addressed RFC 5661, other than errata report 2006 [63], which is addressed
in this document. Addressing that report was not deferrable in this document. Addressing that report was not deferrable
because of the interaction of the changes suggested there and the because of the interaction of the changes suggested there and the
newly described handling of state and session migration. newly described handling of state and session migration.
The errata reports that have been deferred and that will need to The errata reports that have been deferred and that will need to
be addressed in a later document include reports currently be addressed in a later document include reports currently
assigned a range of statuses in the errata reporting system assigned a range of statuses in the errata reporting system,
including reports marked Accepted and those marked Hold For including reports marked Accepted and those marked Hold For
Document Update because the change was too minor to address Document Update because the change was too minor to address
immediately. immediately.
In addition, there is a set of other reports, including at least In addition, there is a set of other reports, including at least
one in state Rejected, which will need to be addressed in a later one in state Rejected, that will need to be addressed in a later
document. This will involve making changes to consensus decisions document. This will involve making changes to consensus decisions
reflected in RFC 5661, in situation in which the working group has reflected in RFC 5661, in situations in which the working group
decided that the treatment in RFC 5661 is incorrect, and needs to has decided that the treatment in RFC 5661 is incorrect and needs
be revised to reflect the working group's new consensus and ensure to be revised to reflect the working group's new consensus and to
compatibility with existing implementations that do not follow the ensure compatibility with existing implementations that do not
handling described in in RFC 5661. follow the handling described in RFC 5661.
Note that it is expected that all such errata reports will remain Note that it is expected that all such errata reports will remain
relevant to implementers and the authors of an eventual relevant to implementors and the authors of an eventual
rfc5661bis, despite the fact that this document, when approved, rfc5661bis, despite the fact that this document, when approved,
will obsolete RFC 5661 [65]. will obsolete RFC 5661 [65].
* There is a need for a new approach to the description of * There is a need for a new approach to the description of
internationalization since the current internationalization internationalization since the current internationalization
section (Section 14) has never been implemented and does not meet section (Section 14) has never been implemented and does not meet
the needs of the NFSv4 protocol. Possible solutions are to create the needs of the NFSv4 protocol. Possible solutions are to create
a new internationalization section modeled on that in [67] or to a new internationalization section modeled on that in [67] or to
create a new document describing internationalization for all create a new document describing internationalization for all
NFSv4 minor versions and reference that document in the RFCs NFSv4 minor versions and reference that document in the RFCs
defining both NFSv4.0 and NFSv4.1. defining both NFSv4.0 and NFSv4.1.
* There is a need for a revised treatment of security in NFSv4.1. * There is a need for a revised treatment of security in NFSv4.1.
The issues with the existing treatment are discussed in The issues with the existing treatment are discussed in
Appendix C. Appendix C.
Until the above work is done, there will not be a consistent set of Until the above work is done, there will not be a consistent set of
documents providing a description of the NFSv4.1 protocol and any documents that provides a description of the NFSv4.1 protocol, and
full description would involve documents updating other documents any full description would involve documents updating other documents
within the specification. The updates applied by RFC8434 [69] and within the specification. The updates applied by RFC 8434 [69] and
RFC8178 [66] to RFC5661 also apply to this specification, and will RFC 8178 [66] to RFC 5661 also apply to this specification, and will
apply to any subsequent v4.1 specification until that work is done. apply to any subsequent v4.1 specification until that work is done.
1.2. The NFS Version 4 Minor Version 1 Protocol 1.2. The NFS Version 4 Minor Version 1 Protocol
The NFS version 4 minor version 1 (NFSv4.1) protocol is the second The NFS version 4 minor version 1 (NFSv4.1) protocol is the second
minor version of the NFS version 4 (NFSv4) protocol. The first minor minor version of the NFS version 4 (NFSv4) protocol. The first minor
version, NFSv4.0, is now described in RFC 7530 [67]. It generally version, NFSv4.0, is now described in RFC 7530 [67]. It generally
follows the guidelines for minor versioning that are listed in follows the guidelines for minor versioning that are listed in
Section 10 of RFC 3530 [36]. However, it diverges from guidelines 11 Section 10 of RFC 3530 [36]. However, it diverges from guidelines 11
("a client and server that support minor version X must support minor ("a client and server that support minor version X must support minor
skipping to change at line 579 skipping to change at line 579
Server: The Server is the entity responsible for coordinating client Server: The Server is the entity responsible for coordinating client
access to a set of file systems and is identified by a server access to a set of file systems and is identified by a server
owner. A server can span multiple network addresses. owner. A server can span multiple network addresses.
Server Owner: The server owner identifies the server to the client. Server Owner: The server owner identifies the server to the client.
The server owner consists of a major identifier and a minor The server owner consists of a major identifier and a minor
identifier. When the client has two connections each to a peer identifier. When the client has two connections each to a peer
with the same major identifier, the client assumes that both peers with the same major identifier, the client assumes that both peers
are the same server (the server namespace is the same via each are the same server (the server namespace is the same via each
connection) and that lock state is sharable across both connection) and that lock state is shareable across both
connections. When each peer has both the same major and minor connections. When each peer has both the same major and minor
identifiers, the client assumes that each connection might be identifiers, the client assumes that each connection might be
associable with the same session. associable with the same session.
Stable Storage: Stable storage is storage from which data stored by Stable Storage: Stable storage is storage from which data stored by
an NFSv4.1 server can be recovered without data loss from multiple an NFSv4.1 server can be recovered without data loss from multiple
power failures (including cascading power failures, that is, power failures (including cascading power failures, that is,
several power failures in quick succession), operating system several power failures in quick succession), operating system
failures, and/or hardware failure of components other than the failures, and/or hardware failure of components other than the
storage medium itself (such as disk, nonvolatile RAM, flash storage medium itself (such as disk, nonvolatile RAM, flash
skipping to change at line 755 skipping to change at line 755
application-specific data with a regular file or directory. NFSv4.1 application-specific data with a regular file or directory. NFSv4.1
modifies named attributes relative to NFSv4.0 by tightening the modifies named attributes relative to NFSv4.0 by tightening the
allowed operations in order to prevent the development of non- allowed operations in order to prevent the development of non-
interoperable implementations. Named attributes are discussed in interoperable implementations. Named attributes are discussed in
Section 5.3. Section 5.3.
1.8.3.3. Multi-Server Namespace 1.8.3.3. Multi-Server Namespace
NFSv4.1 contains a number of features to allow implementation of NFSv4.1 contains a number of features to allow implementation of
namespaces that cross server boundaries and that allow and facilitate namespaces that cross server boundaries and that allow and facilitate
a non-disruptive transfer of support for individual file systems a nondisruptive transfer of support for individual file systems
between servers. They are all based upon attributes that allow one between servers. They are all based upon attributes that allow one
file system to specify alternate, additional, and new location file system to specify alternate, additional, and new location
information that specifies how the client may access that file information that specifies how the client may access that file
system. system.
These attributes can be used to provide for individual active file These attributes can be used to provide for individual active file
systems: systems:
* Alternate network addresses to access the current file system * Alternate network addresses to access the current file system
instance. instance.
skipping to change at line 783 skipping to change at line 783
namespace is associated with locations on other servers without there namespace is associated with locations on other servers without there
being any corresponding file system instance on the current server. being any corresponding file system instance on the current server.
For example, For example,
* These attributes may be used with absent file systems to implement * These attributes may be used with absent file systems to implement
referrals whereby one server may direct the client to a file referrals whereby one server may direct the client to a file
system provided by another server. This allows extensive multi- system provided by another server. This allows extensive multi-
server namespaces to be constructed. server namespaces to be constructed.
* These attributes may be provided when a previously present file * These attributes may be provided when a previously present file
system becomes absent. This allows non-disruptive migration of system becomes absent. This allows nondisruptive migration of
file systems to alternate servers. file systems to alternate servers.
1.8.4. Locking Facilities 1.8.4. Locking Facilities
As mentioned previously, NFSv4.1 is a single protocol that includes As mentioned previously, NFSv4.1 is a single protocol that includes
locking facilities. These locking facilities include support for locking facilities. These locking facilities include support for
many types of locks including a number of sorts of recallable locks. many types of locks including a number of sorts of recallable locks.
Recallable locks such as delegations allow the client to be assured Recallable locks such as delegations allow the client to be assured
that certain events will not occur so long as that lock is held. that certain events will not occur so long as that lock is held.
When circumstances change, the lock is recalled via a callback When circumstances change, the lock is recalled via a callback
skipping to change at line 908 skipping to change at line 908
forms of RPC authentication, AUTH_SYS, had no strong authentication forms of RPC authentication, AUTH_SYS, had no strong authentication
and required a host-based authentication approach. NFSv4.1 also and required a host-based authentication approach. NFSv4.1 also
depends on RPC for basic security services and mandates RPC support depends on RPC for basic security services and mandates RPC support
for a user-based authentication model. The user-based authentication for a user-based authentication model. The user-based authentication
model has user principals authenticated by a server, and in turn the model has user principals authenticated by a server, and in turn the
server authenticated by user principals. RPC provides some basic server authenticated by user principals. RPC provides some basic
security services that are used by NFSv4.1. security services that are used by NFSv4.1.
2.2.1.1. RPC Security Flavors 2.2.1.1. RPC Security Flavors
As described in "Authentication", Section 7.2 of [3], RPC security is As described in "Authentication", Section 7 of [3], RPC security is
encapsulated in the RPC header, via a security or authentication encapsulated in the RPC header, via a security or authentication
flavor, and information specific to the specified security flavor. flavor, and information specific to the specified security flavor.
Every RPC header conveys information used to identify and Every RPC header conveys information used to identify and
authenticate a client and server. As discussed in Section 2.2.1.1.1, authenticate a client and server. As discussed in Section 2.2.1.1.1,
some security flavors provide additional security services. some security flavors provide additional security services.
NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This
requirement to implement is not a requirement to use.) Other requirement to implement is not a requirement to use.) Other
flavors, such as AUTH_NONE and AUTH_SYS, MAY be implemented as well. flavors, such as AUTH_NONE and AUTH_SYS, MAY be implemented as well.
skipping to change at line 1126 skipping to change at line 1126
Client identification is encapsulated in the following client owner Client identification is encapsulated in the following client owner
data type: data type:
struct client_owner4 { struct client_owner4 {
verifier4 co_verifier; verifier4 co_verifier;
opaque co_ownerid<NFS4_OPAQUE_LIMIT>; opaque co_ownerid<NFS4_OPAQUE_LIMIT>;
}; };
The first field, co_verifier, is a client incarnation verifier, The first field, co_verifier, is a client incarnation verifier,
allowing the server to distinguish successive incarnations (e.g. allowing the server to distinguish successive incarnations (e.g.,
reboots) of the same client. The server will start the process of reboots) of the same client. The server will start the process of
canceling the client's leased state if co_verifier is different than canceling the client's leased state if co_verifier is different than
what the server has previously recorded for the identified client (as what the server has previously recorded for the identified client (as
specified in the co_ownerid field). specified in the co_ownerid field).
The second field, co_ownerid, is a variable length string that The second field, co_ownerid, is a variable length string that
uniquely defines the client so that subsequent instances of the same uniquely defines the client so that subsequent instances of the same
client bear the same co_ownerid with a different verifier. client bear the same co_ownerid with a different verifier.
There are several considerations for how the client generates the There are several considerations for how the client generates the
skipping to change at line 2055 skipping to change at line 2055
The backchannel is used for callback requests from server to client, The backchannel is used for callback requests from server to client,
and carries CB_COMPOUND requests and responses. Whether or not there and carries CB_COMPOUND requests and responses. Whether or not there
is a backchannel is decided by the client; however, many features of is a backchannel is decided by the client; however, many features of
NFSv4.1 require a backchannel. NFSv4.1 servers MUST support NFSv4.1 require a backchannel. NFSv4.1 servers MUST support
backchannels. backchannels.
Each session has resources for each channel, including separate reply Each session has resources for each channel, including separate reply
caches (see Section 2.10.6.1). Note that even the backchannel caches (see Section 2.10.6.1). Note that even the backchannel
requires a reply cache (or, at least, a slot table in order to detect requires a reply cache (or, at least, a slot table in order to detect
retries) because some callback operations are nonidempotent. retries) because some callback operations are non-idempotent.
2.10.3.1. Association of Connections, Channels, and Sessions 2.10.3.1. Association of Connections, Channels, and Sessions
Each channel is associated with zero or more transport connections Each channel is associated with zero or more transport connections
(whether of the same transport protocol or different transport (whether of the same transport protocol or different transport
protocols). A connection can be associated with one channel or both protocols). A connection can be associated with one channel or both
channels of a session; the client and server negotiate whether a channels of a session; the client and server negotiate whether a
connection will carry traffic for one channel or both channels via connection will carry traffic for one channel or both channels via
the CREATE_SESSION (Section 18.36) and the BIND_CONN_TO_SESSION the CREATE_SESSION (Section 18.36) and the BIND_CONN_TO_SESSION
(Section 18.34) operations. When a session is created via (Section 18.34) operations. When a session is created via
skipping to change at line 2140 skipping to change at line 2140
implementation, but this can be tailored to the specific situations implementation, but this can be tailored to the specific situations
in which that recognition is desired. in which that recognition is desired.
Clients will have occasion to compare the server scope values of Clients will have occasion to compare the server scope values of
multiple servers under a number of circumstances, each of which will multiple servers under a number of circumstances, each of which will
be discussed under the appropriate functional section: be discussed under the appropriate functional section:
* When server owner values received in response to EXCHANGE_ID * When server owner values received in response to EXCHANGE_ID
operations sent to multiple network addresses are compared for the operations sent to multiple network addresses are compared for the
purpose of determining the validity of various forms of trunking, purpose of determining the validity of various forms of trunking,
as described in Section 11.5.2. . as described in Section 11.5.2.
* When network or server reconfiguration causes the same network * When network or server reconfiguration causes the same network
address to possibly be directed to different servers, with the address to possibly be directed to different servers, with the
necessity for the client to determine when lock reclaim should be necessity for the client to determine when lock reclaim should be
attempted, as described in Section 8.4.2.1. attempted, as described in Section 8.4.2.1.
When two replies from EXCHANGE_ID, each from two different server When two replies from EXCHANGE_ID, each from two different server
network addresses, have the same server scope, there are a number of network addresses, have the same server scope, there are a number of
ways a client can validate that the common server scope is due to two ways a client can validate that the common server scope is due to two
servers cooperating in a group. servers cooperating in a group.
skipping to change at line 2184 skipping to change at line 2184
system involved (e.g. a file system being migrated). system involved (e.g. a file system being migrated).
2.10.5. Trunking 2.10.5. Trunking
Trunking is the use of multiple connections between a client and Trunking is the use of multiple connections between a client and
server in order to increase the speed of data transfer. NFSv4.1 server in order to increase the speed of data transfer. NFSv4.1
supports two types of trunking: session trunking and client ID supports two types of trunking: session trunking and client ID
trunking. trunking.
In the context of a single server network address, it can be assumed In the context of a single server network address, it can be assumed
that all connections are accessing the same server and NFSv4.1 that all connections are accessing the same server, and NFSv4.1
servers MUST support both forms of trunking. When multiple servers MUST support both forms of trunking. When multiple
connections use a set of network addresses accessing the same server, connections use a set of network addresses to access the same server,
the server MUST support both forms of trunking. NFSv4.1 servers in a the server MUST support both forms of trunking. NFSv4.1 servers in a
clustered configuration MAY allow network addresses for different clustered configuration MAY allow network addresses for different
servers to use client ID trunking. servers to use client ID trunking.
Clients may use either form of trunking as long as they do not, when Clients may use either form of trunking as long as they do not, when
trunking between different server network addresses, violate the trunking between different server network addresses, violate the
servers' mandates as to the kinds of trunking to be allowed (see servers' mandates as to the kinds of trunking to be allowed (see
below). With regard to callback channels, the client MUST allow the below). With regard to callback channels, the client MUST allow the
server to choose among all callback channels valid for a given client server to choose among all callback channels valid for a given client
ID and MUST support trunking when the connections supporting the ID and MUST support trunking when the connections supporting the
skipping to change at line 2278 skipping to change at line 2278
When doing client ID trunking, locking state is shared across When doing client ID trunking, locking state is shared across
sessions associated with that same client ID. This requires the sessions associated with that same client ID. This requires the
server to coordinate state across sessions and the client to be server to coordinate state across sessions and the client to be
able to associate the same locking state with multiple sessions. able to associate the same locking state with multiple sessions.
It is always possible that, as a result of various sorts of It is always possible that, as a result of various sorts of
reconfiguration events, eir_server_scope and eir_server_owner values reconfiguration events, eir_server_scope and eir_server_owner values
may be different on subsequent EXCHANGE_ID requests made to the same may be different on subsequent EXCHANGE_ID requests made to the same
network address. network address.
In most cases such reconfiguration events will be disruptive and In most cases, such reconfiguration events will be disruptive and
indicate that an IP address formerly connected to one server is now indicate that an IP address formerly connected to one server is now
connected to an entirely different one. connected to an entirely different one.
Some guidelines on client handling of such situations follow: Some guidelines on client handling of such situations follow:
* When eir_server_scope changes, the client has no assurance that * When eir_server_scope changes, the client has no assurance that
any id's it obtained previously (e.g. file handles) can be validly any IDs that it obtained previously (e.g., filehandles) can be
used on the new server, and, even if the new server accepts them, validly used on the new server, and, even if the new server
there is no assurance that this is not due to accident. Thus, it accepts them, there is no assurance that this is not due to
is best to treat all such state as lost/stale although a client accident. Thus, it is best to treat all such state as lost or
may assume that the probability of inadvertent acceptance is low stale, although a client may assume that the probability of
and treat this situation as within the next case. inadvertent acceptance is low and treat this situation as within
the next case.
* When eir_server_scope remains the same and * When eir_server_scope remains the same and
eir_server_owner.so_major_id changes, the client can use the eir_server_owner.so_major_id changes, the client can use the
filehandles it has, consider its locking state lost, and attempt filehandles it has, consider its locking state lost, and attempt
to reclaim or otherwise re-obtain its locks. It might find that to reclaim or otherwise re-obtain its locks. It might find that
its file handle is now stale. However, if NFS4ERR_STALE is not its filehandle is now stale. However, if NFS4ERR_STALE is not
returned, it can proceed to reclaim or otherwise re-obtain its returned, it can proceed to reclaim or otherwise re-obtain its
open locking state. open locking state.
* When eir_server_scope and eir_server_owner.so_major_id remain the * When eir_server_scope and eir_server_owner.so_major_id remain the
same, the client has to use the now-current values of same, the client has to use the now-current values of
eir_server_owner.so_minor_id in deciding on appropriate forms of eir_server_owner.so_minor_id in deciding on appropriate forms of
trunking. This may result in connections being dropped or new trunking. This may result in connections being dropped or new
sessions being created. sessions being created.
2.10.5.1. Verifying Claims of Matching Server Identity 2.10.5.1. Verifying Claims of Matching Server Identity
When the server responds using two different connections claiming When the server responds using two different connections that claim
matching or partially matching eir_server_owner, eir_server_scope, matching or partially matching eir_server_owner, eir_server_scope,
and eir_clientid values, the client does not have to trust the and eir_clientid values, the client does not have to trust the
servers' claims. The client may verify these claims before trunking servers' claims. The client may verify these claims before trunking
traffic in the following ways: traffic in the following ways:
* For session trunking, clients SHOULD reliably verify if * For session trunking, clients SHOULD reliably verify if
connections between different network paths are in fact associated connections between different network paths are in fact associated
with the same NFSv4.1 server and usable on the same session, and with the same NFSv4.1 server and usable on the same session, and
servers MUST allow clients to perform reliable verification. When servers MUST allow clients to perform reliable verification. When
a client ID is created, the client SHOULD specify that a client ID is created, the client SHOULD specify that
skipping to change at line 4138 skipping to change at line 4139
+===============+==============================================+ +===============+==============================================+
| int32_t | typedef int int32_t; | | int32_t | typedef int int32_t; |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| uint32_t | typedef unsigned int uint32_t; | | uint32_t | typedef unsigned int uint32_t; |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| int64_t | typedef hyper int64_t; | | int64_t | typedef hyper int64_t; |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| uint64_t | typedef unsigned hyper uint64_t; | | uint64_t | typedef unsigned hyper uint64_t; |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| attrlist4 | typedef opaque attrlist4<>; | | attrlist4 | typedef opaque attrlist4<>; |
+---------------+----------------------------------------------+ | | |
| | Used for file/directory attributes. | | | Used for file/directory attributes. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| bitmap4 | typedef uint32_t bitmap4<>; | | bitmap4 | typedef uint32_t bitmap4<>; |
+---------------+----------------------------------------------+ | | |
| | Used in attribute array encoding. | | | Used in attribute array encoding. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| changeid4 | typedef uint64_t changeid4; | | changeid4 | typedef uint64_t changeid4; |
+---------------+----------------------------------------------+ | | |
| | Used in the definition of change_info4. | | | Used in the definition of change_info4. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| clientid4 | typedef uint64_t clientid4; | | clientid4 | typedef uint64_t clientid4; |
+---------------+----------------------------------------------+ | | |
| | Shorthand reference to client | | | Shorthand reference to client |
| | identification. | | | identification. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| count4 | typedef uint32_t count4; | | count4 | typedef uint32_t count4; |
+---------------+----------------------------------------------+ | | |
| | Various count parameters (READ, WRITE, | | | Various count parameters (READ, WRITE, |
| | COMMIT). | | | COMMIT). |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| length4 | typedef uint64_t length4; | | length4 | typedef uint64_t length4; |
+---------------+----------------------------------------------+ | | |
| | The length of a byte-range within a file. | | | The length of a byte-range within a file. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| mode4 | typedef uint32_t mode4; | | mode4 | typedef uint32_t mode4; |
+---------------+----------------------------------------------+ | | |
| | Mode attribute data type. | | | Mode attribute data type. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| nfs_cookie4 | typedef uint64_t nfs_cookie4; | | nfs_cookie4 | typedef uint64_t nfs_cookie4; |
+---------------+----------------------------------------------+ | | |
| | Opaque cookie value for READDIR. | | | Opaque cookie value for READDIR. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; | | nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; |
+---------------+----------------------------------------------+ | | |
| | Filehandle definition. | | | Filehandle definition. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| nfs_ftype4 | enum nfs_ftype4; | | nfs_ftype4 | enum nfs_ftype4; |
+---------------+----------------------------------------------+ | | |
| | Various defined file types. | | | Various defined file types. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| nfsstat4 | enum nfsstat4; | | nfsstat4 | enum nfsstat4; |
+---------------+----------------------------------------------+ | | |
| | Return value for operations. | | | Return value for operations. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| offset4 | typedef uint64_t offset4; | | offset4 | typedef uint64_t offset4; |
+---------------+----------------------------------------------+ | | |
| | Various offset designations (READ, WRITE, | | | Various offset designations (READ, WRITE, |
| | LOCK, COMMIT). | | | LOCK, COMMIT). |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| qop4 | typedef uint32_t qop4; | | qop4 | typedef uint32_t qop4; |
+---------------+----------------------------------------------+ | | |
| | Quality of protection designation in | | | Quality of protection designation in |
| | SECINFO. | | | SECINFO. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| sec_oid4 | typedef opaque sec_oid4<>; | | sec_oid4 | typedef opaque sec_oid4<>; |
+---------------+----------------------------------------------+ | | |
| | Security Object Identifier. The sec_oid4 | | | Security Object Identifier. The sec_oid4 |
| | data type is not really opaque. Instead, it | | | data type is not really opaque. Instead, it |
| | contains an ASN.1 OBJECT IDENTIFIER as used | | | contains an ASN.1 OBJECT IDENTIFIER as used |
| | by GSS-API in the mech_type argument to | | | by GSS-API in the mech_type argument to |
| | GSS_Init_sec_context. See [7] for details. | | | GSS_Init_sec_context. See [7] for details. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| sequenceid4 | typedef uint32_t sequenceid4; | | sequenceid4 | typedef uint32_t sequenceid4; |
+---------------+----------------------------------------------+ | | |
| | Sequence number used for various session | | | Sequence number used for various session |
| | operations (EXCHANGE_ID, CREATE_SESSION, | | | operations (EXCHANGE_ID, CREATE_SESSION, |
| | SEQUENCE, CB_SEQUENCE). | | | SEQUENCE, CB_SEQUENCE). |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| seqid4 | typedef uint32_t seqid4; | | seqid4 | typedef uint32_t seqid4; |
+---------------+----------------------------------------------+ | | |
| | Sequence identifier used for locking. | | | Sequence identifier used for locking. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| sessionid4 | typedef opaque | | sessionid4 | typedef opaque |
| | sessionid4[NFS4_SESSIONID_SIZE]; | | | sessionid4[NFS4_SESSIONID_SIZE]; |
+---------------+----------------------------------------------+ | | |
| | Session identifier. | | | Session identifier. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| slotid4 | typedef uint32_t slotid4; | | slotid4 | typedef uint32_t slotid4; |
+---------------+----------------------------------------------+ | | |
| | Sequencing artifact for various session | | | Sequencing artifact for various session |
| | operations (SEQUENCE, CB_SEQUENCE). | | | operations (SEQUENCE, CB_SEQUENCE). |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| utf8string | typedef opaque utf8string<>; | | utf8string | typedef opaque utf8string<>; |
+---------------+----------------------------------------------+ | | |
| | UTF-8 encoding for strings. | | | UTF-8 encoding for strings. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| utf8str_cis | typedef utf8string utf8str_cis; | | utf8str_cis | typedef utf8string utf8str_cis; |
+---------------+----------------------------------------------+ | | |
| | Case-insensitive UTF-8 string. | | | Case-insensitive UTF-8 string. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| utf8str_cs | typedef utf8string utf8str_cs; | | utf8str_cs | typedef utf8string utf8str_cs; |
+---------------+----------------------------------------------+ | | |
| | Case-sensitive UTF-8 string. | | | Case-sensitive UTF-8 string. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| utf8str_mixed | typedef utf8string utf8str_mixed; | | utf8str_mixed | typedef utf8string utf8str_mixed; |
+---------------+----------------------------------------------+ | | |
| | UTF-8 strings with a case-sensitive prefix | | | UTF-8 strings with a case-sensitive prefix |
| | and a case-insensitive suffix. | | | and a case-insensitive suffix. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| component4 | typedef utf8str_cs component4; | | component4 | typedef utf8str_cs component4; |
+---------------+----------------------------------------------+ | | |
| | Represents pathname components. | | | Represents pathname components. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| linktext4 | typedef utf8str_cs linktext4; | | linktext4 | typedef utf8str_cs linktext4; |
+---------------+----------------------------------------------+ | | |
| | Symbolic link contents ("symbolic link" is | | | Symbolic link contents ("symbolic link" is |
| | defined in an Open Group [11] standard). | | | defined in an Open Group [11] standard). |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| pathname4 | typedef component4 pathname4<>; | | pathname4 | typedef component4 pathname4<>; |
+---------------+----------------------------------------------+ | | |
| | Represents pathname for fs_locations. | | | Represents pathname for fs_locations. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
| verifier4 | typedef opaque | | verifier4 | typedef opaque |
| | verifier4[NFS4_VERIFIER_SIZE]; | | | verifier4[NFS4_VERIFIER_SIZE]; |
+---------------+----------------------------------------------+ | | |
| | Verifier used for various operations | | | Verifier used for various operations |
| | (COMMIT, CREATE, EXCHANGE_ID, OPEN, READDIR, | | | (COMMIT, CREATE, EXCHANGE_ID, OPEN, READDIR, |
| | WRITE) NFS4_VERIFIER_SIZE is defined as 8. | | | WRITE) NFS4_VERIFIER_SIZE is defined as 8. |
+---------------+----------------------------------------------+ +---------------+----------------------------------------------+
Table 1 Table 1
End of Base Data Types End of Base Data Types
3.3. Structured Data Types 3.3. Structured Data Types
skipping to change at line 5152 skipping to change at line 5153
REQUIRED and RECOMMENDED attributes are get-only; i.e., they can be REQUIRED and RECOMMENDED attributes are get-only; i.e., they can be
retrieved via GETATTR but not set via SETATTR. If a client attempts retrieved via GETATTR but not set via SETATTR. If a client attempts
to set a get-only attribute or get a set-only attributes, the server to set a get-only attribute or get a set-only attributes, the server
MUST return NFS4ERR_INVAL. MUST return NFS4ERR_INVAL.
5.6. REQUIRED Attributes - List and Definition References 5.6. REQUIRED Attributes - List and Definition References
The list of REQUIRED attributes appears in Table 4. The meaning of The list of REQUIRED attributes appears in Table 4. The meaning of
the columns of the table are: the columns of the table are:
* Name: The name of the attribute. Name: The name of the attribute.
* Id: The number assigned to the attribute. In the event of Id: The number assigned to the attribute. In the event of conflicts
conflicts between the assigned number and [10], the latter is between the assigned number and [10], the latter is likely
likely authoritative, but should be resolved with Errata to this authoritative, but should be resolved with Errata to this document
document and/or [10]. See [50] for the Errata process. and/or [10]. See [50] for the Errata process.
* Data Type: The XDR data type of the attribute. Data Type: The XDR data type of the attribute.
* Acc: Access allowed to the attribute. R means read-only (GETATTR Acc: Access allowed to the attribute. R means read-only (GETATTR
may retrieve, SETATTR may not set). W means write-only (SETATTR may retrieve, SETATTR may not set). W means write-only (SETATTR
may set, GETATTR may not retrieve). R W means read/write (GETATTR may set, GETATTR may not retrieve). R W means read/write (GETATTR
may retrieve, SETATTR may set). may retrieve, SETATTR may set).
* Defined in: The section of this specification that describes the Defined in: The section of this specification that describes the
attribute. attribute.
+====================+====+============+=====+==================+ +====================+====+============+=====+==================+
| Name | Id | Data Type | Acc | Defined in: | | Name | Id | Data Type | Acc | Defined in: |
+====================+====+============+=====+==================+ +====================+====+============+=====+==================+
| supported_attrs | 0 | bitmap4 | R | Section 5.8.1.1 | | supported_attrs | 0 | bitmap4 | R | Section 5.8.1.1 |
+--------------------+----+------------+-----+------------------+ +--------------------+----+------------+-----+------------------+
| type | 1 | nfs_ftype4 | R | Section 5.8.1.2 | | type | 1 | nfs_ftype4 | R | Section 5.8.1.2 |
+--------------------+----+------------+-----+------------------+ +--------------------+----+------------+-----+------------------+
| fh_expire_type | 2 | uint32_t | R | Section 5.8.1.3 | | fh_expire_type | 2 | uint32_t | R | Section 5.8.1.3 |
skipping to change at line 5373 skipping to change at line 5374
+--------------------+----+----------------+-----+------------------+ +--------------------+----+----------------+-----+------------------+
| time_metadata | 52 | nfstime4 | R | Section | | time_metadata | 52 | nfstime4 | R | Section |
| | | | | 5.8.2.42 | | | | | | 5.8.2.42 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+----------------+-----+------------------+
| time_modify | 53 | nfstime4 | R | Section | | time_modify | 53 | nfstime4 | R | Section |
| | | | | 5.8.2.43 | | | | | | 5.8.2.43 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+----------------+-----+------------------+
| time_modify_set | 54 | settime4 | W | Section | | time_modify_set | 54 | settime4 | W | Section |
| | | | | 5.8.2.44 | | | | | | 5.8.2.44 |
+--------------------+----+----------------+-----+------------------+ +--------------------+----+----------------+-----+------------------+
| * fs_locations_info4 |
+-------------------------------------------------------------------+
Table 5 Table 5
* fs_locations_info4
5.8. Attribute Definitions 5.8. Attribute Definitions
5.8.1. Definitions of REQUIRED Attributes 5.8.1. Definitions of REQUIRED Attributes
5.8.1.1. Attribute 0: supported_attrs 5.8.1.1. Attribute 0: supported_attrs
The bit vector that would retrieve all REQUIRED and RECOMMENDED The bit vector that would retrieve all REQUIRED and RECOMMENDED
attributes that are supported for this object. The scope of this attributes that are supported for this object. The scope of this
attribute applies to all objects with a matching fsid. attribute applies to all objects with a matching fsid.
skipping to change at line 8728 skipping to change at line 8729
within the lease period, it is up to the client to determine which within the lease period, it is up to the client to determine which
locks have been revoked and which have not. It does this by using locks have been revoked and which have not. It does this by using
the TEST_STATEID operation on the appropriate set of stateids. Once the TEST_STATEID operation on the appropriate set of stateids. Once
the set of revoked locks has been determined, the applications can be the set of revoked locks has been determined, the applications can be
notified, and the invalidated stateids can be freed and lock notified, and the invalidated stateids can be freed and lock
revocation acknowledged by using FREE_STATEID. revocation acknowledged by using FREE_STATEID.
8.6. Short and Long Leases 8.6. Short and Long Leases
When determining the time period for the server lease, the usual When determining the time period for the server lease, the usual
lease tradeoffs apply. A short lease is good for fast server lease trade-offs apply. A short lease is good for fast server
recovery at a cost of increased operations to effect lease renewal recovery at a cost of increased operations to effect lease renewal
(when there are no other operations during the period to effect lease (when there are no other operations during the period to effect lease
renewal as a side effect). A long lease is certainly kinder and renewal as a side effect). A long lease is certainly kinder and
gentler to servers trying to handle very large numbers of clients. gentler to servers trying to handle very large numbers of clients.
The number of extra requests to effect lock renewal drops in inverse The number of extra requests to effect lock renewal drops in inverse
proportion to the lease time. The disadvantages of a long lease proportion to the lease time. The disadvantages of a long lease
include the possibility of slower recovery after certain failures. include the possibility of slower recovery after certain failures.
After server failure, a longer grace period may be required when some After server failure, a longer grace period may be required when some
clients do not promptly reclaim their locks and do a global clients do not promptly reclaim their locks and do a global
RECLAIM_COMPLETE. In the event of client failure, the longer period RECLAIM_COMPLETE. In the event of client failure, the longer period
skipping to change at line 10901 skipping to change at line 10902
protected by OPEN_DELEGATE_READ delegations and notifications. Thus, protected by OPEN_DELEGATE_READ delegations and notifications. Thus,
no provision is made for reclaiming directory delegations in the no provision is made for reclaiming directory delegations in the
event of client or server restart. The client can simply establish a event of client or server restart. The client can simply establish a
directory delegation in the same fashion as was done initially. directory delegation in the same fashion as was done initially.
11. Multi-Server Namespace 11. Multi-Server Namespace
NFSv4.1 supports attributes that allow a namespace to extend beyond NFSv4.1 supports attributes that allow a namespace to extend beyond
the boundaries of a single server. It is desirable that clients and the boundaries of a single server. It is desirable that clients and
servers support construction of such multi-server namespaces. Use of servers support construction of such multi-server namespaces. Use of
such multi-server namespaces is OPTIONAL however, and for many such multi-server namespaces is OPTIONAL; however, and for many
purposes, single-server namespaces are perfectly acceptable. Use of purposes, single-server namespaces are perfectly acceptable. Use of
multi-server namespaces can provide many advantages, by separating a multi-server namespaces can provide many advantages, by separating a
file system's logical position in a namespace from the (possibly file system's logical position in a namespace from the (possibly
changing) logistical and administrative considerations that result in changing) logistical and administrative considerations that result in
particular file systems being located on particular servers via a particular file systems being located on particular servers via a
single network access paths known in advance or determined using DNS. single network access path known in advance or determined using DNS.
11.1. Terminology 11.1. Terminology
In this section as a whole (i.e. within all of Section 11), the In this section as a whole (i.e., within all of Section 11), the
phrase "client ID" always refers to the 64-bit shorthand identifier phrase "client ID" always refers to the 64-bit shorthand identifier
assigned by the server (a clientid4) and never to the structure which assigned by the server (a clientid4) and never to the structure that
the client uses to identify itself to the server (called an the client uses to identify itself to the server (called an
nfs_client_id4 or client_owner in NFSv4.0 and NFSv4.1 respectively). nfs_client_id4 or client_owner in NFSv4.0 and NFSv4.1, respectively).
The opaque identifier within those structures is referred to as a The opaque identifier within those structures is referred to as a
"client id string". "client id string".
11.1.1. Terminology Related to Trunking 11.1.1. Terminology Related to Trunking
It is particularly important to clarify the distinction between It is particularly important to clarify the distinction between
trunking detection and trunking discovery. The definitions we trunking detection and trunking discovery. The definitions we
present are applicable to all minor versions of NFSv4, but we will present are applicable to all minor versions of NFSv4, but we will
focus on how these terms apply to NFS version 4.1. focus on how these terms apply to NFS version 4.1.
skipping to change at line 10937 skipping to change at line 10938
network addresses are connected to the same NFSv4 server. The network addresses are connected to the same NFSv4 server. The
means available to make this determination depends on the protocol means available to make this determination depends on the protocol
version, and, in some cases, on the client implementation. version, and, in some cases, on the client implementation.
In the case of NFS version 4.1 and later minor versions, the means In the case of NFS version 4.1 and later minor versions, the means
of trunking detection are as described in this document and are of trunking detection are as described in this document and are
available to every client. Two network addresses connected to the available to every client. Two network addresses connected to the
same server can always be used together to access a particular same server can always be used together to access a particular
server but cannot necessarily be used together to access a single server but cannot necessarily be used together to access a single
session. See below for definitions of the terms "server- session. See below for definitions of the terms "server-
trunkable" and "session-trunkable" trunkable" and "session-trunkable".
* Trunking discovery is a process by which a client using one * Trunking discovery is a process by which a client using one
network address can obtain other addresses that are connected to network address can obtain other addresses that are connected to
the same server. Typically, it builds on a trunking detection the same server. Typically, it builds on a trunking detection
facility by providing one or more methods by which candidate facility by providing one or more methods by which candidate
addresses are made available to the client who can then use addresses are made available to the client, who can then use
trunking detection to appropriately filter them. trunking detection to appropriately filter them.
Despite the support for trunking detection there was no Despite the support for trunking detection, there was no
description of trunking discovery provided in RFC5661 [65], making description of trunking discovery provided in RFC 5661 [65],
it necessary to provide those means in this document. making it necessary to provide those means in this document.
The combination of a server network address and a particular The combination of a server network address and a particular
connection type to be used by a connection is referred to as a connection type to be used by a connection is referred to as a
"server endpoint". Although using different connection types may "server endpoint". Although using different connection types may
result in different ports being used, the use of different ports by result in different ports being used, the use of different ports by
multiple connections to the same network address in such cases is not multiple connections to the same network address in such cases is not
the essence of the distinction between the two endpoints used. This the essence of the distinction between the two endpoints used. This
is in contrast to the case of port-specific endpoints, in which the is in contrast to the case of port-specific endpoints, in which the
explicit specification of port numbers within network addresses is explicit specification of port numbers within network addresses is
used to allow a single server node to support multiple NFS servers. used to allow a single server node to support multiple NFS servers.
Two network addresses connected to the same server are said to be Two network addresses connected to the same server are said to be
server-trunkable. Two such addresses support the use of clientid ID server-trunkable. Two such addresses support the use of client ID
trunking, as described in Section 2.10.5. trunking, as described in Section 2.10.5.
Two network addresses connected to the same server such that those Two network addresses connected to the same server such that those
addresses can be used to support a single common session are referred addresses can be used to support a single common session are referred
to as session-trunkable. Note that two addresses may be server- to as session-trunkable. Note that two addresses may be server-
trunkable without being session-trunkable and that when two trunkable without being session-trunkable, and that, when two
connections of different connection types are made to the same connections of different connection types are made to the same
network address and are based on a single file system location entry network address and are based on a single file system location entry,
they are always session-trunkable, independent of the connection they are always session-trunkable, independent of the connection
type, as specified by Section 2.10.5, since their derivation from the type, as specified by Section 2.10.5, since their derivation from the
same file system location entry together with the identity of their same file system location entry, together with the identity of their
network addresses assures that both connections are to the same network addresses, assures that both connections are to the same
server and will return server-owner information allowing session server and will return server-owner information, allowing session
trunking to be used. trunking to be used.
11.1.2. Terminology Related to File System Location 11.1.2. Terminology Related to File System Location
Regarding terminology relating to the construction of multi-server Regarding the terminology that relates to the construction of multi-
namespaces out of a set of local per-server namespaces: server namespaces out of a set of local per-server namespaces:
* Each server has a set of exported file systems which may be * Each server has a set of exported file systems that may be
accessed by NFSv4 clients. Typically, this is done by assigning accessed by NFSv4 clients. Typically, this is done by assigning
each file system a name within the pseudo-fs associated with the each file system a name within the pseudo-fs associated with the
server, although the pseudo-fs may be dispensed with if there is server, although the pseudo-fs may be dispensed with if there is
only a single exported file system. Each such file system is part only a single exported file system. Each such file system is part
of the server's local namespace, and can be considered as a file of the server's local namespace, and can be considered as a file
system instance within a larger multi-server namespace. system instance within a larger multi-server namespace.
* The set of all exported file systems for a given server * The set of all exported file systems for a given server
constitutes that server's local namespace. constitutes that server's local namespace.
* In some cases, a server will have a namespace more extensive than * In some cases, a server will have a namespace more extensive than
its local namespace by using features associated with attributes its local namespace by using features associated with attributes
that provide file system location information. These features, that provide file system location information. These features,
which allow construction of a multi-server namespace, are all which allow construction of a multi-server namespace, are all
described in individual sections below and include referrals described in individual sections below and include referrals
(described in Section 11.5.6), migration (described in (Section 11.5.6), migration (Section 11.5.5), and replication
Section 11.5.5), and replication (described in Section 11.5.4). (Section 11.5.4).
* A file system present in a server's pseudo-fs may have multiple * A file system present in a server's pseudo-fs may have multiple
file system instances on different servers associated with it. file system instances on different servers associated with it.
All such instances are considered replicas of one another. All such instances are considered replicas of one another.
Whether such replicas can be used simultaneously is discussed in Whether such replicas can be used simultaneously is discussed in
Section 11.11.1, while the level of co-ordination between them Section 11.11.1, while the level of coordination between them
(important when switching between them) is discussed in Sections (important when switching between them) is discussed in Sections
11.11.2 through 11.11.8 below. 11.11.2 through 11.11.8 below.
* When a file system is present in a server's pseudo-fs, but there * When a file system is present in a server's pseudo-fs, but there
is no corresponding local file system, it is said to be "absent". is no corresponding local file system, it is said to be "absent".
In such cases, all associated instances will be accessed on other In such cases, all associated instances will be accessed on other
servers. servers.
Regarding terminology relating to attributes used in trunking Regarding the terminology that relates to attributes used in trunking
discovery and other multi-server namespace features: discovery and other multi-server namespace features:
* File system location attributes include the fs_locations and * File system location attributes include the fs_locations and
fs_locations_info attributes. fs_locations_info attributes.
* File system location entries provide the individual file system * File system location entries provide the individual file system
locations within the file system location attributes. Each such locations within the file system location attributes. Each such
entry specifies a server, in the form of a host name or an entry specifies a server, in the form of a hostname or an address,
address, and an fs name, which designates the location of the file and an fs name, which designates the location of the file system
system within the server's local namespace. A file system within the server's local namespace. A file system location entry
location entry designates a set of server endpoints to which the designates a set of server endpoints to which the client may
client may establish connections. There may be multiple endpoints establish connections. There may be multiple endpoints because a
because a host name may map to multiple network addresses and hostname may map to multiple network addresses and because
because multiple connection types may be used to communicate with multiple connection types may be used to communicate with a single
a single network address. However, except where an explicit port network address. However, except where explicit port numbers are
numbers are used to designate a set of server within a single used to designate a set of servers within a single server node,
server node, all such endpoints MUST designate a way of connecting all such endpoints MUST designate a way of connecting to a single
to a single server. The exact form of the location entry varies server. The exact form of the location entry varies with the
with the particular file system location attribute used, as particular file system location attribute used, as described in
described in Section 11.2. Section 11.2.
The network addresses used in file system location entries The network addresses used in file system location entries
typically appear without port number indications and are used to typically appear without port number indications and are used to
designate a server at one of the standard ports for NFS access, designate a server at one of the standard ports for NFS access,
e.g., 2049 for TCP, or 20049 for use with RPC-over-RDMA. Port e.g., 2049 for TCP or 20049 for use with RPC-over-RDMA. Port
numbers may be used in file system location entries to designate numbers may be used in file system location entries to designate
servers (typically user-level ones) accessed using other port servers (typically user-level ones) accessed using other port
numbers. In the case where network addresses indicate trunking numbers. In the case where network addresses indicate trunking
relationships, use of an explicit port number is inappropriate relationships, the use of an explicit port number is inappropriate
since trunking is a relationship between network addresses. See since trunking is a relationship between network addresses. See
Section 11.5.2 for details. Section 11.5.2 for details.
* File system location elements are derived from location entries * File system location elements are derived from location entries,
and each describes a particular network access path, consisting of and each describes a particular network access path consisting of
a network address and a location within the server's local a network address and a location within the server's local
namespace. Such location elements need not appear within a file namespace. Such location elements need not appear within a file
system location attribute, but the existence of each location system location attribute, but the existence of each location
element derives from a corresponding location entry. When a element derives from a corresponding location entry. When a
location entry specifies an IP address there is only a single location entry specifies an IP address, there is only a single
corresponding location element. File system location entries that corresponding location element. File system location entries that
contain a host name are resolved using DNS, and may result in one contain a hostname are resolved using DNS, and may result in one
or more location elements. All location elements consist of a or more location elements. All location elements consist of a
location address which includes the IP address of an interface to location address that includes the IP address of an interface to a
a server and an fs name which is the location of the file system server and an fs name, which is the location of the file system
within the server's local namespace. The fs name can be empty if within the server's local namespace. The fs name can be empty if
the server has no pseudo-fs and only a single exported file system the server has no pseudo-fs and only a single exported file system
at the root filehandle. at the root filehandle.
* Two file system location elements are said to be server-trunkable * Two file system location elements are said to be server-trunkable
if they specify the same fs name and the location addresses are if they specify the same fs name and the location addresses are
such that the location addresses are server-trunkable. When the such that the location addresses are server-trunkable. When the
corresponding network paths are used, the client will always be corresponding network paths are used, the client will always be
able to use client ID trunking, but will only be able to use able to use client ID trunking, but will only be able to use
session trunking if the paths are also session-trunkable. session trunking if the paths are also session-trunkable.
* Two file system location elements are said to be session-trunkable * Two file system location elements are said to be session-trunkable
if they specify the same fs name and the location addresses are if they specify the same fs name and the location addresses are
such that the location addresses are session-trunkable. When the such that the location addresses are session-trunkable. When the
corresponding network paths are used, the client will be able to corresponding network paths are used, the client will be able to
able to use either client ID trunking or session trunking. able to use either client ID trunking or session trunking.
Discussion of the term "replica" is complicated by the fact that the Discussion of the term "replica" is complicated by the fact that the
term was used in RFC5661 [65], with a meaning different from that in term was used in RFC 5661 [65] with a meaning different from that
this document. In short, in [65] each replica is identified by a used in this document. In short, in [65] each replica is identified
single network access path while, in the current document a set of by a single network access path, while in the current document, a set
network access paths which have server-trunkable network addresses of network access paths that have server-trunkable network addresses
and the same root-relative file system pathname is considered to be a and the same root-relative file system pathname is considered to be a
single replica with multiple network access paths. single replica with multiple network access paths.
Each set of server-trunkable location elements defines a set of Each set of server-trunkable location elements defines a set of
available network access paths to a particular file system. When available network access paths to a particular file system. When
there are multiple such file systems, each of which contains the same there are multiple such file systems, each of which containing the
data, these file systems are considered replicas of one another. same data, these file systems are considered replicas of one another.
Logically, such replication is symmetric, since the fs currently in Logically, such replication is symmetric, since the fs currently in
use and an alternate fs are replicas of each other. Often, in other use and an alternate fs are replicas of each other. Often, in other
documents, the term "replica" is not applied to the fs currently in documents, the term "replica" is not applied to the fs currently in
use, despite the fact that the replication relation is inherently use, despite the fact that the replication relation is inherently
symmetric. symmetric.
11.2. File System Location Attributes 11.2. File System Location Attributes
NFSv4.1 contains attributes that provide information about how (i.e., NFSv4.1 contains attributes that provide information about how a
at what network address and namespace position) a given file system given file system may be accessed (i.e., at what network address and
may be accessed. As a result, file systems in the namespace of one namespace position). As a result, file systems in the namespace of
server can be associated with one or more instances of that file one server can be associated with one or more instances of that file
system on other servers. These attributes contain file system system on other servers. These attributes contain file system
location entries specifying a server address target (either as a DNS location entries specifying a server address target (either as a DNS
name representing one or more IP addresses or as a specific IP name representing one or more IP addresses or as a specific IP
address) together with the pathname of that file system within the address) together with the pathname of that file system within the
associated single-server namespace. associated single-server namespace.
The fs_locations_info RECOMMENDED attribute allows specification of The fs_locations_info RECOMMENDED attribute allows specification of
one or more file system instance locations where the data one or more file system instance locations where the data
corresponding to a given file system may be found. This attribute corresponding to a given file system may be found. In addition to
provides to the client, in addition to specification of file system the specification of file system instance locations, this attribute
instance locations, other helpful information such as: provides helpful information to do the following:
* Information guiding choices among the various file system * Guide choices among the various file system instances provided
instances provided (e.g., priority for use, writability, currency, (e.g., priority for use, writability, currency, etc.).
etc.).
* Information to help the client efficiently effect as seamless a * Help the client efficiently effect as seamless a transition as
transition as possible among multiple file system instances, when possible among multiple file system instances, when and if that
and if that should be necessary. should be necessary.
* Information helping to guide the selection of the appropriate * Guide the selection of the appropriate connection type to be used
connection type to be used when establishing a connection. when establishing a connection.
Within the fs_locations_info attribute, each fs_locations_server4 Within the fs_locations_info attribute, each fs_locations_server4
entry corresponds to a file system location entry with the fls_server entry corresponds to a file system location entry with the fls_server
field designating the server, with the location pathname within the field designating the server and with the location pathname within
server's pseudo-fs given by the fl_rootpath field of the encompassing the server's pseudo-fs given by the fl_rootpath field of the
fs_locations_item4. encompassing fs_locations_item4.
The fs_locations attribute defined in NFSv4.0 is also a part of The fs_locations attribute defined in NFSv4.0 is also a part of
NFSv4.1. This attribute only allows specification of the file system NFSv4.1. This attribute only allows specification of the file system
locations where the data corresponding to a given file system may be locations where the data corresponding to a given file system may be
found. Servers SHOULD make this attribute available whenever found. Servers SHOULD make this attribute available whenever
fs_locations_info is supported, but client use of fs_locations_info fs_locations_info is supported, but client use of fs_locations_info
is preferable, as it provides more information. is preferable because it provides more information.
Within the fs_location attribute, each fs_location4 contains a file Within the fs_locations attribute, each fs_location4 contains a file
system location entry with the server field designating the server system location entry with the server field designating the server
and the rootpath field giving the location pathname within the and the rootpath field giving the location pathname within the
server's pseudo-fs. server's pseudo-fs.
11.3. File System Presence or Absence 11.3. File System Presence or Absence
A given location in an NFSv4.1 namespace (typically but not A given location in an NFSv4.1 namespace (typically but not
necessarily a multi-server namespace) can have a number of file necessarily a multi-server namespace) can have a number of file
system instance locations associated with it (via the fs_locations or system instance locations associated with it (via the fs_locations or
fs_locations_info attribute). There may also be an actual current fs_locations_info attribute). There may also be an actual current
skipping to change at line 11294 skipping to change at line 11294
with an NFS4ERR_MOVED error. with an NFS4ERR_MOVED error.
* The unavailability of an attribute because of a file system's * The unavailability of an attribute because of a file system's
absence, even one that is ordinarily REQUIRED, does not result in absence, even one that is ordinarily REQUIRED, does not result in
any error indication. The set of attributes returned for the root any error indication. The set of attributes returned for the root
directory of the absent file system in that case is simply directory of the absent file system in that case is simply
restricted to those actually available. restricted to those actually available.
11.5. Uses of File System Location Information 11.5. Uses of File System Location Information
The file system location attributes (i.e. fs_locations and The file system location attributes (i.e., fs_locations and
fs_locations_info), together with the possibility of absent file fs_locations_info), together with the possibility of absent file
systems, provide a number of important facilities in providing systems, provide a number of important facilities for reliable,
reliable, manageable, and scalable data access. manageable, and scalable data access.
When a file system is present, these attributes can provide When a file system is present, these attributes can provide the
following:
* The locations of alternative replicas, to be used to access the * The locations of alternative replicas to be used to access the
same data in the event of server failures, communications same data in the event of server failures, communications
problems, or other difficulties that make continued access to the problems, or other difficulties that make continued access to the
current replica impossible or otherwise impractical. Provision current replica impossible or otherwise impractical. Provisioning
and use of such alternate replicas is referred to as "replication" and use of such alternate replicas is referred to as "replication"
and is discussed in Section 11.5.4 below. and is discussed in Section 11.5.4 below.
* The network address(es) to be used to access the current file * The network address(es) to be used to access the current file
system instance or replicas of it. Client use of this information system instance or replicas of it. Client use of this information
is discussed in Section 11.5.2 below. is discussed in Section 11.5.2 below.
Under some circumstances, multiple replicas may be used Under some circumstances, multiple replicas may be used
simultaneously to provide higher-performance access to the file simultaneously to provide higher-performance access to the file
system in question, although the lack of state sharing between system in question, although the lack of state sharing between
servers may be an impediment to such use. servers may be an impediment to such use.
When a file system is present and becomes absent, clients can be When a file system is present but becomes absent, clients can be
given the opportunity to have continued access to their data, using a given the opportunity to have continued access to their data using a
different replica. In this case, a continued attempt to use the data different replica. In this case, a continued attempt to use the data
in the now-absent file system will result in an NFS4ERR_MOVED error in the now-absent file system will result in an NFS4ERR_MOVED error,
and, at that point, the successor replica or set of possible replica and then the successor replica or set of possible replica choices can
choices can be fetched and used to continue access. Transfer of be fetched and used to continue access. Transfer of access to the
access to the new replica location is referred to as "migration", and new replica location is referred to as "migration" and is discussed
is discussed in Section 11.5.4 below. in Section 11.5.4 below.
Where a file system is currently absent, specification of file system When a file system is currently absent, specification of file system
location provides a means by which file systems located on one server location provides a means by which file systems located on one server
can be associated with a namespace defined by another server, thus can be associated with a namespace defined by another server, thus
allowing a general multi-server namespace facility. A designation of allowing a general multi-server namespace facility. A designation of
such a remote instance, in place of a file system not previously such a remote instance, in place of a file system not previously
present, is called a "pure referral" and is discussed in present, is called a "pure referral" and is discussed in
Section 11.5.6 below. Section 11.5.6 below.
Because client support for attributes related to file system location Because client support for attributes related to file system location
is OPTIONAL, a server may choose to take action to hide migration and is OPTIONAL, a server may choose to take action to hide migration and
referral events from such clients, by acting as a proxy, for example. referral events from such clients, by acting as a proxy, for example.
The server can determine the presence of client support from the The server can determine the presence of client support from the
arguments of the EXCHANGE_ID operation (see Section 18.35.3). arguments of the EXCHANGE_ID operation (see Section 18.35.3).
11.5.1. Combining Multiple Uses in a Single Attribute 11.5.1. Combining Multiple Uses in a Single Attribute
A file system location attribute will sometimes contain information A file system location attribute will sometimes contain information
relating to the location of multiple replicas which may be used in relating to the location of multiple replicas, which may be used in
different ways. different ways:
* File system location entries that relate to the file system * File system location entries that relate to the file system
instance currently in use provide trunking information, allowing instance currently in use provide trunking information, allowing
the client to find additional network addresses by which the the client to find additional network addresses by which the
instance may be accessed. instance may be accessed.
* File system location entries that provide information about * File system location entries that provide information about
replicas to which access is to be transferred. replicas to which access is to be transferred.
* Other file system location entries that relate to replicas that * Other file system location entries that relate to replicas that
are available to use in the event that access to the current are available to use in the event that access to the current
replica becomes unsatisfactory. replica becomes unsatisfactory.
In order to simplify client handling and allow the best choice of In order to simplify client handling and to allow the best choice of
replicas to access, the server should adhere to the following replicas to access, the server should adhere to the following
guidelines. guidelines:
* All file system location entries that relate to a single file * All file system location entries that relate to a single file
system instance should be adjacent. system instance should be adjacent.
* File system location entries that relate to the instance currently * File system location entries that relate to the instance currently
in use should appear first. in use should appear first.
* File system location entries that relate to replica(s) to which * File system location entries that relate to replica(s) to which
migration is occurring should appear before replicas which are migration is occurring should appear before replicas that are
available for later use if the current replica should become available for later use if the current replica should become
inaccessible. inaccessible.
11.5.2. File System Location Attributes and Trunking 11.5.2. File System Location Attributes and Trunking
Trunking is the use of multiple connections between a client and Trunking is the use of multiple connections between a client and
server in order to increase the speed of data transfer. A client may server in order to increase the speed of data transfer. A client may
determine the set of network addresses to use to access a given file determine the set of network addresses to use to access a given file
system in a number of ways: system in a number of ways:
* When the name of the server is known to the client, it may use DNS * When the name of the server is known to the client, it may use DNS
to obtain a set of network addresses to use in accessing the to obtain a set of network addresses to use in accessing the
server. server.
* The client may fetch the file system location attribute for the * The client may fetch the file system location attribute for the
file system. This will provide either the name of the server file system. This will provide either the name of the server
(which can be turned into a set of network addresses using DNS), (which can be turned into a set of network addresses using DNS) or
or a set of server-trunkable location entries. Using the latter a set of server-trunkable location entries. Using the latter
alternative, the server can provide addresses it regards as alternative, the server can provide addresses it regards as
desirable to use to access the file system in question. Although desirable to use to access the file system in question. Although
these entries can contain port numbers, these port numbers are not these entries can contain port numbers, these port numbers are not
used in determining trunking relationships. Once the candidate used in determining trunking relationships. Once the candidate
addresses have been determined and EXCHANGE_ID done to the proper addresses have been determined and EXCHANGE_ID done to the proper
server, only the value of the so_major field returned by the server, only the value of the so_major field returned by the
servers in question determines whether a trunking relationship servers in question determines whether a trunking relationship
actually exists. actually exists.
It should be noted that the client, when it fetches a location When the client fetches a location attribute for a file system, it
attribute for a file system, may encounter multiple entries for a should be noted that the client may encounter multiple entries for a
number of reasons, so that, when determining trunking information, it number of reasons, such that when it determines trunking information,
may have to bypass addresses not trunkable with one already known. it may have to bypass addresses not trunkable with one already known.
The server can provide location entries that include either names or The server can provide location entries that include either names or
network addresses. It might use the latter form because of DNS- network addresses. It might use the latter form because of DNS-
related security concerns or because the set of addresses to be used related security concerns or because the set of addresses to be used
might require active management by the server. might require active management by the server.
Location entries used to discover candidate addresses for use in Location entries used to discover candidate addresses for use in
trunking are subject to change, as discussed in Section 11.5.7 below. trunking are subject to change, as discussed in Section 11.5.7 below.
The client may respond to such changes by using additional addresses The client may respond to such changes by using additional addresses
once they are verified or by ceasing to use existing ones. The once they are verified or by ceasing to use existing ones. The
server can force the client to cease using an address by returning server can force the client to cease using an address by returning
NFS4ERR_MOVED when that address is used to access a file system. NFS4ERR_MOVED when that address is used to access a file system.
This allows a transfer of client access which is similar to This allows a transfer of client access that is similar to migration,
migration, although the same file system instance is accessed although the same file system instance is accessed throughout.
throughout.
11.5.3. File System Location Attributes and Connection Type Selection 11.5.3. File System Location Attributes and Connection Type Selection
Because of the need to support multiple types of connections, clients Because of the need to support multiple types of connections, clients
face the issue of determining the proper connection type to use when face the issue of determining the proper connection type to use when
establishing a connection to a given server network address. In some establishing a connection to a given server network address. In some
cases, this issue can be addressed through the use of the connection cases, this issue can be addressed through the use of the connection
"step-up" facility described in Section 18.36. However, because "step-up" facility described in Section 18.36. However, because
there are cases is which that facility is not available, the client there are cases in which that facility is not available, the client
may have to choose a connection type with no possibility of changing may have to choose a connection type with no possibility of changing
it within the scope of a single connection. it within the scope of a single connection.
The two file system location attributes differ as to the information The two file system location attributes differ as to the information
made available in this regard. Fs_locations provides no information made available in this regard. The fs_locations attribute provides
to support connection type selection. As a result, clients no information to support connection type selection. As a result,
supporting multiple connection types would need to attempt to clients supporting multiple connection types would need to attempt to
establish connections using multiple connection types until the one establish connections using multiple connection types until the one
preferred by the client is successfully established. preferred by the client is successfully established.
Fs_locations_info includes a flag, FSLI4TF_RDMA, which, when set The fs_locations_info attribute includes a flag, FSLI4TF_RDMA, which,
indicates that RPC-over-RDMA support is available using the specified when set indicates that RPC-over-RDMA support is available using the
location entry, by "stepping up" an existing TCP connection to specified location entry, by "stepping up" an existing TCP connection
include support for RDMA operation. This flag makes it convenient to include support for RDMA operation. This flag makes it convenient
for a client wishing to use RDMA. When this flag is set, it can for a client wishing to use RDMA. When this flag is set, it can
establish a TCP connection and then convert that connection to use establish a TCP connection and then convert that connection to use
RDMA by using the step-up facility. RDMA by using the step-up facility.
Irrespective of the particular attribute used, when there is no Irrespective of the particular attribute used, when there is no
indication that a step-up operation can be performed, a client indication that a step-up operation can be performed, a client
supporting RDMA operation can establish a new RDMA connection and it supporting RDMA operation can establish a new RDMA connection, and it
can be bound to the session already established by the TCP can be bound to the session already established by the TCP
connection, allowing the TCP connection to be dropped and the session connection, allowing the TCP connection to be dropped and the session
converted to further use in RDMA mode, if the server supports that. converted to further use in RDMA mode, if the server supports that.
11.5.4. File System Replication 11.5.4. File System Replication
The fs_locations and fs_locations_info attributes provide alternative The fs_locations and fs_locations_info attributes provide alternative
file system locations, to be used to access data in place of or in file system locations, to be used to access data in place of or in
addition to the current file system instance. On first access to a addition to the current file system instance. On first access to a
file system, the client should obtain the set of alternate locations file system, the client should obtain the set of alternate locations
skipping to change at line 11471 skipping to change at line 11471
file system impossible or otherwise impractical, the client can use file system impossible or otherwise impractical, the client can use
the alternate locations as a way to get continued access to its data. the alternate locations as a way to get continued access to its data.
The alternate locations may be physical replicas of the (typically The alternate locations may be physical replicas of the (typically
read-only) file system data supplemented by possible asynchronous read-only) file system data supplemented by possible asynchronous
propagation of updates. Alternatively, they may provide for the use propagation of updates. Alternatively, they may provide for the use
of various forms of server clustering in which multiple servers of various forms of server clustering in which multiple servers
provide alternate ways of accessing the same physical file system. provide alternate ways of accessing the same physical file system.
How the difference between replicas affects file system transitions How the difference between replicas affects file system transitions
can be represented within the fs_locations and fs_locations_info can be represented within the fs_locations and fs_locations_info
attributes and how the client deals with file system transition attributes, and how the client deals with file system transition
issues will be discussed in detail in later sections. issues will be discussed in detail in later sections.
Although the location attributes provide some information about the Although the location attributes provide some information about the
nature of the inter-replica transition, many aspects of the semantics nature of the inter-replica transition, many aspects of the semantics
of possible asynchronous updates are not currently described by the of possible asynchronous updates are not currently described by the
protocol, making it necessary that clients using replication to protocol, which makes it necessary for clients using replication to
switch among replicas undergoing change familiarize themselves with switch among replicas undergoing change to familiarize themselves
the semantics of the update approach used. Because of this lack of with the semantics of the update approach used. Because of this lack
specificity, many applications may find use of migration more of specificity, many applications may find the use of migration more
appropriate, since, in that case, the server, when effecting the appropriate, since, in that case, the server, when effecting the
transition, has established a point in time such that all updates transition, has established a point in time such that all updates
made before that can propagated to the new replica as part of the made before that can propagated to the new replica as part of the
migration event. migration event.
11.5.4.1. File System Trunking Presented as Replication 11.5.4.1. File System Trunking Presented as Replication
In some situations, a file system location entry may indicate a file In some situations, a file system location entry may indicate a file
system access path to be used as an alternate location, where system access path to be used as an alternate location, where
trunking, rather than replication, is to be used. The situations in trunking, rather than replication, is to be used. The situations in
which this is appropriate are limited to those in which both of the which this is appropriate are limited to those in which both of the
following are true. following are true:
* The two file system locations (i.e., the one on which the location * The two file system locations (i.e., the one on which the location
attribute is obtained and the one specified in the file system attribute is obtained and the one specified in the file system
location entry) designate the same locations within their location entry) designate the same locations within their
respective single-server namespaces. respective single-server namespaces.
* The two server network addresses (i.e., the one being used to * The two server network addresses (i.e., the one being used to
obtain the location attribute and the one specified in the file obtain the location attribute and the one specified in the file
system location entry) designate the same server (as indicated by system location entry) designate the same server (as indicated by
the same value of the so_major_id field of the eir_server_owner the same value of the so_major_id field of the eir_server_owner
field returned in response to EXCHANGE_ID). field returned in response to EXCHANGE_ID).
When these conditions hold, operations using both access paths are When these conditions hold, operations using both access paths are
generally trunked, although, when the attribute fs_locations_info is generally trunked, although trunking may be disallowed when the
used, trunking may be disallowed: attribute fs_locations_info is used:
* When the fs_locations_info attribute shows the two entries as not * When the fs_locations_info attribute shows the two entries as not
having the same simultaneous-use class, trunking is inhibited and having the same simultaneous-use class, trunking is inhibited, and
the two access paths cannot be used together. the two access paths cannot be used together.
In this case the two paths can be used serially with no transition In this case, the two paths can be used serially with no
activity required on the part of the client. In this case, any transition activity required on the part of the client, and any
transition between access paths is transparent, and the client, in transition between access paths is transparent. In transferring
transferring access from one to the other, is acting as it would access from one to the other, the client acts as if communication
in the event that communication is interrupted, with a new were interrupted, establishing a new connection and possibly a new
connection and possibly a new session being established to session to continue access to the same file system.
continue access to the same file system.
* Note that for two such location entries, any information within * Note that for two such location entries, any information within
the fs_locations_info attribute that indicates the need for the fs_locations_info attribute that indicates the need for
special transition activity, i.e., the appearance of the two file special transition activity, i.e., the appearance of the two file
system location entries with different handle, fileid, write- system location entries with different handle, fileid, write-
verifier, change, and readdir classes, indicates a serious verifier, change, and readdir classes, indicates a serious
problem. The client, if it allows transition to the file system problem. The client, if it allows transition to the file system
instance at all, must not treat any transition as a transparent instance at all, must not treat any transition as a transparent
one. The server SHOULD NOT indicate that these two entries (for one. The server SHOULD NOT indicate that these two entries (for
the same file system on the same server) belong to different the same file system on the same server) belong to different
handle, fileid, write-verifier, change, and readdir classes, handle, fileid, write-verifier, change, and readdir classes,
whether or not the two entries are shown belonging to the same whether or not the two entries are shown belonging to the same
simultaneous-use class. simultaneous-use class.
These situations were recognized by [65], even though that document These situations were recognized by [65], even though that document
made no explicit mention of trunking. made no explicit mention of trunking:
* It treated the situation that we describe as trunking as one of * It treated the situation that we describe as trunking as one of
simultaneous use of two distinct file system instances, even simultaneous use of two distinct file system instances, even
though, in the explanatory framework now used to describe the though, in the explanatory framework now used to describe the
situation, the case is one in which a single file system is situation, the case is one in which a single file system is
accessed by two different trunked addresses. accessed by two different trunked addresses.
* It treated the situation in which two paths are to be used * It treated the situation in which two paths are to be used
serially as a special sort of "transparent transition". however, serially as a special sort of "transparent transition". However,
in the descriptive framework now used to categorize transition in the descriptive framework now used to categorize transition
situations, this is considered a case of a "network endpoint situations, this is considered a case of a "network endpoint
transition" (see Section 11.9). transition" (see Section 11.9).
11.5.5. File System Migration 11.5.5. File System Migration
When a file system is present and becomes inaccessible using the When a file system is present and becomes inaccessible using the
current access path, the NFSv4.1 protocol provides a means by which current access path, the NFSv4.1 protocol provides a means by which
clients can be given the opportunity to have continued access to clients can be given the opportunity to have continued access to
their data. This may involve use of a different access path to the their data. This may involve using a different access path to the
existing replica or by providing a path to a different replica. The existing replica or providing a path to a different replica. The new
new access path or the location of the new replica is specified by a access path or the location of the new replica is specified by a file
file system location attribute. The ensuing migration of access system location attribute. The ensuing migration of access includes
includes the ability to retain locks across the transition. the ability to retain locks across the transition. Depending on
Depending on circumstances, this can involve: circumstances, this can involve:
* The continued use of the existing clientid when accessing the * The continued use of the existing clientid when accessing the
current replica using a new access path. current replica using a new access path.
* Use of lock reclaim, taking advantage of a per-fs grace period. * Use of lock reclaim, taking advantage of a per-fs grace period.
* Use of Transparent State Migration. * Use of Transparent State Migration.
Typically, a client will be accessing the file system in question, Typically, a client will be accessing the file system in question,
get an NFS4ERR_MOVED error, and then use a file system location get an NFS4ERR_MOVED error, and then use a file system location
attribute to determine the new access path for the data. When attribute to determine the new access path for the data. When
fs_locations_info is used, additional information will be available fs_locations_info is used, additional information will be available
that will define the nature of the client's handling of the that will define the nature of the client's handling of the
transition to a new server. transition to a new server.
In most instances, servers will choose to migrate all clients using a In most instances, servers will choose to migrate all clients using a
particular file system to a successor replica at the same time to particular file system to a successor replica at the same time to
avoid cases in which different clients are updating different avoid cases in which different clients are updating different
replicas. However migration of individual client can be helpful in replicas. However, migration of an individual client can be helpful
providing load balancing, as long as the replicas in question are in providing load balancing, as long as the replicas in question are
such that they represent the same data as described in such that they represent the same data as described in
Section 11.11.8. Section 11.11.8.
* In the case in which there is no transition between replicas * In the case in which there is no transition between replicas
(i.e., only a change in access path), there are no special (i.e., only a change in access path), there are no special
difficulties in using of this mechanism to effect load balancing. difficulties in using of this mechanism to effect load balancing.
* In the case in which the two replicas are sufficiently co- * In the case in which the two replicas are sufficiently coordinated
ordinated as to allow coherent simultaneous access to both by a as to allow a single client coherent, simultaneous access to both,
single client, there is, in general, no obstacle to use of there is, in general, no obstacle to the use of migration of
migration of particular clients to effect load balancing. particular clients to effect load balancing. Generally, such
Generally, such simultaneous use involves co-operation between simultaneous use involves cooperation between servers to ensure
servers to ensure that locks granted on two co-ordinated replicas that locks granted on two coordinated replicas cannot conflict and
cannot conflict and can remain effective when transferred to a can remain effective when transferred to a common replica.
common replica.
* In the case in which a large set of clients are accessing a file * In the case in which a large set of clients is accessing a file
system in a read-only fashion, in can be helpful to migrate all system in a read-only fashion, it can be helpful to migrate all
clients with writable access simultaneously, while using load clients with writable access simultaneously, while using load
balancing on the set of read-only copies, as long as the rules balancing on the set of read-only copies, as long as the rules in
appearing in Section 11.11.8, designed to prevent data reversion Section 11.11.8, which are designed to prevent data reversion, are
are adhered to. followed.
In other cases, the client might not have sufficient guarantees of In other cases, the client might not have sufficient guarantees of
data similarity/coherence to function properly (e.g. the data in the data similarity or coherence to function properly (e.g., the data in
two replicas is similar but not identical), and the possibility that the two replicas is similar but not identical), and the possibility
different clients are updating different replicas can exacerbate the that different clients are updating different replicas can exacerbate
difficulties, making use of load balancing in such situations a the difficulties, making the use of load balancing in such situations
perilous enterprise. a perilous enterprise.
The protocol does not specify how the file system will be moved The protocol does not specify how the file system will be moved
between servers or how updates to multiple replicas will be co- between servers or how updates to multiple replicas will be
ordinated. It is anticipated that a number of different server-to- coordinated. It is anticipated that a number of different server-to-
server co-ordination mechanisms might be used with the choice left to server coordination mechanisms might be used, with the choice left to
the server implementer. The NFSv4.1 protocol specifies the method the server implementer. The NFSv4.1 protocol specifies the method
used to communicate the migration event between client and server. used to communicate the migration event between client and server.
The new location may be, in the case of various forms of server In the case of various forms of server clustering, the new location
clustering, another server providing access to the same physical file may be another server providing access to the same physical file
system. The client's responsibilities in dealing with this system. The client's responsibilities in dealing with this
transition will depend on whether a switch between replicas has transition will depend on whether a switch between replicas has
occurred and the means the server has chosen to provide continuity of occurred and the means the server has chosen to provide continuity of
locking state. These issues will be discussed in detail below. locking state. These issues will be discussed in detail below.
Although a single successor location is typical, multiple locations Although a single successor location is typical, multiple locations
may be provided. When multiple locations are provided, the client may be provided. When multiple locations are provided, the client
will typically use the first one provided. If that is inaccessible will typically use the first one provided. If that is inaccessible
for some reason, later ones can be used. In such cases the client for some reason, later ones can be used. In such cases, the client
might consider the transition to the new replica to be a migration might consider the transition to the new replica to be a migration
event, even though some of the servers involved might not be aware of event, even though some of the servers involved might not be aware of
the use of the server which was inaccessible. In such a case, a the use of the server that was inaccessible. In such a case, a
client might lose access to locking state as a result of the access client might lose access to locking state as a result of the access
transfer. transfer.
When an alternate location is designated as the target for migration, When an alternate location is designated as the target for migration,
it must designate the same data (with metadata being the same to the it must designate the same data (with metadata being the same to the
degree indicated by the fs_locations_info attribute). Where file degree indicated by the fs_locations_info attribute). Where file
systems are writable, a change made on the original file system must systems are writable, a change made on the original file system must
be visible on all migration targets. Where a file system is not be visible on all migration targets. Where a file system is not
writable but represents a read-only copy (possibly periodically writable but represents a read-only copy (possibly periodically
updated) of a writable file system, similar requirements apply to the updated) of a writable file system, similar requirements apply to the
skipping to change at line 11683 skipping to change at line 11681
to different locations as reported to individual clients, in order to to different locations as reported to individual clients, in order to
adapt to client physical location or to effect load balancing. When adapt to client physical location or to effect load balancing. When
both read-only and read-write file systems are present, some of the both read-only and read-write file systems are present, some of the
read-only locations might not be absolutely up-to-date (as they would read-only locations might not be absolutely up-to-date (as they would
have to be in the case of replication and migration). Servers may have to be in the case of replication and migration). Servers may
also specify file system locations that include client-substituted also specify file system locations that include client-substituted
variables so that different clients are referred to different file variables so that different clients are referred to different file
systems (with different data contents) based on client attributes systems (with different data contents) based on client attributes
such as CPU architecture. such as CPU architecture.
When the fs_locations_info attribute is such that that there are If the fs_locations_info attribute lists multiple possible targets,
multiple possible targets listed, the relationships among them may be the relationships among them may be important to the client in
important to the client in selecting which one to use. The same selecting which one to use. The same rules specified in
rules specified in Section 11.5.5 below regarding multiple migration Section 11.5.5 below regarding multiple migration targets apply to
targets apply to these multiple replicas as well. For example, the these multiple replicas as well. For example, the client might
client might prefer a writable target on a server that has additional prefer a writable target on a server that has additional writable
writable replicas to which it subsequently might switch. Note that, replicas to which it subsequently might switch. Note that, as
as distinguished from the case of replication, there is no need to distinguished from the case of replication, there is no need to deal
deal with the case of propagation of updates made by the current with the case of propagation of updates made by the current client,
client, since the current client has not accessed the file system in since the current client has not accessed the file system in
question. question.
Use of multi-server namespaces is enabled by NFSv4.1 but is not Use of multi-server namespaces is enabled by NFSv4.1 but is not
required. The use of multi-server namespaces and their scope will required. The use of multi-server namespaces and their scope will
depend on the applications used and system administration depend on the applications used and system administration
preferences. preferences.
Multi-server namespaces can be established by a single server Multi-server namespaces can be established by a single server
providing a large set of pure referrals to all of the included file providing a large set of pure referrals to all of the included file
systems. Alternatively, a single multi-server namespace may be systems. Alternatively, a single multi-server namespace may be
skipping to change at line 11717 skipping to change at line 11715
Generally, multi-server namespaces are for the most part uniform, in Generally, multi-server namespaces are for the most part uniform, in
that the same data made available to one client at a given location that the same data made available to one client at a given location
in the namespace is made available to all clients at that namespace in the namespace is made available to all clients at that namespace
location. However, there are facilities provided that allow location. However, there are facilities provided that allow
different clients to be directed to different sets of data, for different clients to be directed to different sets of data, for
reasons such as enabling adaptation to such client characteristics as reasons such as enabling adaptation to such client characteristics as
CPU architecture. These facilities are described in Section 11.17.3. CPU architecture. These facilities are described in Section 11.17.3.
Note that it is possible, when providing a uniform namespace, to Note that it is possible, when providing a uniform namespace, to
provide different location entries to different clients, in order to provide different location entries to different clients in order to
provide each client with a copy of the data physically closest to it, provide each client with a copy of the data physically closest to it
or otherwise optimize access (e.g. provide load balancing). or otherwise optimize access (e.g., provide load balancing).
11.5.7. Changes in a File System Location Attribute 11.5.7. Changes in a File System Location Attribute
Although clients will typically fetch a file system location Although clients will typically fetch a file system location
attribute when first accessing a file system and when NFS4ERR_MOVED attribute when first accessing a file system and when NFS4ERR_MOVED
is returned, a client can choose to fetch the attribute periodically, is returned, a client can choose to fetch the attribute periodically,
in which case the value fetched may change over time. in which case, the value fetched may change over time.
For clients not prepared to access multiple replicas simultaneously For clients not prepared to access multiple replicas simultaneously
(see Section 11.11.1), the handling of the various cases of location (see Section 11.11.1), the handling of the various cases of location
change are as follows: change are as follows:
* Changes in the list of replicas or in the network addresses * Changes in the list of replicas or in the network addresses
associated with replicas do not require immediate action. The associated with replicas do not require immediate action. The
client will typically update its list of replicas to reflect the client will typically update its list of replicas to reflect the
new information. new information.
* Additions to the list of network addresses for the current file * Additions to the list of network addresses for the current file
system instance need not be acted on promptly. However, to system instance need not be acted on promptly. However, to
prepare for the case in which a migration event occurs prepare for a subsequent migration event, the client can choose to
subsequently, the client can choose to take note of the new take note of the new address and then use it whenever it needs to
address and then use it whenever it needs to switch access to a switch access to a new replica.
new replica.
* Deletions from the list of network addresses for the current file * Deletions from the list of network addresses for the current file
system instance do not need to be acted on immediately by ceasing system instance do not require the client to immediately cease use
use of existing access paths although new connections are not to of existing access paths, although new connections are not to be
be established on addresses that have been deleted. However, established on addresses that have been deleted. However, clients
clients can choose to act on such deletions by making preparations can choose to act on such deletions by preparing for an eventual
for an eventual shift in access which would become unavoidable as shift in access, which becomes unavoidable as soon as the server
soon as the server indicates that a particular network access path returns NFS4ERR_MOVED to indicate that a particular network access
is not usable to access the current file system, by returning path is not usable to access the current file system.
NFS4ERR_MOVED.
For clients that are prepared to access several replicas For clients that are prepared to access several replicas
simultaneously, the following additional cases need to be addressed. simultaneously, the following additional cases need to be addressed.
As in the cases discussed above, changes in the set of replicas need As in the cases discussed above, changes in the set of replicas need
not be acted upon promptly, although the client has the option of not be acted upon promptly, although the client has the option of
adjusting its access even in the absence of difficulties that would adjusting its access even in the absence of difficulties that would
lead to a new replica to be selected. lead to the selection of a new replica.
* When a new replica is added which may be accessed simultaneously * When a new replica is added, which may be accessed simultaneously
with one currently in use, the client is free to use the new with one currently in use, the client is free to use the new
replica immediately. replica immediately.
* When a replica currently in use is deleted from the list, the * When a replica currently in use is deleted from the list, the
client need not cease using it immediately. However, since the client need not cease using it immediately. However, since the
server may subsequently force such use to cease (by returning server may subsequently force such use to cease (by returning
NFS4ERR_MOVED), clients might decide to limit the need for later NFS4ERR_MOVED), clients might decide to limit the need for later
state transfer. For example, new opens might be done on other state transfer. For example, new opens might be done on other
replicas, rather than on one not present in the list. replicas, rather than on one not present in the list.
11.6. Trunking without File System Location Information 11.6. Trunking without File System Location Information
In situations in which a file system is accessed using two server- In situations in which a file system is accessed using two server-
trunkable addresses (as indicated by the same value of the trunkable addresses (as indicated by the same value of the
so_major_id field of the eir_server_owner field returned in response so_major_id field of the eir_server_owner field returned in response
to EXCHANGE_ID), trunked access is allowed even though there might to EXCHANGE_ID), trunked access is allowed even though there might
not be any location entries specifically indicating the use of not be any location entries specifically indicating the use of
trunking for that file system. trunking for that file system.
This situation was recognized by [65], even though that document made This situation was recognized by [65], although that document made no
no explicit mention of trunking and treated the situation as one of explicit mention of trunking and treated the situation as one of
simultaneous use of two distinct file system instances, even though, simultaneous use of two distinct file system instances. In the
in the explanatory framework now used to describe the situation, the explanatory framework now used to describe the situation, the case is
case is one in which a single file system is accessed by two one in which a single file system is accessed by two different
different trunked addresses. trunked addresses.
11.7. Users and Groups in a Multi-server Namespace 11.7. Users and Groups in a Multi-Server Namespace
As in the case of a single-server environment (see Section 5.9, when As in the case of a single-server environment (see Section 5.9), when
an owner or group name of the form "id@domain" is assigned to a file, an owner or group name of the form "id@domain" is assigned to a file,
there is an implicit promise to return that same string when the there is an implicit promise to return that same string when the
corresponding attribute is interrogated subsequently. In the case of corresponding attribute is interrogated subsequently. In the case of
a multi-server namespace, that same promise applies even if server a multi-server namespace, that same promise applies even if server
boundaries have been crossed. Similarly, when the owner attribute of boundaries have been crossed. Similarly, when the owner attribute of
a file is derived from the security principal which created the file, a file is derived from the security principal that created the file,
that attribute should have the same value even if the interrogation that attribute should have the same value even if the interrogation
occurs on a different server from the file creation. occurs on a different server from the file creation.
Similarly, the set of security principals recognized by all the Similarly, the set of security principals recognized by all the
participating servers needs to be the same, with each such principal participating servers needs to be the same, with each such principal
having the same credentials, regardless of the particular server having the same credentials, regardless of the particular server
being accessed. being accessed.
In order to meet these requirements, those setting up multi-server In order to meet these requirements, those setting up multi-server
namespaces will need to limit the servers included so that: namespaces will need to limit the servers included so that:
* In all cases in which more than a single domain is supported, the * In all cases in which more than a single domain is supported, the
requirements stated in RFC8000 [31] are to be respected. requirements stated in RFC 8000 [31] are to be respected.
* All servers support a common set of domains which includes all of * All servers support a common set of domains that includes all of
the domains clients use and expect to see returned as the domain the domains clients use and expect to see returned as the domain
portion of an owner or group in the form "id@domain". Note that portion of an owner or group in the form "id@domain". Note that,
although this set most often consists of a single domain, it is although this set most often consists of a single domain, it is
possible for multiple domains to be supported. possible for multiple domains to be supported.
* All servers, for each domain that they support, accept the same * All servers, for each domain that they support, accept the same
set of user and group ids as valid. set of user and group ids as valid.
* All servers recognize the same set of security principals. For * All servers recognize the same set of security principals. For
each principal, the same credential is required, independent of each principal, the same credential is required, independent of
the server being accessed. In addition, the group membership for the server being accessed. In addition, the group membership for
each such principal is to be the same, independent of the server each such principal is to be the same, independent of the server
accessed. accessed.
Note that there is no requirement in general that the users Note that there is no requirement in general that the users
corresponding to particular security principals have the same local corresponding to particular security principals have the same local
representation on each server, even though it is most often the case representation on each server, even though it is most often the case
that this is so. that this is so.
When AUTH_SYS is used, the following additional requirements must be When AUTH_SYS is used, the following additional requirements must be
met: met:
* Only a single NFSv4 domain can be supported through use of * Only a single NFSv4 domain can be supported through the use of
AUTH_SYS. AUTH_SYS.
* The "local" representation of all owners and groups must be the * The "local" representation of all owners and groups must be the
same on all servers. The word "local" is used here since that is same on all servers. The word "local" is used here since that is
the way that numeric user and group ids are described in the way that numeric user and group ids are described in
Section 5.9. However, when AUTH_SYS or stringified numeric owners Section 5.9. However, when AUTH_SYS or stringified numeric owners
or groups are used, these identifiers are not truly local, since or groups are used, these identifiers are not truly local, since
they are known to the clients as well as the server. they are known to the clients as well as to the server.
Similarly, when stringified numeric user and group ids are used, the Similarly, when stringified numeric user and group ids are used, the
"local" representation of all owners and groups must be the same on "local" representation of all owners and groups must be the same on
all servers, even when AUTH_SYS is not used. all servers, even when AUTH_SYS is not used.
11.8. Additional Client-Side Considerations 11.8. Additional Client-Side Considerations
When clients make use of servers that implement referrals, When clients make use of servers that implement referrals,
replication, and migration, care should be taken that a user who replication, and migration, care should be taken that a user who
mounts a given file system that includes a referral or a relocated mounts a given file system that includes a referral or a relocated
skipping to change at line 11900 skipping to change at line 11896
How these are dealt with is discussed in Section 11.11. How these are dealt with is discussed in Section 11.11.
* Those in which access to the current file system instance is * Those in which access to the current file system instance is
retained, while the network path used to access that instance is retained, while the network path used to access that instance is
changed. This case is discussed in Section 11.10. changed. This case is discussed in Section 11.10.
11.10. Effecting Network Endpoint Transitions 11.10. Effecting Network Endpoint Transitions
The endpoints used to access a particular file system instance may The endpoints used to access a particular file system instance may
change in a number of ways, as listed below. In each of these cases, change in a number of ways, as listed below. In each of these cases,
the same fsid, filehandles, stateids, client IDs and are used to the same fsid, filehandles, stateids, client IDs, and are used to
continue access, with a continuity of lock state. In many cases, the continue access, with a continuity of lock state. In many cases, the
same sessions can also be used. same sessions can also be used.
The appropriate action depends on the set of replacement addresses The appropriate action depends on the set of replacement addresses
(i.e. server endpoints which are server-trunkable with one previously that are available for use (i.e., server endpoints that are server-
being used) which are available for use. trunkable with one previously being used).
* When use of a particular address is to cease and there is also * When use of a particular address is to cease, and there is also
another one currently in use which is server-trunkable with it, another address currently in use that is server-trunkable with it,
requests that would have been issued on the address whose use is requests that would have been issued on the address whose use is
to be discontinued can be issued on the remaining address(es). to be discontinued can be issued on the remaining address(es).
When an address is server-trunkable but not session-trunkable with When an address is server-trunkable but not session-trunkable with
the address whose use is to be discontinued, the request might the address whose use is to be discontinued, the request might
need to be modified to reflect the fact that a different session need to be modified to reflect the fact that a different session
will be used. will be used.
* When use of a particular connection is to cease, as indicated by * When use of a particular connection is to cease, as indicated by
receiving NFS4ERR_MOVED when using that connection but that receiving NFS4ERR_MOVED when using that connection, but that
address is still indicated as accessible according to the address is still indicated as accessible according to the
appropriate file system location entries, it is likely that appropriate file system location entries, it is likely that
requests can be issued on a new connection of a different requests can be issued on a new connection of a different
connection type, once that connection is established. Since any connection type once that connection is established. Since any
two, non-port-specific server endpoints that share a network two non-port-specific server endpoints that share a network
address are inherently session-trunkable, the client can use address are inherently session-trunkable, the client can use
BIND_CONN_TO_SESSION to access the existing session using the new BIND_CONN_TO_SESSION to access the existing session using the new
connection and proceed to access the file system using the new connection and proceed to access the file system using the new
connection. connection.
* When there are no potential replacement addresses in use but there * When there are no potential replacement addresses in use, but
are valid addresses session-trunkable with the one whose use is to there are valid addresses session-trunkable with the one whose use
be discontinued, the client can use BIND_CONN_TO_SESSION to access is to be discontinued, the client can use BIND_CONN_TO_SESSION to
the existing session using the new address. Although the target access the existing session using the new address. Although the
session will generally be accessible, there may be rare situations target session will generally be accessible, there may be rare
in which that session is no longer accessible, when an attempt is situations in which that session is no longer accessible when an
made to bind the new connection to it. In this case, the client attempt is made to bind the new connection to it. In this case,
can create a new session to enable continued access to the the client can create a new session to enable continued access to
existing instance using the new connection, providing for use of the existing instance using the new connection, providing for the
existing filehandles, stateids, and client ids while providing use of existing filehandles, stateids, and client ids while
continuity of locking state. supplying continuity of locking state.
* When there is no potential replacement address in use and there * When there is no potential replacement address in use, and there
are no valid addresses session-trunkable with the one whose use is are no valid addresses session-trunkable with the one whose use is
to be discontinued, other server-trunkable addresses may be used to be discontinued, other server-trunkable addresses may be used
to provide continued access. Although use of CREATE_SESSION is to provide continued access. Although the use of CREATE_SESSION
available to provide continued access to the existing instance, is available to provide continued access to the existing instance,
servers have the option of providing continued access to the servers have the option of providing continued access to the
existing session through the new network access path in a fashion existing session through the new network access path in a fashion
similar to that provided by session migration (see Section 11.12). similar to that provided by session migration (see Section 11.12).
To take advantage of this possibility, clients can perform an To take advantage of this possibility, clients can perform an
initial BIND_CONN_TO_SESSION, as in the previous case, and use initial BIND_CONN_TO_SESSION, as in the previous case, and use
CREATE_SESSION only if that fails. CREATE_SESSION only if that fails.
11.11. Effecting File System Transitions 11.11. Effecting File System Transitions
There are a range of situations in which there is a change to be There are a range of situations in which there is a change to be
skipping to change at line 11970 skipping to change at line 11966
For reasons explained in that section, most transitions will involve For reasons explained in that section, most transitions will involve
a transition from a single replica to a corresponding replacement a transition from a single replica to a corresponding replacement
replica. When effecting replica transition, some types of sharing replica. When effecting replica transition, some types of sharing
between the replicas may affect handling of the transition as between the replicas may affect handling of the transition as
described in Sections 11.11.2 through 11.11.8 below. The attribute described in Sections 11.11.2 through 11.11.8 below. The attribute
fs_locations_info provides helpful information to allow the client to fs_locations_info provides helpful information to allow the client to
determine the degree of inter-replica sharing. determine the degree of inter-replica sharing.
With regard to some types of state, the degree of continuity across With regard to some types of state, the degree of continuity across
the transition depends on the occasion prompting the transition, with the transition depends on the occasion prompting the transition, with
transitions initiated by the servers (i.e. migration) offering much transitions initiated by the servers (i.e., migration) offering much
more scope for a non-disruptive transition than cases in which the more scope for a nondisruptive transition than cases in which the
client on its own shifts its access to another replica (i.e. client on its own shifts its access to another replica (i.e.,
replication). This issue potentially applies to locking state and to replication). This issue potentially applies to locking state and to
session state, which are dealt with below as follows: session state, which are dealt with below as follows:
* An introduction to the possible means of providing continuity in * An introduction to the possible means of providing continuity in
these areas appears in Section 11.11.9 below. these areas appears in Section 11.11.9 below.
* Transparent State Migration is introduced in Section 11.12. The * Transparent State Migration is introduced in Section 11.12. The
possible transfer of session state is addressed there as well. possible transfer of session state is addressed there as well.
* The client handling of transitions, including determining how to * The client handling of transitions, including determining how to
deal with the various means that the server might take to supply deal with the various means that the server might take to supply
effective continuity of locking state is discussed in effective continuity of locking state, is discussed in
Section 11.13. Section 11.13.
* The servers' (source and destination) responsibilities in * The source and destination servers' responsibilities in effecting
effecting Transparent Migration of locking and session state are Transparent State Migration of locking and session state are
discussed in Section 11.14. discussed in Section 11.14.
11.11.1. File System Transitions and Simultaneous Access 11.11.1. File System Transitions and Simultaneous Access
The fs_locations_info attribute (described in Section 11.17) may The fs_locations_info attribute (described in Section 11.17) may
indicate that two replicas may be used simultaneously, although some indicate that two replicas may be used simultaneously, although some
situations in which such simultaneous access is permitted are more situations in which such simultaneous access is permitted are more
appropriately described as instances of trunking (see appropriately described as instances of trunking (see
Section 11.5.4.1). Although situations in which multiple replicas Section 11.5.4.1). Although situations in which multiple replicas
may be accessed simultaneously are somewhat similar to those in which may be accessed simultaneously are somewhat similar to those in which
a single replica is accessed by multiple network addresses, there are a single replica is accessed by multiple network addresses, there are
important differences, since locking state is not shared among important differences since locking state is not shared among
multiple replicas. multiple replicas.
Because of this difference in state handling, many clients will not Because of this difference in state handling, many clients will not
have the ability to take advantage of the fact that such replicas have the ability to take advantage of the fact that such replicas
represent the same data. Such clients will not be prepared to use represent the same data. Such clients will not be prepared to use
multiple replicas simultaneously but will access each file system multiple replicas simultaneously but will access each file system
using only a single replica, although the replica selected might make using only a single replica, although the replica selected might make
multiple server-trunkable addresses available. multiple server-trunkable addresses available.
Clients who are prepared to use multiple replicas simultaneously will Clients who are prepared to use multiple replicas simultaneously can
divide opens among replicas however they choose. Once that choice is divide opens among replicas however they choose. Once that choice is
made, any subsequent transitions will treat the set of locking state made, any subsequent transitions will treat the set of locking state
associated with each replica as a single entity. associated with each replica as a single entity.
For example, if one of the replicas become unavailable, access will For example, if one of the replicas become unavailable, access will
be transferred to a different replica, also capable of simultaneous be transferred to a different replica, which is also capable of
access with the one still in use. simultaneous access with the one still in use.
When there is no such replica, the transition may be to the replica When there is no such replica, the transition may be to the replica
already in use. At this point, the client has a choice between already in use. At this point, the client has a choice between
merging the locking state for the two replicas under the aegis of the merging the locking state for the two replicas under the aegis of the
sole replica in use or treating these separately, until another sole replica in use or treating these separately until another
replica capable of simultaneous access presents itself. replica capable of simultaneous access presents itself.
11.11.2. Filehandles and File System Transitions 11.11.2. Filehandles and File System Transitions
There are a number of ways in which filehandles can be handled across There are a number of ways in which filehandles can be handled across
a file system transition. These can be divided into two broad a file system transition. These can be divided into two broad
classes depending upon whether the two file systems across which the classes depending upon whether the two file systems across which the
transition happens share sufficient state to effect some sort of transition happens share sufficient state to effect some sort of
continuity of file system handling. continuity of file system handling.
skipping to change at line 12162 skipping to change at line 12158
When the two file systems have consistent change attribute formats, When the two file systems have consistent change attribute formats,
and this fact is communicated to the client by reporting in the same and this fact is communicated to the client by reporting in the same
change class, the client may assume a continuity of change attribute change class, the client may assume a continuity of change attribute
construction and handle this situation just as it would be handled construction and handle this situation just as it would be handled
without any file system transition. without any file system transition.
11.11.6. Write Verifiers and File System Transitions 11.11.6. Write Verifiers and File System Transitions
In a file system transition, the two file systems might be In a file system transition, the two file systems might be
cooperating in the handling of unstably written data. Clients can cooperating in the handling of unstably written data. Clients can
determine if this is the case, by seeing if the two file systems determine if this is the case by seeing if the two file systems
belong to the same write-verifier class. When this is the case, belong to the same write-verifier class. When this is the case,
write verifiers returned from one system may be compared to those write verifiers returned from one system may be compared to those
returned by the other and superfluous writes avoided. returned by the other and superfluous writes can be avoided.
When two file systems belong to different write-verifier classes, any When two file systems belong to different write-verifier classes, any
verifier generated by one must not be compared to one provided by the verifier generated by one must not be compared to one provided by the
other. Instead, the two verifiers should be treated as not equal other. Instead, the two verifiers should be treated as not equal
even when the values are identical. even when the values are identical.
11.11.7. Readdir Cookies and Verifiers and File System Transitions 11.11.7. READDIR Cookies and Verifiers and File System Transitions
In a file system transition, the two file systems might be consistent In a file system transition, the two file systems might be consistent
in their handling of READDIR cookies and verifiers. Clients can in their handling of READDIR cookies and verifiers. Clients can
determine if this is the case, by seeing if the two file systems determine if this is the case by seeing if the two file systems
belong to the same readdir class. When this is the case, readdir belong to the same readdir class. When this is the case, readdir
class, READDIR cookies and verifiers from one system will be class, READDIR cookies, and verifiers from one system will be
recognized by the other and READDIR operations started on one server recognized by the other, and READDIR operations started on one server
can be validly continued on the other, simply by presenting the can be validly continued on the other simply by presenting the cookie
cookie and verifier returned by a READDIR operation done on the first and verifier returned by a READDIR operation done on the first file
file system to the second. system to the second.
When two file systems belong to different readdir classes, any When two file systems belong to different readdir classes, any
READDIR cookie and verifier generated by one is not valid on the READDIR cookie and verifier generated by one is not valid on the
second, and must not be presented to that server by the client. The second and must not be presented to that server by the client. The
client should act as if the verifier were rejected. client should act as if the verifier were rejected.
11.11.8. File System Data and File System Transitions 11.11.8. File System Data and File System Transitions
When multiple replicas exist and are used simultaneously or in When multiple replicas exist and are used simultaneously or in
succession by a client, applications using them will normally expect succession by a client, applications using them will normally expect
that they contain either the same data or data that is consistent that they contain either the same data or data that is consistent
with the normal sorts of changes that are made by other clients with the normal sorts of changes that are made by other clients
updating the data of the file system (with metadata being the same to updating the data of the file system (with metadata being the same to
the degree indicated by the fs_locations_info attribute). However, the degree indicated by the fs_locations_info attribute). However,
when multiple file systems are presented as replicas of one another, when multiple file systems are presented as replicas of one another,
the precise relationship between the data of one and the data of the precise relationship between the data of one and the data of
another is not, as a general matter, specified by the NFSv4.1 another is not, as a general matter, specified by the NFSv4.1
protocol. It is quite possible to present as replicas file systems protocol. It is quite possible to present as replicas file systems
where the data of those file systems is sufficiently different that where the data of those file systems is sufficiently different that
some applications have problems dealing with the transition between some applications have problems dealing with the transition between
replicas. The namespace will typically be constructed so that replicas. The namespace will typically be constructed so that
applications can choose an appropriate level of support, so that in applications can choose an appropriate level of support, so that in
one position in the namespace a varied set of replicas might be one position in the namespace, a varied set of replicas might be
listed, while in another only those that are up-to-date would be listed, while in another, only those that are up-to-date would be
considered replicas. The protocol does define three special cases of considered replicas. The protocol does define three special cases of
the relationship among replicas to be specified by the server and the relationship among replicas to be specified by the server and
relied upon by clients: relied upon by clients:
* When multiple replicas exist and are used simultaneously by a * When multiple replicas exist and are used simultaneously by a
client (see the FSLIB4_CLSIMUL definition within client (see the FSLIB4_CLSIMUL definition within
fs_locations_info), they must designate the same data. Where file fs_locations_info), they must designate the same data. Where file
systems are writable, a change made on one instance must be systems are writable, a change made on one instance must be
visible on all instances at the same time, regardless of whether visible on all instances at the same time, regardless of whether
the interrogated instance is the one on which the modification was the interrogated instance is the one on which the modification was
done. This allows a client to use these replicas simultaneously done. This allows a client to use these replicas simultaneously
without any special adaptation to the fact that there are multiple without any special adaptation to the fact that there are multiple
replicas, beyond adapting to the fact that locks obtained on one replicas, beyond adapting to the fact that locks obtained on one
replica are maintained separately (i.e. under a different client replica are maintained separately (i.e., under a different client
ID). In this case, locks (whether share reservations or byte- ID). In this case, locks (whether share reservations or byte-
range locks) and delegations obtained on one replica are range locks) and delegations obtained on one replica are
immediately reflected on all replicas, in the sense that access immediately reflected on all replicas, in the sense that access
from all other servers is prevented regardless of the replica from all other servers is prevented regardless of the replica
used. However, because the servers are not required to treat two used. However, because the servers are not required to treat two
associated client IDs as representing the same client, it is best associated client IDs as representing the same client, it is best
to access each file using only a single client ID. to access each file using only a single client ID.
* When one replica is designated as the successor instance to * When one replica is designated as the successor instance to
another existing instance after return NFS4ERR_MOVED (i.e., the another existing instance after the return of NFS4ERR_MOVED (i.e.,
case of migration), the client may depend on the fact that all the case of migration), the client may depend on the fact that all
changes written to stable storage on the original instance are changes written to stable storage on the original instance are
written to stable storage of the successor (uncommitted writes are written to stable storage of the successor (uncommitted writes are
dealt with in Section 11.11.6 above). dealt with in Section 11.11.6 above).
* Where a file system is not writable but represents a read-only * Where a file system is not writable but represents a read-only
copy (possibly periodically updated) of a writable file system, copy (possibly periodically updated) of a writable file system,
clients have similar requirements with regard to the propagation clients have similar requirements with regard to the propagation
of updates. They may need a guarantee that any change visible on of updates. They may need a guarantee that any change visible on
the original file system instance must be immediately visible on the original file system instance must be immediately visible on
any replica before the client transitions access to that replica, any replica before the client transitions access to that replica,
skipping to change at line 12253 skipping to change at line 12249
transition to a replica, will see any reversion in file system transition to a replica, will see any reversion in file system
state. The specific means of this guarantee varies based on the state. The specific means of this guarantee varies based on the
value of the fss_type field that is reported as part of the value of the fss_type field that is reported as part of the
fs_status attribute (see Section 11.18). Since these file systems fs_status attribute (see Section 11.18). Since these file systems
are presumed to be unsuitable for simultaneous use, there is no are presumed to be unsuitable for simultaneous use, there is no
specification of how locking is handled; in general, locks specification of how locking is handled; in general, locks
obtained on one file system will be separate from those on others. obtained on one file system will be separate from those on others.
Since these are expected to be read-only file systems, this is not Since these are expected to be read-only file systems, this is not
likely to pose an issue for clients or applications. likely to pose an issue for clients or applications.
When none of these special situations apply, there is no basis, When none of these special situations applies, there is no basis
within the protocol for the client to make assumptions about the within the protocol for the client to make assumptions about the
contents of a replica file system or its relationship to previous contents of a replica file system or its relationship to previous
file system instances. Thus switching between nominally identical file system instances. Thus, switching between nominally identical
read-write file systems would not be possible, because either the read-write file systems would not be possible because either the
client does not use or the server does not support the client does not use the fs_locations_info attribute, or the server
fs_locations_info attribute. does not support it.
11.11.9. Lock State and File System Transitions 11.11.9. Lock State and File System Transitions
While accessing a file system, clients obtain locks enforced by the While accessing a file system, clients obtain locks enforced by the
server which may prevent actions by other clients that are server, which may prevent actions by other clients that are
inconsistent with those locks. inconsistent with those locks.
When access is transferred between replicas, clients need to be When access is transferred between replicas, clients need to be
assured that the actions disallowed by holding these locks cannot assured that the actions disallowed by holding these locks cannot
have occurred during the transition. This can be ensured by the have occurred during the transition. This can be ensured by the
methods below. Unless at least one of these is implemented, clients methods below. Unless at least one of these is implemented, clients
will not be assured of continuity of lock possession across a will not be assured of continuity of lock possession across a
migration event. migration event:
* Providing the client an opportunity to re-obtain his locks via a * Providing the client an opportunity to re-obtain his locks via a
per-fs grace period on the destination server, denying all clients per-fs grace period on the destination server, denying all clients
using the destination file system the opportunity to obtain new using the destination file system the opportunity to obtain new
locks that conflict which those held by the transferred client as locks that conflict with those held by the transferred client as
long as that client has not completed its per-fs grace period. long as that client has not completed its per-fs grace period.
Because the lock reclaim mechanism was originally defined to Because the lock reclaim mechanism was originally defined to
support server reboot, it implicitly assumes that file handles support server reboot, it implicitly assumes that filehandles
will, upon reclaim, will be the same as those at open. In the will, upon reclaim, be the same as those at open. In the case of
case of migration, this requires that source and destination migration, this requires that source and destination servers use
servers use the same filehandles, as evidenced by using the same the same filehandles, as evidenced by using the same server scope
server scope (see Section 2.10.4) or by showing this agreement (see Section 2.10.4) or by showing this agreement using
using fs_locations_info (see Section 11.11.2 above). fs_locations_info (see Section 11.11.2 above).
Note that such a grace period can be implemented without Note that such a grace period can be implemented without
interfering with the ability of non-transferred clients to obtain interfering with the ability of non-transferred clients to obtain
new locks while it is going on. As long as the destination server new locks while it is going on. As long as the destination server
is aware of the transferred locks, it can distinguish requests to is aware of the transferred locks, it can distinguish requests to
obtain new locks that contrast with existing locks from those that obtain new locks that contrast with existing locks from those that
do not, allowing it to treat such client requests without do not, allowing it to treat such client requests without
reference to the ongoing grace period. reference to the ongoing grace period.
* Locking state can be transferred as part of the transition by * Locking state can be transferred as part of the transition by
providing Transparent State Migration as described in providing Transparent State Migration as described in
Section 11.12. Section 11.12.
Of these, Transparent State Migration provides the smoother Of these, Transparent State Migration provides the smoother
experience for clients in that there is no need to go through a experience for clients in that there is no need to go through a
reclaim process before new locks can be obtained. However, it reclaim process before new locks can be obtained; however, it
requires a greater degree of inter-server co-ordination. In general, requires a greater degree of inter-server coordination. In general,
the servers taking part in migration are free to provide either the servers taking part in migration are free to provide either
facility. However, when the filehandles can differ across the facility. However, when the filehandles can differ across the
migration event, Transparent State Migration is the only available migration event, Transparent State Migration is the only available
means of providing the needed functionality. means of providing the needed functionality.
It should be noted that these two methods are not mutually exclusive It should be noted that these two methods are not mutually exclusive
and that a server might well provide both. In particular, if there and that a server might well provide both. In particular, if there
is some circumstance preventing a specific lock from being is some circumstance preventing a specific lock from being
transferred transparently, the destination server can allow it to be transferred transparently, the destination server can allow it to be
reclaimed, by implementing a per-fs grace period for the migrated reclaimed by implementing a per-fs grace period for the migrated file
file system. system.
11.11.9.1. Security Consideration Related to Reclaiming Lock State 11.11.9.1. Security Consideration Related to Reclaiming Lock State
after File System Transitions after File System Transitions
Although it is possible for a client reclaiming state to misrepresent Although it is possible for a client reclaiming state to misrepresent
its state, in the same fashion as described in Section 8.4.2.1.1, its state in the same fashion as described in Section 8.4.2.1.1, most
most implementations providing for such reclamation in the case of implementations providing for such reclamation in the case of file
file system transitions will have the ability to detect such system transitions will have the ability to detect such
misrepresentations. This limits the ability of unauthenticated misrepresentations. This limits the ability of unauthenticated
clients to execute denial-of-service attacks in these circumstances. clients to execute denial-of-service attacks in these circumstances.
Nevertheless, the rules stated in Section 8.4.2.1.1, regarding Nevertheless, the rules stated in Section 8.4.2.1.1 regarding
principal verification for reclaim requests, apply in this situation principal verification for reclaim requests apply in this situation
as well. as well.
Typically, implementations that support file system transitions will Typically, implementations that support file system transitions will
have extensive information about the locks to be transferred. This have extensive information about the locks to be transferred. This
is because: is because of the following:
* Since failure is not involved, there is no need store to locking * Since failure is not involved, there is no need to store locking
information in persistent storage. information in persistent storage.
* There is no need, as there is in the failure case, to update * There is no need, as there is in the failure case, to update
multiple repositories containing locking state to keep them in multiple repositories containing locking state to keep them in
sync. Instead, there is a one-time communication of locking state sync. Instead, there is a one-time communication of locking state
from the source to the destination server. from the source to the destination server.
* Providing this information avoids potential interference with * Providing this information avoids potential interference with
existing clients using the destination file system, by denying existing clients using the destination file system by denying them
them the ability to obtain new locks during the grace period. the ability to obtain new locks during the grace period.
When such detailed locking information, not necessarily including the When such detailed locking information, not necessarily including the
associated stateids, is available: associated stateids, is available:
* It is possible to detect reclaim requests that attempt to reclaim * It is possible to detect reclaim requests that attempt to reclaim
locks that did not exist before the transfer, rejecting them with locks that did not exist before the transfer, rejecting them with
NFS4ERR_RECLAIM_BAD (Section 15.1.9.4). NFS4ERR_RECLAIM_BAD (Section 15.1.9.4).
* It is possible when dealing with non-reclaim requests, to * It is possible when dealing with non-reclaim requests, to
determine whether they conflict with existing locks, eliminating determine whether they conflict with existing locks, eliminating
the need to return NFS4ERR_GRACE (Section 15.1.9.2) on non-reclaim the need to return NFS4ERR_GRACE (Section 15.1.9.2) on non-reclaim
requests. requests.
It is possible for implementations of grace periods in connection It is possible for implementations of grace periods in connection
with file system transitions not to have detailed locking information with file system transitions not to have detailed locking information
available at the destination server, in which case the security available at the destination server, in which case, the security
situation is exactly as described in Section 8.4.2.1.1. situation is exactly as described in Section 8.4.2.1.1.
11.11.9.2. Leases and File System Transitions 11.11.9.2. Leases and File System Transitions
In the case of lease renewal, the client may not be submitting In the case of lease renewal, the client may not be submitting
requests for a file system that has been transferred to another requests for a file system that has been transferred to another
server. This can occur because of the lease renewal mechanism. The server. This can occur because of the lease renewal mechanism. The
client renews the lease associated with all file systems when client renews the lease associated with all file systems when
submitting a request on an associated session, regardless of the submitting a request on an associated session, regardless of the
specific file system being referenced. specific file system being referenced.
skipping to change at line 12438 skipping to change at line 12434
new server, the client should fetch the value of lease_time on the new server, the client should fetch the value of lease_time on the
new (i.e., destination) server, and use it for subsequent locking new (i.e., destination) server, and use it for subsequent locking
requests. However, the server must respect a grace period of at requests. However, the server must respect a grace period of at
least as long as the lease_time on the source server, in order to least as long as the lease_time on the source server, in order to
ensure that clients have ample time to reclaim their lock before ensure that clients have ample time to reclaim their lock before
potentially conflicting non-reclaimed locks are granted. potentially conflicting non-reclaimed locks are granted.
11.12. Transferring State upon Migration 11.12. Transferring State upon Migration
When the transition is a result of a server-initiated decision to When the transition is a result of a server-initiated decision to
transition access and the source and destination servers have transition access, and the source and destination servers have
implemented appropriate co-operation, it is possible to: implemented appropriate cooperation, it is possible to do the
following:
* Transfer locking state from the source to the destination server, * Transfer locking state from the source to the destination server
in a fashion similar to that provided by Transparent State in a fashion similar to that provided by Transparent State
Migration in NFSv4.0, as described in [68]. Server Migration in NFSv4.0, as described in [68]. Server
responsibilities are described in Section 11.14.2. responsibilities are described in Section 11.14.2.
* Transfer session state from the source to the destination server. * Transfer session state from the source to the destination server.
Server responsibilities in effecting such a transfer are described Server responsibilities in effecting such a transfer are described
in Section 11.14.3. in Section 11.14.3.
The means by which the client determines which of these transfer The means by which the client determines which of these transfer
events has occurred are described in Section 11.13. events has occurred are described in Section 11.13.
11.12.1. Transparent State Migration and pNFS 11.12.1. Transparent State Migration and pNFS
When pNFS is involved, the protocol is capable of supporting: When pNFS is involved, the protocol is capable of supporting:
* Migration of the Metadata Server (MDS), leaving the Data Servers * Migration of the Metadata Server (MDS), leaving the Data Servers
(DS's) in place. (DSs) in place.
* Migration of the file system as a whole, including the MDS and * Migration of the file system as a whole, including the MDS and
associated DS's. associated DSs.
* Replacement of one DS by another. * Replacement of one DS by another.
* Migration of a pNFS file system to one in which pNFS is not used. * Migration of a pNFS file system to one in which pNFS is not used.
* Migration of a file system not using pNFS to one in which layouts * Migration of a file system not using pNFS to one in which layouts
are available. are available.
Note that migration per se is only involved in the transfer of the Note that migration, per se, is only involved in the transfer of the
MDS function. Although the servicing of a layout may be transferred MDS function. Although the servicing of a layout may be transferred
from one data server to another, this not done using the file system from one data server to another, this not done using the file system
location attributes. The MDS can effect such transfers by recalling/ location attributes. The MDS can effect such transfers by recalling
revoking existing layouts and granting new ones on a different data or revoking existing layouts and granting new ones on a different
server. data server.
Migration of the MDS function is directly supported by Transparent Migration of the MDS function is directly supported by Transparent
State Migration. Layout state will normally be transparently State Migration. Layout state will normally be transparently
transferred, just as other state is. As a result, Transparent State transferred, just as other state is. As a result, Transparent State
Migration provides a framework in which, given appropriate inter-MDS Migration provides a framework in which, given appropriate inter-MDS
data transfer, one MDS can be substituted for another. data transfer, one MDS can be substituted for another.
Migration of the file system function as a whole can be accomplished Migration of the file system function as a whole can be accomplished
by recalling all layouts as part of the initial phase of the by recalling all layouts as part of the initial phase of the
migration process. As a result, IO will be done through the MDS migration process. As a result, I/O will be done through the MDS
during the migration process, and new layouts can be granted once the during the migration process, and new layouts can be granted once the
client is interacting with the new MDS. An MDS can also effect this client is interacting with the new MDS. An MDS can also effect this
sort of transition by revoking all layouts as part of Transparent sort of transition by revoking all layouts as part of Transparent
State Migration, as long as the client is notified about the loss of State Migration, as long as the client is notified about the loss of
locking state. locking state.
In order to allow migration to a file system on which pNFS is not In order to allow migration to a file system on which pNFS is not
supported, clients need to be prepared for a situation in which supported, clients need to be prepared for a situation in which
layouts are not available or supported on the destination file system layouts are not available or supported on the destination file system
and so direct IO requests to the destination server, rather than and so direct I/O requests to the destination server, rather than
depending on layouts being available. depending on layouts being available.
Replacement of one DS by another is not addressed by migration as Replacement of one DS by another is not addressed by migration as
such but can be effected by an MDS recalling layouts for the DS to be such but can be effected by an MDS recalling layouts for the DS to be
replaced and issuing new ones to be served by the successor DS. replaced and issuing new ones to be served by the successor DS.
Migration may transfer a file system from a server which does not Migration may transfer a file system from a server that does not
support pNFS to one which does. In order to properly adapt to this support pNFS to one that does. In order to properly adapt to this
situation, clients which support pNFS, but function adequately in its situation, clients that support pNFS, but function adequately in its
absence should check for pNFS support when a file system is migrated absence, should check for pNFS support when a file system is migrated
and be prepared to use pNFS when support is available on the and be prepared to use pNFS when support is available on the
destination. destination.
11.13. Client Responsibilities when Access is Transitioned 11.13. Client Responsibilities When Access Is Transitioned
For a client to respond to an access transition, it must become aware For a client to respond to an access transition, it must become aware
of it. The ways in which this can happen are discussed in of it. The ways in which this can happen are discussed in
Section 11.13.1 which discusses indications that a specific file Section 11.13.1, which discusses indications that a specific file
system access path has transitioned as well as situations in which system access path has transitioned as well as situations in which
additional activity is necessary to determine the set of file systems additional activity is necessary to determine the set of file systems
that have been migrated. Section 11.13.2 goes on to complete the that have been migrated. Section 11.13.2 goes on to complete the
discussion of how the set of migrated file systems might be discussion of how the set of migrated file systems might be
determined. Sections 11.13.3 through 11.13.5 discuss how the client determined. Sections 11.13.3 through 11.13.5 discuss how the client
should deal with each transition it becomes aware of, either directly should deal with each transition it becomes aware of, either directly
or as a result of migration discovery. or as a result of migration discovery.
The following terms are used to describe client activities: The following terms are used to describe client activities:
* "Transition recovery" refers to the process of restoring access to * "Transition recovery" refers to the process of restoring access to
a file system on which NFS4ERR_MOVED was received. a file system on which NFS4ERR_MOVED was received.
* "Migration recovery" to that subset of transition recovery which * "Migration recovery" refers to that subset of transition recovery
applies when the file system has migrated to a different replica. that applies when the file system has migrated to a different
replica.
* "Migration discovery" refers to the process of determining which * "Migration discovery" refers to the process of determining which
file system(s) have been migrated. It is necessary to avoid a file system(s) have been migrated. It is necessary to avoid a
situation in which leases could expire when a file system is not situation in which leases could expire when a file system is not
accessed for a long period of time, since a client unaware of the accessed for a long period of time, since a client unaware of the
migration might be referencing an unmigrated file system and not migration might be referencing an unmigrated file system and not
renewing the lease associated with the migrated file system. renewing the lease associated with the migrated file system.
11.13.1. Client Transition Notifications 11.13.1. Client Transition Notifications
When there is a change in the network access path which a client is When there is a change in the network access path that a client is to
to use to access a file system, there are a number of related status use to access a file system, there are a number of related status
indications with which clients need to deal: indications with which clients need to deal:
* If an attempt is made to use or return a filehandle within a file * If an attempt is made to use or return a filehandle within a file
system that is no longer accessible at the address previously used system that is no longer accessible at the address previously used
to access it, the error NFS4ERR_MOVED is returned. to access it, the error NFS4ERR_MOVED is returned.
Exceptions are made to allow such file handles to be used when Exceptions are made to allow such filehandles to be used when
interrogating a file system location attribute. This enables a interrogating a file system location attribute. This enables a
client to determine a new replica's location or a new network client to determine a new replica's location or a new network
access path. access path.
This condition continues on subsequent attempts to access the file This condition continues on subsequent attempts to access the file
system in question. The only way the client can avoid the error system in question. The only way the client can avoid the error
is to cease accessing the file system in question at its old is to cease accessing the file system in question at its old
server location and access it instead using a different address at server location and access it instead using a different address at
which it is now available. which it is now available.
skipping to change at line 12570 skipping to change at line 12568
a file system that is no longer accessible on the server at which a file system that is no longer accessible on the server at which
it was previously available, the response will contain a lease- it was previously available, the response will contain a lease-
migrated indication, with the SEQ4_STATUS_LEASE_MOVED status bit migrated indication, with the SEQ4_STATUS_LEASE_MOVED status bit
being set. being set.
This condition continues until the client acknowledges the This condition continues until the client acknowledges the
notification by fetching a file system location attribute for the notification by fetching a file system location attribute for the
file system whose network access path is being changed. When file system whose network access path is being changed. When
there are multiple such file systems, a location attribute for there are multiple such file systems, a location attribute for
each such file system needs to be fetched. The location attribute each such file system needs to be fetched. The location attribute
for all migrated file system needs to be fetched in order to clear for all migrated file systems needs to be fetched in order to
the condition. Even after the condition is cleared, the client clear the condition. Even after the condition is cleared, the
needs to respond by using the location information to access the client needs to respond by using the location information to
file system at its new location to ensure that leases are not access the file system at its new location to ensure that leases
needlessly expired. are not needlessly expired.
Unlike the case of NFSv4.0, in which the corresponding conditions are Unlike NFSv4.0, in which the corresponding conditions are both errors
both errors and thus mutually exclusive, in NFSv4.1 the client can, and thus mutually exclusive, in NFSv4.1 the client can, and often
and often will, receive both indications on the same request. As a will, receive both indications on the same request. As a result,
result, implementations need to address the question of how to co- implementations need to address the question of how to coordinate the
ordinate the necessary recovery actions when both indications arrive necessary recovery actions when both indications arrive in the
in the response to the same request. It should be noted that when response to the same request. It should be noted that when
processing an NFSv4 COMPOUND, the server will normally decide whether processing an NFSv4 COMPOUND, the server will normally decide whether
SEQ4_STATUS_LEASE_MOVED is to be set before it determines which file SEQ4_STATUS_LEASE_MOVED is to be set before it determines which file
system will be referenced or whether NFS4ERR_MOVED is to be returned. system will be referenced or whether NFS4ERR_MOVED is to be returned.
Since these indications are not mutually exclusive in NFSv4.1, the Since these indications are not mutually exclusive in NFSv4.1, the
following combinations are possible results when a COMPOUND is following combinations are possible results when a COMPOUND is
issued: issued:
* The COMPOUND status is NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED * The COMPOUND status is NFS4ERR_MOVED, and SEQ4_STATUS_LEASE_MOVED
is asserted. is asserted.
In this case, transition recovery is required. While it is In this case, transition recovery is required. While it is
possible that migration discovery is needed in addition, it is possible that migration discovery is needed in addition, it is
likely that only the accessed file system has transitioned. In likely that only the accessed file system has transitioned. In
any case, because addressing NFS4ERR_MOVED is necessary to allow any case, because addressing NFS4ERR_MOVED is necessary to allow
the rejected requests to be processed on the target, dealing with the rejected requests to be processed on the target, dealing with
it will typically have priority over migration discovery. it will typically have priority over migration discovery.
* The COMPOUND status is NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED * The COMPOUND status is NFS4ERR_MOVED, and SEQ4_STATUS_LEASE_MOVED
is clear. is clear.
In this case, transition recovery is also required. It is clear In this case, transition recovery is also required. It is clear
that migration discovery is not needed to find file systems that that migration discovery is not needed to find file systems that
have been migrated other that the one returning NFS4ERR_MOVED. have been migrated other than the one returning NFS4ERR_MOVED.
Cases in which this result can arise include a referral or a Cases in which this result can arise include a referral or a
migration for which there is no associated locking state. This migration for which there is no associated locking state. This
can also arise in cases in which an access path transition other can also arise in cases in which an access path transition other
than migration occurs within the same server. In such a case, than migration occurs within the same server. In such a case,
there is no need to set SEQ4_STATUS_LEASE_MOVED, since the lease there is no need to set SEQ4_STATUS_LEASE_MOVED, since the lease
remains associated with the current server even though the access remains associated with the current server even though the access
path has changed. path has changed.
* The COMPOUND status is not NFS4ERR_MOVED and * The COMPOUND status is not NFS4ERR_MOVED, and
SEQ4_STATUS_LEASE_MOVED is asserted. SEQ4_STATUS_LEASE_MOVED is asserted.
In this case, no transition recovery activity is required on the In this case, no transition recovery activity is required on the
file system(s) accessed by the request. However, to prevent file system(s) accessed by the request. However, to prevent
avoidable lease expiration, migration discovery needs to be done avoidable lease expiration, migration discovery needs to be done.
* The COMPOUND status is not NFS4ERR_MOVED and * The COMPOUND status is not NFS4ERR_MOVED, and
SEQ4_STATUS_LEASE_MOVED is clear. SEQ4_STATUS_LEASE_MOVED is clear.
In this case, neither transition-related activity nor migration In this case, neither transition-related activity nor migration
discovery is required. discovery is required.
Note that the specified actions only need to be taken if they are not Note that the specified actions only need to be taken if they are not
already going on. For example, when NFS4ERR_MOVED is received when already going on. For example, when NFS4ERR_MOVED is received while
accessing a file system for which transition recovery already going accessing a file system for which transition recovery is already
on, the client merely waits for that recovery to be completed while occurring, the client merely waits for that recovery to be completed,
the receipt of SEQ4_STATUS_LEASE_MOVED indication only needs to while the receipt of the SEQ4_STATUS_LEASE_MOVED indication only
initiate migration discovery for a server if such discovery is not needs to initiate migration discovery for a server if such discovery
already underway for that server. is not already underway for that server.
The fact that a lease-migrated condition does not result in an error The fact that a lease-migrated condition does not result in an error
in NFSv4.1 has a number of important consequences. In addition to in NFSv4.1 has a number of important consequences. In addition to
the fact, discussed above, that the two indications are not mutually the fact that the two indications are not mutually exclusive, as
exclusive, there are number of issues that are important in discussed above, there are number of issues that are important in
considering implementation of migration discovery, as discussed in considering implementation of migration discovery, as discussed in
Section 11.13.2. Section 11.13.2.
Because SEQ4_STATUS_LEASE_MOVED is not an error condition", it is Because SEQ4_STATUS_LEASE_MOVED is not an error condition, it is
possible for file systems whose access paths have not changed to be possible for file systems whose access paths have not changed to be
successfully accessed on a given server even though recovery is successfully accessed on a given server even though recovery is
necessary for other file systems on the same server. As a result, necessary for other file systems on the same server. As a result,
access can go on while, access can take place while:
* The migration discovery process is going on for that server. * The migration discovery process is happening for that server.
* The transition recovery process is going on for other file systems * The transition recovery process is happening for other file
connected to that server. systems connected to that server.
11.13.2. Performing Migration Discovery 11.13.2. Performing Migration Discovery
Migration discovery can be performed in the same context as Migration discovery can be performed in the same context as
transition recovery, allowing recovery for each migrated file system transition recovery, allowing recovery for each migrated file system
to be invoked as it is discovered. Alternatively, it may be done in to be invoked as it is discovered. Alternatively, it may be done in
a separate migration discovery thread, allowing migration discovery a separate migration discovery thread, allowing migration discovery
to be done in parallel with one or more instances of transition to be done in parallel with one or more instances of transition
recovery. recovery.
In either case, because the lease-migrated indication does not result In either case, because the lease-migrated indication does not result
in an error. other access to file systems on the server can proceed in an error, other access to file systems on the server can proceed
normally, with the possibility that further such indications will be normally, with the possibility that further such indications will be
received, raising the issue of how such indications are to be dealt received, raising the issue of how such indications are to be dealt
with. In general, with. In general:
* No action needs to be taken for such indications received by any * No action needs to be taken for such indications received by any
threads performing migration discovery, since continuation of that threads performing migration discovery, since continuation of that
work will address the issue. work will address the issue.
* In other cases in which migration discovery is currently being * In other cases in which migration discovery is currently being
performed, nothing further needs to be done to respond to such performed, nothing further needs to be done to respond to such
lease migration indications, as long as one can be certain that lease migration indications, as long as one can be certain that
the migration discovery process would deal with those indications. the migration discovery process would deal with those indications.
See below for details. See below for details.
skipping to change at line 12702 skipping to change at line 12700
migration events may occur at any time, and because a LEASE_MOVED migration events may occur at any time, and because a LEASE_MOVED
indication may reflect the situation in effect a considerable time indication may reflect the situation in effect a considerable time
before the indication is received, special care needs to be taken to before the indication is received, special care needs to be taken to
ensure that LEASE_MOVED indications are not inappropriately ignored. ensure that LEASE_MOVED indications are not inappropriately ignored.
A useful approach to this issue involves the use of separate A useful approach to this issue involves the use of separate
externally-visible migration discovery states for each server. externally-visible migration discovery states for each server.
Separate values could represent the various possible states for the Separate values could represent the various possible states for the
migration discovery process for a server: migration discovery process for a server:
* non-operation, in which migration discovery is not being performed * Non-operation, in which migration discovery is not being
performed.
* normal operation, in which there is an ongoing scan for migrated * Normal operation, in which there is an ongoing scan for migrated
file systems. file systems.
* completion/verification of migration discovery processing, in * Completion/verification of migration discovery processing, in
which the possible completion of migration discovery processing which the possible completion of migration discovery processing
needs to be verified. needs to be verified.
Given that framework, migration discovery processing would proceed as Given that framework, migration discovery processing would proceed as
follows. follows:
* While in the normal-operation state, the thread performing * While in the normal-operation state, the thread performing
discovery would fetch, for successive file systems known to the discovery would fetch, for successive file systems known to the
client on the server being worked on, a file system location client on the server being worked on, a file system location
attribute plus the fs_status attribute. attribute plus the fs_status attribute.
* If the fs_status attribute indicates that the file system is a * If the fs_status attribute indicates that the file system is a
migrated one (i.e. fss_absent is true and fss_type != migrated one (i.e., fss_absent is true, and fss_type !=
STATUS4_REFERRAL) then a migrated file system has been found. In STATUS4_REFERRAL), then a migrated file system has been found. In
this situation, it is likely that the fetch of the file system this situation, it is likely that the fetch of the file system
location attribute has cleared one the file systems contributing location attribute has cleared one of the file systems
to the lease-migrated indication. contributing to the lease-migrated indication.
* In cases in which that happened, the thread cannot know whether * In cases in which that happened, the thread cannot know whether
the lease-migrated indication has been cleared and so it enters the lease-migrated indication has been cleared, and so it enters
the completion/verification state and proceeds to issue a COMPOUND the completion/verification state and proceeds to issue a COMPOUND
to see if the LEASE_MOVED indication has been cleared. to see if the LEASE_MOVED indication has been cleared.
* When the discovery process is in the completion/verification * When the discovery process is in the completion/verification
state, if other requests get a lease-migrated indication they note state, if other requests get a lease-migrated indication, they
that it was received. Later, the existence of such indications is note that it was received. Later, the existence of such
used when the request completes, as described below. indications is used when the request completes, as described
below.
When the request used in the completion/verification state completes: When the request used in the completion/verification state completes:
* If a lease-migrated indication is returned, the discovery * If a lease-migrated indication is returned, the discovery
continues normally. Note that this is so even if all file systems continues normally. Note that this is so even if all file systems
have traversed, since new migrations could have occurred while the have been traversed, since new migrations could have occurred
process was going on. while the process was going on.
* Otherwise, if there is any record that other requests saw a lease- * Otherwise, if there is any record that other requests saw a lease-
migrated indication while the request was going on, that record is migrated indication while the request was occurring, that record
cleared and the verification request retried. The discovery is cleared, and the verification request is retried. The
process remains in completion/verification state. discovery process remains in the completion/verification state.
* If there have been no lease-migrated indications, the work of * If there have been no lease-migrated indications, the work of
migration discovery is considered completed and it enters the non- migration discovery is considered completed, and it enters the
operating state. Once it enters this state, subsequent lease- non-operating state. Once it enters this state, subsequent lease-
migrated indication will trigger a new migration discovery migrated indications will trigger a new migration discovery
process. process.
It should be noted that the process described above is not guaranteed It should be noted that the process described above is not guaranteed
to terminate, as a long series of new migration events might to terminate, as a long series of new migration events might
continually delay the clearing of the LEASE_MOVED indication. To continually delay the clearing of the LEASE_MOVED indication. To
prevent unnecessary lease expiration, it is appropriate for clients prevent unnecessary lease expiration, it is appropriate for clients
to use the discovery of migrations to effect lease renewal to use the discovery of migrations to effect lease renewal
immediately, rather than waiting for clearing of the LEASE_MOVED immediately, rather than waiting for the clearing of the LEASE_MOVED
indication when the complete set of migrations is available. indication when the complete set of migrations is available.
Lease discovery needs to be provided as described above. This Lease discovery needs to be provided as described above. This
ensures that the client discovers file system migrations soon enough ensures that the client discovers file system migrations soon enough
to renew its leases on each destination server before they expire. to renew its leases on each destination server before they expire.
Non-renewal of leases can lead to loss of locking state. While the Non-renewal of leases can lead to loss of locking state. While the
consequences of such loss can be ameliorated through implementations consequences of such loss can be ameliorated through implementations
of courtesy locks, servers are under no obligation to do so, and a of courtesy locks, servers are under no obligation to do so, and a
conflicting lock request may mean that a lock is revoked conflicting lock request may mean that a lock is revoked
unexpectedly. Clients should be aware of this possibility. unexpectedly. Clients should be aware of this possibility.
skipping to change at line 12799 skipping to change at line 12799
State Migration. State Migration.
During the first phase of this process, the client proceeds to During the first phase of this process, the client proceeds to
examine file system location entries to find the initial network examine file system location entries to find the initial network
address it will use to continue access to the file system or its address it will use to continue access to the file system or its
replacement. For each location entry that the client examines, the replacement. For each location entry that the client examines, the
process consists of five steps: process consists of five steps:
1. Performing an EXCHANGE_ID directed at the location address. This 1. Performing an EXCHANGE_ID directed at the location address. This
operation is used to register the client owner (in the form of a operation is used to register the client owner (in the form of a
client_owner4) with the server, to obtain a client ID to be use client_owner4) with the server, to obtain a client ID to be used
subsequently to communicate with it, to obtain that client ID's subsequently to communicate with it, to obtain that client ID's
confirmation status, and to determine server_owner and scope for confirmation status, and to determine server_owner and scope for
the purpose of determining if the entry is trunkable with that the purpose of determining if the entry is trunkable with the
previously being used to access the file system (i.e. that it address previously being used to access the file system (i.e.,
represents another network access path to the same file system that it represents another network access path to the same file
and can share locking state with it). system and can share locking state with it).
2. Making an initial determination of whether migration has 2. Making an initial determination of whether migration has
occurred. The initial determination will be based on whether the occurred. The initial determination will be based on whether the
EXCHANGE_ID results indicate that the current location element is EXCHANGE_ID results indicate that the current location element is
server-trunkable with that used to access the file system when server-trunkable with that used to access the file system when
access was terminated by receiving NFS4ERR_MOVED. If it is, then access was terminated by receiving NFS4ERR_MOVED. If it is, then
migration has not occurred. In that case, the transition is migration has not occurred. In that case, the transition is
dealt with, at least initially, as one involving continued access dealt with, at least initially, as one involving continued access
to the same file system on the same server through a new network to the same file system on the same server through a new network
address. address.
3. Obtaining access to existing session state or creating new 3. Obtaining access to existing session state or creating new
sessions. How this is done depends on the initial determination sessions. How this is done depends on the initial determination
of whether migration has occurred and can be done as described in of whether migration has occurred and can be done as described in
Section 11.13.4 below in the case of migration or as described in Section 11.13.4 below in the case of migration or as described in
Section 11.13.5 below in the case of a network address transfer Section 11.13.5 below in the case of a network address transfer
without migration. without migration.
4. Verification of the trunking relationship assumed in step 2 as 4. Verifying the trunking relationship assumed in step 2 as
discussed in Section 2.10.5.1. Although this step will generally discussed in Section 2.10.5.1. Although this step will generally
confirm the initial determination, it is possible for confirm the initial determination, it is possible for
verification to fail with the result that an initial verification to fail with the result that an initial
determination that a network address shift (without migration) determination that a network address shift (without migration)
has occurred may be invalidated and migration determined to have has occurred may be invalidated and migration determined to have
occurred. There is no need to redo step 3 above, since it will occurred. There is no need to redo step 3 above, since it will
be possible to continue use of the session established already. be possible to continue use of the session established already.
5. Obtaining access to existing locking state and/or reobtaining it. 5. Obtaining access to existing locking state and/or re-obtaining
How this is done depends on the final determination of whether it. How this is done depends on the final determination of
migration has occurred and can be done as described below in whether migration has occurred and can be done as described below
Section 11.13.4 in the case of migration or as described in in Section 11.13.4 in the case of migration or as described in
Section 11.13.5 in the case of a network address transfer without Section 11.13.5 in the case of a network address transfer without
migration. migration.
Once the initial address has been determined, clients are free to Once the initial address has been determined, clients are free to
apply an abbreviated process to find additional addresses trunkable apply an abbreviated process to find additional addresses trunkable
with it (clients may seek session-trunkable or server-trunkable with it (clients may seek session-trunkable or server-trunkable
addresses depending on whether they support clientid trunking). addresses depending on whether they support client ID trunking).
During this later phase of the process, further location entries are During this later phase of the process, further location entries are
examined using the abbreviated procedure specified below: examined using the abbreviated procedure specified below:
A: Before the EXCHANGE_ID, the fs name of the location entry is A: Before the EXCHANGE_ID, the fs name of the location entry is
examined and if it does not match that currently being used, the examined, and if it does not match that currently being used, the
entry is ignored. otherwise, one proceeds as specified by step 1 entry is ignored. Otherwise, one proceeds as specified by step 1
above. above.
B: In the case that the network address is session-trunkable with B: In the case that the network address is session-trunkable with
one used previously a BIND_CONN_TO_SESSION is used to access that one used previously, a BIND_CONN_TO_SESSION is used to access
session using the new network address. Otherwise, or if the bind that session using the new network address. Otherwise, or if the
operation fails, a CREATE_SESSION is done. bind operation fails, a CREATE_SESSION is done.
C: The verification procedure referred to in step 4 above is used. C: The verification procedure referred to in step 4 above is used.
However, if it fails, the entry is ignored and the next available However, if it fails, the entry is ignored and the next available
entry is used. entry is used.
11.13.4. Obtaining Access to Sessions and State after Migration 11.13.4. Obtaining Access to Sessions and State after Migration
In the event that migration has occurred, migration recovery will In the event that migration has occurred, migration recovery will
involve determining whether Transparent State Migration has occurred. involve determining whether Transparent State Migration has occurred.
This decision is made based on the client ID returned by the This decision is made based on the client ID returned by the
EXCHANGE_ID and the reported confirmation status. EXCHANGE_ID and the reported confirmation status.
* If the client ID is an unconfirmed client ID not previously known * If the client ID is an unconfirmed client ID not previously known
to the client, then Transparent State Migration has not occurred. to the client, then Transparent State Migration has not occurred.
* If the client ID is a confirmed client ID previously known to the * If the client ID is a confirmed client ID previously known to the
client, then any transferred state would have been merged with an client, then any transferred state would have been merged with an
existing client ID representing the client to the destination existing client ID representing the client to the destination
server. In this state merger case, Transparent State Migration server. In this state merger case, Transparent State Migration
might or might not have occurred and a determination as to whether might or might not have occurred, and a determination as to
it has occurred is deferred until sessions are established and the whether it has occurred is deferred until sessions are established
client is ready to begin state recovery. and the client is ready to begin state recovery.
* If the client ID is a confirmed client ID not previously known to * If the client ID is a confirmed client ID not previously known to
the client, then the client can conclude that the client ID was the client, then the client can conclude that the client ID was
transferred as part of Transparent State Migration. In this transferred as part of Transparent State Migration. In this
transferred client ID case, Transparent State Migration has transferred client ID case, Transparent State Migration has
occurred although some state might have been lost. occurred, although some state might have been lost.
Once the client ID has been obtained, it is necessary to obtain Once the client ID has been obtained, it is necessary to obtain
access to sessions to continue communication with the new server. In access to sessions to continue communication with the new server. In
any of the cases in which Transparent State Migration has occurred, any of the cases in which Transparent State Migration has occurred,
it is possible that a session was transferred as well. To deal with it is possible that a session was transferred as well. To deal with
that possibility, clients can, after doing the EXCHANGE_ID, issue a that possibility, clients can, after doing the EXCHANGE_ID, issue a
BIND_CONN_TO_SESSION to connect the transferred session to a BIND_CONN_TO_SESSION to connect the transferred session to a
connection to the new server. If that fails, it is an indication connection to the new server. If that fails, it is an indication
that the session was not transferred and that a new session needs to that the session was not transferred and that a new session needs to
be created to take its place. be created to take its place.
In some situations, it is possible for a BIND_CONN_TO_SESSION to In some situations, it is possible for a BIND_CONN_TO_SESSION to
succeed without session migration having occurred. If state merger succeed without session migration having occurred. If state merger
has taken place then the associated client ID may have already had a has taken place, then the associated client ID may have already had a
set of existing sessions, with it being possible that the sessionid set of existing sessions, with it being possible that the sessionid
of a given session is the same as one that might have been migrated. of a given session is the same as one that might have been migrated.
In that event, a BIND_CONN_TO_SESSION might succeed, even though In that event, a BIND_CONN_TO_SESSION might succeed, even though
there could have been no migration of the session with that there could have been no migration of the session with that
sessionid. In such cases, the client will receive sequence errors sessionid. In such cases, the client will receive sequence errors
when the slot sequence values used are not appropriate on the new when the slot sequence values used are not appropriate on the new
session. When this occurs, the client can create a new a session and session. When this occurs, the client can create a new a session and
cease using the existing one. cease using the existing one.
Once the client has determined the initial migration status, and Once the client has determined the initial migration status, and
skipping to change at line 12927 skipping to change at line 12927
Clients need to deal with the following cases: Clients need to deal with the following cases:
* In the state merger case, it is possible that the server has not * In the state merger case, it is possible that the server has not
attempted Transparent State Migration, in which case state may attempted Transparent State Migration, in which case state may
have been lost without it being reflected in the SEQ4_STATUS bits. have been lost without it being reflected in the SEQ4_STATUS bits.
To determine whether this has happened, the client can use To determine whether this has happened, the client can use
TEST_STATEID to check whether the stateids created on the source TEST_STATEID to check whether the stateids created on the source
server are still accessible on the destination server. Once a server are still accessible on the destination server. Once a
single stateid is found to have been successfully transferred, the single stateid is found to have been successfully transferred, the
client can conclude that Transparent State Migration was begun and client can conclude that Transparent State Migration was begun,
any failure to transport all of the stateids will be reflected in and any failure to transport all of the stateids will be reflected
the SEQ4_STATUS bits. Otherwise, Transparent State Migration has in the SEQ4_STATUS bits. Otherwise, Transparent State Migration
not occurred. has not occurred.
* In a case in which Transparent State Migration has not occurred, * In a case in which Transparent State Migration has not occurred,
the client can use the per-fs grace period provided by the the client can use the per-fs grace period provided by the
destination server to reclaim locks that were held on the source destination server to reclaim locks that were held on the source
server. server.
* In a case in which Transparent State Migration has occurred, and * In a case in which Transparent State Migration has occurred, and
no lock state was lost (as shown by SEQ4_STATUS flags), no lock no lock state was lost (as shown by SEQ4_STATUS flags), no lock
reclaim is necessary. reclaim is necessary.
* In a case in which Transparent State Migration has occurred, and * In a case in which Transparent State Migration has occurred, and
some lock state was lost (as shown by SEQ4_STATUS flags), existing some lock state was lost (as shown by SEQ4_STATUS flags), existing
stateids need to be checked for validity using TEST_STATEID, and stateids need to be checked for validity using TEST_STATEID, and
reclaim used to re-establish any that were not transferred. reclaim used to re-establish any that were not transferred.
For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs value For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs value
of TRUE needs to be done before normal use of the file system of TRUE needs to be done before normal use of the file system,
including obtaining new locks for the file system. This applies even including obtaining new locks for the file system. This applies even
if no locks were lost and there was no need for any to be reclaimed. if no locks were lost and there was no need for any to be reclaimed.
11.13.5. Obtaining Access to Sessions and State after Network Address 11.13.5. Obtaining Access to Sessions and State after Network Address
Transfer Transfer
The case in which there is a transfer to a new network address The case in which there is a transfer to a new network address
without migration is similar to that described in Section 11.13.4 without migration is similar to that described in Section 11.13.4
above in that there is a need to obtain access to needed sessions and above in that there is a need to obtain access to needed sessions and
locking state. However, the details are simpler and will vary locking state. However, the details are simpler and will vary
depending on the type of trunking between the address receiving depending on the type of trunking between the address receiving
NFS4ERR_MOVED and that to which the transfer is to be made NFS4ERR_MOVED and that to which the transfer is to be made.
To make a session available for use, a BIND_CONN_TO_SESSION should be To make a session available for use, a BIND_CONN_TO_SESSION should be
used to obtain access to the session previously in use. Only if this used to obtain access to the session previously in use. Only if this
fails, should a CREATE_SESSION be done. While this procedure mirrors fails, should a CREATE_SESSION be done. While this procedure mirrors
that in Section 11.13.4 above, there is an important difference in that in Section 11.13.4 above, there is an important difference in
that preservation of the session is not purely optional but depends that preservation of the session is not purely optional but depends
on the type of trunking. on the type of trunking.
Access to appropriate locking state will generally need no actions Access to appropriate locking state will generally need no actions
beyond access to the session. However, the SEQ4_STATUS bits need to beyond access to the session. However, the SEQ4_STATUS bits need to
be checked for lost locking state, including the need to reclaim be checked for lost locking state, including the need to reclaim
locks after a server reboot, since there is always a possibility of locks after a server reboot, since there is always a possibility of
locking state being lost. locking state being lost.
11.14. Server Responsibilities Upon Migration 11.14. Server Responsibilities Upon Migration
In the event of file system migration, when the client connects to In the event of file system migration, when the client connects to
the destination server, that server needs to be able to provide the the destination server, that server needs to be able to provide the
client continued to access the files it had open on the source client continued access to the files it had open on the source
server. There are two ways to provide this: server. There are two ways to provide this:
* By provision of an fs-specific grace period, allowing the client * By provision of an fs-specific grace period, allowing the client
the ability to reclaim its locks, in a fashion similar to what the ability to reclaim its locks, in a fashion similar to what
would have been done in the case of recovery from a server would have been done in the case of recovery from a server
restart. See Section 11.14.1 for a more complete discussion. restart. See Section 11.14.1 for a more complete discussion.
* By implementing Transparent State Migration possibly in connection * By implementing Transparent State Migration possibly in connection
with session migration, the server can provide the client with session migration, the server can provide the client
immediate access to the state built up on the source server, on immediate access to the state built up on the source server on the
the destination. destination server.
These features are discussed separately in Sections 11.14.2 and These features are discussed separately in Sections 11.14.2 and
11.14.3, which discuss Transparent State Migration and session 11.14.3, which discuss Transparent State Migration and session
migration respectively. migration, respectively.
All the features described above can involve transfer of lock-related All the features described above can involve transfer of lock-related
information between source and destination servers. In some cases, information between source and destination servers. In some cases,
this transfer is a necessary part of the implementation while in this transfer is a necessary part of the implementation, while in
other cases it is a helpful implementation aid which servers might or other cases, it is a helpful implementation aid, which servers might
might not use. The sub-sections below discuss the information which or might not use. The subsections below discuss the information that
would be transferred but do not define the specifics of the transfer would be transferred but do not define the specifics of the transfer
protocol. This is left as an implementation choice although protocol. This is left as an implementation choice, although
standards in this area could be developed at a later time. standards in this area could be developed at a later time.
11.14.1. Server Responsibilities in Effecting State Reclaim after 11.14.1. Server Responsibilities in Effecting State Reclaim after
Migration Migration
In this case, the destination server needs no knowledge of the locks In this case, the destination server needs no knowledge of the locks
held on the source server. It relies on the clients to accurately held on the source server. It relies on the clients to accurately
report (via reclaim operations) the locks previously held, and does report (via reclaim operations) the locks previously held, and does
not allow new locks to be granted on migrated file systems until the not allow new locks to be granted on migrated file systems until the
grace period expires. Disallowing of new locks applies to all grace period expires. Disallowing of new locks applies to all
clients accessing these file system, while grace period expiration clients accessing these file systems, while grace period expiration
occurs for each migrated client independently. occurs for each migrated client independently.
During this grace period clients have the opportunity to use reclaim During this grace period, clients have the opportunity to use reclaim
operations to obtain locks for file system objects within the operations to obtain locks for file system objects within the
migrated file system, in the same way that they do when recovering migrated file system, in the same way that they do when recovering
from server restart, and the servers typically rely on clients to from server restart, and the servers typically rely on clients to
accurately report their locks, although they have the option of accurately report their locks, although they have the option of
subjecting these requests to verification. If the clients only subjecting these requests to verification. If the clients only
reclaim locks held on the source server, no conflict can arise. Once reclaim locks held on the source server, no conflict can arise. Once
the client has reclaimed its locks, it indicates the completion of the client has reclaimed its locks, it indicates the completion of
lock reclamation by performing a RECLAIM_COMPLETE specifying lock reclamation by performing a RECLAIM_COMPLETE specifying
rca_one_fs as TRUE. rca_one_fs as TRUE.
While it is not necessary for source and destination servers to co- While it is not necessary for source and destination servers to
operate to transfer information about locks, implementations are cooperate to transfer information about locks, implementations are
well-advised to consider transferring the following useful well advised to consider transferring the following useful
information: information:
* If information about the set of clients that have locking state * If information about the set of clients that have locking state
for the transferred file system is made available, the destination for the transferred file system is made available, the destination
server will be able to terminate the grace period once all such server will be able to terminate the grace period once all such
clients have reclaimed their locks, allowing normal locking clients have reclaimed their locks, allowing normal locking
activity to resume earlier than it would have otherwise. activity to resume earlier than it would have otherwise.
* Locking summary information for individual clients (at various * Locking summary information for individual clients (at various
possible levels of detail) can detect some instances in which possible levels of detail) can detect some instances in which
clients do not accurately represent the locks held on the source clients do not accurately represent the locks held on the source
server. server.
11.14.2. Server Responsibilities in Effecting Transparent State 11.14.2. Server Responsibilities in Effecting Transparent State
Migration Migration
The basic responsibility of the source server in effecting The basic responsibility of the source server in effecting
Transparent State Migration is to make available to the destination Transparent State Migration is to make available to the destination
server a description of each piece of locking state associated with server a description of each piece of locking state associated with
the file system being migrated. In addition to client id string and the file system being migrated. In addition to client id string and
verifier, the source server needs to provide, for each stateid: verifier, the source server needs to provide for each stateid:
* The stateid including the current sequence value. * The stateid including the current sequence value.
* The associated client ID. * The associated client ID.
* The handle of the associated file. * The handle of the associated file.
* The type of the lock, such as open, byte-range lock, delegation, * The type of the lock, such as open, byte-range lock, delegation,
or layout. or layout.
skipping to change at line 13091 skipping to change at line 13091
that locks revoked soon before or soon after migration are not that locks revoked soon before or soon after migration are not
inadvertently allowed to be reclaimed in situations in which the inadvertently allowed to be reclaimed in situations in which the
continuity of lock possession cannot be assured. continuity of lock possession cannot be assured.
* For locks lost on the source but whose loss has not yet been * For locks lost on the source but whose loss has not yet been
acknowledged by the client (by using FREE_STATEID), the acknowledged by the client (by using FREE_STATEID), the
destination must be aware of this loss so that it can deny a destination must be aware of this loss so that it can deny a
request to reclaim them. request to reclaim them.
* For locks lost on the destination after the state transfer but * For locks lost on the destination after the state transfer but
before the client's RECLAIM_COMPLTE is done, the destination before the client's RECLAIM_COMPLETE is done, the destination
server should note these and not allow them to be reclaimed. server should note these and not allow them to be reclaimed.
An additional responsibility of the cooperating servers concerns An additional responsibility of the cooperating servers concerns
situations in which a stateid cannot be transferred transparently situations in which a stateid cannot be transferred transparently
because it conflicts with an existing stateid held by the client and because it conflicts with an existing stateid held by the client and
associated with a different file system. In this case there are two associated with a different file system. In this case, there are two
valid choices: valid choices:
* Treat the transfer, as in NFSv4.0, as one without Transparent * Treat the transfer, as in NFSv4.0, as one without Transparent
State Migration. In this case, conflicting locks cannot be State Migration. In this case, conflicting locks cannot be
granted until the client does a RECLAIM_COMPLETE, after reclaiming granted until the client does a RECLAIM_COMPLETE, after reclaiming
the locks it had, with the exception of reclaims denied because the locks it had, with the exception of reclaims denied because
they were attempts to reclaim locks that had been lost. they were attempts to reclaim locks that had been lost.
* Implement Transparent State Migration, except for the lock with * Implement Transparent State Migration, except for the lock with
the conflicting stateid. In this case, the client will be aware the conflicting stateid. In this case, the client will be aware
of a lost lock (through the SEQ4_STATUS flags) and be allowed to of a lost lock (through the SEQ4_STATUS flags) and be allowed to
reclaim it. reclaim it.
When transferring state between the source and destination, the When transferring state between the source and destination, the
issues discussed in Section 7.2 of [68] must still be attended to. issues discussed in Section 7.2 of [68] must still be attended to.
In this case, the use of NFS4ERR_DELAY may still necessary in In this case, the use of NFS4ERR_DELAY may still be necessary in
NFSv4.1, as it was in NFSv4.0, to prevent locking state changing NFSv4.1, as it was in NFSv4.0, to prevent locking state changing
while it is being transferred. See Section 15.1.1.3 for information while it is being transferred. See Section 15.1.1.3 for information
about appropriate client retry approaches in the event that about appropriate client retry approaches in the event that
NFS4ERR_DELAY is returned. NFS4ERR_DELAY is returned.
There are a number of important differences in the NFS4.1 context: There are a number of important differences in the NFS4.1 context:
* The absence of RELEASE_LOCKOWNER means that the one case in which * The absence of RELEASE_LOCKOWNER means that the one case in which
an operation could not be deferred by use of NFS4ERR_DELAY no an operation could not be deferred by use of NFS4ERR_DELAY no
longer exists. longer exists.
* Sequencing of operations is no longer done using owner-based * Sequencing of operations is no longer done using owner-based
operation sequences numbers. Instead, sequencing is session- operation sequences numbers. Instead, sequencing is session-
based based.
As a result, when sessions are not transferred, the techniques As a result, when sessions are not transferred, the techniques
discussed in Section 7.2 of [68] are adequate and will not be further discussed in Section 7.2 of [68] are adequate and will not be further
discussed. discussed.
11.14.3. Server Responsibilities in Effecting Session Transfer 11.14.3. Server Responsibilities in Effecting Session Transfer
The basic responsibility of the source server in effecting session The basic responsibility of the source server in effecting session
transfer is to make available to the destination server a description transfer is to make available to the destination server a description
of the current state of each slot with the session, including: of the current state of each slot with the session, including the
following:
* The last sequence value received for that slot. * The last sequence value received for that slot.
* Whether there is cached reply data for the last request executed * Whether there is cached reply data for the last request executed
and, if so, the cached reply. and, if so, the cached reply.
When sessions are transferred, there are a number of issues that pose When sessions are transferred, there are a number of issues that pose
challenges in terms of making the transferred state unmodifiable challenges in terms of making the transferred state unmodifiable
during the period it is gathered up and transferred to the during the period it is gathered up and transferred to the
destination server. destination server:
* A single session may be used to access multiple file systems, not * A single session may be used to access multiple file systems, not
all of which are being transferred. all of which are being transferred.
* Requests made on a session may, even if rejected, affect the state * Requests made on a session may, even if rejected, affect the state
of the session by advancing the sequence number associated with of the session by advancing the sequence number associated with
the slot used. the slot used.
As a result, when the file system state might otherwise be considered As a result, when the file system state might otherwise be considered
unmodifiable, the client might have any number of in-flight requests, unmodifiable, the client might have any number of in-flight requests,
each of which is capable of changing session state, which may be of a each of which is capable of changing session state, which may be of a
number of types: number of types:
1. Those requests that were processed on the migrating file system, 1. Those requests that were processed on the migrating file system
before migration began. before migration began.
2. Those requests which got the error NFS4ERR_DELAY because the file 2. Those requests that received the error NFS4ERR_DELAY because the
system being accessed was in the process of being migrated. file system being accessed was in the process of being migrated.
3. Those requests which got the error NFS4ERR_MOVED because the file 3. Those requests that received the error NFS4ERR_MOVED because the
system being accessed had been migrated. file system being accessed had been migrated.
4. Those requests that accessed the migrating file system, in order 4. Those requests that accessed the migrating file system in order
to obtain location or status information. to obtain location or status information.
5. Those requests that did not reference the migrating file system. 5. Those requests that did not reference the migrating file system.
It should be noted that the history of any particular slot is likely It should be noted that the history of any particular slot is likely
to include a number of these request classes. In the case in which a to include a number of these request classes. In the case in which a
session which is migrated is used by file systems other than the one session that is migrated is used by file systems other than the one
migrated, requests of class 5 may be common and be the last request migrated, requests of class 5 may be common and may be the last
processed, for many slots. request processed for many slots.
Since session state can change even after the locking state has been Since session state can change even after the locking state has been
fixed as part of the migration process, the session state known to fixed as part of the migration process, the session state known to
the client could be different from that on the destination server, the client could be different from that on the destination server,
which necessarily reflects the session state on the source server, at which necessarily reflects the session state on the source server at
an earlier time. In deciding how to deal with this situation, it is an earlier time. In deciding how to deal with this situation, it is
helpful to distinguish between two sorts of behavioral consequences helpful to distinguish between two sorts of behavioral consequences
of the choice of initial sequence ID values. of the choice of initial sequence ID values:
* The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID * The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID
in a request is neither equal to the last one seen for the current in a request is neither equal to the last one seen for the current
slot nor the next greater one. slot nor the next greater one.
In view of the difficulty of arriving at a mutually acceptable In view of the difficulty of arriving at a mutually acceptable
value for the correct last sequence value at the point of value for the correct last sequence value at the point of
migration, it may be necessary for the server to show some degree migration, it may be necessary for the server to show some degree
of forbearance, when the sequence ID is one that would be of forbearance when the sequence ID is one that would be
considered unacceptable if session migration were not involved. considered unacceptable if session migration were not involved.
* Returning the cached reply for a previously executed request when * Returning the cached reply for a previously executed request when
the sequence ID in the request matches the last value recorded for the sequence ID in the request matches the last value recorded for
the slot. the slot.
In the cases in which an error is returned and there is no In the cases in which an error is returned and there is no
possibility of any non-idempotent operation having been executed, possibility of any non-idempotent operation having been executed,
it may not be necessary to adhere to this as strictly as might be it may not be necessary to adhere to this as strictly as might be
proper if session migration were not involved. For example, the proper if session migration were not involved. For example, the
fact that the error NFS4ERR_DELAY was returned may not assist the fact that the error NFS4ERR_DELAY was returned may not assist the
client in any material way, while the fact that NFS4ERR_MOVED was client in any material way, while the fact that NFS4ERR_MOVED was
returned by the source server may not be relevant when the request returned by the source server may not be relevant when the request
was reissued, directed to the destination server. was reissued and directed to the destination server.
An important issue is that the specification needs to take note of An important issue is that the specification needs to take note of
all potential COMPOUNDs, even if they might be unlikely in practice. all potential COMPOUNDs, even if they might be unlikely in practice.
For example, a COMPOUND is allowed to access multiple file systems For example, a COMPOUND is allowed to access multiple file systems
and might perform non-idempotent operations in some of them before and might perform non-idempotent operations in some of them before
accessing a file system being migrated. Also, a COMPOUND may return accessing a file system being migrated. Also, a COMPOUND may return
considerable data in the response, before being rejected with considerable data in the response before being rejected with
NFS4ERR_DELAY or NFS4ERR_MOVED, and may in addition be marked as NFS4ERR_DELAY or NFS4ERR_MOVED, and may in addition be marked as
sa_cachethis. However, note that if the client and server adhere to sa_cachethis. However, note that if the client and server adhere to
rules in Section 15.1.1.3, there is no possibility of non-idempotent rules in Section 15.1.1.3, there is no possibility of non-idempotent
operations being spuriously reissued after receiving NFS4ERR_DELAY operations being spuriously reissued after receiving NFS4ERR_DELAY
response. response.
To address these issues, a destination server MAY do any of the To address these issues, a destination server MAY do any of the
following when implementing session transfer. following when implementing session transfer:
* Avoid enforcing any sequencing semantics for a particular slot * Avoid enforcing any sequencing semantics for a particular slot
until the client has established the starting sequence for that until the client has established the starting sequence for that
slot on the destination server. slot on the destination server.
* For each slot, avoid returning a cached reply returning * For each slot, avoid returning a cached reply returning
NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established
the starting sequence for that slot on the destination server. the starting sequence for that slot on the destination server.
* Until the client has established the starting sequence for a * Until the client has established the starting sequence for a
particular slot on the destination server, avoid reporting particular slot on the destination server, avoid reporting
NFS4ERR_SEQ_MISORDERED or returning a cached reply returning NFS4ERR_SEQ_MISORDERED or returning a cached reply returning
NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of
a series of operations where the response is NFS4_OK until the a series of operations where the response is NFS4_OK until the
final error. final error.
Because of the considerations mentioned above including the rules for Because of the considerations mentioned above, including the rules
the handling of NFS4ERR_DELAY included in Section 15.1.1.3, the for the handling of NFS4ERR_DELAY included in Section 15.1.1.3, the
destination server can respond appropriately to SEQUENCE operations destination server can respond appropriately to SEQUENCE operations
received from the client by adopting the three policies listed below: received from the client by adopting the three policies listed below:
* Not responding with NFS4ERR_SEQ_MISORDERED for the initial request * Not responding with NFS4ERR_SEQ_MISORDERED for the initial request
on a slot within a transferred session, since the destination on a slot within a transferred session because the destination
server cannot be aware of requests made by the client after the server cannot be aware of requests made by the client after the
server handoff but before the client became aware of the shift. server handoff but before the client became aware of the shift.
In cases in which NFS4ERR_SEQ_MISORDERED would normally have been In cases in which NFS4ERR_SEQ_MISORDERED would normally have been
reported, the request is to be processed normally, as a new reported, the request is to be processed normally as a new
request. request.
* Replying as it would for a retry whenever the sequence matches * Replying as it would for a retry whenever the sequence matches
that transferred by the source server, even though this would not that transferred by the source server, even though this would not
provide retry handling for requests issued after the server provide retry handling for requests issued after the server
handoff, under the assumption that when such requests are issued handoff, under the assumption that, when such requests are issued,
they will never be responded to in a state-changing fashion, they will never be responded to in a state-changing fashion,
making retry support for them unnecessary. making retry support for them unnecessary.
* Once a non-retry SEQUENCE is received for a given slot, using that * Once a non-retry SEQUENCE is received for a given slot, using that
as the basis for further sequence checking, with no further as the basis for further sequence checking, with no further
reference to the sequence value transferred by the source. reference to the sequence value transferred by the source server.
server.
11.15. Effecting File System Referrals 11.15. Effecting File System Referrals
Referrals are effected when an absent file system is encountered and Referrals are effected when an absent file system is encountered and
one or more alternate locations are made available by the one or more alternate locations are made available by the
fs_locations or fs_locations_info attributes. The client will fs_locations or fs_locations_info attributes. The client will
typically get an NFS4ERR_MOVED error, fetch the appropriate location typically get an NFS4ERR_MOVED error, fetch the appropriate location
information, and proceed to access the file system on a different information, and proceed to access the file system on a different
server, even though it retains its logical position within the server, even though it retains its logical position within the
original namespace. Referrals differ from migration events in that original namespace. Referrals differ from migration events in that
they happen only when the client has not previously referenced the they happen only when the client has not previously referenced the
file system in question (so there is nothing to transition). file system in question (so there is nothing to transition).
Referrals can only come into effect when an absent file system is Referrals can only come into effect when an absent file system is
encountered at its root. encountered at its root.
The examples given in the sections below are somewhat artificial in The examples given in the sections below are somewhat artificial in
that an actual client will not typically do a multi-component look that an actual client will not typically do a multi-component look
up, but will have cached information regarding the upper levels of up, but will have cached information regarding the upper levels of
the name hierarchy. However, these examples are chosen to make the the name hierarchy. However, these examples are chosen to make the
required behavior clear and easy to put within the scope of a small required behavior clear and easy to put within the scope of a small
number of requests, without getting a discussion of the details of number of requests, without getting into a discussion of the details
how specific clients might choose to cache things. of how specific clients might choose to cache things.
11.15.1. Referral Example (LOOKUP) 11.15.1. Referral Example (LOOKUP)
Let us suppose that the following COMPOUND is sent in an environment Let us suppose that the following COMPOUND is sent in an environment
in which /this/is/the/path is absent from the target server. This in which /this/is/the/path is absent from the target server. This
may be for a number of reasons. It may be that the file system has may be for a number of reasons. It may be that the file system has
moved, or it may be that the target server is functioning mainly, or moved, or it may be that the target server is functioning mainly, or
solely, to refer clients to the servers on which various file systems solely, to refer clients to the servers on which various file systems
are located. are located.
skipping to change at line 13731 skipping to change at line 13731
different write-verifier class from the source. different write-verifier class from the source.
The specific choices reflect typical implementation patterns for The specific choices reflect typical implementation patterns for
failover and controlled migration, respectively. Since other choices failover and controlled migration, respectively. Since other choices
are possible and useful, this information is better obtained by using are possible and useful, this information is better obtained by using
fs_locations_info. When a server implementation needs to communicate fs_locations_info. When a server implementation needs to communicate
other choices, it MUST support the fs_locations_info attribute. other choices, it MUST support the fs_locations_info attribute.
See Section 21 for a discussion on the recommendations for the See Section 21 for a discussion on the recommendations for the
security flavor to be used by any GETATTR operation that requests the security flavor to be used by any GETATTR operation that requests the
"fs_locations" attribute. fs_locations attribute.
11.17. The Attribute fs_locations_info 11.17. The Attribute fs_locations_info
The fs_locations_info attribute is intended as a more functional The fs_locations_info attribute is intended as a more functional
replacement for the fs_locations attribute which will continue to replacement for the fs_locations attribute, which will continue to
exist and be supported. Clients can use it to get a more complete exist and be supported. Clients can use it to get a more complete
set of data about alternative file system locations, including set of data about alternative file system locations, including
additional network paths to access replicas in use and additional additional network paths to access replicas in use and additional
replicas. When the server does not support fs_locations_info, replicas. When the server does not support fs_locations_info,
fs_locations can be used to get a subset of the data. A server that fs_locations can be used to get a subset of the data. A server that
supports fs_locations_info MUST support fs_locations as well. supports fs_locations_info MUST support fs_locations as well.
There is additional data present in fs_locations_info, that is not There is additional data present in fs_locations_info that is not
available in fs_locations: available in fs_locations:
* Attribute continuity information. This information will allow a * Attribute continuity information. This information will allow a
client to select a replica that meets the transparency client to select a replica that meets the transparency
requirements of the applications accessing the data and to requirements of the applications accessing the data and to
leverage optimizations due to the server guarantees of attribute leverage optimizations due to the server guarantees of attribute
continuity (e.g., if the change attribute of a file of the file continuity (e.g., if the change attribute of a file of the file
system is continuous between multiple replicas, the client does system is continuous between multiple replicas, the client does
not have to invalidate the file's cache when switching to a not have to invalidate the file's cache when switching to a
different replica). different replica).
skipping to change at line 13782 skipping to change at line 13782
used to implement load-balancing while giving the client the used to implement load-balancing while giving the client the
entire file system list to be used in case the primary fails. entire file system list to be used in case the primary fails.
The fs_locations_info attribute is structured similarly to the The fs_locations_info attribute is structured similarly to the
fs_locations attribute. A top-level structure (fs_locations_info4) fs_locations attribute. A top-level structure (fs_locations_info4)
contains the entire attribute including the root pathname of the file contains the entire attribute including the root pathname of the file
system and an array of lower-level structures that define replicas system and an array of lower-level structures that define replicas
that share a common rootpath on their respective servers. The lower- that share a common rootpath on their respective servers. The lower-
level structure in turn (fs_locations_item4) contains a specific level structure in turn (fs_locations_item4) contains a specific
pathname and information on one or more individual network access pathname and information on one or more individual network access
paths. For that last lowest level, fs_locations_info has an paths. For that last, lowest level, fs_locations_info has an
fs_locations_server4 structure that contains per-server-replica fs_locations_server4 structure that contains per-server-replica
information in addition to the file system location entry. This per- information in addition to the file system location entry. This per-
server-replica information includes a nominally opaque array, server-replica information includes a nominally opaque array,
fls_info, within which specific pieces of information are located at fls_info, within which specific pieces of information are located at
the specific indices listed below. the specific indices listed below.
Two fs_location_server4 entries that are within different Two fs_location_server4 entries that are within different
fs_location_item4 structures are never trunkable, while two entries fs_location_item4 structures are never trunkable, while two entries
within in the same fs_location_item4 structure might or might not be within in the same fs_location_item4 structure might or might not be
trunkable. Two entries that are trunkable will have identical trunkable. Two entries that are trunkable will have identical
skipping to change at line 13909 skipping to change at line 13909
The data presented in the fs_locations_info attribute may be obtained The data presented in the fs_locations_info attribute may be obtained
by the server in any number of ways, including specification by the by the server in any number of ways, including specification by the
administrator or by current protocols for transferring data among administrator or by current protocols for transferring data among
replicas and protocols not yet developed. NFSv4.1 only defines how replicas and protocols not yet developed. NFSv4.1 only defines how
this information is presented by the server to the client. this information is presented by the server to the client.
11.17.1. The fs_locations_server4 Structure 11.17.1. The fs_locations_server4 Structure
The fs_locations_server4 structure consists of the following items in The fs_locations_server4 structure consists of the following items in
addition to the fls_server field which specifies a network address or addition to the fls_server field, which specifies a network address
set of addresses to be used to access the specified file system. or set of addresses to be used to access the specified file system.
Note that both of these items (i.e., fls_currency and flinfo) specify Note that both of these items (i.e., fls_currency and flinfo) specify
attributes of the file system replica and should not be different attributes of the file system replica and should not be different
when there are multiple fs_locations_server4 structures for the same when there are multiple fs_locations_server4 structures, each
replica, each specifying a network path to the chosen replica. specifying a network path to the chosen replica, for the same
replica.
When these values are different in two fs_locations_server4 When these values are different in two fs_locations_server4
structures, a client has no basis for choosing one over the other and structures, a client has no basis for choosing one over the other and
is best off simply ignoring both entries, whether these entries apply is best off simply ignoring both entries, whether these entries apply
to migration replication or referral. When there are more than two to migration replication or referral. When there are more than two
such entries, majority voting can be used to exclude a single such entries, majority voting can be used to exclude a single
erroneous entry from consideration. In the case in which trunking erroneous entry from consideration. In the case in which trunking
information is provided for a replica currently being accessed, the information is provided for a replica currently being accessed, the
additional trunked addresses can be ignored while access continues on additional trunked addresses can be ignored while access continues on
the address currently being used, even if the entry corresponding to the address currently being used, even if the entry corresponding to
skipping to change at line 13955 skipping to change at line 13956
* The server string (fls_server). For the case of the replica * The server string (fls_server). For the case of the replica
currently being accessed (via GETATTR), a zero-length string MAY currently being accessed (via GETATTR), a zero-length string MAY
be used to indicate the current address being used for the RPC be used to indicate the current address being used for the RPC
call. The fls_server field can also be an IPv4 or IPv6 address, call. The fls_server field can also be an IPv4 or IPv6 address,
formatted the same way as an IPv4 or IPv6 address in the "server" formatted the same way as an IPv4 or IPv6 address in the "server"
field of the fs_location4 data type (see Section 11.16). field of the fs_location4 data type (see Section 11.16).
With the exception of the transport-flag field (at offset With the exception of the transport-flag field (at offset
FSLI4BX_TFLAGS with the fls_info array), all of this data defined in FSLI4BX_TFLAGS with the fls_info array), all of this data defined in
this specification applies to the replica specified by the entry, this specification applies to the replica specified by the entry,
rather that the specific network path used to access it. The rather than the specific network path used to access it. The
classification of data in extensions to this data is discussed below. classification of data in extensions to this data is discussed below.
Data within the fls_info array is in the form of 8-bit data items Data within the fls_info array is in the form of 8-bit data items
with constants giving the offsets within the array of various values with constants giving the offsets within the array of various values
describing this particular file system instance. This style of describing this particular file system instance. This style of
definition was chosen, in preference to explicit XDR structure definition was chosen, in preference to explicit XDR structure
definitions for these values, for a number of reasons. definitions for these values, for a number of reasons.
* The kinds of data in the fls_info array, representing flags, file * The kinds of data in the fls_info array, representing flags, file
system classes, and priorities among sets of file systems system classes, and priorities among sets of file systems
representing the same data, are such that 8 bits provide a quite representing the same data, are such that 8 bits provide a quite
acceptable range of values. Even where there might be more than acceptable range of values. Even where there might be more than
256 such file system instances, having more than 256 distinct 256 such file system instances, having more than 256 distinct
classes or priorities is unlikely. classes or priorities is unlikely.
* Explicit definition of the various specific data items within XDR * Explicit definition of the various specific data items within XDR
would limit expandability in that any extension within would would limit expandability in that any extension within would
require yet another attribute, leading to specification and require yet another attribute, leading to specification and
implementation clumsiness. In the context of the NFSv4 extension implementation clumsiness. In the context of the NFSv4 extension
model in effect at the time fs_locations_info was designed (i.e. model in effect at the time fs_locations_info was designed (i.e.,
that described in RFC5661 [65]), this would necessitate a new that which is described in RFC 5661 [65]), this would necessitate
minor version to effect any Standards Track extension to the data a new minor version to effect any Standards Track extension to the
in in fls_info. data in fls_info.
The set of fls_info data is subject to expansion in a future minor The set of fls_info data is subject to expansion in a future minor
version, or in a Standards Track RFC, within the context of a single version or in a Standards Track RFC within the context of a single
minor version. The server SHOULD NOT send and the client MUST NOT minor version. The server SHOULD NOT send and the client MUST NOT
use indices within the fls_info array or flag bits that are not use indices within the fls_info array or flag bits that are not
defined in Standards Track RFCs. defined in Standards Track RFCs.
In light of the new extension model defined in RFC8178 [66] and the In light of the new extension model defined in RFC 8178 [66] and the
fact that the individual items within fls_info are not explicitly fact that the individual items within fls_info are not explicitly
referenced in the XDR, the following practices should be followed referenced in the XDR, the following practices should be followed
when extending or otherwise changing the structure of the data when extending or otherwise changing the structure of the data
returned in fls_info within the scope of a single minor version. returned in fls_info within the scope of a single minor version:
* All extensions need to be described by Standards Track documents. * All extensions need to be described by Standards Track documents.
There is no need for such documents to be marked as updating There is no need for such documents to be marked as updating RFC
RFC5661 [65] or this document. 5661 [65] or this document.
* It needs to be made clear whether the information in any added * It needs to be made clear whether the information in any added
data items applies to the replica specified by the entry or to the data items applies to the replica specified by the entry or to the
specific network paths specified in the entry. specific network paths specified in the entry.
* There needs to be a reliable way defined to determine whether the * There needs to be a reliable way defined to determine whether the
server is aware of the extension. This may be based on the length server is aware of the extension. This may be based on the length
field of the fls_info array, but it is more flexible to provide field of the fls_info array, but it is more flexible to provide
fs-scope or server-scope attributes to indicate what extensions fs-scope or server-scope attributes to indicate what extensions
are provided. are provided.
skipping to change at line 14084 skipping to change at line 14085
reasonable. reasonable.
When this flag is seen as part of a transition into a new file When this flag is seen as part of a transition into a new file
system, a client might choose to transfer immediately to another system, a client might choose to transfer immediately to another
replica, or it may reference the current file system and only replica, or it may reference the current file system and only
transition when a migration event occurs. Similarly, when this transition when a migration event occurs. Similarly, when this
flag appears as a replica in the referral, clients would likely flag appears as a replica in the referral, clients would likely
avoid being referred to this instance whenever there is another avoid being referred to this instance whenever there is another
choice. choice.
This flag, like the other items within fls_info applies to the This flag, like the other items within fls_info, applies to the
replica, rather than to a particular path to that replica. When replica rather than to a particular path to that replica. When it
it appears, a transition to a new replica rather than to a appears, a transition to a new replica, rather than to a different
different path to the same replica, is indicated. path to the same replica, is indicated.
* FSLI4GF_SPLIT indicates that when a transition occurs from the * FSLI4GF_SPLIT indicates that when a transition occurs from the
current file system instance to this one, the replacement may current file system instance to this one, the replacement may
consist of multiple file systems. In this case, the client has to consist of multiple file systems. In this case, the client has to
be prepared for the possibility that objects on the same file be prepared for the possibility that objects on the same file
system before migration will be on different ones after. Note system before migration will be on different ones after. Note
that FSLI4GF_SPLIT is not incompatible with the file systems that FSLI4GF_SPLIT is not incompatible with the file systems
belonging to the same fileid class since, if one has a set of belonging to the same fileid class since, if one has a set of
fileids that are unique within a file system, each subset assigned fileids that are unique within a file system, each subset assigned
to a smaller file system after migration would not have any to a smaller file system after migration would not have any
skipping to change at line 14133 skipping to change at line 14134
the server to determine when the need for emulating two file the server to determine when the need for emulating two file
systems as one is over. systems as one is over.
Although it is possible for this flag to be present in the event Although it is possible for this flag to be present in the event
of referral, it would generally be of little interest to the of referral, it would generally be of little interest to the
client, since the client is not expected to have information client, since the client is not expected to have information
regarding the current contents of the absent file system. regarding the current contents of the absent file system.
The transport-flag field (at byte index FSLI4BX_TFLAGS) contains the The transport-flag field (at byte index FSLI4BX_TFLAGS) contains the
following bits related to the transport capabilities of the specific following bits related to the transport capabilities of the specific
network path(s) specified by the entry. network path(s) specified by the entry:
* FSLI4TF_RDMA indicates that any specified network paths provide * FSLI4TF_RDMA indicates that any specified network paths provide
NFSv4.1 clients access using an RDMA-capable transport. NFSv4.1 clients access using an RDMA-capable transport.
Attribute continuity and file system identity information are Attribute continuity and file system identity information are
expressed by defining equivalence relations on the sets of file expressed by defining equivalence relations on the sets of file
systems presented to the client. Each such relation is expressed as systems presented to the client. Each such relation is expressed as
a set of file system equivalence classes. For each relation, a file a set of file system equivalence classes. For each relation, a file
system has an 8-bit class number. Two file systems belong to the system has an 8-bit class number. Two file systems belong to the
same class if both have identical non-zero class numbers. Zero is same class if both have identical non-zero class numbers. Zero is
skipping to change at line 14864 skipping to change at line 14865
Via a notification mechanism (see Section 20.12), device ID to device Via a notification mechanism (see Section 20.12), device ID to device
address mappings can change over the duration of server operation address mappings can change over the duration of server operation
without recalling or revoking the layouts that refer to device ID. without recalling or revoking the layouts that refer to device ID.
The notification mechanism can also delete a device ID, but only if The notification mechanism can also delete a device ID, but only if
the client has no layouts referring to the device ID. A notification the client has no layouts referring to the device ID. A notification
of a change to a device ID to device address mapping will immediately of a change to a device ID to device address mapping will immediately
or eventually invalidate some or all of the device ID's mappings. or eventually invalidate some or all of the device ID's mappings.
The server MUST support notifications and the client must request The server MUST support notifications and the client must request
them before they can be used. For further information about the them before they can be used. For further information about the
notification types Section 20.12. notification types, see Section 20.12.
12.3. pNFS Operations 12.3. pNFS Operations
NFSv4.1 has several operations that are needed for pNFS servers, NFSv4.1 has several operations that are needed for pNFS servers,
regardless of layout type or storage protocol. These operations are regardless of layout type or storage protocol. These operations are
all sent to a metadata server and summarized here. While pNFS is an all sent to a metadata server and summarized here. While pNFS is an
OPTIONAL feature, if pNFS is implemented, some operations are OPTIONAL feature, if pNFS is implemented, some operations are
REQUIRED in order to comply with pNFS. See Section 17. REQUIRED in order to comply with pNFS. See Section 17.
These are the fore channel pNFS operations: These are the fore channel pNFS operations:
skipping to change at line 17829 skipping to change at line 17830
For any of a number of reasons, the replier could not process this For any of a number of reasons, the replier could not process this
operation in what was deemed a reasonable time. The client should operation in what was deemed a reasonable time. The client should
wait and then try the request with a new slot and sequence value. wait and then try the request with a new slot and sequence value.
Some examples of scenarios that might lead to this situation: Some examples of scenarios that might lead to this situation:
* A server that supports hierarchical storage receives a request to * A server that supports hierarchical storage receives a request to
process a file that had been migrated. process a file that had been migrated.
* An operation requires a delegation recall to proceed, so that the * An operation requires a delegation recall to proceed, but the need
need to wait for this delegation to be recalled and returned makes to wait for this delegation to be recalled and returned makes
processing this request in a timely fashion impossible. processing this request in a timely fashion impossible.
* A request is being performed on a session being migrated from * A request is being performed on a session being migrated from
another server as described in Section 11.14.3, and the lack of another server as described in Section 11.14.3, and the lack of
full information about the state of the session on the source full information about the state of the session on the source
makes it impossible to process the request immediately. makes it impossible to process the request immediately.
In such cases, returning the error NFS4ERR_DELAY allows necessary In such cases, returning the error NFS4ERR_DELAY allows necessary
preparatory operations to proceed without holding up requester preparatory operations to proceed without holding up requester
resources such as a session slot. After delaying for period of time, resources such as a session slot. After delaying for period of time,
skipping to change at line 17861 skipping to change at line 17862
is retried in full with the SEQUENCE operation containing the same is retried in full with the SEQUENCE operation containing the same
slot and sequence values. In this case, the replier MUST avoid slot and sequence values. In this case, the replier MUST avoid
returning a response containing NFS4ERR_DELAY as the response to returning a response containing NFS4ERR_DELAY as the response to
SEQUENCE solely on the basis of its presence in the replay cache. SEQUENCE solely on the basis of its presence in the replay cache.
If the replier did this, the retries would not be effective as If the replier did this, the retries would not be effective as
there would be no opportunity for the replier to see whether the there would be no opportunity for the replier to see whether the
condition that generated the NFS4ERR_DELAY had been rectified condition that generated the NFS4ERR_DELAY had been rectified
during the interim between the original request and the retry. during the interim between the original request and the retry.
* If NFS4ERR_DELAY is returned on an operation other than SEQUENCE * If NFS4ERR_DELAY is returned on an operation other than SEQUENCE
which validly appears as the first operation of a request, that validly appears as the first operation of a request, the
handling is similar. The request can be retried in full without handling is similar. The request can be retried in full without
modification. In this case as well, the replier MUST avoid modification. In this case as well, the replier MUST avoid
returning a response containing NFS4ERR_DELAY as the response to returning a response containing NFS4ERR_DELAY as the response to
an initial operation of a request solely on the basis of its an initial operation of a request solely on the basis of its
presence in the replay cache. If the replier did this, the presence in the replay cache. If the replier did this, the
retries would not be effective as there would be no opportunity retries would not be effective as there would be no opportunity
for the replier to see whether the condition that generated the for the replier to see whether the condition that generated the
NFS4ERR_DELAY had been rectified during the interim between the NFS4ERR_DELAY had been rectified during the interim between the
original request and the retry. original request and the retry.
* If NFS4ERR_DELAY is returned on an operation other than the first * If NFS4ERR_DELAY is returned on an operation other than the first
in the request, the request when retried MUST contain a SEQUENCE in the request, the request when retried MUST contain a SEQUENCE
operation which is different than the original one, with either operation that is different than the original one, with either the
the bin id or the sequence value different from that in the bin id or the sequence value different from that in the original
original request. Because requesters do this, there is no need request. Because requesters do this, there is no need for the
for the replier to take special care to avoid returning an replier to take special care to avoid returning an NFS4ERR_DELAY
NFS4ERR_DELAY error, obtained from the replay cache. When no non- error obtained from the replay cache. When no non-idempotent
idempotent operations have been processed before the NFS4ERR_DELAY operations have been processed before the NFS4ERR_DELAY was
was returned, the requester should retry the request in full, with returned, the requester should retry the request in full, with the
the only difference from the original request being the only difference from the original request being the modification
modification to the slot id or sequence value in the reissued to the slot ID or sequence value in the reissued SEQUENCE
SEQUENCE operation. operation.
* When NFS4ERR_DELAY is returned on an operation other than the * When NFS4ERR_DELAY is returned on an operation other than the
first within a request and there has been a non-idempotent first within a request and there has been a non-idempotent
operation processed before the NFS4ERR_DELAY was returned, operation processed before the NFS4ERR_DELAY was returned,
reissuing the request as is normally done would incorrectly cause reissuing the request as is normally done would incorrectly cause
the re-execution of the non-idempotent operation. the re-execution of the non-idempotent operation.
To avoid this situation, the client should reissue the request To avoid this situation, the client should reissue the request
without the non-idempotent operation. The request still must use without the non-idempotent operation. The request still must use
a SEQUENCE operation with either a different slot id or sequence a SEQUENCE operation with either a different slot ID or sequence
value from the SEQUENCE in the original request. Because this is value from the SEQUENCE in the original request. Because this is
done, there is no way the replier could avoid spuriously re- done, there is no way the replier could avoid spuriously re-
executing the non-idempotent operation since the different executing the non-idempotent operation since the different
SEQUENCE parameters prevent the requester from recognizing that SEQUENCE parameters prevent the requester from recognizing that
the non-idempotent operation is being retried. the non-idempotent operation is being retried.
Note that without the ability to return NFS4ERR_DELAY and the Note that without the ability to return NFS4ERR_DELAY and the
requester's willingness to re-send when receiving it, deadlock might requester's willingness to re-send when receiving it, deadlock might
result. For example, if a recall is done, and if the delegation result. For example, if a recall is done, and if the delegation
return or operations preparatory to delegation return are held up by return or operations preparatory to delegation return are held up by
skipping to change at line 17947 skipping to change at line 17948
in which the filehandle is a valid filehandle in general but is not in which the filehandle is a valid filehandle in general but is not
of the appropriate object type for the current operation. of the appropriate object type for the current operation.
Where the error description indicates a problem with the current or Where the error description indicates a problem with the current or
saved filehandle, it is to be understood that filehandles are only saved filehandle, it is to be understood that filehandles are only
checked for the condition if they are implicit arguments of the checked for the condition if they are implicit arguments of the
operation in question. operation in question.
15.1.2.1. NFS4ERR_BADHANDLE (Error Code 10001) 15.1.2.1. NFS4ERR_BADHANDLE (Error Code 10001)
Illegal NFS filehandle for the current server. The current file Illegal NFS filehandle for the current server. The current
handle failed internal consistency checks. Once accepted as valid filehandle failed internal consistency checks. Once accepted as
(by PUTFH), no subsequent status change can cause the filehandle to valid (by PUTFH), no subsequent status change can cause the
generate this error. filehandle to generate this error.
15.1.2.2. NFS4ERR_FHEXPIRED (Error Code 10014) 15.1.2.2. NFS4ERR_FHEXPIRED (Error Code 10014)
A current or saved filehandle that is an argument to the current A current or saved filehandle that is an argument to the current
operation is volatile and has expired at the server. operation is volatile and has expired at the server.
15.1.2.3. NFS4ERR_ISDIR (Error Code 21) 15.1.2.3. NFS4ERR_ISDIR (Error Code 21)
The current or saved filehandle designates a directory when the The current or saved filehandle designates a directory when the
current operation does not allow a directory to be accepted as the current operation does not allow a directory to be accepted as the
target of this operation. target of this operation.
15.1.2.4. NFS4ERR_MOVED (Error Code 10019) 15.1.2.4. NFS4ERR_MOVED (Error Code 10019)
The file system that contains the current filehandle object is not The file system that contains the current filehandle object is not
present at the server, or is not accessible using the network address present at the server or is not accessible with the network address
used. It may have been made accessible on a different set of network used. It may have been made accessible on a different set of network
addresses, relocated or migrated to another server, or it may have addresses, relocated or migrated to another server, or it may have
never been present. The client may obtain the new file system never been present. The client may obtain the new file system
location by obtaining the "fs_locations" or "fs_locations_info" location by obtaining the fs_locations or fs_locations_info attribute
attribute for the current filehandle. For further discussion, refer for the current filehandle. For further discussion, refer to
to Section 11.3. Section 11.3.
As with the case of NFS4ERR_DELAY, it is possible that one or more As with the case of NFS4ERR_DELAY, it is possible that one or more
non-idempotent operations may have been successfully executed within non-idempotent operations may have been successfully executed within
a COMPOUND before NFS4ERR_MOVED is returned. Because of this, once a COMPOUND before NFS4ERR_MOVED is returned. Because of this, once
the new location is determined, the original request which received the new location is determined, the original request that received
the NFS4ERR_MOVED should not be re-executed in full. Instead, the the NFS4ERR_MOVED should not be re-executed in full. Instead, the
client should send a new COMPOUND, with any successfully executed client should send a new COMPOUND with any successfully executed non-
non-idempotent operations removed. When the client uses the same idempotent operations removed. When the client uses the same session
session for the new COMPOUND, its SEQUENCE operation should use a for the new COMPOUND, its SEQUENCE operation should use a different
different slot id or sequence. slot ID or sequence.
15.1.2.5. NFS4ERR_NOFILEHANDLE (Error Code 10020) 15.1.2.5. NFS4ERR_NOFILEHANDLE (Error Code 10020)
The logical current or saved filehandle value is required by the The logical current or saved filehandle value is required by the
current operation and is not set. This may be a result of a current operation and is not set. This may be a result of a
malformed COMPOUND operation (i.e., no PUTFH or PUTROOTFH before an malformed COMPOUND operation (i.e., no PUTFH or PUTROOTFH before an
operation that requires the current filehandle be set). operation that requires the current filehandle be set).
15.1.2.6. NFS4ERR_NOTDIR (Error Code 20) 15.1.2.6. NFS4ERR_NOTDIR (Error Code 20)
skipping to change at line 18363 skipping to change at line 18364
specifying the same scope, whether that scope is global or for the specifying the same scope, whether that scope is global or for the
same file system in the case of a per-fs RECLAIM_COMPLETE. An same file system in the case of a per-fs RECLAIM_COMPLETE. An
additional RECLAIM_COMPLETE operation is not necessary and results in additional RECLAIM_COMPLETE operation is not necessary and results in
this error. this error.
15.1.9.2. NFS4ERR_GRACE (Error Code 10013) 15.1.9.2. NFS4ERR_GRACE (Error Code 10013)
This error is returned when the server is in its grace period with This error is returned when the server is in its grace period with
regard to the file system object for which the lock was requested. regard to the file system object for which the lock was requested.
In this situation, a non-reclaim locking request cannot be granted. In this situation, a non-reclaim locking request cannot be granted.
This can occur because either This can occur because either:
* The server does not have sufficient information about locks that * The server does not have sufficient information about locks that
might be potentially reclaimed to determine whether the lock could might be potentially reclaimed to determine whether the lock could
be granted. be granted.
* The request is made by a client responsible for reclaiming its * The request is made by a client responsible for reclaiming its
locks that has not yet done the appropriate RECLAIM_COMPLETE locks that has not yet done the appropriate RECLAIM_COMPLETE
operation, allowing it to proceed to obtain new locks. operation, allowing it to proceed to obtain new locks.
In the case of a per-fs grace period, there may be clients, (i.e., In the case of a per-fs grace period, there may be clients (i.e.,
those currently using the destination file system) who might be those currently using the destination file system) who might be
unaware of the circumstances resulting in the initiation of the grace unaware of the circumstances resulting in the initiation of the grace
period. Such clients need to periodically retry the request until period. Such clients need to periodically retry the request until
the grace period is over, just as other clients do. the grace period is over, just as other clients do.
15.1.9.3. NFS4ERR_NO_GRACE (Error Code 10033) 15.1.9.3. NFS4ERR_NO_GRACE (Error Code 10033)
A reclaim of client state was attempted in circumstances in which the A reclaim of client state was attempted in circumstances in which the
server cannot guarantee that conflicting state has not been provided server cannot guarantee that conflicting state has not been provided
to another client. This occurs in any of the following situations. to another client. This occurs in any of the following situations:
* There is no active grace period applying to the file system object * There is no active grace period applying to the file system object
for which the request was made. for which the request was made.
* The client making the request has no current role in reclaiming * The client making the request has no current role in reclaiming
locks. locks.
* Previous operations have created a situation in which the server * Previous operations have created a situation in which the server
is not able to determine that a reclaim-interfering edge condition is not able to determine that a reclaim-interfering edge condition
does not exist. does not exist.
15.1.9.4. NFS4ERR_RECLAIM_BAD (Error Code 10034) 15.1.9.4. NFS4ERR_RECLAIM_BAD (Error Code 10034)
The server has determined that a reclaim attempted by the client is The server has determined that a reclaim attempted by the client is
not valid, i.e. the lock specified as being reclaimed could not not valid, i.e., the lock specified as being reclaimed could not
possibly have existed before the server restart or file system possibly have existed before the server restart or file system
migration event. A server is not obliged to make this determination migration event. A server is not obliged to make this determination
and will typically rely on the client to only reclaim locks that the and will typically rely on the client to only reclaim locks that the
client was granted prior to restart. However, when a server does client was granted prior to restart. However, when a server does
have reliable information to enable it to make this determination, have reliable information to enable it to make this determination,
this error indicates that the reclaim has been rejected as invalid. this error indicates that the reclaim has been rejected as invalid.
This is as opposed to the error NFS4ERR_RECLAIM_CONFLICT (see This is as opposed to the error NFS4ERR_RECLAIM_CONFLICT (see
Section 15.1.9.5) where the server can only determine that there has Section 15.1.9.5) where the server can only determine that there has
been an invalid reclaim, but cannot determine which request is been an invalid reclaim, but cannot determine which request is
invalid. invalid.
skipping to change at line 21654 skipping to change at line 21655
18.4.3. DESCRIPTION 18.4.3. DESCRIPTION
The CREATE operation creates a file object other than an ordinary The CREATE operation creates a file object other than an ordinary
file in a directory with a given name. The OPEN operation MUST be file in a directory with a given name. The OPEN operation MUST be
used to create a regular file or a named attribute. used to create a regular file or a named attribute.
The current filehandle must be a directory: an object of type NF4DIR. The current filehandle must be a directory: an object of type NF4DIR.
If the current filehandle is an attribute directory (type If the current filehandle is an attribute directory (type
NF4ATTRDIR), the error NFS4ERR_WRONG_TYPE is returned. If the NF4ATTRDIR), the error NFS4ERR_WRONG_TYPE is returned. If the
current file handle designates any other type of object, the error current filehandle designates any other type of object, the error
NFS4ERR_NOTDIR results. NFS4ERR_NOTDIR results.
The objname specifies the name for the new object. The objtype The objname specifies the name for the new object. The objtype
determines the type of object to be created: directory, symlink, etc. determines the type of object to be created: directory, symlink, etc.
If the object type specified is that of an ordinary file, a named If the object type specified is that of an ordinary file, a named
attribute, or a named attribute directory, the error NFS4ERR_BADTYPE attribute, or a named attribute directory, the error NFS4ERR_BADTYPE
results. results.
If an object of the same name already exists in the directory, the If an object of the same name already exists in the directory, the
server will return the error NFS4ERR_EXIST. server will return the error NFS4ERR_EXIST.
skipping to change at line 22750 skipping to change at line 22751
* to file. Ordinary OPEN of the * to file. Ordinary OPEN of the
* specified file by current filehandle. * specified file by current filehandle.
*/ */
case CLAIM_FH: /* new to v4.1 */ case CLAIM_FH: /* new to v4.1 */
/* CURRENT_FH: regular file to open */ /* CURRENT_FH: regular file to open */
void; void;
/* /*
* Like CLAIM_DELEGATE_PREV. Right to file based on a * Like CLAIM_DELEGATE_PREV. Right to file based on a
* delegation granted to a previous boot * delegation granted to a previous boot
* instance of the client. File is identified by * instance of the client. File is identified
* by filehandle. * by filehandle.
*/ */
case CLAIM_DELEG_PREV_FH: /* new to v4.1 */ case CLAIM_DELEG_PREV_FH: /* new to v4.1 */
/* CURRENT_FH: file being opened */ /* CURRENT_FH: file being opened */
void; void;
/* /*
* Like CLAIM_DELEGATE_CUR. Right to file based on * Like CLAIM_DELEGATE_CUR. Right to file based on
* a delegation granted by the server. * a delegation granted by the server.
* File is identified by filehandle. * File is identified by filehandle.
skipping to change at line 22979 skipping to change at line 22980
| | | and | or EXCLUSIVE4 (SHOULD | | | | and | or EXCLUSIVE4 (SHOULD |
| | | EXCLUSIVE4 | NOT) | | | | EXCLUSIVE4 | NOT) |
+-------------+----------+--------------+-----------------------+ +-------------+----------+--------------+-----------------------+
| no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 | | no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 |
+-------------+----------+--------------+-----------------------+ +-------------+----------+--------------+-----------------------+
| yes | no | GUARDED4 | GUARDED4 | | yes | no | GUARDED4 | GUARDED4 |
+-------------+----------+--------------+-----------------------+ +-------------+----------+--------------+-----------------------+
| yes | yes | GUARDED4 | GUARDED4 | | yes | yes | GUARDED4 | GUARDED4 |
+-------------+----------+--------------+-----------------------+ +-------------+----------+--------------+-----------------------+
Table 18: Required methods for exclusive create Table 18: Required Methods for Exclusive Create
If CREATE_SESSION4_FLAG_PERSIST is set in the results of If CREATE_SESSION4_FLAG_PERSIST is set in the results of
CREATE_SESSION, the reply cache is persistent (see Section 18.36). CREATE_SESSION, the reply cache is persistent (see Section 18.36).
If the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from If the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from
EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the
client attempts to use EXCLUSIVE4 on a persistent session, or a client attempts to use EXCLUSIVE4 on a persistent session, or a
session derived from an EXCHGID4_FLAG_USE_PNFS_MDS client ID, the session derived from an EXCHGID4_FLAG_USE_PNFS_MDS client ID, the
server MUST return NFS4ERR_INVAL. server MUST return NFS4ERR_INVAL.
With persistent sessions, exclusive create semantics are fully With persistent sessions, exclusive create semantics are fully
skipping to change at line 23564 skipping to change at line 23565
the object. If none exist, then NFS4ERR_NOENT will be returned. If the object. If none exist, then NFS4ERR_NOENT will be returned. If
createdir has a value of TRUE and no named attribute directory createdir has a value of TRUE and no named attribute directory
exists, one is created and its filehandle becomes the current exists, one is created and its filehandle becomes the current
filehandle. On the other hand, if createdir has a value of TRUE and filehandle. On the other hand, if createdir has a value of TRUE and
the named attribute directory already exists, no error results and the named attribute directory already exists, no error results and
the filehandle of the existing directory becomes the current the filehandle of the existing directory becomes the current
filehandle. The creation of a named attribute directory assumes that filehandle. The creation of a named attribute directory assumes that
the server has implemented named attribute support in this fashion the server has implemented named attribute support in this fashion
and is not required to do so by this definition. and is not required to do so by this definition.
If the current file handle designates an object of type NF4NAMEDATTR If the current filehandle designates an object of type NF4NAMEDATTR
(a named attribute) or NF4ATTRDIR (a named attribute directory), an (a named attribute) or NF4ATTRDIR (a named attribute directory), an
error of NFS4ERR_WRONG_TYPE is returned to the client. Named error of NFS4ERR_WRONG_TYPE is returned to the client. Named
attributes or a named attribute directory MUST NOT have their own attributes or a named attribute directory MUST NOT have their own
named attributes. named attributes.
18.17.4. IMPLEMENTATION 18.17.4. IMPLEMENTATION
If the server does not support named attributes for the current If the server does not support named attributes for the current
filehandle, an error of NFS4ERR_NOTSUPP will be returned to the filehandle, an error of NFS4ERR_NOTSUPP will be returned to the
client. client.
skipping to change at line 24927 skipping to change at line 24928
+============+===================================+ +============+===================================+
| stable | committed | | stable | committed |
+============+===================================+ +============+===================================+
| UNSTABLE4 | FILE_SYNC4, DATA_SYNC4, UNSTABLE4 | | UNSTABLE4 | FILE_SYNC4, DATA_SYNC4, UNSTABLE4 |
+------------+-----------------------------------+ +------------+-----------------------------------+
| DATA_SYNC4 | FILE_SYNC4, DATA_SYNC4 | | DATA_SYNC4 | FILE_SYNC4, DATA_SYNC4 |
+------------+-----------------------------------+ +------------+-----------------------------------+
| FILE_SYNC4 | FILE_SYNC4 | | FILE_SYNC4 | FILE_SYNC4 |
+------------+-----------------------------------+ +------------+-----------------------------------+
Table 20: Valid combinations of the fields Table 20: Valid Combinations of the Fields
stable in the request and committed in the Stable in the Request and Committed in the
reply Reply
The final portion of the result is the field writeverf. This field The final portion of the result is the field writeverf. This field
is the write verifier and is a cookie that the client can use to is the write verifier and is a cookie that the client can use to
determine whether a server has changed instance state (e.g., server determine whether a server has changed instance state (e.g., server
restart) between a call to WRITE and a subsequent call to either restart) between a call to WRITE and a subsequent call to either
WRITE or COMMIT. This cookie MUST be unchanged during a single WRITE or COMMIT. This cookie MUST be unchanged during a single
instance of the NFSv4.1 server and MUST be unique between instances instance of the NFSv4.1 server and MUST be unique between instances
of the NFSv4.1 server. If the cookie changes, then the client MUST of the NFSv4.1 server. If the cookie changes, then the client MUST
assume that any data written with an UNSTABLE4 value for committed assume that any data written with an UNSTABLE4 value for committed
and an old writeverf in the reply has been lost and will need to be and an old writeverf in the reply has been lost and will need to be
skipping to change at line 25255 skipping to change at line 25256
* The attempted BIND_CONN_TO_SESSION with the old SSV should * The attempted BIND_CONN_TO_SESSION with the old SSV should
succeed. If so, the client re-sends the original SET_SSV. If the succeed. If so, the client re-sends the original SET_SSV. If the
original SET_SSV was not executed, then the server executes it. original SET_SSV was not executed, then the server executes it.
If the original SET_SSV was executed but failed, the server will If the original SET_SSV was executed but failed, the server will
return the SET_SSV from the reply cache. return the SET_SSV from the reply cache.
18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID
The EXCHANGE_ID operation exchanges long-hand client and server The EXCHANGE_ID operation exchanges long-hand client and server
identifiers (owners), and provides access to a client ID, creating identifiers (owners) and provides access to a client ID, creating one
one if necessary. This client ID becomes associated with the if necessary. This client ID becomes associated with the connection
connection on which the operation is done, so that it is available on which the operation is done, so that it is available when a
when a CREATE_SESSION is done or when the connection is used to issue CREATE_SESSION is done or when the connection is used to issue a
a request on an existing session associated with the current client. request on an existing session associated with the current client.
18.35.1. ARGUMENT 18.35.1. ARGUMENT
const EXCHGID4_FLAG_SUPP_MOVED_REFER = 0x00000001; const EXCHGID4_FLAG_SUPP_MOVED_REFER = 0x00000001;
const EXCHGID4_FLAG_SUPP_MOVED_MIGR = 0x00000002; const EXCHGID4_FLAG_SUPP_MOVED_MIGR = 0x00000002;
const EXCHGID4_FLAG_BIND_PRINC_STATEID = 0x00000100; const EXCHGID4_FLAG_BIND_PRINC_STATEID = 0x00000100;
const EXCHGID4_FLAG_USE_NON_PNFS = 0x00010000; const EXCHGID4_FLAG_USE_NON_PNFS = 0x00010000;
const EXCHGID4_FLAG_USE_PNFS_MDS = 0x00020000; const EXCHGID4_FLAG_USE_PNFS_MDS = 0x00020000;
skipping to change at line 25354 skipping to change at line 25355
EXCHANGE_ID4resok eir_resok4; EXCHANGE_ID4resok eir_resok4;
default: default:
void; void;
}; };
18.35.3. DESCRIPTION 18.35.3. DESCRIPTION
The client uses the EXCHANGE_ID operation to register a particular The client uses the EXCHANGE_ID operation to register a particular
client_owner with the server. However, when the client_owner has client_owner with the server. However, when the client_owner has
already been registered by other means (e.g. Transparent State already been registered by other means (e.g., Transparent State
Migration), the client may still use EXCHANGE_ID to obtain the client Migration), the client may still use EXCHANGE_ID to obtain the client
ID assigned previously. ID assigned previously.
The client ID returned from this operation will be associated with The client ID returned from this operation will be associated with
the connection on which the EXCHANGE_ID is received and will serve as the connection on which the EXCHANGE_ID is received and will serve as
a parent object for sessions created by the client on this connection a parent object for sessions created by the client on this connection
or to which the connection is bound. As a result of using those or to which the connection is bound. As a result of using those
sessions to make requests involving the creation of state, that state sessions to make requests involving the creation of state, that state
will become associated with the client ID returned. will become associated with the client ID returned.
skipping to change at line 25377 skipping to change at line 25378
returned eir_sequenceid, in creating an associated session using returned eir_sequenceid, in creating an associated session using
CREATE_SESSION. CREATE_SESSION.
If the flag EXCHGID4_FLAG_CONFIRMED_R is set in the result, If the flag EXCHGID4_FLAG_CONFIRMED_R is set in the result,
eir_flags, then it is an indication that the registration of the eir_flags, then it is an indication that the registration of the
client_owner has already occurred and that a further CREATE_SESSION client_owner has already occurred and that a further CREATE_SESSION
is not needed to confirm it. Of course, subsequent CREATE_SESSION is not needed to confirm it. Of course, subsequent CREATE_SESSION
operations may be needed for other reasons. operations may be needed for other reasons.
The value eir_sequenceid is used to establish an initial sequence The value eir_sequenceid is used to establish an initial sequence
value associate with the client ID returned. In cases in which a value associated with the client ID returned. In cases in which a
CREATE_SESSION has already been done, there is no need for this CREATE_SESSION has already been done, there is no need for this
value, since sequencing of such request has already been established value, since sequencing of such request has already been established,
and the client has no need for this value and will ignore it and the client has no need for this value and will ignore it.
EXCHANGE_ID MAY be sent in a COMPOUND procedure that starts with EXCHANGE_ID MAY be sent in a COMPOUND procedure that starts with
SEQUENCE. However, when a client communicates with a server for the SEQUENCE. However, when a client communicates with a server for the
first time, it will not have a session, so using SEQUENCE will not be first time, it will not have a session, so using SEQUENCE will not be
possible. If EXCHANGE_ID is sent without a preceding SEQUENCE, then possible. If EXCHANGE_ID is sent without a preceding SEQUENCE, then
it MUST be the only operation in the COMPOUND procedure's request. it MUST be the only operation in the COMPOUND procedure's request.
If it is not, the server MUST return NFS4ERR_NOT_ONLY_OP. If it is not, the server MUST return NFS4ERR_NOT_ONLY_OP.
The eia_clientowner field is composed of a co_verifier field and a The eia_clientowner field is composed of a co_verifier field and a
co_ownerid string. As noted in Section 2.4, the co_ownerid co_ownerid string. As noted in Section 2.4, the co_ownerid
identifies the client, and the co_verifier specifies a particular identifies the client, and the co_verifier specifies a particular
incarnation of that client. An EXCHANGE_ID sent with a new incarnation of that client. An EXCHANGE_ID sent with a new
incarnation of the client will lead to the server removing lock state incarnation of the client will lead to the server removing lock state
of the old incarnation. On the other hand, an EXCHANGE_ID sent with of the old incarnation. On the other hand, when an EXCHANGE_ID sent
the current incarnation and co_ownerid will, when it does not result with the current incarnation and co_ownerid does not result in an
in an unrelated error, potentially update an existing client ID's unrelated error, it will potentially update an existing client ID's
properties, or simply return information about the existing properties or simply return information about the existing client_id.
client_id. That latter would happen when this operation is done to The latter would happen when this operation is done to the same
the same server using different network addresses as part of creating server using different network addresses as part of creating trunked
trunked connections. connections.
A server MUST NOT provide the same client ID to two different A server MUST NOT provide the same client ID to two different
incarnations of an eia_clientowner. incarnations of an eia_clientowner.
In addition to the client ID and sequence ID, the server returns a In addition to the client ID and sequence ID, the server returns a
server owner (eir_server_owner) and server scope (eir_server_scope). server owner (eir_server_owner) and server scope (eir_server_scope).
The former field is used in connection with network trunking as The former field is used in connection with network trunking as
described in Section 2.10.5. The latter field is used to allow described in Section 2.10.5. The latter field is used to allow
clients to determine when client IDs sent by one server may be clients to determine when client IDs sent by one server may be
recognized by another in the event of file system migration (see recognized by another in the event of file system migration (see
skipping to change at line 25799 skipping to change at line 25800
ssp_num_gss_handles to zero; the client can create more handles ssp_num_gss_handles to zero; the client can create more handles
with another EXCHANGE_ID call. with another EXCHANGE_ID call.
Because each SSV RPCSEC_GSS handle shares a common SSV GSS Because each SSV RPCSEC_GSS handle shares a common SSV GSS
context, there are security considerations specific to this context, there are security considerations specific to this
situation discussed in Section 2.10.10. situation discussed in Section 2.10.10.
The seq_window (see Section 5.2.3.1 of RFC 2203 [4]) of each The seq_window (see Section 5.2.3.1 of RFC 2203 [4]) of each
RPCSEC_GSS handle in spi_handle MUST be the same as the seq_window RPCSEC_GSS handle in spi_handle MUST be the same as the seq_window
of the RPCSEC_GSS handle used for the credential of the RPC of the RPCSEC_GSS handle used for the credential of the RPC
request that the EXCHANGE_ID operation was sent as a part of. request of which the EXCHANGE_ID operation was sent as a part.
+======================+===========================+===============+ +======================+===========================+===============+
| Encryption Algorithm | MUST NOT be combined with | SHOULD NOT be | | Encryption Algorithm | MUST NOT be combined with | SHOULD NOT be |
| | | combined with | | | | combined with |
+======================+===========================+===============+ +======================+===========================+===============+
| id-aes128-CBC | | id-sha384, | | id-aes128-CBC | | id-sha384, |
| | | id-sha512 | | | | id-sha512 |
+----------------------+---------------------------+---------------+ +----------------------+---------------------------+---------------+
| id-aes192-CBC | id-sha1 | id-sha512 | | id-aes192-CBC | id-sha1 | id-sha512 |
+----------------------+---------------------------+---------------+ +----------------------+---------------------------+---------------+
skipping to change at line 25839 skipping to change at line 25840
server MUST NOT interpret this implementation identity information in server MUST NOT interpret this implementation identity information in
a way that affects how the implementation interacts with its peer. a way that affects how the implementation interacts with its peer.
The client and server are not allowed to depend on the peer's The client and server are not allowed to depend on the peer's
manifesting a particular allowed behavior based on an implementation manifesting a particular allowed behavior based on an implementation
identifier but are required to interoperate as specified elsewhere in identifier but are required to interoperate as specified elsewhere in
the protocol specification. the protocol specification.
Because it is possible that some implementations might violate the Because it is possible that some implementations might violate the
protocol specification and interpret the identity information, protocol specification and interpret the identity information,
implementations MUST provide facilities to allow the NFSv4 client and implementations MUST provide facilities to allow the NFSv4 client and
server be configured to set the contents of the nfs_impl_id server to be configured to set the contents of the nfs_impl_id
structures sent to any specified value. structures sent to any specified value.
18.35.4. IMPLEMENTATION 18.35.4. IMPLEMENTATION
A server's client record is a 5-tuple: A server's client record is a 5-tuple:
1. co_ownerid: 1. co_ownerid:
The client identifier string, from the eia_clientowner structure The client identifier string, from the eia_clientowner structure
of the EXCHANGE_ID4args structure. of the EXCHANGE_ID4args structure.
skipping to change at line 26296 skipping to change at line 26297
ca_maxresponsesize_cached: ca_maxresponsesize_cached:
Like ca_maxresponsesize, but the maximum size of a reply that Like ca_maxresponsesize, but the maximum size of a reply that
will be stored in the reply cache (Section 2.10.6.1). For each will be stored in the reply cache (Section 2.10.6.1). For each
channel, the server MAY decrease this value, but MUST NOT channel, the server MAY decrease this value, but MUST NOT
increase it. If, in the reply to CREATE_SESSION, the value of increase it. If, in the reply to CREATE_SESSION, the value of
ca_maxresponsesize_cached of a channel is less than the value ca_maxresponsesize_cached of a channel is less than the value
of ca_maxresponsesize of the same channel, then this is an of ca_maxresponsesize of the same channel, then this is an
indication to the requester that it needs to be selective about indication to the requester that it needs to be selective about
which replies it directs the replier to cache; for example, which replies it directs the replier to cache; for example,
large replies from nonidempotent operations (e.g., COMPOUND large replies from non-idempotent operations (e.g., COMPOUND
requests with a READ operation) should not be cached. The requests with a READ operation) should not be cached. The
requester decides which replies to cache via an argument to the requester decides which replies to cache via an argument to the
SEQUENCE (the sa_cachethis field, see Section 18.46) or SEQUENCE (the sa_cachethis field, see Section 18.46) or
CB_SEQUENCE (the csa_cachethis field, see Section 20.9) CB_SEQUENCE (the csa_cachethis field, see Section 20.9)
operations. After the session is created, if a requester sends operations. After the session is created, if a requester sends
a request for which the size of the reply would exceed a request for which the size of the reply would exceed
ca_maxresponsesize_cached, the replier will return ca_maxresponsesize_cached, the replier will return
NFS4ERR_REP_TOO_BIG_TO_CACHE, per the description in NFS4ERR_REP_TOO_BIG_TO_CACHE, per the description in
Section 2.10.6.4. Section 2.10.6.4.
skipping to change at line 26382 skipping to change at line 26383
gcbp_service, i.e., it MUST set the "service" field of the gcbp_service, i.e., it MUST set the "service" field of the
rpc_gss_cred_t data type in RPCSEC_GSS credential to the value of rpc_gss_cred_t data type in RPCSEC_GSS credential to the value of
gcbp_service (see "RPC Request Header", Section 5.3.1 of [4]). gcbp_service (see "RPC Request Header", Section 5.3.1 of [4]).
If the RPCSEC_GSS handle identified by gcbp_handle_from_server If the RPCSEC_GSS handle identified by gcbp_handle_from_server
does not exist on the server, the server will return does not exist on the server, the server will return
NFS4ERR_NOENT. NFS4ERR_NOENT.
Within each element of csa_sec_parms, the fore and back RPCSEC_GSS Within each element of csa_sec_parms, the fore and back RPCSEC_GSS
contexts MUST share the same GSS context and MUST have the same contexts MUST share the same GSS context and MUST have the same
seq_window (see Section 5.2.3.1 of RFC2203 [4]). The fore and seq_window (see Section 5.2.3.1 of RFC 2203 [4]). The fore and
back RPCSEC_GSS context state are independent of each other as far back RPCSEC_GSS context state are independent of each other as far
as the RPCSEC_GSS sequence number (see the seq_num field in the as the RPCSEC_GSS sequence number (see the seq_num field in the
rpc_gss_cred_t data type of Sections 5 and 5.3.1 of [4]). rpc_gss_cred_t data type of Sections 5 and 5.3.1 of [4]).
If an RPCSEC_GSS handle is using the SSV context (see If an RPCSEC_GSS handle is using the SSV context (see
Section 2.10.9), then because each SSV RPCSEC_GSS handle shares a Section 2.10.9), then because each SSV RPCSEC_GSS handle shares a
common SSV GSS context, there are security considerations specific common SSV GSS context, there are security considerations specific
to this situation discussed in Section 2.10.10. to this situation discussed in Section 2.10.10.
Once the session is created, the first SEQUENCE or CB_SEQUENCE Once the session is created, the first SEQUENCE or CB_SEQUENCE
skipping to change at line 28598 skipping to change at line 28599
DESTROY_CLIENTID allows a server to immediately reclaim the resources DESTROY_CLIENTID allows a server to immediately reclaim the resources
consumed by an unused client ID, and also to forget that it ever consumed by an unused client ID, and also to forget that it ever
generated the client ID. By forgetting that it ever generated the generated the client ID. By forgetting that it ever generated the
client ID, the server can safely reuse the client ID on a future client ID, the server can safely reuse the client ID on a future
EXCHANGE_ID operation. EXCHANGE_ID operation.
18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished
18.51.1. ARGUMENT 18.51.1. ARGUMENT
<CODE BEGINS>
struct RECLAIM_COMPLETE4args { struct RECLAIM_COMPLETE4args {
/* /*
* If rca_one_fs TRUE, * If rca_one_fs TRUE,
* *
* CURRENT_FH: object in * CURRENT_FH: object in
* file system reclaim is * file system reclaim is
* complete for. * complete for.
*/ */
bool rca_one_fs; bool rca_one_fs;
}; };
<CODE ENDS>
18.51.2. RESULTS 18.51.2. RESULTS
<CODE BEGINS>
struct RECLAIM_COMPLETE4res { struct RECLAIM_COMPLETE4res {
nfsstat4 rcr_status; nfsstat4 rcr_status;
}; };
<CODE ENDS>
18.51.3. DESCRIPTION 18.51.3. DESCRIPTION
A RECLAIM_COMPLETE operation is used to indicate that the client has A RECLAIM_COMPLETE operation is used to indicate that the client has
reclaimed all of the locking state that it will recover using reclaimed all of the locking state that it will recover using
reclaim, when it is recovering state due to either a server restart reclaim, when it is recovering state due to either a server restart
or the migration of a file system to another server. There are two or the migration of a file system to another server. There are two
types of RECLAIM_COMPLETE operations: types of RECLAIM_COMPLETE operations:
* When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done. * When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done.
skipping to change at line 28716 skipping to change at line 28721
When a RECLAIM_COMPLETE is sent, the client effectively acknowledges When a RECLAIM_COMPLETE is sent, the client effectively acknowledges
any locks not yet reclaimed as lost. This allows the server to re- any locks not yet reclaimed as lost. This allows the server to re-
enable the client to recover locks if the occurrence of edge enable the client to recover locks if the occurrence of edge
conditions, as described in Section 8.4.3, had caused the server to conditions, as described in Section 8.4.3, had caused the server to
disable the client's ability to recover locks. disable the client's ability to recover locks.
Because previous descriptions of RECLAIM_COMPLETE were not Because previous descriptions of RECLAIM_COMPLETE were not
sufficiently explicit about the circumstances in which use of sufficiently explicit about the circumstances in which use of
RECLAIM_COMPLETE with rca_one_fs set to TRUE was appropriate, there RECLAIM_COMPLETE with rca_one_fs set to TRUE was appropriate, there
have been cases which it has been misused by clients who have issued have been cases in which it has been misused by clients who have
RECLAIM_COMPLETE with rca_one_fs set to TRUE when it should have not issued RECLAIM_COMPLETE with rca_one_fs set to TRUE when it should
been. There have also been cases in which servers have, in various have not been. There have also been cases in which servers have, in
ways, not responded to such misuse as described above, either various ways, not responded to such misuse as described above, either
ignoring the rca_one_fs setting (treating the operation as a global ignoring the rca_one_fs setting (treating the operation as a global
RECLAIM_COMPLETE) or ignoring the entire operation. RECLAIM_COMPLETE) or ignoring the entire operation.
While clients SHOULD NOT misuse this feature and servers SHOULD While clients SHOULD NOT misuse this feature, and servers SHOULD
respond to such misuse as described above, implementers need to be respond to such misuse as described above, implementors need to be
aware of the following considerations as they make necessary aware of the following considerations as they make necessary trade-
tradeoffs between interoperability with existing implementations and offs between interoperability with existing implementations and
proper support for facilities to allow lock recovery in the event of proper support for facilities to allow lock recovery in the event of
file system migration. file system migration.
* When servers have no support for becoming the destination server * When servers have no support for becoming the destination server
of a file system subject to migration, there is no possibility of of a file system subject to migration, there is no possibility of
a per-fs RECLAIM_COMPLETE being done legitimately and occurrences a per-fs RECLAIM_COMPLETE being done legitimately, and occurrences
of it SHOULD be ignored. However, the negative consequences of of it SHOULD be ignored. However, the negative consequences of
accepting such mistaken use are quite limited as long as the accepting such mistaken use are quite limited as long as the
client does not issue it before all necessary reclaims are done. client does not issue it before all necessary reclaims are done.
* When a server might become the destination for a file system being * When a server might become the destination for a file system being
migrated, inappropriate use of per-fs RECLAIM_COMPLETE is more migrated, inappropriate use of per-fs RECLAIM_COMPLETE is more
concerning. In the case in which the file system designated is concerning. In the case in which the file system designated is
not within a per-fs grace period, the per-fs RECLAIM_COMPLETE not within a per-fs grace period, the per-fs RECLAIM_COMPLETE
SHOULD be ignored, with the negative consequences of accepting it SHOULD be ignored, with the negative consequences of accepting it
being limited, as in the case in which migration is not supported. being limited, as in the case in which migration is not supported.
skipping to change at line 28990 skipping to change at line 28995
+------------------------------+----------------------------------+ +------------------------------+----------------------------------+
| NFS4ERR_TOO_MANY_OPS | | | NFS4ERR_TOO_MANY_OPS | |
+------------------------------+----------------------------------+ +------------------------------+----------------------------------+
| NFS4ERR_REP_TOO_BIG | | | NFS4ERR_REP_TOO_BIG | |
+------------------------------+----------------------------------+ +------------------------------+----------------------------------+
| NFS4ERR_REP_TOO_BIG_TO_CACHE | | | NFS4ERR_REP_TOO_BIG_TO_CACHE | |
+------------------------------+----------------------------------+ +------------------------------+----------------------------------+
| NFS4ERR_REQ_TOO_BIG | | | NFS4ERR_REQ_TOO_BIG | |
+------------------------------+----------------------------------+ +------------------------------+----------------------------------+
Table 24: CB_COMPOUND error returns Table 24: CB_COMPOUND Error Returns
20. NFSv4.1 Callback Operations 20. NFSv4.1 Callback Operations
20.1. Operation 3: CB_GETATTR - Get Attributes 20.1. Operation 3: CB_GETATTR - Get Attributes
20.1.1. ARGUMENT 20.1.1. ARGUMENT
struct CB_GETATTR4args { struct CB_GETATTR4args {
nfs_fh4 fh; nfs_fh4 fh;
bitmap4 attr_request; bitmap4 attr_request;
skipping to change at line 30102 skipping to change at line 30107
Relative to previous NFS versions, NFSv4.1 has additional security Relative to previous NFS versions, NFSv4.1 has additional security
considerations for pNFS (see Sections 12.9 and 13.12), locking and considerations for pNFS (see Sections 12.9 and 13.12), locking and
session state (see Section 2.10.8.3), and state recovery during grace session state (see Section 2.10.8.3), and state recovery during grace
period (see Section 8.4.2.1.1). With respect to locking and session period (see Section 8.4.2.1.1). With respect to locking and session
state, if SP4_SSV state protection is being used, Section 2.10.10 has state, if SP4_SSV state protection is being used, Section 2.10.10 has
specific security considerations for the NFSv4.1 client and server. specific security considerations for the NFSv4.1 client and server.
Security considerations for lock reclaim differ between the two Security considerations for lock reclaim differ between the two
different situations in which state reclaim is to be done. The different situations in which state reclaim is to be done. The
server failure situation is discussed in Section 8.4.2.1.1 while the server failure situation is discussed in Section 8.4.2.1.1, while the
per-fs state reclaim done in support of migration/replication is per-fs state reclaim done in support of migration/replication is
discussed in Section 11.11.9.1. discussed in Section 11.11.9.1.
The use of the multi-server namespace features described in The use of the multi-server namespace features described in
Section 11 raises the possibility that requests to determine the set Section 11 raises the possibility that requests to determine the set
of network addresses corresponding to a given server might be of network addresses corresponding to a given server might be
interfered with or have their responses modified in flight. In light interfered with or have their responses modified in flight. In light
of this possibility, the following considerations should be taken of this possibility, the following considerations should be noted:
note of:
* When DNS is used to convert server names to addresses and DNSSEC * When DNS is used to convert server names to addresses and DNSSEC
[29] is not available, the validity of the network addresses [29] is not available, the validity of the network addresses
returned generally cannot be relied upon. However, when combined returned generally cannot be relied upon. However, when combined
with a trusted resolver, DNS over TLS [30], and DNS over HTTPS with a trusted resolver, DNS over TLS [30] and DNS over HTTPS [34]
[34] can also be relied upon to provide valid address resolutions. can be relied upon to provide valid address resolutions.
In situations in which the validity of the provided addresses In situations in which the validity of the provided addresses
cannot be relied upon and the client uses RPCSEC_GSS to access the cannot be relied upon and the client uses RPCSEC_GSS to access the
designated server, it is possible for mutual authentication to designated server, it is possible for mutual authentication to
discover invalid server addresses as long as the RPCSEC_GSS discover invalid server addresses as long as the RPCSEC_GSS
implementation used does not use insecure DNS queries to implementation used does not use insecure DNS queries to
canonicalize the hostname components of the service principal canonicalize the hostname components of the service principal
names, as explained in [28]. names, as explained in [28].
* The fetching of attributes containing file system location * The fetching of attributes containing file system location
information SHOULD be performed using integrity protection. It is information SHOULD be performed using integrity protection. It is
important to note here that a client making a request of this sort important to note here that a client making a request of this sort
without using integrity protection needs be aware of the negative without using integrity protection needs be aware of the negative
consequences of doing so, which can lead to invalid host names or consequences of doing so, which can lead to invalid hostnames or
network addresses being returned. These include cases in which network addresses being returned. These include cases in which
the client is directed to a server under the control of an the client is directed to a server under the control of an
attacker, who might get access to data written or provide attacker, who might get access to data written or provide
incorrect values for data read. In light of this, the client incorrect values for data read. In light of this, the client
needs to recognize that using such returned location information needs to recognize that using such returned location information
to access an NFSv4 server without use of RPCSEC_GSS (i.e. by to access an NFSv4 server without use of RPCSEC_GSS (i.e., by
using AUTH_SYS) poses dangers as it can result in the client using AUTH_SYS) poses dangers as it can result in the client
interacting with such an attacker-controlled server, without any interacting with such an attacker-controlled server without any
authentication facilities to verify the server's identity. authentication facilities to verify the server's identity.
* Despite the fact that it is a requirement that implementations * Despite the fact that it is a requirement that implementations
provide "support" for use of RPCSEC_GSS, it cannot be assumed that provide "support" for use of RPCSEC_GSS, it cannot be assumed that
use of RPCSEC_GSS is always available between any particular use of RPCSEC_GSS is always available between any particular
client-server pair. client-server pair.
* When a client has the network addresses of a server but not the * When a client has the network addresses of a server but not the
associated host names, that would interfere with its ability to associated hostnames, that would interfere with its ability to use
use RPCSEC_GSS. RPCSEC_GSS.
In light of the above, a server SHOULD present file system location In light of the above, a server SHOULD present file system location
entries that correspond to file systems on other servers using a host entries that correspond to file systems on other servers using a
name. This would allow the client to interrogate the fs_locations on hostname. This would allow the client to interrogate the
the destination server to obtain trunking information (as well as fs_locations on the destination server to obtain trunking information
replica information) using integrity protection, validating the name (as well as replica information) using integrity protection,
provided while assuring that the response has not been modified in validating the name provided while assuring that the response has not
flight. been modified in flight.
When RPCSEC_GSS is not available on a server, the client needs to be When RPCSEC_GSS is not available on a server, the client needs to be
aware of the fact that the location entries are subject to aware of the fact that the location entries are subject to
modification in flight and so cannot be relied upon. In the case of modification in flight and so cannot be relied upon. In the case of
a client being directed to another server after NFS4ERR_MOVED, this a client being directed to another server after NFS4ERR_MOVED, this
could vitiate the authentication provided by the use of RPCSEC_GSS on could vitiate the authentication provided by the use of RPCSEC_GSS on
the designated destination server. Even when RPCSEC_GSS the designated destination server. Even when RPCSEC_GSS
authentication is available on the destination, the server might authentication is available on the destination, the server might
still properly authenticate as the server to which the client was still properly authenticate as the server to which the client was
erroneously directed. Without a way to decide whether the server is erroneously directed. Without a way to decide whether the server is
skipping to change at line 30184 skipping to change at line 30188
When a file system location attribute is fetched upon connecting with When a file system location attribute is fetched upon connecting with
an NFS server, it SHOULD, as stated above, be done with integrity an NFS server, it SHOULD, as stated above, be done with integrity
protection. When this not possible, it is generally best for the protection. When this not possible, it is generally best for the
client to ignore trunking and replica information or simply not fetch client to ignore trunking and replica information or simply not fetch
the location information for these purposes. the location information for these purposes.
When location information cannot be verified, it can be subjected to When location information cannot be verified, it can be subjected to
additional filtering to prevent the client from being inappropriately additional filtering to prevent the client from being inappropriately
directed. For example, if a range of network addresses can be directed. For example, if a range of network addresses can be
determined that assure that the servers and clients using AUTH_SYS determined that assure that the servers and clients using AUTH_SYS
are subject to the appropriate set of constraints (e.g. physical are subject to the appropriate set of constraints (e.g., physical
network isolation, administrative controls on the operating systems network isolation, administrative controls on the operating systems
used), then network addresses in the appropriate range can be used used), then network addresses in the appropriate range can be used
with others discarded or restricted in their use of AUTH_SYS. with others discarded or restricted in their use of AUTH_SYS.
To summarize considerations regarding the use of RPCSEC_GSS in To summarize considerations regarding the use of RPCSEC_GSS in
fetching location information, we need to consider the following fetching location information, we need to consider the following
possibilities for requests to interrogate location information, with possibilities for requests to interrogate location information, with
interrogation approaches on the referring and destination servers interrogation approaches on the referring and destination servers
arrived at separately: arrived at separately:
* The use of integrity protection is RECOMMENDED in all cases, since * The use of integrity protection is RECOMMENDED in all cases, since
the absence of integrity protection exposes the client to the the absence of integrity protection exposes the client to the
possibility of the results being modified in transit. possibility of the results being modified in transit.
* The use of requests issued without RPCSEC_GSS (i.e. using AUTH_SYS * The use of requests issued without RPCSEC_GSS (i.e., using
which has no provision to avoid modification of data in flight), AUTH_SYS, which has no provision to avoid modification of data in
while undesirable and a potential security exposure, may not be flight), while undesirable and a potential security exposure, may
avoidable in all cases. Where the use of the returned information not be avoidable in all cases. Where the use of the returned
cannot be avoided, it is made subject to filtering as described information cannot be avoided, it is made subject to filtering as
above to eliminate the possibility that the client would treat an described above to eliminate the possibility that the client would
invalid address as if it were a NFSv4 server. The specifics will treat an invalid address as if it were a NFSv4 server. The
vary depending on the degree of network isolation and whether the specifics will vary depending on the degree of network isolation
request is to the referring or destination servers. and whether the request is to the referring or destination
servers.
Even if such requests are not interfered with in flight, it is Even if such requests are not interfered with in flight, it is
possible for a compromised server to direct the client to use possible for a compromised server to direct the client to use
inappropriate servers, such as those under the control of the inappropriate servers, such as those under the control of the
attacker. It is not clear that being directed to such servers attacker. It is not clear that being directed to such servers
represents a greater threat to the client than the damage that could represents a greater threat to the client than the damage that could
be done by the compromised server itself. However, it is possible be done by the compromised server itself. However, it is possible
that some sorts of transient server compromises might be taken that some sorts of transient server compromises might be exploited to
advantage of to direct a client to a server capable of doing greater direct a client to a server capable of doing greater damage over a
damage over a longer time. One useful step to guard against this longer time. One useful step to guard against this possibility is to
possibility is to issue requests to fetch location data using issue requests to fetch location data using RPCSEC_GSS, even if no
RPCSEC_GSS, even if no mapping to an RPCSEC_GSS principal is mapping to an RPCSEC_GSS principal is available. In this case,
available. In this case, RPCSEC_GSS would not be used, as it RPCSEC_GSS would not be used, as it typically is, to identify the
typically is, to identify the client principal to the server, but client principal to the server, but rather to make sure (via
rather to make sure (via RPCSEC_GSS mutual authentication) that the RPCSEC_GSS mutual authentication) that the server being contacted is
server being contacted is the one intended. the one intended.
Similar considerations apply if the threat to be avoided is the Similar considerations apply if the threat to be avoided is the
redirection of client traffic to inappropriate (i.e. poorly redirection of client traffic to inappropriate (i.e., poorly
performing) servers. In both cases, there is no reason for the performing) servers. In both cases, there is no reason for the
information returned to depend on the identity of the client information returned to depend on the identity of the client
principal requesting it, while the validity of the server principal requesting it, while the validity of the server
information, which has the capability to affect all client information, which has the capability to affect all client
principals, is of considerable importance. principals, is of considerable importance.
22. IANA Considerations 22. IANA Considerations
This section uses terms that are defined in [62]. This section uses terms that are defined in [62].
22.1. IANA Actions Needed 22.1. IANA Actions
This update does not require any modification of or additions to This update does not require any modification of, or additions to,
registry entries or registry rules associated with NFSv4.1. However, registry entries or registry rules associated with NFSv4.1. However,
since this document is intended to obsolete RFC5661, it will be since this document obsoletes RFC 8881, IANA has updated all registry
necessary for IANA to update all registry entries and registry rules entries and registry rules references that point to RFC 5661 to point
references that points to RFC5661 to point to this document instead. to this document instead.
Previous actions by IANA related to NFSv4.1 are listed in the Previous actions by IANA related to NFSv4.1 are listed in the
remaining subsections of Section 22. remaining subsections of Section 22.
22.2. Named Attribute Definitions 22.2. Named Attribute Definitions
IANA created a registry called the "NFSv4 Named Attribute Definitions IANA created a registry called the "NFSv4 Named Attribute Definitions
Registry". Registry".
The NFSv4.1 protocol supports the association of a file with zero or The NFSv4.1 protocol supports the association of a file with zero or
skipping to change at line 30373 skipping to change at line 30378
22.3.1. Initial Registry 22.3.1. Initial Registry
The initial registry is in Table 25. Note that the next available The initial registry is in Table 25. Note that the next available
value is zero. value is zero.
+=========================+=======+==========+=====+================+ +=========================+=======+==========+=====+================+
| Notification Name | Value | RFC | How | Minor Versions | | Notification Name | Value | RFC | How | Minor Versions |
+=========================+=======+==========+=====+================+ +=========================+=======+==========+=====+================+
| NOTIFY_DEVICEID4_CHANGE | 1 | RFC | N | 1 | | NOTIFY_DEVICEID4_CHANGE | 1 | RFC | N | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------+-------+----------+-----+----------------+ +-------------------------+-------+----------+-----+----------------+
| NOTIFY_DEVICEID4_DELETE | 2 | RFC | N | 1 | | NOTIFY_DEVICEID4_DELETE | 2 | RFC | N | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------+-------+----------+-----+----------------+ +-------------------------+-------+----------+-----+----------------+
Table 25: Initial Device ID Notification Assignments Table 25: Initial Device ID Notification Assignments
22.3.2. Updating Registrations 22.3.2. Updating Registrations
The update of a registration will require IESG Approval on the advice The update of a registration will require IESG Approval on the advice
of a Designated Expert. of a Designated Expert.
22.4. Object Recall Types 22.4. Object Recall Types
skipping to change at line 30448 skipping to change at line 30453
22.4.1. Initial Registry 22.4.1. Initial Registry
The initial registry is in Table 26. Note that the next available The initial registry is in Table 26. Note that the next available
value is five. value is five.
+===============================+=======+======+=====+==========+ +===============================+=======+======+=====+==========+
| Recallable Object Type Name | Value | RFC | How | Minor | | Recallable Object Type Name | Value | RFC | How | Minor |
| | | | | Versions | | | | | | Versions |
+===============================+=======+======+=====+==========+ +===============================+=======+======+=====+==========+
| RCA4_TYPE_MASK_RDATA_DLG | 0 | RFC | N | 1 | | RCA4_TYPE_MASK_RDATA_DLG | 0 | RFC | N | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------------+-------+------+-----+----------+ +-------------------------------+-------+------+-----+----------+
| RCA4_TYPE_MASK_WDATA_DLG | 1 | RFC | N | 1 | | RCA4_TYPE_MASK_WDATA_DLG | 1 | RFC | N | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------------+-------+------+-----+----------+ +-------------------------------+-------+------+-----+----------+
| RCA4_TYPE_MASK_DIR_DLG | 2 | RFC | N | 1 | | RCA4_TYPE_MASK_DIR_DLG | 2 | RFC | N | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------------+-------+------+-----+----------+ +-------------------------------+-------+------+-----+----------+
| RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFC | N | 1 | | RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFC | N | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------------+-------+------+-----+----------+ +-------------------------------+-------+------+-----+----------+
| RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFC | L | 1 | | RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFC | L | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------------+-------+------+-----+----------+ +-------------------------------+-------+------+-----+----------+
| RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFC | L | 1 | | RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFC | L | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------------+-------+------+-----+----------+ +-------------------------------+-------+------+-----+----------+
| RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFC | L | 1 | | RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFC | L | 1 |
| | | 5661 | | | | | | 8881 | | |
+-------------------------------+-------+------+-----+----------+ +-------------------------------+-------+------+-----+----------+
Table 26: Initial Recallable Object Type Assignments Table 26: Initial Recallable Object Type Assignments
22.4.2. Updating Registrations 22.4.2. Updating Registrations
The update of a registration will require IESG Approval on the advice The update of a registration will require IESG Approval on the advice
of a Designated Expert. of a Designated Expert.
22.5. Layout Types 22.5. Layout Types
skipping to change at line 30527 skipping to change at line 30532
minor version of NFSv4 approved, a Designated Expert should minor version of NFSv4 approved, a Designated Expert should
review the registry to make recommended updates as needed. review the registry to make recommended updates as needed.
22.5.1. Initial Registry 22.5.1. Initial Registry
The initial registry is in Table 27. The initial registry is in Table 27.
+=======================+=======+==========+=====+================+ +=======================+=======+==========+=====+================+
| Layout Type Name | Value | RFC | How | Minor Versions | | Layout Type Name | Value | RFC | How | Minor Versions |
+=======================+=======+==========+=====+================+ +=======================+=======+==========+=====+================+
| LAYOUT4_NFSV4_1_FILES | 0x1 | RFC 5661 | N | 1 | | LAYOUT4_NFSV4_1_FILES | 0x1 | RFC 8881 | N | 1 |
+-----------------------+-------+----------+-----+----------------+ +-----------------------+-------+----------+-----+----------------+
| LAYOUT4_OSD2_OBJECTS | 0x2 | RFC 5664 | L | 1 | | LAYOUT4_OSD2_OBJECTS | 0x2 | RFC 5664 | L | 1 |
+-----------------------+-------+----------+-----+----------------+ +-----------------------+-------+----------+-----+----------------+
| LAYOUT4_BLOCK_VOLUME | 0x3 | RFC 5663 | L | 1 | | LAYOUT4_BLOCK_VOLUME | 0x3 | RFC 5663 | L | 1 |
+-----------------------+-------+----------+-----+----------------+ +-----------------------+-------+----------+-----+----------------+
Table 27: Initial Layout Type Assignments Table 27: Initial Layout Type Assignments
22.5.2. Updating Registrations 22.5.2. Updating Registrations
skipping to change at line 30675 skipping to change at line 30680
For assignments made on a Standards Action basis, the point of For assignments made on a Standards Action basis, the point of
contact is always IESG. contact is always IESG.
22.6.1.1.1. Initial Registry 22.6.1.1.1. Initial Registry
The initial registry is in Table 28. The initial registry is in Table 28.
+========================+==========+==================+ +========================+==========+==================+
| Variable Name | RFC | Point of Contact | | Variable Name | RFC | Point of Contact |
+========================+==========+==================+ +========================+==========+==================+
| ${ietf.org:CPU_ARCH} | RFC 5661 | IESG | | ${ietf.org:CPU_ARCH} | RFC 8881 | IESG |
+------------------------+----------+------------------+ +------------------------+----------+------------------+
| ${ietf.org:OS_TYPE} | RFC 5661 | IESG | | ${ietf.org:OS_TYPE} | RFC 8881 | IESG |
+------------------------+----------+------------------+ +------------------------+----------+------------------+
| ${ietf.org:OS_VERSION} | RFC 5661 | IESG | | ${ietf.org:OS_VERSION} | RFC 8881 | IESG |
+------------------------+----------+------------------+ +------------------------+----------+------------------+
Table 28: Initial List of Path Variables Table 28: Initial List of Path Variables
IANA has created registries for the values of the variable names IANA has created registries for the values of the variable names
${ietf.org:CPU_ARCH} and ${ietf.org:OS_TYPE}. See Sections 22.6.2 and ${ietf.org:CPU_ARCH} and ${ietf.org:OS_TYPE}. See Sections 22.6.2 and
22.6.3. 22.6.3.
For the values of the variable ${ietf.org:OS_VERSION}, no registry is For the values of the variable ${ietf.org:OS_VERSION}, no registry is
needed as the specifics of the values of the variable will vary with needed as the specifics of the values of the variable will vary with
skipping to change at line 30792 skipping to change at line 30797
1997, <https://www.rfc-editor.org/info/rfc2203>. 1997, <https://www.rfc-editor.org/info/rfc2203>.
[5] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos [5] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos
Version 5 Generic Security Service Application Program Version 5 Generic Security Service Application Program
Interface (GSS-API) Mechanism: Version 2", RFC 4121, Interface (GSS-API) Mechanism: Version 2", RFC 4121,
DOI 10.17487/RFC4121, July 2005, DOI 10.17487/RFC4121, July 2005,
<https://www.rfc-editor.org/info/rfc4121>. <https://www.rfc-editor.org/info/rfc4121>.
[6] The Open Group, "Section 3.191 of Chapter 3 of Base [6] The Open Group, "Section 3.191 of Chapter 3 of Base
Definitions of The Open Group Base Specifications Issue 6 Definitions of The Open Group Base Specifications Issue 6
IEEE Std 1003.1, 2004 Edition, HTML Version IEEE Std 1003.1, 2004 Edition, HTML Version",
(www.opengroup.org), ISBN 1931624232", 2004. ISBN 1931624232, 2004, <https://www.opengroup.org>.
[7] Linn, J., "Generic Security Service Application Program [7] Linn, J., "Generic Security Service Application Program
Interface Version 2, Update 1", RFC 2743, Interface Version 2, Update 1", RFC 2743,
DOI 10.17487/RFC2743, January 2000, DOI 10.17487/RFC2743, January 2000,
<https://www.rfc-editor.org/info/rfc2743>. <https://www.rfc-editor.org/info/rfc2743>.
[8] Recio, R., Metzler, B., Culley, P., Hilland, J., and D. [8] Recio, R., Metzler, B., Culley, P., Hilland, J., and D.
Garcia, "A Remote Direct Memory Access Protocol Garcia, "A Remote Direct Memory Access Protocol
Specification", RFC 5040, DOI 10.17487/RFC5040, October Specification", RFC 5040, DOI 10.17487/RFC5040, October
2007, <https://www.rfc-editor.org/info/rfc5040>. 2007, <https://www.rfc-editor.org/info/rfc5040>.
skipping to change at line 30817 skipping to change at line 30822
<https://www.rfc-editor.org/info/rfc5403>. <https://www.rfc-editor.org/info/rfc5403>.
[10] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., [10] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1 "Network File System (NFS) Version 4 Minor Version 1
External Data Representation Standard (XDR) Description", External Data Representation Standard (XDR) Description",
RFC 5662, DOI 10.17487/RFC5662, January 2010, RFC 5662, DOI 10.17487/RFC5662, January 2010,
<https://www.rfc-editor.org/info/rfc5662>. <https://www.rfc-editor.org/info/rfc5662>.
[11] The Open Group, "Section 3.372 of Chapter 3 of Base [11] The Open Group, "Section 3.372 of Chapter 3 of Base
Definitions of The Open Group Base Specifications Issue 6 Definitions of The Open Group Base Specifications Issue 6
IEEE Std 1003.1, 2004 Edition, HTML Version IEEE Std 1003.1, 2004 Edition, HTML Version",
(www.opengroup.org), ISBN 1931624232", 2004. ISBN 1931624232, 2004, <https://www.opengroup.org>.
[12] Eisler, M., "IANA Considerations for Remote Procedure Call [12] Eisler, M., "IANA Considerations for Remote Procedure Call
(RPC) Network Identifiers and Universal Address Formats", (RPC) Network Identifiers and Universal Address Formats",
RFC 5665, DOI 10.17487/RFC5665, January 2010, RFC 5665, DOI 10.17487/RFC5665, January 2010,
<https://www.rfc-editor.org/info/rfc5665>. <https://www.rfc-editor.org/info/rfc5665>.
[13] The Open Group, "Section 'read()' of System Interfaces of [13] The Open Group, "Section 'read()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[14] The Open Group, "Section 'readdir()' of System Interfaces [14] The Open Group, "Section 'readdir()' of System Interfaces
of The Open Group Base Specifications Issue 6 IEEE Std of The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[15] The Open Group, "Section 'write()' of System Interfaces of [15] The Open Group, "Section 'write()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[16] Hoffman, P. and M. Blanchet, "Preparation of [16] Hoffman, P. and M. Blanchet, "Preparation of
Internationalized Strings ("stringprep")", RFC 3454, Internationalized Strings ("stringprep")", RFC 3454,
DOI 10.17487/RFC3454, December 2002, DOI 10.17487/RFC3454, December 2002,
<https://www.rfc-editor.org/info/rfc3454>. <https://www.rfc-editor.org/info/rfc3454>.
[17] The Open Group, "Section 'chmod()' of System Interfaces of [17] The Open Group, "Section 'chmod()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[18] International Organization for Standardization, [18] International Organization for Standardization,
"Information Technology - Universal Multiple-octet coded "Information Technology - Universal Multiple-octet coded
Character Set (UCS) - Part 1: Architecture and Basic Character Set (UCS) - Part 1: Architecture and Basic
Multilingual Plane", ISO Standard 10646-1, May 1993. Multilingual Plane", ISO Standard 10646-1, May 1993.
[19] Alvestrand, H., "IETF Policy on Character Sets and [19] Alvestrand, H., "IETF Policy on Character Sets and
Languages", BCP 18, RFC 2277, DOI 10.17487/RFC2277, Languages", BCP 18, RFC 2277, DOI 10.17487/RFC2277,
January 1998, <https://www.rfc-editor.org/info/rfc2277>. January 1998, <https://www.rfc-editor.org/info/rfc2277>.
[20] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep [20] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
Profile for Internationalized Domain Names (IDN)", Profile for Internationalized Domain Names (IDN)",
RFC 3491, DOI 10.17487/RFC3491, March 2003, RFC 3491, DOI 10.17487/RFC3491, March 2003,
<https://www.rfc-editor.org/info/rfc3491>. <https://www.rfc-editor.org/info/rfc3491>.
[21] The Open Group, "Section 'fcntl()' of System Interfaces of [21] The Open Group, "Section 'fcntl()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[22] The Open Group, "Section 'fsync()' of System Interfaces of [22] The Open Group, "Section 'fsync()' of System Interfaces of
The Open Group Base Specifications Issue 6 IEEE Std The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[23] The Open Group, "Section 'getpwnam()' of System Interfaces [23] The Open Group, "Section 'getpwnam()' of System Interfaces
of The Open Group Base Specifications Issue 6 IEEE Std of The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[24] The Open Group, "Section 'unlink()' of System Interfaces [24] The Open Group, "Section 'unlink()' of System Interfaces
of The Open Group Base Specifications Issue 6 IEEE Std of The Open Group Base Specifications Issue 6 IEEE Std
1003.1, 2004 Edition, HTML Version (www.opengroup.org), 1003.1, 2004 Edition, HTML Version", ISBN 1931624232,
ISBN 1931624232", 2004. 2004, <https://www.opengroup.org>.
[25] Schaad, J., Kaliski, B., and R. Housley, "Additional [25] Schaad, J., Kaliski, B., and R. Housley, "Additional
Algorithms and Identifiers for RSA Cryptography for use in Algorithms and Identifiers for RSA Cryptography for use in
the Internet X.509 Public Key Infrastructure Certificate the Internet X.509 Public Key Infrastructure Certificate
and Certificate Revocation List (CRL) Profile", RFC 4055, and Certificate Revocation List (CRL) Profile", RFC 4055,
DOI 10.17487/RFC4055, June 2005, DOI 10.17487/RFC4055, June 2005,
<https://www.rfc-editor.org/info/rfc4055>. <https://www.rfc-editor.org/info/rfc4055>.
[26] National Institute of Standards and Technology, [26] National Institute of Standards and Technology,
"Cryptographic Algorithm Object Registration", URL "Cryptographic Algorithm Object Registration", November
http://csrc.nist.gov/groups/ST/crypto_apps_infra/csor/ 2007,
algorithms.html, November 2007. <http://csrc.nist.gov/groups/ST/crypto_apps_infra/csor/
algorithms.html>.
[27] Adamson, A. and N. Williams, "Remote Procedure Call (RPC) [27] Adamson, A. and N. Williams, "Remote Procedure Call (RPC)
Security Version 3", RFC 7861, DOI 10.17487/RFC7861, Security Version 3", RFC 7861, DOI 10.17487/RFC7861,
November 2016, <https://www.rfc-editor.org/info/rfc7861>. November 2016, <https://www.rfc-editor.org/info/rfc7861>.
[28] Neuman, C., Yu, T., Hartman, S., and K. Raeburn, "The [28] Neuman, C., Yu, T., Hartman, S., and K. Raeburn, "The
Kerberos Network Authentication Service (V5)", RFC 4120, Kerberos Network Authentication Service (V5)", RFC 4120,
DOI 10.17487/RFC4120, July 2005, DOI 10.17487/RFC4120, July 2005,
<https://www.rfc-editor.org/info/rfc4120>. <https://www.rfc-editor.org/info/rfc4120>.
skipping to change at line 30962 skipping to change at line 30968
[38] Eisler, M., "LIPKEY - A Low Infrastructure Public Key [38] Eisler, M., "LIPKEY - A Low Infrastructure Public Key
Mechanism Using SPKM", RFC 2847, DOI 10.17487/RFC2847, Mechanism Using SPKM", RFC 2847, DOI 10.17487/RFC2847,
June 2000, <https://www.rfc-editor.org/info/rfc2847>. June 2000, <https://www.rfc-editor.org/info/rfc2847>.
[39] Eisler, M., "NFS Version 2 and Version 3 Security Issues [39] Eisler, M., "NFS Version 2 and Version 3 Security Issues
and the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5", and the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5",
RFC 2623, DOI 10.17487/RFC2623, June 1999, RFC 2623, DOI 10.17487/RFC2623, June 1999,
<https://www.rfc-editor.org/info/rfc2623>. <https://www.rfc-editor.org/info/rfc2623>.
[40] Juszczak, C., "Improving the Performance and Correctness [40] Juszczak, C., "Improving the Performance and Correctness
of an NFS Server", USENIX Conference Proceedings , June of an NFS Server", USENIX Conference Proceedings, June
1990. 1990.
[41] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced [41] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced
by an On-line Database", RFC 3232, DOI 10.17487/RFC3232, by an On-line Database", RFC 3232, DOI 10.17487/RFC3232,
January 2002, <https://www.rfc-editor.org/info/rfc3232>. January 2002, <https://www.rfc-editor.org/info/rfc3232>.
[42] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", [42] Srinivasan, R., "Binding Protocols for ONC RPC Version 2",
RFC 1833, DOI 10.17487/RFC1833, August 1995, RFC 1833, DOI 10.17487/RFC1833, August 1995,
<https://www.rfc-editor.org/info/rfc1833>. <https://www.rfc-editor.org/info/rfc1833>.
[43] Werme, R., "RPC XID Issues", USENIX Conference [43] Werme, R., "RPC XID Issues", USENIX Conference
Proceedings , February 1996. Proceedings, February 1996.
[44] Nowicki, B., "NFS: Network File System Protocol [44] Nowicki, B., "NFS: Network File System Protocol
specification", RFC 1094, DOI 10.17487/RFC1094, March specification", RFC 1094, DOI 10.17487/RFC1094, March
1989, <https://www.rfc-editor.org/info/rfc1094>. 1989, <https://www.rfc-editor.org/info/rfc1094>.
[45] Bhide, A., Elnozahy, E. N., and S. P. Morgan, "A Highly [45] Bhide, A., Elnozahy, E. N., and S. P. Morgan, "A Highly
Available Network Server", USENIX Conference Proceedings , Available Network Server", USENIX Conference Proceedings,
January 1991. January 1991.
[46] Halevy, B., Welch, B., and J. Zelenka, "Object-Based [46] Halevy, B., Welch, B., and J. Zelenka, "Object-Based
Parallel NFS (pNFS) Operations", RFC 5664, Parallel NFS (pNFS) Operations", RFC 5664,
DOI 10.17487/RFC5664, January 2010, DOI 10.17487/RFC5664, January 2010,
<https://www.rfc-editor.org/info/rfc5664>. <https://www.rfc-editor.org/info/rfc5664>.
[47] Black, D., Fridella, S., and J. Glasgow, "Parallel NFS [47] Black, D., Fridella, S., and J. Glasgow, "Parallel NFS
(pNFS) Block/Volume Layout", RFC 5663, (pNFS) Block/Volume Layout", RFC 5663,
DOI 10.17487/RFC5663, January 2010, DOI 10.17487/RFC5663, January 2010,
skipping to change at line 31003 skipping to change at line 31009
[48] Callaghan, B., "WebNFS Client Specification", RFC 2054, [48] Callaghan, B., "WebNFS Client Specification", RFC 2054,
DOI 10.17487/RFC2054, October 1996, DOI 10.17487/RFC2054, October 1996,
<https://www.rfc-editor.org/info/rfc2054>. <https://www.rfc-editor.org/info/rfc2054>.
[49] Callaghan, B., "WebNFS Server Specification", RFC 2055, [49] Callaghan, B., "WebNFS Server Specification", RFC 2055,
DOI 10.17487/RFC2055, October 1996, DOI 10.17487/RFC2055, October 1996,
<https://www.rfc-editor.org/info/rfc2055>. <https://www.rfc-editor.org/info/rfc2055>.
[50] IESG, "IESG Processing of RFC Errata for the IETF Stream", [50] IESG, "IESG Processing of RFC Errata for the IETF Stream",
July 2008, <http://www.ietf.org/IESG/STATEMENTS/iesg- July 2008,
statement-07-30-2008.txt>. <https://www.ietf.org/about/groups/iesg/statements/
processing-rfc-errata/>.
[51] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- [51] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-
Hashing for Message Authentication", RFC 2104, Hashing for Message Authentication", RFC 2104,
DOI 10.17487/RFC2104, February 1997, DOI 10.17487/RFC2104, February 1997,
<https://www.rfc-editor.org/info/rfc2104>. <https://www.rfc-editor.org/info/rfc2104>.
[52] Shepler, S., "NFS Version 4 Design Considerations", [52] Shepler, S., "NFS Version 4 Design Considerations",
RFC 2624, DOI 10.17487/RFC2624, June 1999, RFC 2624, DOI 10.17487/RFC2624, June 1999,
<https://www.rfc-editor.org/info/rfc2624>. <https://www.rfc-editor.org/info/rfc2624>.
[53] The Open Group, "Protocols for Interworking: XNFS, Version [53] The Open Group, "Protocols for Interworking: XNFS, Version
3W, ISBN 1-85912-184-5", February 1998. 3W", ISBN 1-85912-184-5, February 1998.
[54] Floyd, S. and V. Jacobson, "The Synchronization of [54] Floyd, S. and V. Jacobson, "The Synchronization of
Periodic Routing Messages", IEEE/ACM Transactions on Periodic Routing Messages", IEEE/ACM Transactions on
Networking 2(2), pp. 122-136, April 1994. Networking, 2(2), pp. 122-136, April 1994.
[55] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., [55] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M.,
and E. Zeidner, "Internet Small Computer Systems Interface and E. Zeidner, "Internet Small Computer Systems Interface
(iSCSI)", RFC 3720, DOI 10.17487/RFC3720, April 2004, (iSCSI)", RFC 3720, DOI 10.17487/RFC3720, April 2004,
<https://www.rfc-editor.org/info/rfc3720>. <https://www.rfc-editor.org/info/rfc3720>.
[56] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version [56] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version
(FCP-2)", ANSI/INCITS 350-2003, October 2003. (FCP-2)", ANSI/INCITS, 350-2003, October 2003.
[57] Weber, R.O., "Object-Based Storage Device Commands (OSD)", [57] Weber, R.O., "Object-Based Storage Device Commands (OSD)",
ANSI/INCITS 400-2004, July 2004, ANSI/INCITS, 400-2004, July 2004,
<http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. <http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>.
[58] Carns, P. H., Ligon III, W. B., Ross, R. B., and R. [58] Carns, P. H., Ligon III, W. B., Ross, R. B., and R.
Thakur, "PVFS: A Parallel File System for Linux Thakur, "PVFS: A Parallel File System for Linux
Clusters.", Proceedings of the 4th Annual Linux Showcase Clusters.", Proceedings of the 4th Annual Linux Showcase
and Conference , 2000. and Conference, 2000.
[59] The Open Group, "The Open Group Base Specifications Issue [59] The Open Group, "The Open Group Base Specifications Issue
6, IEEE Std 1003.1, 2004 Edition", 2004. 6, IEEE Std 1003.1, 2004 Edition", 2004,
<https://www.opengroup.org>.
[60] Callaghan, B., "NFS URL Scheme", RFC 2224, [60] Callaghan, B., "NFS URL Scheme", RFC 2224,
DOI 10.17487/RFC2224, October 1997, DOI 10.17487/RFC2224, October 1997,
<https://www.rfc-editor.org/info/rfc2224>. <https://www.rfc-editor.org/info/rfc2224>.
[61] Chiu, A., Eisler, M., and B. Callaghan, "Security [61] Chiu, A., Eisler, M., and B. Callaghan, "Security
Negotiation for WebNFS", RFC 2755, DOI 10.17487/RFC2755, Negotiation for WebNFS", RFC 2755, DOI 10.17487/RFC2755,
January 2000, <https://www.rfc-editor.org/info/rfc2755>. January 2000, <https://www.rfc-editor.org/info/rfc2755>.
[62] Narten, T. and H. Alvestrand, "Guidelines for Writing an [62] Narten, T. and H. Alvestrand, "Guidelines for Writing an
IANA Considerations Section in RFCs", RFC 5226, IANA Considerations Section in RFCs", RFC 5226,
DOI 10.17487/RFC5226, May 2008, DOI 10.17487/RFC5226, May 2008,
<https://www.rfc-editor.org/info/rfc5226>. <https://www.rfc-editor.org/info/rfc5226>.
[63] Eisler, M., "Errata 2006 for RFC 5661", January 2010, [63] RFC Errata, Erratum ID 2006, RFC 5661,
<https://www.rfc-editor.org/errata_search.php?eid=2006>. <https://www.rfc-editor.org/errata/eid2006>.
[64] Spasojevic, M. and M. Satayanarayanan, "An Empirical Study [64] Spasojevic, M. and M. Satayanarayanan, "An Empirical Study
of a Wide-Area Distributed File System", May 1996, of a Wide-Area Distributed File System", May 1996,
<https://www.cs.cmu.edu/~satya/docdir/spasojevic-tocs-afs- <https://www.cs.cmu.edu/~satya/docdir/spasojevic-tocs-afs-
measurement-1996.pdf>. measurement-1996.pdf>.
[65] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., [65] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed.,
"Network File System (NFS) Version 4 Minor Version 1 "Network File System (NFS) Version 4 Minor Version 1
Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010,
<https://www.rfc-editor.org/info/rfc5661>. <https://www.rfc-editor.org/info/rfc5661>.
skipping to change at line 31094 skipping to change at line 31102
[70] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an [70] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
2014, <https://www.rfc-editor.org/info/rfc7258>. 2014, <https://www.rfc-editor.org/info/rfc7258>.
[71] Rescorla, E. and B. Korver, "Guidelines for Writing RFC [71] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Text on Security Considerations", BCP 72, RFC 3552, Text on Security Considerations", BCP 72, RFC 3552,
DOI 10.17487/RFC3552, July 2003, DOI 10.17487/RFC3552, July 2003,
<https://www.rfc-editor.org/info/rfc3552>. <https://www.rfc-editor.org/info/rfc3552>.
Appendix A. Need for this Update Appendix A. The Need for This Update
This document includes an explanation of how clients and servers are This document includes an explanation of how clients and servers are
to determine the particular network access paths to be used to access to determine the particular network access paths to be used to access
a file system. This includes describing how changes to the specific a file system. This includes descriptions of how to handle changes
replica to be used or to the set of addresses to be used to access it to the specific replica to be used or to the set of addresses to be
are to be dealt with, and how transfers of responsibility that need used to access it, and how to deal transparently with transfers of
to be made can be dealt with transparently. This includes cases in responsibility that need to be made. This includes cases in which
which there is a shift between one replica and another and those in there is a shift between one replica and another and those in which
which different network access paths are used to access the same different network access paths are used to access the same replica.
replica.
As a result of the following problems in RFC5661 [65], it is As a result of the following problems in RFC 5661 [65], it was
necessary to provide the specific updates which are made by this necessary to provide the specific updates that are made by this
document. These updates are described in Appendix B document. These updates are described in Appendix B.
* RFC5661 [65], while it dealt with situations in which various * RFC 5661 [65], while it dealt with situations in which various
forms of clustering allowed co-ordination of the state assigned by forms of clustering allowed coordination of the state assigned by
co-operating servers to be used, made no provisions for cooperating servers to be used, made no provisions for Transparent
Transparent State Migration. Within NFSv4.0, Transparent State Migration. Within NFSv4.0, Transparent State Migration was
Migration was first explained clearly in RFC7530 [67] and first explained clearly in RFC 7530 [67] and corrected and
corrected and clarified by RFC7931 [68]. No corresponding clarified by RFC 7931 [68]. No corresponding explanation for
explanation for NFSv4.1 had been provided. NFSv4.1 had been provided.
* Although NFSv4.1 was defined with a clear definition of how * Although NFSv4.1 provided a clear definition of how trunking
trunking detection was to be done, there was no clear detection was to be done, there was no clear specification of how
specification of how trunking discovery was to be done, despite trunking discovery was to be done, despite the fact that the
the fact that the specification clearly indicated that this specification clearly indicated that this information could be
information could be made available via the file system location made available via the file system location attributes.
attributes.
* Because the existence of multiple network access paths to the same * Because the existence of multiple network access paths to the same
file system was dealt with as if there were multiple replicas, file system was dealt with as if there were multiple replicas,
issues relating to transitions between replicas could never be issues relating to transitions between replicas could never be
clearly distinguished from trunking-related transitions between clearly distinguished from trunking-related transitions between
the addresses used to access a particular file system instance. the addresses used to access a particular file system instance.
As a result, in situations in which both migration and trunking As a result, in situations in which both migration and trunking
configuration changes were involved, neither of these could be configuration changes were involved, neither of these could be
clearly dealt with and the relationship between these two features clearly dealt with, and the relationship between these two
was not seriously addressed. features was not seriously addressed.
* Because use of two network access paths to the same file system * Because use of two network access paths to the same file system
instance (i.e. trunking) was often treated as if two replicas were instance (i.e., trunking) was often treated as if two replicas
involved, it was considered that two replicas were being used were involved, it was considered that two replicas were being used
simultaneously. As a result, the treatment of replicas being used simultaneously. As a result, the treatment of replicas being used
simultaneously in RFC5661 [65] was not clear as it covered the two simultaneously in RFC 5661 [65] was not clear, as it covered the
distinct cases of a single file system instance being accessed by two distinct cases of a single file system instance being accessed
two different network access paths and two replicas being accessed by two different network access paths and two replicas being
simultaneously, with the limitations of the latter case not being accessed simultaneously, with the limitations of the latter case
clearly laid out. not being clearly laid out.
The majority of the consequences of these issues are dealt with by The majority of the consequences of these issues are dealt with by
presenting in Section 11 a replacement for Section 11 of RFC 5661 presenting in Section 11 a replacement for Section 11 of RFC 5661
[65]. This replacement modifies existing sub-sections within that [65]. This replacement modifies existing subsections within that
section and adds new ones, as described in Appendix B.1. Also, some section and adds new ones as described in Appendix B.1. Also, some
existing sections are deleted. These changes were made in order to: existing sections were deleted. These changes were made in order to
do the following:
* Reorganize the description so that the case of two network access * Reorganize the description so that the case of two network access
paths to the same file system instance needs to be distinguished paths to the same file system instance is distinguished clearly
clearly from the case of two different replicas since, in the from the case of two different replicas since, in the former case,
former case, locking state is shared and there also can be sharing locking state is shared and there also can be sharing of session
of session state. state.
* Provide a clear statement regarding the desirability of * Provide a clear statement regarding the desirability of
transparent transfer of state between replicas together with a transparent transfer of state between replicas together with a
recommendation that either that or a single-fs grace period be recommendation that either transparent transfer or a single-fs
provided. grace period be provided.
* Specifically delineate how such transfers are to be dealt with by * Specifically delineate how a client is to handle such transfers,
the client, taking into account the differences from the treatment taking into account the differences from the treatment in [68]
in [68] made necessary by the major protocol changes made in made necessary by the major protocol changes to NFSv4.1.
NFSv4.1.
* Provide discussion of the relationship between transparent state * Discuss the relationship between transparent state transfer and
transfer and Parallel NFS (pNFS). Parallel NFS (pNFS).
* Provide clarification of the fs_locations_info attribute in order * Clarify the fs_locations_info attribute in order to specify which
to specify which portions of the information provided apply to a portions of the provided information apply to a specific network
specific network access path and which to the replica which that access path and which apply to the replica that the path is used
path is used to access. to access.
In addition, there are also updates to other sections of RFC5661 In addition, other sections of RFC 5661 [65] were updated to correct
[65], where the consequences of the incorrect assumptions underlying the consequences of the incorrect assumptions underlying the
the current treatment of multi-server namespace issues also needed to treatment of multi-server namespace issues. These are described in
be corrected. These are to be dealt with as described in Appendices Appendices B.2 through B.4.
B.2 through B.4.
* A revised introductory section regarding multi-server namespace * A revised introductory section regarding multi-server namespace
facilities is provided. facilities is provided.
* A more realistic treatment of server scope is provided, which * A more realistic treatment of server scope is provided. This
reflects the more limited co-ordination of locking state adopted treatment reflects the more limited coordination of locking state
by servers actually sharing a common server scope. adopted by servers actually sharing a common server scope.
* Some confusing text regarding changes in server_owner has been * Some confusing text regarding changes in server_owner has been
clarified. clarified.
* The description of some existing errors has been modified to more * The description of some existing errors has been modified to more
clearly explain certain errors situations to reflect the existence clearly explain certain error situations to reflect the existence
of trunking and the possible use of fs-specific grace periods. of trunking and the possible use of fs-specific grace periods.
For details, see Appendix B.3. For details, see Appendix B.3.
* New descriptions of certain existing operations are provided, * New descriptions of certain existing operations are provided,
either because the existing treatment did not account for either because the existing treatment did not account for
situations that would arise in dealing with transparent state situations that would arise in dealing with Transparent State
migration, or because some types of reclaim issues were not Migration, or because some types of reclaim issues were not
adequately dealt with in the context of fs-specific grace periods. adequately dealt with in the context of fs-specific grace periods.
For details, see Appendix B.2. For details, see Appendix B.2.
Appendix B. Changes in this Update Appendix B. Changes in This Update
B.1. Revisions Made to Section 11 of RFC5661 B.1. Revisions Made to Section 11 of RFC 5661
A number of areas needed to be revised or extended, in many case A number of areas have been revised or extended, in many cases
replacing existing sub-sections within Section 11 of RFC 5661 [65]: replacing subsections within Section 11 of RFC 5661 [65]:
* New introductory material, including a terminology section, * New introductory material, including a terminology section,
replaces the existing material in RFC5661 [65] ranging from the replaces the material in RFC 5661 [65], ranging from the start of
start of the existing Section 11 up to and including the existing the original Section 11 up to and including Section 11.1. The new
Section 11.1. The new material starts at the beginning of material starts at the beginning of Section 11 and continues
Section 11 and continues through 11.2 below. through 11.2.
* A significant reorganization of the material in the existing * A significant reorganization of the material in Sections 11.4 and
Sections 11.4 and 11.5 of RFC 5661 [65]) is necessary. The 11.5 of RFC 5661 [65] was necessary. The reasons for the
reasons for the reorganization of these sections into a single reorganization of these sections into a single section with
section with multiple subsections are discussed in Appendix B.1.1 multiple subsections are discussed in Appendix B.1.1 below. This
below. This replacement appears as Section 11.5 below. replacement appears as Section 11.5.
New material relating to the handling of the file system location New material relating to the handling of the file system location
attributes is contained in Sections 11.5.1 and 11.5.7 below. attributes is contained in Sections 11.5.1 and 11.5.7.
* A new section describing requirements for user and group handling * A new section describing requirements for user and group handling
within a multi-server namespace has been added as Section 11.7 within a multi-server namespace has been added as Section 11.7.
* A major replacement for the existing Section 11.7 of RFC 5661 [65] * A major replacement for Section 11.7 of RFC 5661 [65], entitled
entitled "Effecting File System Transitions", will appear as "Effecting File System Transitions", appears as Sections 11.9
Sections 11.9 through 11.14. The reasons for the reorganization through 11.14. The reasons for the reorganization of this section
of this section into multiple sections are discussed in into multiple sections are discussed in Appendix B.1.2.
Appendix B.1.2.
* A replacement for the existing Section 11.10 of RFC 5661 [65] * A replacement for Section 11.10 of RFC 5661 [65], entitled "The
entitled "The Attribute fs_locations_info", will appear as Attribute fs_locations_info", appears as Section 11.17, with
Section 11.17, with Appendix B.1.3 describing the differences Appendix B.1.3 describing the differences between the new section
between the new section and the treatment within [65]. A revised and the treatment within [65]. A revised treatment was necessary
treatment is necessary because the existing treatment did not make because the original treatment did not make clear how the added
clear how the added attribute information relates to the case of attribute information relates to the case of trunked paths to the
trunked paths to the same replica. These issues were not same replica. These issues were not addressed in RFC 5661 [65]
addressed in RFC5661 [65] where the concepts of a replica and a where the concepts of a replica and a network path used to access
network path used to access a replica were not clearly a replica were not clearly distinguished.
distinguished.
B.1.1. Re-organization of Sections 11.4 and 11.5 of RFC5661 B.1.1. Reorganization of Sections 11.4 and 11.5 of RFC 5661
Previously, issues related to the fact that multiple location entries Previously, issues related to the fact that multiple location entries
directed the client to the same file system instance were dealt with directed the client to the same file system instance were dealt with
in a separate Section 11.5 of RFC 5661 [65]. Because of the new in Section 11.5 of RFC 5661 [65]. Because of the new treatment of
treatment of trunking, these issues now belong within Section 11.5 trunking, these issues now belong within Section 11.5.
below.
In this new section, trunking is dealt with in Section 11.5.2 In this new section, trunking is covered in Section 11.5.2 together
together with the other uses of file system location information with the other uses of file system location information described in
described in Sections 11.5.3 through 11.5.6. Sections 11.5.3 through 11.5.6.
As a result, Section 11.5 which will replace Section 11.4 of RFC 5661 As a result, Section 11.5, which replaces Section 11.4 of RFC 5661
[65] is substantially different than the section it replaces in that [65], is substantially different than the section it replaces in that
some existing sections will be replaced by corresponding sections some original sections have been replaced by corresponding sections
below while, at the same time, new sections will be added, resulting as described below, while new sections have been added:
in a replacement containing some renumbered sections, as follows:
* The material in Section 11.5, exclusive of subsections, replaces * The material in Section 11.5, exclusive of subsections, replaces
the material in Section 11.4 of RFC 5661 [65] exclusive of the material in Section 11.4 of RFC 5661 [65] exclusive of
subsections. subsections.
* Section 11.5.1 is a new first subsection of the overall section. * Section 11.5.1 is the new first subsection of the overall section.
* Section 11.5.2 is a new second subsection of the overall section. * Section 11.5.2 is the new second subsection of the overall
section.
* Each of the Sections 11.5.4, 11.5.5, and 11.5.6 replaces (in * Each of the Sections 11.5.4, 11.5.5, and 11.5.6 replaces (in
order) one of the corresponding Sections 11.4.1, 11.4.2, and order) one of the corresponding Sections 11.4.1, 11.4.2, and
11.4.3 of RFC 5661 [65]. 11.4.4, and 11.4.5. 11.4.3 of RFC 5661 [65]. 11.4.4, and 11.4.5.
* Section 11.5.7 is a new final subsection of the overall section. * Section 11.5.7 is the new final subsection of the overall section.
B.1.2. Re-organization of Material Dealing with File System Transitions B.1.2. Reorganization of Material Dealing with File System Transitions
The material relating to file system transition, previously contained The material relating to file system transition, previously contained
in Section 11.7 of RFC 5661 [65] has been reorganized and augmented in Section 11.7 of RFC 5661 [65] has been reorganized and augmented
as described below: as described below:
* Because there can be a shift of the network access paths used to * Because there can be a shift of the network access paths used to
access a file system instance without any shift between replicas, access a file system instance without any shift between replicas,
a new Section 11.9 distinguishes between those cases in which a new Section 11.9 distinguishes between those cases in which
there is a shift between distinct replicas and those involving a there is a shift between distinct replicas and those involving a
shift in network access paths with no shift between replicas. shift in network access paths with no shift between replicas.
As a result, a new Section 11.10 deals with network address As a result, the new Section 11.10 deals with network address
transitions while the bulk of the former Section 11.7 of RFC 5661 transitions, while the bulk of the original Section 11.7 of RFC
[65] is extensively modified as reflected in Section 11.11 which 5661 [65] has been extensively modified as reflected in
is now limited to cases in which there is a shift between two Section 11.11, which is now limited to cases in which there is a
different sets of replicas. shift between two different sets of replicas.
* The additional Section 11.12 discusses the case in which a shift * The additional Section 11.12 discusses the case in which a shift
to a different replica is made and state is transferred to allow to a different replica is made and state is transferred to allow
the client the ability to have continued access to its accumulated the client the ability to have continued access to its accumulated
locking state on the new server. locking state on the new server.
* The additional Section 11.13 discusses the client's response to * The additional Section 11.13 discusses the client's response to
access transitions and how it determines whether migration has access transitions, how it determines whether migration has
occurred, and how it gets access to any transferred locking and occurred, and how it gets access to any transferred locking and
session state. session state.
* The additional Section 11.14 discusses the responsibilities of the * The additional Section 11.14 discusses the responsibilities of the
source and destination servers when transferring locking and source and destination servers when transferring locking and
session state. session state.
This re-organization has caused a renumbering of the sections within This reorganization has caused a renumbering of the sections within
Section 11 of [65] as described below: Section 11 of [65] as described below:
* The new Sections 11.9 and 11.10 have resulted in existing sections * The new Sections 11.9 and 11.10 have resulted in the renumbering
with these numbers to be renumbered. of existing sections with these numbers.
* Section 11.7 of [65] will be substantially modified and appear as * Section 11.7 of [65] has been substantially modified and appears
Section 11.11. The necessary modifications reflect the fact that as Section 11.11. The necessary modifications reflect the fact
this section will only deal with transitions between replicas that this section only deals with transitions between replicas,
while transitions between network addresses are dealt with in while transitions between network addresses are dealt with in
other sections. Details of the reorganization are described later other sections. Details of the reorganization are described later
in this section. in this section.
* The additional Sections 11.12, 11.13, and 11.14 have been added. * Sections 11.12, 11.13, and 11.14 have been added.
* Consequently, Sections 11.8, 11.9, 11.10, and 11.11 in [65] now * Consequently, Sections 11.8, 11.9, 11.10, and 11.11 in [65] now
appear as Sections 11.13, 11.14, 11.15, and 11.16, respectively. appear as Sections 11.15, 11.16, 11.17, and 11.18, respectively.
As part of this general re-organization, Section 11.7 of RFC 5661 As part of this general reorganization, Section 11.7 of RFC 5661 [65]
[65] will be modified as described below: has been modified as described below:
* Sections 11.7 and 11.7.1 of RFC 5661 [65] are to be replaced by * Sections 11.7 and 11.7.1 of RFC 5661 [65] have been replaced by
Sections 11.11 and 11.11.1, respectively. Sections 11.11 and 11.11.1, respectively.
* Section 11.7.2 of RFC 5661 (and included subsections) are to be * Section 11.7.2 of RFC 5661 (and included subsections) has been
deleted. deleted.
* Sections 11.7.3, 11.7.4, 11.7.5, 11.7.5.1, and 11.7.6 of RFC 5661 * Sections 11.7.3, 11.7.4, 11.7.5, 11.7.5.1, and 11.7.6 of RFC 5661
[65] are to be replaced by Sections 11.11.2, 11.11.3, 11.11.4, [65] have been replaced by Sections 11.11.2, 11.11.3, 11.11.4,
11.11.4.1, and 11.11.5 respectively in this document. 11.11.4.1, and 11.11.5 respectively in this document.
* Section 11.7.7 of RFC 5661 [65] is to be replaced by * Section 11.7.7 of RFC 5661 [65] has been replaced by
Section 11.11.9 This sub-section has been moved to the end of the Section 11.11.9. This subsection has been moved to the end of the
section dealing with file system transitions. section dealing with file system transitions.
* Sections 11.7.8, 11.7.9, and 11.7.10 of RFC 5661 [65] are to be * Sections 11.7.8, 11.7.9, and 11.7.10 of RFC 5661 [65] have been
replaced by Sections 11.11.6, 11.11.7, and 11.11.8 respectively in replaced by Sections 11.11.6, 11.11.7, and 11.11.8 respectively in
this document. this document.
B.1.3. Updates to treatment of fs_locations_info B.1.3. Updates to the Treatment of fs_locations_info
Various elements of the fs_locations_info attribute contain Various elements of the fs_locations_info attribute contain
information that applies to either a specific file system replica or information that applies to either a specific file system replica or
to a network path or set of network paths used to access such a to a network path or set of network paths used to access such a
replica. The existing treatment of fs_locations info (Section 11.10 replica. The original treatment of fs_locations_info (Section 11.10
of RFC 5661 [65]) does not clearly distinguish these cases, in part of RFC 5661 [65]) did not clearly distinguish these cases, in part
because the document did not clearly distinguish replicas from the because the document did not clearly distinguish replicas from the
paths used to access them. paths used to access them.
In addition, special clarification needed to be provided with regard In addition, special clarification has been provided with regard to
to the following fields: the following fields:
* With regard to the handling of FSLI4GF_GOING, it needs to be made * With regard to the handling of FSLI4GF_GOING, it was clarified
clear that this only applies to the unavailability of a replica that this only applies to the unavailability of a replica rather
rather than to a path to access a replica. than to a path to access a replica.
* In describing the appropriate value for a server to use for * In describing the appropriate value for a server to use for
fli_valid_for, it needs to be made clear that there is no need for fli_valid_for, it was clarified that there is no need for the
the client to frequently fetch the fs_locations_info value to be client to frequently fetch the fs_locations_info value to be
prepared for shifts in trunking patterns. prepared for shifts in trunking patterns.
* Clarification of the rules for extensions to the fls_info needs to * Clarification of the rules for extensions to the fls_info has been
be provided. The existing treatment reflects the extension model provided. The original treatment reflected the extension model
in effect at the time RFC5661 [65] was written, and needed to be that was in effect at the time RFC 5661 [65] was written, but has
updated in accordance with the extension model described in been updated in accordance with the extension model described in
RFC8178 [66]. RFC 8178 [66].
B.2. Revisions Made to Operations in RFC5661 B.2. Revisions Made to Operations in RFC 5661
Revised descriptions were needed to address issues that arose in Descriptions have been revised to address issues that arose in
effecting necessary changes to multi-server namespace features. effecting necessary changes to multi-server namespace features.
* The existing treatment of EXCHANGE_ID (Section 13.35 of RFC 5661 * The treatment of EXCHANGE_ID (Section 18.35 of RFC 5661 [65])
[65]) assumes that client IDs cannot be created/ confirmed other assumed that client IDs cannot be created/confirmed other than by
than by the EXCHANGE_ID and CREATE_SESSION operations. Also, the the EXCHANGE_ID and CREATE_SESSION operations. Also, the
necessary use of EXCHANGE_ID in recovery from migration and necessary use of EXCHANGE_ID in recovery from migration and
related situations is not addressed clearly. A revised treatment related situations was not clearly addressed. A revised treatment
of EXCHANGE_ID is necessary and it appears in Section 18.35 while of EXCHANGE_ID was necessary, and it appears in Section 18.35,
the specific differences between it and the treatment within [65] while the specific differences between it and the treatment within
are explained in Appendix B.2.1 below. [65] are explained in Appendix B.2.1 below.
* The existing treatment of RECLAIM_COMPLETE in Section 18.51 of RFC * The treatment of RECLAIM_COMPLETE in Section 18.51 of RFC 5661
5661 [65]) is not sufficiently clear about the purpose and use of [65] was not sufficiently clear about the purpose and use of the
the rca_one_fs and how the server is to deal with inappropriate rca_one_fs and how the server was to deal with inappropriate
values of this argument. Because the resulting confusion raises values of this argument. Because the resulting confusion raised
interoperability issues, a new treatment of RECLAIM_COMPLETE is interoperability issues, a new treatment of RECLAIM_COMPLETE was
necessary and it appears in Section 18.51 below while the specific necessary, and it appears in Section 18.51, while the specific
differences between it and the treatment within RFC5661 [65] are differences between it and the treatment within RFC 5661 [65] are
discussed in Appendix B.2.2 below. In addition, the definitions discussed in Appendix B.2.2 below. In addition, the definitions
of the reclaim-related errors receive an updated treatment in of the reclaim-related errors have received an updated treatment
Section 15.1.9 to reflect the fact that there are multiple in Section 15.1.9 to reflect the fact that there are multiple
contexts for lock reclaim operations. contexts for lock reclaim operations.
B.2.1. Revision to Treatment of EXCHANGE_ID B.2.1. Revision of Treatment of EXCHANGE_ID
There are a number of issues in the original treatment of EXCHANGE_ID There was a number of issues in the original treatment of EXCHANGE_ID
(in RFC5661 [65]) that cause problems for Transparent State Migration in RFC 5661 [65] that caused problems for Transparent State Migration
and for the transfer of access between different network access paths and for the transfer of access between different network access paths
to the same file system instance. to the same file system instance.
These issues arise from the fact that this treatment was written, These issues arose from the fact that this treatment was written:
* Assuming that a client ID can only become known to a server by * Assuming that a client ID can only become known to a server by
having been created by executing an EXCHANGE_ID, with confirmation having been created by executing an EXCHANGE_ID, with confirmation
of the ID only possible by execution of a CREATE_SESSION. of the ID only possible by execution of a CREATE_SESSION.
* Considering the interactions between a client and a server only * Considering the interactions between a client and a server only
occurring on a single network address occurring on a single network address.
As these assumptions have become invalid in the context of As these assumptions have become invalid in the context of
Transparent State Migration and active use of trunking, the treatment Transparent State Migration and active use of trunking, the treatment
has been modified in several respects. has been modified in several respects:
* It had been assumed that an EXCHANGED_ID executed when the server * It had been assumed that an EXCHANGE_ID executed when the server
is already aware of a given client instance must be either is already aware of a given client instance must be either
updating associated parameters (e.g. with respect to callbacks) or updating associated parameters (e.g., with respect to callbacks)
a lingering retransmission to deal with a previously lost reply. or a lingering retransmission to deal with a previously lost
As result, any slot sequence returned by that operation would be reply. As result, any slot sequence returned by that operation
of no use. The existing treatment goes so far as to say that it would be of no use. The original treatment went so far as to say
"MUST NOT" be used, although this usage is not in accord with [1]. that it "MUST NOT" be used, although this usage was not in accord
This created a difficulty when an EXCHANGE_ID is done after with [1]. This created a difficulty when an EXCHANGE_ID is done
Transparent State Migration since that slot sequence would need to after Transparent State Migration since that slot sequence would
be used in a subsequent CREATE_SESSION. need to be used in a subsequent CREATE_SESSION.
In the updated treatment, CREATE_SESSION is a way that client IDs In the updated treatment, CREATE_SESSION is a way that client IDs
are confirmed but it is understood that other ways are possible. are confirmed, but it is understood that other ways are possible.
The slot sequence can be used as needed and cases in which it The slot sequence can be used as needed, and cases in which it
would be of no use are appropriately noted. would be of no use are appropriately noted.
* It was assumed that the only functions of EXCHANGE_ID were to * It had been assumed that the only functions of EXCHANGE_ID were to
inform the server of the client, create the client ID, and inform the server of the client, to create the client ID, and to
communicate it to the client. When multiple simultaneous communicate it to the client. When multiple simultaneous
connections are involved, as often happens when trunking, that connections are involved, as often happens when trunking, that
treatment was inadequate in that it ignored the role of treatment was inadequate in that it ignored the role of
EXCHANGE_ID in associating the client ID with the connection on EXCHANGE_ID in associating the client ID with the connection on
which it was done, so that it could be used by a subsequent which it was done, so that it could be used by a subsequent
CREATE_SESSSION, whose parameters do not include an explicit CREATE_SESSSION whose parameters do not include an explicit client
client ID. ID.
The new treatment explicitly discusses the role of EXCHANGE_ID in The new treatment explicitly discusses the role of EXCHANGE_ID in
associating the client ID with the connection so it can be used by associating the client ID with the connection so it can be used by
CREATE_SESSION and in associating a connection with an existing CREATE_SESSION and in associating a connection with an existing
session. session.
The new treatment can be found in Section 18.35 above. It supersedes The new treatment can be found in Section 18.35 above. It supersedes
the treatment in Section 18.35 of RFC 5661 [65]. the treatment in Section 18.35 of RFC 5661 [65].
B.2.2. Revision to Treatment of RECLAIM_COMPLETE B.2.2. Revision of Treatment of RECLAIM_COMPLETE
The following changes were made to the treatment of RECLAIM_COMPLETE The following changes were made to the treatment of RECLAIM_COMPLETE
in RFC5661 [65] to arrive at the treatment in Section 18.51. in RFC 5661 [65] to arrive at the treatment in Section 18.51:
* In a number of places the text is made more explicit about the * In a number of places, the text was made more explicit about the
purpose of rca_one_fs and its connection to file system migration. purpose of rca_one_fs and its connection to file system migration.
* There is a discussion of situations in which particular forms of * There is a discussion of situations in which particular forms of
RECLAIM_COMPLETE would need to be done. RECLAIM_COMPLETE would need to be done.
* There is a discussion of interoperability issues that result from * There is a discussion of interoperability issues between
implementations that may have arisen due to the lack of clarity of implementations that may have arisen due to the lack of clarity of
the previous treatment of RECLAIM_COMPLETE. the previous treatment of RECLAIM_COMPLETE.
B.3. Revisions Made to Error Definitions in RFC5661 B.3. Revisions Made to Error Definitions in RFC 5661
The new handling of various situations required revisions of some The new handling of various situations required revisions to some
existing error definition: existing error definitions:
* Because of the need to appropriately address trunking-related * Because of the need to appropriately address trunking-related
issues, some uses of the term "replica" in RFC5661 [65] have issues, some uses of the term "replica" in RFC 5661 [65] became
become problematic since a shift in network access paths was problematic because a shift in network access paths was considered
considered to be a shift to a different replica. As a result, the to be a shift to a different replica. As a result, the original
existing definition of NFS4ERR_MOVED (in Section 15.1.2.4 of RFC definition of NFS4ERR_MOVED (in Section 15.1.2.4 of RFC 5661 [65])
5661 [65]) needs to be updated to reflect the different handling was updated to reflect the different handling of unavailability of
of unavailability of a particular fs via a specific network a particular fs via a specific network address.
address.
Since such a situation is no longer considered to constitute Since such a situation is no longer considered to constitute
unavailability of a file system instance, the description needs to unavailability of a file system instance, the description has been
change even though the set of circumstances in which it is to be changed, even though the set of circumstances in which it is to be
returned remain the same. The new paragraph explicitly recognizes returned remains the same. The new paragraph explicitly
that a different network address might be used, while the previous recognizes that a different network address might be used, while
description, misleadingly, treated this as a shift between two the previous description, misleadingly, treated this as a shift
replicas while only a single file system instance might be between two replicas while only a single file system instance
involved. The updated description appears in Section 15.1.2.4 might be involved. The updated description appears in
below. Section 15.1.2.4.
* Because of the need to accommodate use of fs-specific grace * Because of the need to accommodate the use of fs-specific grace
periods, it is necessary to clarify some of the error definitions periods, it was necessary to clarify some of the definitions of
of reclaim-related errors in Section 15 of RFC 5661 [65], so the reclaim-related errors in Section 15 of RFC 5661 [65] so that the
text applies properly to reclaims for all types of grace periods. text applies properly to reclaims for all types of grace periods.
The updated descriptions appear within Section 15.1.9 below. The updated descriptions appear within Section 15.1.9.
* Because of the need to provide the clarifications in errata report * Because of the need to provide the clarifications in errata report
2006 [63] and to adapt these to properly explain the interaction 2006 [63] and to adapt these to properly explain the interaction
of NFS4ERR_DELAY with the replay cache, a revised description of of NFS4ERR_DELAY with the replay cache, a revised description of
NFS4ERR_DELAY appears in Section 15.1.1.3. This errata report, NFS4ERR_DELAY appears in Section 15.1.1.3. This errata report,
unlike many other RFC5661 errata reports, is addressed in this unlike many other RFC 5661 errata reports, is addressed in this
document because of the extensive use of NFS4ERR_DELAY in document because of the extensive use of NFS4ERR_DELAY in
connection with state migration and session migration. connection with state migration and session migration.
B.4. Other Revisions Made to RFC5661 B.4. Other Revisions Made to RFC 5661
Beside the major reworking of Section 11 of RFC 5661 [65] and the Besides the major reworking of Section 11 of RFC 5661 [65] and the
associated revisions to existing operations and errors, there are a associated revisions to existing operations and errors, there were a
number of related changes that are necessary: number of related changes that were necessary:
* The summary that appeared in Section 1.7.3.3 of RFC 5661 [65] was * The summary in Section 1.7.3.3 of RFC 5661 [65] was revised to
revised to reflect the changes made in the revised Section 11 reflect the changes made to Section 11 above. The updated summary
above. The updated summary appears as Section 1.8.3.3 above. appears as Section 1.8.3.3 above.
* The discussion of server scope which appeared in Section 2.10.4 of * The discussion of server scope in Section 2.10.4 of RFC 5661 [65]
RFC 5661 [65] needed to be replaced, since the previous text was replaced since it appeared to require a level of inter-server
appears to require a level of inter-server co-ordination coordination incompatible with its basic function of avoiding the
incompatible with its basic function of avoiding the need for a need for a globally uniform means of assigning server_owner
globally uniform means of assigning server_owner values. A values. A revised treatment appears in Section 2.10.4.
revised treatment appears in Section 2.10.4.
* The discussion of trunking which appeared in Section 2.10.5 of RFC * The discussion of trunking in Section 2.10.5 of RFC 5661 [65] was
5661 [65] needed to be revised, to more clearly explain the revised to more clearly explain the multiple types of trunking
multiple types of trunking support and how the client can be made support and how the client can be made aware of the existing
aware of the existing trunking configuration. In addition, while trunking configuration. In addition, while the last paragraph
the last paragraph (exclusive of sub-sections) of that section, (exclusive of subsections) of that section dealing with
dealing with server_owner changes, is literally true, it has been server_owner changes was literally true, it had been a source of
a source of confusion. Since the existing paragraph can be read confusion. Since the original paragraph could be read as
as suggesting that such changes be dealt with non-disruptively, suggesting that such changes be handled nondisruptively, the issue
the issue needs to be clarified in the revised section, which was clarified in the revised Section 2.10.5.
appears in Section 2.10.5.
Appendix C. Security Issues that Need to be Addressed Appendix C. Security Issues That Need to Be Addressed
The following issues in the treatment of security within the NFSv4.1 The following issues in the treatment of security within the NFSv4.1
specification need to be addressed: specification need to be addressed:
* The Security Considerations Section of RFC5661 [65] is not written * The Security Considerations Section of RFC 5661 [65] was not
in accord with RFC3552 [71] (also BCP72). Of particular concern written in accordance with RFC 3552 (BCP 72) [71]. Of particular
is the fact that the section does not contain a threat analysis. concern was the fact that the section did not contain a threat
analysis.
* Initial analysis of the existing security issues with NFSv4.1 has * Initial analysis of the existing security issues with NFSv4.1 has
made it likely that a revised Security Considerations Section for made it likely that a revised Security Considerations section for
the existing protocol (one containing a threat analysis) would be the existing protocol (one containing a threat analysis) would be
likely to conclude that NFSv4.1 does not meet the goal of secure likely to conclude that NFSv4.1 does not meet the goal of secure
use on the internet. use on the Internet.
The Security Considerations Section of this document (in Section 21) The Security Considerations section of this document (Section 21) has
has not been thoroughly revised to correct the difficulties mentioned not been thoroughly revised to correct the difficulties mentioned
above. Instead, it has been modified to take proper account of above. Instead, it has been modified to take proper account of
issues related to the multi-server namespace features discussed in issues related to the multi-server namespace features discussed in
Section 11, leaving the incomplete discussion and security weaknesses Section 11, leaving the incomplete discussion and security weaknesses
pretty much as they were. pretty much as they were.
The following major security issues need to be addressed in a The following major security issues need to be addressed in a
satisfactory fashion before an updated Security Considerations satisfactory fashion before an updated Security Considerations
section can be published as part of a bis document for NFSv4.1: section can be published as part of a bis document for NFSv4.1:
* The continued use of AUTH_SYS and the security exposures it * The continued use of AUTH_SYS and the security exposures it
creates needs to be addressed. Addressing this issue must not be creates need to be addressed. Addressing this issue must not be
limited to the questions of whether the designation of this as limited to the questions of whether the designation of this as
OPTIONAL was justified and whether it should be changed. OPTIONAL was justified and whether it should be changed.
In any event, it may not be possible, at this point, to correct In any event, it may not be possible at this point to correct the
the security problems created by continued use of AUTH_SYS simply security problems created by continued use of AUTH_SYS simply by
by revising this designation. revising this designation.
* The lack of attention within the protocol to the possibility of * The lack of attention within the protocol to the possibility of
pervasive monitoring attacks such as those described in RFC7258 pervasive monitoring attacks such as those described in RFC 7258
[70] (also BCP188). [70] (also BCP 188).
In that connection, the use of CREATE_SESSION without privacy In that connection, the use of CREATE_SESSION without privacy
protection needs to be addressed as it exposes the session ID to protection needs to be addressed as it exposes the session ID to
view by an attacker. This is worrisome as this is precisely the view by an attacker. This is worrisome as this is precisely the
type of protocol artifact alluded to in RFC7258, which can enable type of protocol artifact alluded to in RFC 7258, which can enable
further mischief on the part of the attacker as it enables denial- further mischief on the part of the attacker as it enables denial-
of-service attacks which can be executed effectively with only a of-service attacks that can be executed effectively with only a
single, normally low-value, credential, even when RPCSEC_GSS single, normally low-value, credential, even when RPCSEC_GSS
authentication is in use. authentication is in use.
* The lack of effective use of privacy and integrity, even where the * The lack of effective use of privacy and integrity, even where the
infrastructure to support use of RPCSEC_GSS in present, needs to infrastructure to support use of RPCSEC_GSS is present, needs to
be addressed. be addressed.
In light of the security exposures that this situation creates, it In light of the security exposures that this situation creates, it
is not enough to define a protocol that could, with the provision is not enough to define a protocol that could address this problem
of sufficient resources, address the problem. Instead, what is with the provision of sufficient resources. Instead, what is
needed is a way to provide the necessary security, with very needed is a way to provide the necessary security with very
limited performance costs and without requiring security limited performance costs and without requiring security
infrastructure that experience has shown is difficult for many infrastructure, which experience has shown is difficult for many
clients and servers to provide. clients and servers to provide.
In trying to provide a major security upgrade for a deployed protocol In trying to provide a major security upgrade for a deployed protocol
such as NFSv4.1, the working group, and the internet community is such as NFSv4.1, the working group and the Internet community are
likely to find itself dealing with a number of considerations such as likely to find themselves dealing with a number of considerations
the following: such as the following:
* The need to accommodate existing deployments of existing protocols * The need to accommodate existing deployments of protocols
as specified previously in existing Proposed Standards. specified previously in existing Proposed Standards.
* The difficulty of effecting changes to existing interoperating * The difficulty of effecting changes to existing, interoperating
implementations. implementations.
* The difficulty of making changes to NFSv4 protocols other than * The difficulty of making changes to NFSv4 protocols other than
those in the form of OPTIONAL extensions. those in the form of OPTIONAL extensions.
* The tendency of those responsible for existing NFSv4 deployments * The tendency of those responsible for existing NFSv4 deployments
to ignore security flaws in the context of local area networks to ignore security flaws in the context of local area networks
under the mistaken impression that network isolation provides, in under the mistaken impression that network isolation provides, in
and of itself, isolation from all potential attackers. and of itself, isolation from all potential attackers.
Given that the difficulties mentioned above apply to minor version Given that the above-mentioned difficulties apply to minor version
zero as well, it may make sense to deal with these security issues in zero as well, it may make sense to deal with these security issues in
a common document applying to all NFSv4 minor versions. If that a common document that applies to all NFSv4 minor versions. If that
approach is taken the, Security Considerations section of an eventual approach is taken, the Security Considerations section of an eventual
NFv4.1 bis document would reference that common document and the NFv4.1 bis document would reference that common document, and the
defining RFCs for other minor versions might do so as well. defining RFCs for other minor versions might do so as well.
Acknowledgments Acknowledgments
Acknowledgments for this Update Acknowledgments for This Update
The authors wish to acknowledge the important role of Andy Adamson of The authors wish to acknowledge the important role of Andy Adamson of
Netapp in clarifying the need for trunking discovery functionality, Netapp in clarifying the need for trunking discovery functionality,
and exploring the role of the file system location attributes in and exploring the role of the file system location attributes in
providing the necessary support. providing the necessary support.
The authors wish to thank Tom Haynes of Hammerspace for drawing our The authors wish to thank Tom Haynes of Hammerspace for drawing our
attention to the fact that internationalization and security might attention to the fact that internationalization and security might
best be handled in documents dealing with such protocol issues as best be handled in documents dealing with such protocol issues as
they apply to all NFSv4 minor versions. they apply to all NFSv4 minor versions.
The authors also wish to acknowledge the work of Xuan Qi of Oracle The authors also wish to acknowledge the work of Xuan Qi of Oracle
with NFSv4.1 client and server prototypes of transparent state with NFSv4.1 client and server prototypes of Transparent State
migration functionality. Migration functionality.
The authors wish to thank others that brought attention to important The authors wish to thank others that brought attention to important
issues. The comments of Trond Myklebust of Primary Data related to issues. The comments of Trond Myklebust of Primary Data related to
trunking helped to clarify the role of DNS in trunking discovery. trunking helped to clarify the role of DNS in trunking discovery.
Rick Macklem's comments brought attention to problems in the handling Rick Macklem's comments brought attention to problems in the handling
of the per-fs version of RECLAIM_COMPLETE. of the per-fs version of RECLAIM_COMPLETE.
The authors wish to thank Olga Kornievskaia of Netapp for her helpful The authors wish to thank Olga Kornievskaia of Netapp for her helpful
review comments. review comments.
Acknowledgments for RFC5661 Acknowledgments for RFC 5661
The initial text for the SECINFO extensions were edited by Mike The initial text for the SECINFO extensions were edited by Mike
Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl
Burnett. Burnett.
The initial text for the SESSIONS extensions were edited by Tom The initial text for the SESSIONS extensions were edited by Tom
Talpey, Spencer Shepler, Jon Bauman with contributions from Charles Talpey, Spencer Shepler, Jon Bauman with contributions from Charles
Antonelli, Brent Callaghan, Mike Eisler, John Howard, Chet Juszczak, Antonelli, Brent Callaghan, Mike Eisler, John Howard, Chet Juszczak,
Trond Myklebust, Dave Noveck, John Scott, Mike Stolarchuk, and Mark Trond Myklebust, Dave Noveck, John Scott, Mike Stolarchuk, and Mark
Wittle. Wittle.
skipping to change at line 31690 skipping to change at line 31690
The initial text for the parallel NFS support was edited by Brent The initial text for the parallel NFS support was edited by Brent
Welch and Garth Goodson. Additional authors for those documents were Welch and Garth Goodson. Additional authors for those documents were
Benny Halevy, David Black, and Andy Adamson. Additional input came Benny Halevy, David Black, and Andy Adamson. Additional input came
from the informal group that contributed to the construction of the from the informal group that contributed to the construction of the
initial pNFS drafts; specific acknowledgment goes to Gary Grider, initial pNFS drafts; specific acknowledgment goes to Gary Grider,
Peter Corbett, Dave Noveck, Peter Honeyman, and Stephen Fridella. Peter Corbett, Dave Noveck, Peter Honeyman, and Stephen Fridella.
Fredric Isaman found several errors in draft versions of the ONC RPC Fredric Isaman found several errors in draft versions of the ONC RPC
XDR description of the NFSv4.1 protocol. XDR description of the NFSv4.1 protocol.
Audrey Van Belleghem provided, in numerous ways, essential co- Audrey Van Belleghem provided, in numerous ways, essential
ordination and management of the process of editing the specification coordination and management of the process of editing the
documents. specification documents.
Richard Jernigan gave feedback on the file layout's striping pattern Richard Jernigan gave feedback on the file layout's striping pattern
design. design.
Several formal inspection teams were formed to review various areas Several formal inspection teams were formed to review various areas
of the protocol. All the inspections found significant errors and of the protocol. All the inspections found significant errors and
room for improvement. NFSv4.1's inspection teams were: room for improvement. NFSv4.1's inspection teams were:
* ACLs, with the following inspectors: Sam Falkner, Bruce Fields, * ACLs, with the following inspectors: Sam Falkner, Bruce Fields,
Rahul Iyer, Saadia Khan, Dave Noveck, Lisa Week, Mario Wurzl, and Rahul Iyer, Saadia Khan, Dave Noveck, Lisa Week, Mario Wurzl, and
skipping to change at line 31778 skipping to change at line 31778
Phone: +1-781-768-5347 Phone: +1-781-768-5347
Email: dnoveck@netapp.com Email: dnoveck@netapp.com
Charles Lever Charles Lever
Oracle Corporation Oracle Corporation
1015 Granger Avenue 1015 Granger Avenue
Ann Arbor, MI 48104 Ann Arbor, MI 48104
United States of America United States of America
Phone: +1 248 614 5091 Phone: +1-248-614-5091
Email: chuck.lever@oracle.com Email: chuck.lever@oracle.com
 End of changes. 491 change blocks. 
985 lines changed or deleted 985 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/