rfc8881.form.txt | rfc8881.txt | |||
---|---|---|---|---|
Internet Engineering Task Force (IETF) D. Noveck, Ed. | Internet Engineering Task Force (IETF) D. Noveck, Ed. | |||
Request for Comments: 0000 NetApp | Request for Comments: 8881 NetApp | |||
Obsoletes: 5661 C. Lever | Obsoletes: 5661 C. Lever | |||
Category: Standards Track ORACLE | Category: Standards Track ORACLE | |||
ISSN: 2070-1721 April 2020 | ISSN: 2070-1721 July 2020 | |||
Network File System (NFS) Version 4 Minor Version 1 Protocol | Network File System (NFS) Version 4 Minor Version 1 Protocol | |||
Abstract | Abstract | |||
This document describes the Network File System (NFS) version 4 minor | This document describes the Network File System (NFS) version 4 minor | |||
version 1, including features retained from the base protocol (NFS | version 1, including features retained from the base protocol (NFS | |||
version 4 minor version 0, which is specified in RFC 7530) and | version 4 minor version 0, which is specified in RFC 7530) and | |||
protocol extensions made subsequently. The later minor version has | protocol extensions made subsequently. The later minor version has | |||
no dependencies on NFS version 4 minor version 0, and is considered a | no dependencies on NFS version 4 minor version 0, and is considered a | |||
separate protocol. | separate protocol. | |||
This document obsoletes RFC5661. It substantially revises the | This document obsoletes RFC 5661. It substantially revises the | |||
treatment of features relating to multi-server namespace, superseding | treatment of features relating to multi-server namespace, superseding | |||
the description of those features appearing in RFC5661. | the description of those features appearing in RFC 5661. | |||
Status of This Memo | Status of This Memo | |||
This is an Internet Standards Track document. | This is an Internet Standards Track document. | |||
This document is a product of the Internet Engineering Task Force | This document is a product of the Internet Engineering Task Force | |||
(IETF). It represents the consensus of the IETF community. It has | (IETF). It represents the consensus of the IETF community. It has | |||
received public review and has been approved for publication by the | received public review and has been approved for publication by the | |||
Internet Engineering Steering Group (IESG). Further information on | Internet Engineering Steering Group (IESG). Further information on | |||
Internet Standards is available in Section 2 of RFC 7841. | Internet Standards is available in Section 2 of RFC 7841. | |||
Information about the current status of this document, any errata, | Information about the current status of this document, any errata, | |||
and how to provide feedback on it may be obtained at | and how to provide feedback on it may be obtained at | |||
https://www.rfc-editor.org/info/rfc0000. | https://www.rfc-editor.org/info/rfc8881. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2020 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at line 67 ¶ | skipping to change at line 67 ¶ | |||
Without obtaining an adequate license from the person(s) controlling | Without obtaining an adequate license from the person(s) controlling | |||
the copyright in such materials, this document may not be modified | the copyright in such materials, this document may not be modified | |||
outside the IETF Standards Process, and derivative works of it may | outside the IETF Standards Process, and derivative works of it may | |||
not be created outside the IETF Standards Process, except to format | not be created outside the IETF Standards Process, except to format | |||
it for publication as an RFC or to translate it into languages other | it for publication as an RFC or to translate it into languages other | |||
than English. | than English. | |||
Table of Contents | Table of Contents | |||
1. Introduction | 1. Introduction | |||
1.1. Introduction to this Update | 1.1. Introduction to This Update | |||
1.2. The NFS Version 4 Minor Version 1 Protocol | 1.2. The NFS Version 4 Minor Version 1 Protocol | |||
1.3. Requirements Language | 1.3. Requirements Language | |||
1.4. Scope of This Document | 1.4. Scope of This Document | |||
1.5. NFSv4 Goals | 1.5. NFSv4 Goals | |||
1.6. NFSv4.1 Goals | 1.6. NFSv4.1 Goals | |||
1.7. General Definitions | 1.7. General Definitions | |||
1.8. Overview of NFSv4.1 Features | 1.8. Overview of NFSv4.1 Features | |||
1.9. Differences from NFSv4.0 | 1.9. Differences from NFSv4.0 | |||
2. Core Infrastructure | 2. Core Infrastructure | |||
2.1. Introduction | 2.1. Introduction | |||
skipping to change at line 162 ¶ | skipping to change at line 162 ¶ | |||
10.7. Data and Metadata Caching and Memory Mapped Files | 10.7. Data and Metadata Caching and Memory Mapped Files | |||
10.8. Name and Directory Caching without Directory Delegations | 10.8. Name and Directory Caching without Directory Delegations | |||
10.9. Directory Delegations | 10.9. Directory Delegations | |||
11. Multi-Server Namespace | 11. Multi-Server Namespace | |||
11.1. Terminology | 11.1. Terminology | |||
11.2. File System Location Attributes | 11.2. File System Location Attributes | |||
11.3. File System Presence or Absence | 11.3. File System Presence or Absence | |||
11.4. Getting Attributes for an Absent File System | 11.4. Getting Attributes for an Absent File System | |||
11.5. Uses of File System Location Information | 11.5. Uses of File System Location Information | |||
11.6. Trunking without File System Location Information | 11.6. Trunking without File System Location Information | |||
11.7. Users and Groups in a Multi-server Namespace | 11.7. Users and Groups in a Multi-Server Namespace | |||
11.8. Additional Client-Side Considerations | 11.8. Additional Client-Side Considerations | |||
11.9. Overview of File Access Transitions | 11.9. Overview of File Access Transitions | |||
11.10. Effecting Network Endpoint Transitions | 11.10. Effecting Network Endpoint Transitions | |||
11.11. Effecting File System Transitions | 11.11. Effecting File System Transitions | |||
11.12. Transferring State upon Migration | 11.12. Transferring State upon Migration | |||
11.13. Client Responsibilities when Access is Transitioned | 11.13. Client Responsibilities When Access Is Transitioned | |||
11.14. Server Responsibilities Upon Migration | 11.14. Server Responsibilities Upon Migration | |||
11.15. Effecting File System Referrals | 11.15. Effecting File System Referrals | |||
11.16. The Attribute fs_locations | 11.16. The Attribute fs_locations | |||
11.17. The Attribute fs_locations_info | 11.17. The Attribute fs_locations_info | |||
11.18. The Attribute fs_status | 11.18. The Attribute fs_status | |||
12. Parallel NFS (pNFS) | 12. Parallel NFS (pNFS) | |||
12.1. Introduction | 12.1. Introduction | |||
12.2. pNFS Definitions | 12.2. pNFS Definitions | |||
12.3. pNFS Operations | 12.3. pNFS Operations | |||
12.4. pNFS Attributes | 12.4. pNFS Attributes | |||
skipping to change at line 300 ¶ | skipping to change at line 300 ¶ | |||
and Control | and Control | |||
20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending | 20.10. Operation 12: CB_WANTS_CANCELLED - Cancel Pending | |||
Delegation Wants | Delegation Wants | |||
20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of Possible | 20.11. Operation 13: CB_NOTIFY_LOCK - Notify Client of Possible | |||
Lock Availability | Lock Availability | |||
20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of Device | 20.12. Operation 14: CB_NOTIFY_DEVICEID - Notify Client of Device | |||
ID Changes | ID Changes | |||
20.13. Operation 10044: CB_ILLEGAL - Illegal Callback Operation | 20.13. Operation 10044: CB_ILLEGAL - Illegal Callback Operation | |||
21. Security Considerations | 21. Security Considerations | |||
22. IANA Considerations | 22. IANA Considerations | |||
22.1. IANA Actions Needed | 22.1. IANA Actions | |||
22.2. Named Attribute Definitions | 22.2. Named Attribute Definitions | |||
22.3. Device ID Notifications | 22.3. Device ID Notifications | |||
22.4. Object Recall Types | 22.4. Object Recall Types | |||
22.5. Layout Types | 22.5. Layout Types | |||
22.6. Path Variable Definitions | 22.6. Path Variable Definitions | |||
23. References | 23. References | |||
23.1. Normative References | 23.1. Normative References | |||
23.2. Informative References | 23.2. Informative References | |||
Appendix A. Need for this Update | Appendix A. The Need for This Update | |||
Appendix B. Changes in this Update | Appendix B. Changes in This Update | |||
B.1. Revisions Made to Section 11 of RFC5661 | B.1. Revisions Made to Section 11 of RFC 5661 | |||
B.2. Revisions Made to Operations in RFC5661 | B.2. Revisions Made to Operations in RFC 5661 | |||
B.3. Revisions Made to Error Definitions in RFC5661 | B.3. Revisions Made to Error Definitions in RFC 5661 | |||
B.4. Other Revisions Made to RFC5661 | B.4. Other Revisions Made to RFC 5661 | |||
Appendix C. Security Issues that Need to be Addressed | Appendix C. Security Issues That Need to Be Addressed | |||
Acknowledgments | Acknowledgments | |||
Authors' Addresses | Authors' Addresses | |||
1. Introduction | 1. Introduction | |||
1.1. Introduction to this Update | 1.1. Introduction to This Update | |||
Two important features previously defined in minor version 0 but | Two important features previously defined in minor version 0 but | |||
never fully addressed in minor version 1 are trunking, the | never fully addressed in minor version 1 are trunking, which is the | |||
simultaneous use of multiple connections between a client and server, | simultaneous use of multiple connections between a client and server, | |||
potentially to different network addresses, and transparent state | potentially to different network addresses, and Transparent State | |||
migration, which allows a file system to be transferred between | Migration, which allows a file system to be transferred between | |||
servers in a way that provides to the client the ability to maintain | servers in a way that provides to the client the ability to maintain | |||
its existing locking state across the transfer. | its existing locking state across the transfer. | |||
The revised description of the NFS version 4 minor version 1 | The revised description of the NFS version 4 minor version 1 | |||
(NFSv4.1) protocol presented in this update is necessary to enable | (NFSv4.1) protocol presented in this update is necessary to enable | |||
full use of these features together with other multi-server namespace | full use of these features together with other multi-server namespace | |||
features. This document is in the form of an updated description of | features. This document is in the form of an updated description of | |||
the NFSv4.1 protocol previously defined in RFC 5661 [65]. RFC5661 is | the NFSv4.1 protocol previously defined in RFC 5661 [65]. RFC 5661 | |||
obsoleted by this document. However, the update has a limited scope | is obsoleted by this document. However, the update has a limited | |||
and is focused on enabling full use of trunking and transparent state | scope and is focused on enabling full use of trunking and Transparent | |||
migration. The need for these changes is discussed in Appendix A. | State Migration. The need for these changes is discussed in | |||
Appendix B describes the specific changes made to arrive at the | Appendix A. Appendix B describes the specific changes made to arrive | |||
current text. | at the current text. | |||
This limited-scope update replaces the current NFSv4.1 RFC with the | This limited-scope update replaces the current NFSv4.1 RFC with the | |||
intention of providing an authoritative and complete specification, | intention of providing an authoritative and complete specification, | |||
the motivation for which is discussed in [35], addressing the issues | the motivation for which is discussed in [35], addressing the issues | |||
within the scope of the update. However, it will not address issues | within the scope of the update. However, it will not address issues | |||
that are known but outside of this limited scope as could expected by | that are known but outside of this limited scope as could be expected | |||
a full update of the protocol. Below are some areas which are known | by a full update of the protocol. Below are some areas that are | |||
to need addressing in a future update of the protocol. | known to need addressing in a future update of the protocol: | |||
* Work needs to be done with regard to RFC 8178 [66] which | * Work needs to be done with regard to RFC 8178 [66], which | |||
establishes NFSv4-wide versioning rules. As RFC5661 is currently | establishes NFSv4-wide versioning rules. As RFC 5661 is currently | |||
inconsistent with that document, changes are needed in order to | inconsistent with that document, changes are needed in order to | |||
arrive at a situation in which there would be no need for RFC8178 | arrive at a situation in which there would be no need for RFC 8178 | |||
to update the NFSv4.1 specification. | to update the NFSv4.1 specification. | |||
* Work needs to be done with regard to RFC 8434 [69], which | * Work needs to be done with regard to RFC 8434 [69], which | |||
establishes the requirements for pNFS layout types, which are not | establishes the requirements for parallel NFS (pNFS) layout types, | |||
clearly defined in RFC5661. When that work is done and the | which are not clearly defined in RFC 5661. When that work is done | |||
resulting documents approved, the new NFSv4.1 specification | and the resulting documents approved, the new NFSv4.1 | |||
document will provide a clear set of requirements for layout types | specification document will provide a clear set of requirements | |||
and a description of the file layout type that conforms to those | for layout types and a description of the file layout type that | |||
requirements. Other layout types will have their own | conforms to those requirements. Other layout types will have | |||
specification documents that conforms to those requirements as | their own specification documents that conform to those | |||
well. | requirements as well. | |||
* Work needs to be done to address many errata reports relevant to | * Work needs to be done to address many errata reports relevant to | |||
RFC 5661, other than errata report 2006 [63], which is addressed | RFC 5661, other than errata report 2006 [63], which is addressed | |||
in this document. Addressing that report was not deferrable | in this document. Addressing that report was not deferrable | |||
because of the interaction of the changes suggested there and the | because of the interaction of the changes suggested there and the | |||
newly described handling of state and session migration. | newly described handling of state and session migration. | |||
The errata reports that have been deferred and that will need to | The errata reports that have been deferred and that will need to | |||
be addressed in a later document include reports currently | be addressed in a later document include reports currently | |||
assigned a range of statuses in the errata reporting system | assigned a range of statuses in the errata reporting system, | |||
including reports marked Accepted and those marked Hold For | including reports marked Accepted and those marked Hold For | |||
Document Update because the change was too minor to address | Document Update because the change was too minor to address | |||
immediately. | immediately. | |||
In addition, there is a set of other reports, including at least | In addition, there is a set of other reports, including at least | |||
one in state Rejected, which will need to be addressed in a later | one in state Rejected, that will need to be addressed in a later | |||
document. This will involve making changes to consensus decisions | document. This will involve making changes to consensus decisions | |||
reflected in RFC 5661, in situation in which the working group has | reflected in RFC 5661, in situations in which the working group | |||
decided that the treatment in RFC 5661 is incorrect, and needs to | has decided that the treatment in RFC 5661 is incorrect and needs | |||
be revised to reflect the working group's new consensus and ensure | to be revised to reflect the working group's new consensus and to | |||
compatibility with existing implementations that do not follow the | ensure compatibility with existing implementations that do not | |||
handling described in in RFC 5661. | follow the handling described in RFC 5661. | |||
Note that it is expected that all such errata reports will remain | Note that it is expected that all such errata reports will remain | |||
relevant to implementers and the authors of an eventual | relevant to implementors and the authors of an eventual | |||
rfc5661bis, despite the fact that this document, when approved, | rfc5661bis, despite the fact that this document, when approved, | |||
will obsolete RFC 5661 [65]. | will obsolete RFC 5661 [65]. | |||
* There is a need for a new approach to the description of | * There is a need for a new approach to the description of | |||
internationalization since the current internationalization | internationalization since the current internationalization | |||
section (Section 14) has never been implemented and does not meet | section (Section 14) has never been implemented and does not meet | |||
the needs of the NFSv4 protocol. Possible solutions are to create | the needs of the NFSv4 protocol. Possible solutions are to create | |||
a new internationalization section modeled on that in [67] or to | a new internationalization section modeled on that in [67] or to | |||
create a new document describing internationalization for all | create a new document describing internationalization for all | |||
NFSv4 minor versions and reference that document in the RFCs | NFSv4 minor versions and reference that document in the RFCs | |||
defining both NFSv4.0 and NFSv4.1. | defining both NFSv4.0 and NFSv4.1. | |||
* There is a need for a revised treatment of security in NFSv4.1. | * There is a need for a revised treatment of security in NFSv4.1. | |||
The issues with the existing treatment are discussed in | The issues with the existing treatment are discussed in | |||
Appendix C. | Appendix C. | |||
Until the above work is done, there will not be a consistent set of | Until the above work is done, there will not be a consistent set of | |||
documents providing a description of the NFSv4.1 protocol and any | documents that provides a description of the NFSv4.1 protocol, and | |||
full description would involve documents updating other documents | any full description would involve documents updating other documents | |||
within the specification. The updates applied by RFC8434 [69] and | within the specification. The updates applied by RFC 8434 [69] and | |||
RFC8178 [66] to RFC5661 also apply to this specification, and will | RFC 8178 [66] to RFC 5661 also apply to this specification, and will | |||
apply to any subsequent v4.1 specification until that work is done. | apply to any subsequent v4.1 specification until that work is done. | |||
1.2. The NFS Version 4 Minor Version 1 Protocol | 1.2. The NFS Version 4 Minor Version 1 Protocol | |||
The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | The NFS version 4 minor version 1 (NFSv4.1) protocol is the second | |||
minor version of the NFS version 4 (NFSv4) protocol. The first minor | minor version of the NFS version 4 (NFSv4) protocol. The first minor | |||
version, NFSv4.0, is now described in RFC 7530 [67]. It generally | version, NFSv4.0, is now described in RFC 7530 [67]. It generally | |||
follows the guidelines for minor versioning that are listed in | follows the guidelines for minor versioning that are listed in | |||
Section 10 of RFC 3530 [36]. However, it diverges from guidelines 11 | Section 10 of RFC 3530 [36]. However, it diverges from guidelines 11 | |||
("a client and server that support minor version X must support minor | ("a client and server that support minor version X must support minor | |||
skipping to change at line 579 ¶ | skipping to change at line 579 ¶ | |||
Server: The Server is the entity responsible for coordinating client | Server: The Server is the entity responsible for coordinating client | |||
access to a set of file systems and is identified by a server | access to a set of file systems and is identified by a server | |||
owner. A server can span multiple network addresses. | owner. A server can span multiple network addresses. | |||
Server Owner: The server owner identifies the server to the client. | Server Owner: The server owner identifies the server to the client. | |||
The server owner consists of a major identifier and a minor | The server owner consists of a major identifier and a minor | |||
identifier. When the client has two connections each to a peer | identifier. When the client has two connections each to a peer | |||
with the same major identifier, the client assumes that both peers | with the same major identifier, the client assumes that both peers | |||
are the same server (the server namespace is the same via each | are the same server (the server namespace is the same via each | |||
connection) and that lock state is sharable across both | connection) and that lock state is shareable across both | |||
connections. When each peer has both the same major and minor | connections. When each peer has both the same major and minor | |||
identifiers, the client assumes that each connection might be | identifiers, the client assumes that each connection might be | |||
associable with the same session. | associable with the same session. | |||
Stable Storage: Stable storage is storage from which data stored by | Stable Storage: Stable storage is storage from which data stored by | |||
an NFSv4.1 server can be recovered without data loss from multiple | an NFSv4.1 server can be recovered without data loss from multiple | |||
power failures (including cascading power failures, that is, | power failures (including cascading power failures, that is, | |||
several power failures in quick succession), operating system | several power failures in quick succession), operating system | |||
failures, and/or hardware failure of components other than the | failures, and/or hardware failure of components other than the | |||
storage medium itself (such as disk, nonvolatile RAM, flash | storage medium itself (such as disk, nonvolatile RAM, flash | |||
skipping to change at line 755 ¶ | skipping to change at line 755 ¶ | |||
application-specific data with a regular file or directory. NFSv4.1 | application-specific data with a regular file or directory. NFSv4.1 | |||
modifies named attributes relative to NFSv4.0 by tightening the | modifies named attributes relative to NFSv4.0 by tightening the | |||
allowed operations in order to prevent the development of non- | allowed operations in order to prevent the development of non- | |||
interoperable implementations. Named attributes are discussed in | interoperable implementations. Named attributes are discussed in | |||
Section 5.3. | Section 5.3. | |||
1.8.3.3. Multi-Server Namespace | 1.8.3.3. Multi-Server Namespace | |||
NFSv4.1 contains a number of features to allow implementation of | NFSv4.1 contains a number of features to allow implementation of | |||
namespaces that cross server boundaries and that allow and facilitate | namespaces that cross server boundaries and that allow and facilitate | |||
a non-disruptive transfer of support for individual file systems | a nondisruptive transfer of support for individual file systems | |||
between servers. They are all based upon attributes that allow one | between servers. They are all based upon attributes that allow one | |||
file system to specify alternate, additional, and new location | file system to specify alternate, additional, and new location | |||
information that specifies how the client may access that file | information that specifies how the client may access that file | |||
system. | system. | |||
These attributes can be used to provide for individual active file | These attributes can be used to provide for individual active file | |||
systems: | systems: | |||
* Alternate network addresses to access the current file system | * Alternate network addresses to access the current file system | |||
instance. | instance. | |||
skipping to change at line 783 ¶ | skipping to change at line 783 ¶ | |||
namespace is associated with locations on other servers without there | namespace is associated with locations on other servers without there | |||
being any corresponding file system instance on the current server. | being any corresponding file system instance on the current server. | |||
For example, | For example, | |||
* These attributes may be used with absent file systems to implement | * These attributes may be used with absent file systems to implement | |||
referrals whereby one server may direct the client to a file | referrals whereby one server may direct the client to a file | |||
system provided by another server. This allows extensive multi- | system provided by another server. This allows extensive multi- | |||
server namespaces to be constructed. | server namespaces to be constructed. | |||
* These attributes may be provided when a previously present file | * These attributes may be provided when a previously present file | |||
system becomes absent. This allows non-disruptive migration of | system becomes absent. This allows nondisruptive migration of | |||
file systems to alternate servers. | file systems to alternate servers. | |||
1.8.4. Locking Facilities | 1.8.4. Locking Facilities | |||
As mentioned previously, NFSv4.1 is a single protocol that includes | As mentioned previously, NFSv4.1 is a single protocol that includes | |||
locking facilities. These locking facilities include support for | locking facilities. These locking facilities include support for | |||
many types of locks including a number of sorts of recallable locks. | many types of locks including a number of sorts of recallable locks. | |||
Recallable locks such as delegations allow the client to be assured | Recallable locks such as delegations allow the client to be assured | |||
that certain events will not occur so long as that lock is held. | that certain events will not occur so long as that lock is held. | |||
When circumstances change, the lock is recalled via a callback | When circumstances change, the lock is recalled via a callback | |||
skipping to change at line 908 ¶ | skipping to change at line 908 ¶ | |||
forms of RPC authentication, AUTH_SYS, had no strong authentication | forms of RPC authentication, AUTH_SYS, had no strong authentication | |||
and required a host-based authentication approach. NFSv4.1 also | and required a host-based authentication approach. NFSv4.1 also | |||
depends on RPC for basic security services and mandates RPC support | depends on RPC for basic security services and mandates RPC support | |||
for a user-based authentication model. The user-based authentication | for a user-based authentication model. The user-based authentication | |||
model has user principals authenticated by a server, and in turn the | model has user principals authenticated by a server, and in turn the | |||
server authenticated by user principals. RPC provides some basic | server authenticated by user principals. RPC provides some basic | |||
security services that are used by NFSv4.1. | security services that are used by NFSv4.1. | |||
2.2.1.1. RPC Security Flavors | 2.2.1.1. RPC Security Flavors | |||
As described in "Authentication", Section 7.2 of [3], RPC security is | As described in "Authentication", Section 7 of [3], RPC security is | |||
encapsulated in the RPC header, via a security or authentication | encapsulated in the RPC header, via a security or authentication | |||
flavor, and information specific to the specified security flavor. | flavor, and information specific to the specified security flavor. | |||
Every RPC header conveys information used to identify and | Every RPC header conveys information used to identify and | |||
authenticate a client and server. As discussed in Section 2.2.1.1.1, | authenticate a client and server. As discussed in Section 2.2.1.1.1, | |||
some security flavors provide additional security services. | some security flavors provide additional security services. | |||
NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This | NFSv4.1 clients and servers MUST implement RPCSEC_GSS. (This | |||
requirement to implement is not a requirement to use.) Other | requirement to implement is not a requirement to use.) Other | |||
flavors, such as AUTH_NONE and AUTH_SYS, MAY be implemented as well. | flavors, such as AUTH_NONE and AUTH_SYS, MAY be implemented as well. | |||
skipping to change at line 1126 ¶ | skipping to change at line 1126 ¶ | |||
Client identification is encapsulated in the following client owner | Client identification is encapsulated in the following client owner | |||
data type: | data type: | |||
struct client_owner4 { | struct client_owner4 { | |||
verifier4 co_verifier; | verifier4 co_verifier; | |||
opaque co_ownerid<NFS4_OPAQUE_LIMIT>; | opaque co_ownerid<NFS4_OPAQUE_LIMIT>; | |||
}; | }; | |||
The first field, co_verifier, is a client incarnation verifier, | The first field, co_verifier, is a client incarnation verifier, | |||
allowing the server to distinguish successive incarnations (e.g. | allowing the server to distinguish successive incarnations (e.g., | |||
reboots) of the same client. The server will start the process of | reboots) of the same client. The server will start the process of | |||
canceling the client's leased state if co_verifier is different than | canceling the client's leased state if co_verifier is different than | |||
what the server has previously recorded for the identified client (as | what the server has previously recorded for the identified client (as | |||
specified in the co_ownerid field). | specified in the co_ownerid field). | |||
The second field, co_ownerid, is a variable length string that | The second field, co_ownerid, is a variable length string that | |||
uniquely defines the client so that subsequent instances of the same | uniquely defines the client so that subsequent instances of the same | |||
client bear the same co_ownerid with a different verifier. | client bear the same co_ownerid with a different verifier. | |||
There are several considerations for how the client generates the | There are several considerations for how the client generates the | |||
skipping to change at line 2055 ¶ | skipping to change at line 2055 ¶ | |||
The backchannel is used for callback requests from server to client, | The backchannel is used for callback requests from server to client, | |||
and carries CB_COMPOUND requests and responses. Whether or not there | and carries CB_COMPOUND requests and responses. Whether or not there | |||
is a backchannel is decided by the client; however, many features of | is a backchannel is decided by the client; however, many features of | |||
NFSv4.1 require a backchannel. NFSv4.1 servers MUST support | NFSv4.1 require a backchannel. NFSv4.1 servers MUST support | |||
backchannels. | backchannels. | |||
Each session has resources for each channel, including separate reply | Each session has resources for each channel, including separate reply | |||
caches (see Section 2.10.6.1). Note that even the backchannel | caches (see Section 2.10.6.1). Note that even the backchannel | |||
requires a reply cache (or, at least, a slot table in order to detect | requires a reply cache (or, at least, a slot table in order to detect | |||
retries) because some callback operations are nonidempotent. | retries) because some callback operations are non-idempotent. | |||
2.10.3.1. Association of Connections, Channels, and Sessions | 2.10.3.1. Association of Connections, Channels, and Sessions | |||
Each channel is associated with zero or more transport connections | Each channel is associated with zero or more transport connections | |||
(whether of the same transport protocol or different transport | (whether of the same transport protocol or different transport | |||
protocols). A connection can be associated with one channel or both | protocols). A connection can be associated with one channel or both | |||
channels of a session; the client and server negotiate whether a | channels of a session; the client and server negotiate whether a | |||
connection will carry traffic for one channel or both channels via | connection will carry traffic for one channel or both channels via | |||
the CREATE_SESSION (Section 18.36) and the BIND_CONN_TO_SESSION | the CREATE_SESSION (Section 18.36) and the BIND_CONN_TO_SESSION | |||
(Section 18.34) operations. When a session is created via | (Section 18.34) operations. When a session is created via | |||
skipping to change at line 2140 ¶ | skipping to change at line 2140 ¶ | |||
implementation, but this can be tailored to the specific situations | implementation, but this can be tailored to the specific situations | |||
in which that recognition is desired. | in which that recognition is desired. | |||
Clients will have occasion to compare the server scope values of | Clients will have occasion to compare the server scope values of | |||
multiple servers under a number of circumstances, each of which will | multiple servers under a number of circumstances, each of which will | |||
be discussed under the appropriate functional section: | be discussed under the appropriate functional section: | |||
* When server owner values received in response to EXCHANGE_ID | * When server owner values received in response to EXCHANGE_ID | |||
operations sent to multiple network addresses are compared for the | operations sent to multiple network addresses are compared for the | |||
purpose of determining the validity of various forms of trunking, | purpose of determining the validity of various forms of trunking, | |||
as described in Section 11.5.2. . | as described in Section 11.5.2. | |||
* When network or server reconfiguration causes the same network | * When network or server reconfiguration causes the same network | |||
address to possibly be directed to different servers, with the | address to possibly be directed to different servers, with the | |||
necessity for the client to determine when lock reclaim should be | necessity for the client to determine when lock reclaim should be | |||
attempted, as described in Section 8.4.2.1. | attempted, as described in Section 8.4.2.1. | |||
When two replies from EXCHANGE_ID, each from two different server | When two replies from EXCHANGE_ID, each from two different server | |||
network addresses, have the same server scope, there are a number of | network addresses, have the same server scope, there are a number of | |||
ways a client can validate that the common server scope is due to two | ways a client can validate that the common server scope is due to two | |||
servers cooperating in a group. | servers cooperating in a group. | |||
skipping to change at line 2184 ¶ | skipping to change at line 2184 ¶ | |||
system involved (e.g. a file system being migrated). | system involved (e.g. a file system being migrated). | |||
2.10.5. Trunking | 2.10.5. Trunking | |||
Trunking is the use of multiple connections between a client and | Trunking is the use of multiple connections between a client and | |||
server in order to increase the speed of data transfer. NFSv4.1 | server in order to increase the speed of data transfer. NFSv4.1 | |||
supports two types of trunking: session trunking and client ID | supports two types of trunking: session trunking and client ID | |||
trunking. | trunking. | |||
In the context of a single server network address, it can be assumed | In the context of a single server network address, it can be assumed | |||
that all connections are accessing the same server and NFSv4.1 | that all connections are accessing the same server, and NFSv4.1 | |||
servers MUST support both forms of trunking. When multiple | servers MUST support both forms of trunking. When multiple | |||
connections use a set of network addresses accessing the same server, | connections use a set of network addresses to access the same server, | |||
the server MUST support both forms of trunking. NFSv4.1 servers in a | the server MUST support both forms of trunking. NFSv4.1 servers in a | |||
clustered configuration MAY allow network addresses for different | clustered configuration MAY allow network addresses for different | |||
servers to use client ID trunking. | servers to use client ID trunking. | |||
Clients may use either form of trunking as long as they do not, when | Clients may use either form of trunking as long as they do not, when | |||
trunking between different server network addresses, violate the | trunking between different server network addresses, violate the | |||
servers' mandates as to the kinds of trunking to be allowed (see | servers' mandates as to the kinds of trunking to be allowed (see | |||
below). With regard to callback channels, the client MUST allow the | below). With regard to callback channels, the client MUST allow the | |||
server to choose among all callback channels valid for a given client | server to choose among all callback channels valid for a given client | |||
ID and MUST support trunking when the connections supporting the | ID and MUST support trunking when the connections supporting the | |||
skipping to change at line 2278 ¶ | skipping to change at line 2278 ¶ | |||
When doing client ID trunking, locking state is shared across | When doing client ID trunking, locking state is shared across | |||
sessions associated with that same client ID. This requires the | sessions associated with that same client ID. This requires the | |||
server to coordinate state across sessions and the client to be | server to coordinate state across sessions and the client to be | |||
able to associate the same locking state with multiple sessions. | able to associate the same locking state with multiple sessions. | |||
It is always possible that, as a result of various sorts of | It is always possible that, as a result of various sorts of | |||
reconfiguration events, eir_server_scope and eir_server_owner values | reconfiguration events, eir_server_scope and eir_server_owner values | |||
may be different on subsequent EXCHANGE_ID requests made to the same | may be different on subsequent EXCHANGE_ID requests made to the same | |||
network address. | network address. | |||
In most cases such reconfiguration events will be disruptive and | In most cases, such reconfiguration events will be disruptive and | |||
indicate that an IP address formerly connected to one server is now | indicate that an IP address formerly connected to one server is now | |||
connected to an entirely different one. | connected to an entirely different one. | |||
Some guidelines on client handling of such situations follow: | Some guidelines on client handling of such situations follow: | |||
* When eir_server_scope changes, the client has no assurance that | * When eir_server_scope changes, the client has no assurance that | |||
any id's it obtained previously (e.g. file handles) can be validly | any IDs that it obtained previously (e.g., filehandles) can be | |||
used on the new server, and, even if the new server accepts them, | validly used on the new server, and, even if the new server | |||
there is no assurance that this is not due to accident. Thus, it | accepts them, there is no assurance that this is not due to | |||
is best to treat all such state as lost/stale although a client | accident. Thus, it is best to treat all such state as lost or | |||
may assume that the probability of inadvertent acceptance is low | stale, although a client may assume that the probability of | |||
and treat this situation as within the next case. | inadvertent acceptance is low and treat this situation as within | |||
the next case. | ||||
* When eir_server_scope remains the same and | * When eir_server_scope remains the same and | |||
eir_server_owner.so_major_id changes, the client can use the | eir_server_owner.so_major_id changes, the client can use the | |||
filehandles it has, consider its locking state lost, and attempt | filehandles it has, consider its locking state lost, and attempt | |||
to reclaim or otherwise re-obtain its locks. It might find that | to reclaim or otherwise re-obtain its locks. It might find that | |||
its file handle is now stale. However, if NFS4ERR_STALE is not | its filehandle is now stale. However, if NFS4ERR_STALE is not | |||
returned, it can proceed to reclaim or otherwise re-obtain its | returned, it can proceed to reclaim or otherwise re-obtain its | |||
open locking state. | open locking state. | |||
* When eir_server_scope and eir_server_owner.so_major_id remain the | * When eir_server_scope and eir_server_owner.so_major_id remain the | |||
same, the client has to use the now-current values of | same, the client has to use the now-current values of | |||
eir_server_owner.so_minor_id in deciding on appropriate forms of | eir_server_owner.so_minor_id in deciding on appropriate forms of | |||
trunking. This may result in connections being dropped or new | trunking. This may result in connections being dropped or new | |||
sessions being created. | sessions being created. | |||
2.10.5.1. Verifying Claims of Matching Server Identity | 2.10.5.1. Verifying Claims of Matching Server Identity | |||
When the server responds using two different connections claiming | When the server responds using two different connections that claim | |||
matching or partially matching eir_server_owner, eir_server_scope, | matching or partially matching eir_server_owner, eir_server_scope, | |||
and eir_clientid values, the client does not have to trust the | and eir_clientid values, the client does not have to trust the | |||
servers' claims. The client may verify these claims before trunking | servers' claims. The client may verify these claims before trunking | |||
traffic in the following ways: | traffic in the following ways: | |||
* For session trunking, clients SHOULD reliably verify if | * For session trunking, clients SHOULD reliably verify if | |||
connections between different network paths are in fact associated | connections between different network paths are in fact associated | |||
with the same NFSv4.1 server and usable on the same session, and | with the same NFSv4.1 server and usable on the same session, and | |||
servers MUST allow clients to perform reliable verification. When | servers MUST allow clients to perform reliable verification. When | |||
a client ID is created, the client SHOULD specify that | a client ID is created, the client SHOULD specify that | |||
skipping to change at line 4138 ¶ | skipping to change at line 4139 ¶ | |||
+===============+==============================================+ | +===============+==============================================+ | |||
| int32_t | typedef int int32_t; | | | int32_t | typedef int int32_t; | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| uint32_t | typedef unsigned int uint32_t; | | | uint32_t | typedef unsigned int uint32_t; | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| int64_t | typedef hyper int64_t; | | | int64_t | typedef hyper int64_t; | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| uint64_t | typedef unsigned hyper uint64_t; | | | uint64_t | typedef unsigned hyper uint64_t; | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| attrlist4 | typedef opaque attrlist4<>; | | | attrlist4 | typedef opaque attrlist4<>; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Used for file/directory attributes. | | | | Used for file/directory attributes. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| bitmap4 | typedef uint32_t bitmap4<>; | | | bitmap4 | typedef uint32_t bitmap4<>; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Used in attribute array encoding. | | | | Used in attribute array encoding. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| changeid4 | typedef uint64_t changeid4; | | | changeid4 | typedef uint64_t changeid4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Used in the definition of change_info4. | | | | Used in the definition of change_info4. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| clientid4 | typedef uint64_t clientid4; | | | clientid4 | typedef uint64_t clientid4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Shorthand reference to client | | | | Shorthand reference to client | | |||
| | identification. | | | | identification. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| count4 | typedef uint32_t count4; | | | count4 | typedef uint32_t count4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Various count parameters (READ, WRITE, | | | | Various count parameters (READ, WRITE, | | |||
| | COMMIT). | | | | COMMIT). | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| length4 | typedef uint64_t length4; | | | length4 | typedef uint64_t length4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | The length of a byte-range within a file. | | | | The length of a byte-range within a file. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| mode4 | typedef uint32_t mode4; | | | mode4 | typedef uint32_t mode4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Mode attribute data type. | | | | Mode attribute data type. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| nfs_cookie4 | typedef uint64_t nfs_cookie4; | | | nfs_cookie4 | typedef uint64_t nfs_cookie4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Opaque cookie value for READDIR. | | | | Opaque cookie value for READDIR. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; | | | nfs_fh4 | typedef opaque nfs_fh4<NFS4_FHSIZE>; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Filehandle definition. | | | | Filehandle definition. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| nfs_ftype4 | enum nfs_ftype4; | | | nfs_ftype4 | enum nfs_ftype4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Various defined file types. | | | | Various defined file types. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| nfsstat4 | enum nfsstat4; | | | nfsstat4 | enum nfsstat4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Return value for operations. | | | | Return value for operations. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| offset4 | typedef uint64_t offset4; | | | offset4 | typedef uint64_t offset4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Various offset designations (READ, WRITE, | | | | Various offset designations (READ, WRITE, | | |||
| | LOCK, COMMIT). | | | | LOCK, COMMIT). | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| qop4 | typedef uint32_t qop4; | | | qop4 | typedef uint32_t qop4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Quality of protection designation in | | | | Quality of protection designation in | | |||
| | SECINFO. | | | | SECINFO. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| sec_oid4 | typedef opaque sec_oid4<>; | | | sec_oid4 | typedef opaque sec_oid4<>; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Security Object Identifier. The sec_oid4 | | | | Security Object Identifier. The sec_oid4 | | |||
| | data type is not really opaque. Instead, it | | | | data type is not really opaque. Instead, it | | |||
| | contains an ASN.1 OBJECT IDENTIFIER as used | | | | contains an ASN.1 OBJECT IDENTIFIER as used | | |||
| | by GSS-API in the mech_type argument to | | | | by GSS-API in the mech_type argument to | | |||
| | GSS_Init_sec_context. See [7] for details. | | | | GSS_Init_sec_context. See [7] for details. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| sequenceid4 | typedef uint32_t sequenceid4; | | | sequenceid4 | typedef uint32_t sequenceid4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Sequence number used for various session | | | | Sequence number used for various session | | |||
| | operations (EXCHANGE_ID, CREATE_SESSION, | | | | operations (EXCHANGE_ID, CREATE_SESSION, | | |||
| | SEQUENCE, CB_SEQUENCE). | | | | SEQUENCE, CB_SEQUENCE). | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| seqid4 | typedef uint32_t seqid4; | | | seqid4 | typedef uint32_t seqid4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Sequence identifier used for locking. | | | | Sequence identifier used for locking. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| sessionid4 | typedef opaque | | | sessionid4 | typedef opaque | | |||
| | sessionid4[NFS4_SESSIONID_SIZE]; | | | | sessionid4[NFS4_SESSIONID_SIZE]; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Session identifier. | | | | Session identifier. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| slotid4 | typedef uint32_t slotid4; | | | slotid4 | typedef uint32_t slotid4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Sequencing artifact for various session | | | | Sequencing artifact for various session | | |||
| | operations (SEQUENCE, CB_SEQUENCE). | | | | operations (SEQUENCE, CB_SEQUENCE). | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| utf8string | typedef opaque utf8string<>; | | | utf8string | typedef opaque utf8string<>; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | UTF-8 encoding for strings. | | | | UTF-8 encoding for strings. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| utf8str_cis | typedef utf8string utf8str_cis; | | | utf8str_cis | typedef utf8string utf8str_cis; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Case-insensitive UTF-8 string. | | | | Case-insensitive UTF-8 string. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| utf8str_cs | typedef utf8string utf8str_cs; | | | utf8str_cs | typedef utf8string utf8str_cs; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Case-sensitive UTF-8 string. | | | | Case-sensitive UTF-8 string. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| utf8str_mixed | typedef utf8string utf8str_mixed; | | | utf8str_mixed | typedef utf8string utf8str_mixed; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | UTF-8 strings with a case-sensitive prefix | | | | UTF-8 strings with a case-sensitive prefix | | |||
| | and a case-insensitive suffix. | | | | and a case-insensitive suffix. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| component4 | typedef utf8str_cs component4; | | | component4 | typedef utf8str_cs component4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Represents pathname components. | | | | Represents pathname components. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| linktext4 | typedef utf8str_cs linktext4; | | | linktext4 | typedef utf8str_cs linktext4; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Symbolic link contents ("symbolic link" is | | | | Symbolic link contents ("symbolic link" is | | |||
| | defined in an Open Group [11] standard). | | | | defined in an Open Group [11] standard). | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| pathname4 | typedef component4 pathname4<>; | | | pathname4 | typedef component4 pathname4<>; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Represents pathname for fs_locations. | | | | Represents pathname for fs_locations. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
| verifier4 | typedef opaque | | | verifier4 | typedef opaque | | |||
| | verifier4[NFS4_VERIFIER_SIZE]; | | | | verifier4[NFS4_VERIFIER_SIZE]; | | |||
+---------------+----------------------------------------------+ | | | | | |||
| | Verifier used for various operations | | | | Verifier used for various operations | | |||
| | (COMMIT, CREATE, EXCHANGE_ID, OPEN, READDIR, | | | | (COMMIT, CREATE, EXCHANGE_ID, OPEN, READDIR, | | |||
| | WRITE) NFS4_VERIFIER_SIZE is defined as 8. | | | | WRITE) NFS4_VERIFIER_SIZE is defined as 8. | | |||
+---------------+----------------------------------------------+ | +---------------+----------------------------------------------+ | |||
Table 1 | Table 1 | |||
End of Base Data Types | End of Base Data Types | |||
3.3. Structured Data Types | 3.3. Structured Data Types | |||
skipping to change at line 5152 ¶ | skipping to change at line 5153 ¶ | |||
REQUIRED and RECOMMENDED attributes are get-only; i.e., they can be | REQUIRED and RECOMMENDED attributes are get-only; i.e., they can be | |||
retrieved via GETATTR but not set via SETATTR. If a client attempts | retrieved via GETATTR but not set via SETATTR. If a client attempts | |||
to set a get-only attribute or get a set-only attributes, the server | to set a get-only attribute or get a set-only attributes, the server | |||
MUST return NFS4ERR_INVAL. | MUST return NFS4ERR_INVAL. | |||
5.6. REQUIRED Attributes - List and Definition References | 5.6. REQUIRED Attributes - List and Definition References | |||
The list of REQUIRED attributes appears in Table 4. The meaning of | The list of REQUIRED attributes appears in Table 4. The meaning of | |||
the columns of the table are: | the columns of the table are: | |||
* Name: The name of the attribute. | Name: The name of the attribute. | |||
* Id: The number assigned to the attribute. In the event of | Id: The number assigned to the attribute. In the event of conflicts | |||
conflicts between the assigned number and [10], the latter is | between the assigned number and [10], the latter is likely | |||
likely authoritative, but should be resolved with Errata to this | authoritative, but should be resolved with Errata to this document | |||
document and/or [10]. See [50] for the Errata process. | and/or [10]. See [50] for the Errata process. | |||
* Data Type: The XDR data type of the attribute. | Data Type: The XDR data type of the attribute. | |||
* Acc: Access allowed to the attribute. R means read-only (GETATTR | Acc: Access allowed to the attribute. R means read-only (GETATTR | |||
may retrieve, SETATTR may not set). W means write-only (SETATTR | may retrieve, SETATTR may not set). W means write-only (SETATTR | |||
may set, GETATTR may not retrieve). R W means read/write (GETATTR | may set, GETATTR may not retrieve). R W means read/write (GETATTR | |||
may retrieve, SETATTR may set). | may retrieve, SETATTR may set). | |||
* Defined in: The section of this specification that describes the | Defined in: The section of this specification that describes the | |||
attribute. | attribute. | |||
+====================+====+============+=====+==================+ | +====================+====+============+=====+==================+ | |||
| Name | Id | Data Type | Acc | Defined in: | | | Name | Id | Data Type | Acc | Defined in: | | |||
+====================+====+============+=====+==================+ | +====================+====+============+=====+==================+ | |||
| supported_attrs | 0 | bitmap4 | R | Section 5.8.1.1 | | | supported_attrs | 0 | bitmap4 | R | Section 5.8.1.1 | | |||
+--------------------+----+------------+-----+------------------+ | +--------------------+----+------------+-----+------------------+ | |||
| type | 1 | nfs_ftype4 | R | Section 5.8.1.2 | | | type | 1 | nfs_ftype4 | R | Section 5.8.1.2 | | |||
+--------------------+----+------------+-----+------------------+ | +--------------------+----+------------+-----+------------------+ | |||
| fh_expire_type | 2 | uint32_t | R | Section 5.8.1.3 | | | fh_expire_type | 2 | uint32_t | R | Section 5.8.1.3 | | |||
skipping to change at line 5373 ¶ | skipping to change at line 5374 ¶ | |||
+--------------------+----+----------------+-----+------------------+ | +--------------------+----+----------------+-----+------------------+ | |||
| time_metadata | 52 | nfstime4 | R | Section | | | time_metadata | 52 | nfstime4 | R | Section | | |||
| | | | | 5.8.2.42 | | | | | | | 5.8.2.42 | | |||
+--------------------+----+----------------+-----+------------------+ | +--------------------+----+----------------+-----+------------------+ | |||
| time_modify | 53 | nfstime4 | R | Section | | | time_modify | 53 | nfstime4 | R | Section | | |||
| | | | | 5.8.2.43 | | | | | | | 5.8.2.43 | | |||
+--------------------+----+----------------+-----+------------------+ | +--------------------+----+----------------+-----+------------------+ | |||
| time_modify_set | 54 | settime4 | W | Section | | | time_modify_set | 54 | settime4 | W | Section | | |||
| | | | | 5.8.2.44 | | | | | | | 5.8.2.44 | | |||
+--------------------+----+----------------+-----+------------------+ | +--------------------+----+----------------+-----+------------------+ | |||
| * fs_locations_info4 | | ||||
+-------------------------------------------------------------------+ | ||||
Table 5 | Table 5 | |||
* fs_locations_info4 | ||||
5.8. Attribute Definitions | 5.8. Attribute Definitions | |||
5.8.1. Definitions of REQUIRED Attributes | 5.8.1. Definitions of REQUIRED Attributes | |||
5.8.1.1. Attribute 0: supported_attrs | 5.8.1.1. Attribute 0: supported_attrs | |||
The bit vector that would retrieve all REQUIRED and RECOMMENDED | The bit vector that would retrieve all REQUIRED and RECOMMENDED | |||
attributes that are supported for this object. The scope of this | attributes that are supported for this object. The scope of this | |||
attribute applies to all objects with a matching fsid. | attribute applies to all objects with a matching fsid. | |||
skipping to change at line 8728 ¶ | skipping to change at line 8729 ¶ | |||
within the lease period, it is up to the client to determine which | within the lease period, it is up to the client to determine which | |||
locks have been revoked and which have not. It does this by using | locks have been revoked and which have not. It does this by using | |||
the TEST_STATEID operation on the appropriate set of stateids. Once | the TEST_STATEID operation on the appropriate set of stateids. Once | |||
the set of revoked locks has been determined, the applications can be | the set of revoked locks has been determined, the applications can be | |||
notified, and the invalidated stateids can be freed and lock | notified, and the invalidated stateids can be freed and lock | |||
revocation acknowledged by using FREE_STATEID. | revocation acknowledged by using FREE_STATEID. | |||
8.6. Short and Long Leases | 8.6. Short and Long Leases | |||
When determining the time period for the server lease, the usual | When determining the time period for the server lease, the usual | |||
lease tradeoffs apply. A short lease is good for fast server | lease trade-offs apply. A short lease is good for fast server | |||
recovery at a cost of increased operations to effect lease renewal | recovery at a cost of increased operations to effect lease renewal | |||
(when there are no other operations during the period to effect lease | (when there are no other operations during the period to effect lease | |||
renewal as a side effect). A long lease is certainly kinder and | renewal as a side effect). A long lease is certainly kinder and | |||
gentler to servers trying to handle very large numbers of clients. | gentler to servers trying to handle very large numbers of clients. | |||
The number of extra requests to effect lock renewal drops in inverse | The number of extra requests to effect lock renewal drops in inverse | |||
proportion to the lease time. The disadvantages of a long lease | proportion to the lease time. The disadvantages of a long lease | |||
include the possibility of slower recovery after certain failures. | include the possibility of slower recovery after certain failures. | |||
After server failure, a longer grace period may be required when some | After server failure, a longer grace period may be required when some | |||
clients do not promptly reclaim their locks and do a global | clients do not promptly reclaim their locks and do a global | |||
RECLAIM_COMPLETE. In the event of client failure, the longer period | RECLAIM_COMPLETE. In the event of client failure, the longer period | |||
skipping to change at line 10901 ¶ | skipping to change at line 10902 ¶ | |||
protected by OPEN_DELEGATE_READ delegations and notifications. Thus, | protected by OPEN_DELEGATE_READ delegations and notifications. Thus, | |||
no provision is made for reclaiming directory delegations in the | no provision is made for reclaiming directory delegations in the | |||
event of client or server restart. The client can simply establish a | event of client or server restart. The client can simply establish a | |||
directory delegation in the same fashion as was done initially. | directory delegation in the same fashion as was done initially. | |||
11. Multi-Server Namespace | 11. Multi-Server Namespace | |||
NFSv4.1 supports attributes that allow a namespace to extend beyond | NFSv4.1 supports attributes that allow a namespace to extend beyond | |||
the boundaries of a single server. It is desirable that clients and | the boundaries of a single server. It is desirable that clients and | |||
servers support construction of such multi-server namespaces. Use of | servers support construction of such multi-server namespaces. Use of | |||
such multi-server namespaces is OPTIONAL however, and for many | such multi-server namespaces is OPTIONAL; however, and for many | |||
purposes, single-server namespaces are perfectly acceptable. Use of | purposes, single-server namespaces are perfectly acceptable. Use of | |||
multi-server namespaces can provide many advantages, by separating a | multi-server namespaces can provide many advantages, by separating a | |||
file system's logical position in a namespace from the (possibly | file system's logical position in a namespace from the (possibly | |||
changing) logistical and administrative considerations that result in | changing) logistical and administrative considerations that result in | |||
particular file systems being located on particular servers via a | particular file systems being located on particular servers via a | |||
single network access paths known in advance or determined using DNS. | single network access path known in advance or determined using DNS. | |||
11.1. Terminology | 11.1. Terminology | |||
In this section as a whole (i.e. within all of Section 11), the | In this section as a whole (i.e., within all of Section 11), the | |||
phrase "client ID" always refers to the 64-bit shorthand identifier | phrase "client ID" always refers to the 64-bit shorthand identifier | |||
assigned by the server (a clientid4) and never to the structure which | assigned by the server (a clientid4) and never to the structure that | |||
the client uses to identify itself to the server (called an | the client uses to identify itself to the server (called an | |||
nfs_client_id4 or client_owner in NFSv4.0 and NFSv4.1 respectively). | nfs_client_id4 or client_owner in NFSv4.0 and NFSv4.1, respectively). | |||
The opaque identifier within those structures is referred to as a | The opaque identifier within those structures is referred to as a | |||
"client id string". | "client id string". | |||
11.1.1. Terminology Related to Trunking | 11.1.1. Terminology Related to Trunking | |||
It is particularly important to clarify the distinction between | It is particularly important to clarify the distinction between | |||
trunking detection and trunking discovery. The definitions we | trunking detection and trunking discovery. The definitions we | |||
present are applicable to all minor versions of NFSv4, but we will | present are applicable to all minor versions of NFSv4, but we will | |||
focus on how these terms apply to NFS version 4.1. | focus on how these terms apply to NFS version 4.1. | |||
skipping to change at line 10937 ¶ | skipping to change at line 10938 ¶ | |||
network addresses are connected to the same NFSv4 server. The | network addresses are connected to the same NFSv4 server. The | |||
means available to make this determination depends on the protocol | means available to make this determination depends on the protocol | |||
version, and, in some cases, on the client implementation. | version, and, in some cases, on the client implementation. | |||
In the case of NFS version 4.1 and later minor versions, the means | In the case of NFS version 4.1 and later minor versions, the means | |||
of trunking detection are as described in this document and are | of trunking detection are as described in this document and are | |||
available to every client. Two network addresses connected to the | available to every client. Two network addresses connected to the | |||
same server can always be used together to access a particular | same server can always be used together to access a particular | |||
server but cannot necessarily be used together to access a single | server but cannot necessarily be used together to access a single | |||
session. See below for definitions of the terms "server- | session. See below for definitions of the terms "server- | |||
trunkable" and "session-trunkable" | trunkable" and "session-trunkable". | |||
* Trunking discovery is a process by which a client using one | * Trunking discovery is a process by which a client using one | |||
network address can obtain other addresses that are connected to | network address can obtain other addresses that are connected to | |||
the same server. Typically, it builds on a trunking detection | the same server. Typically, it builds on a trunking detection | |||
facility by providing one or more methods by which candidate | facility by providing one or more methods by which candidate | |||
addresses are made available to the client who can then use | addresses are made available to the client, who can then use | |||
trunking detection to appropriately filter them. | trunking detection to appropriately filter them. | |||
Despite the support for trunking detection there was no | Despite the support for trunking detection, there was no | |||
description of trunking discovery provided in RFC5661 [65], making | description of trunking discovery provided in RFC 5661 [65], | |||
it necessary to provide those means in this document. | making it necessary to provide those means in this document. | |||
The combination of a server network address and a particular | The combination of a server network address and a particular | |||
connection type to be used by a connection is referred to as a | connection type to be used by a connection is referred to as a | |||
"server endpoint". Although using different connection types may | "server endpoint". Although using different connection types may | |||
result in different ports being used, the use of different ports by | result in different ports being used, the use of different ports by | |||
multiple connections to the same network address in such cases is not | multiple connections to the same network address in such cases is not | |||
the essence of the distinction between the two endpoints used. This | the essence of the distinction between the two endpoints used. This | |||
is in contrast to the case of port-specific endpoints, in which the | is in contrast to the case of port-specific endpoints, in which the | |||
explicit specification of port numbers within network addresses is | explicit specification of port numbers within network addresses is | |||
used to allow a single server node to support multiple NFS servers. | used to allow a single server node to support multiple NFS servers. | |||
Two network addresses connected to the same server are said to be | Two network addresses connected to the same server are said to be | |||
server-trunkable. Two such addresses support the use of clientid ID | server-trunkable. Two such addresses support the use of client ID | |||
trunking, as described in Section 2.10.5. | trunking, as described in Section 2.10.5. | |||
Two network addresses connected to the same server such that those | Two network addresses connected to the same server such that those | |||
addresses can be used to support a single common session are referred | addresses can be used to support a single common session are referred | |||
to as session-trunkable. Note that two addresses may be server- | to as session-trunkable. Note that two addresses may be server- | |||
trunkable without being session-trunkable and that when two | trunkable without being session-trunkable, and that, when two | |||
connections of different connection types are made to the same | connections of different connection types are made to the same | |||
network address and are based on a single file system location entry | network address and are based on a single file system location entry, | |||
they are always session-trunkable, independent of the connection | they are always session-trunkable, independent of the connection | |||
type, as specified by Section 2.10.5, since their derivation from the | type, as specified by Section 2.10.5, since their derivation from the | |||
same file system location entry together with the identity of their | same file system location entry, together with the identity of their | |||
network addresses assures that both connections are to the same | network addresses, assures that both connections are to the same | |||
server and will return server-owner information allowing session | server and will return server-owner information, allowing session | |||
trunking to be used. | trunking to be used. | |||
11.1.2. Terminology Related to File System Location | 11.1.2. Terminology Related to File System Location | |||
Regarding terminology relating to the construction of multi-server | Regarding the terminology that relates to the construction of multi- | |||
namespaces out of a set of local per-server namespaces: | server namespaces out of a set of local per-server namespaces: | |||
* Each server has a set of exported file systems which may be | * Each server has a set of exported file systems that may be | |||
accessed by NFSv4 clients. Typically, this is done by assigning | accessed by NFSv4 clients. Typically, this is done by assigning | |||
each file system a name within the pseudo-fs associated with the | each file system a name within the pseudo-fs associated with the | |||
server, although the pseudo-fs may be dispensed with if there is | server, although the pseudo-fs may be dispensed with if there is | |||
only a single exported file system. Each such file system is part | only a single exported file system. Each such file system is part | |||
of the server's local namespace, and can be considered as a file | of the server's local namespace, and can be considered as a file | |||
system instance within a larger multi-server namespace. | system instance within a larger multi-server namespace. | |||
* The set of all exported file systems for a given server | * The set of all exported file systems for a given server | |||
constitutes that server's local namespace. | constitutes that server's local namespace. | |||
* In some cases, a server will have a namespace more extensive than | * In some cases, a server will have a namespace more extensive than | |||
its local namespace by using features associated with attributes | its local namespace by using features associated with attributes | |||
that provide file system location information. These features, | that provide file system location information. These features, | |||
which allow construction of a multi-server namespace, are all | which allow construction of a multi-server namespace, are all | |||
described in individual sections below and include referrals | described in individual sections below and include referrals | |||
(described in Section 11.5.6), migration (described in | (Section 11.5.6), migration (Section 11.5.5), and replication | |||
Section 11.5.5), and replication (described in Section 11.5.4). | (Section 11.5.4). | |||
* A file system present in a server's pseudo-fs may have multiple | * A file system present in a server's pseudo-fs may have multiple | |||
file system instances on different servers associated with it. | file system instances on different servers associated with it. | |||
All such instances are considered replicas of one another. | All such instances are considered replicas of one another. | |||
Whether such replicas can be used simultaneously is discussed in | Whether such replicas can be used simultaneously is discussed in | |||
Section 11.11.1, while the level of co-ordination between them | Section 11.11.1, while the level of coordination between them | |||
(important when switching between them) is discussed in Sections | (important when switching between them) is discussed in Sections | |||
11.11.2 through 11.11.8 below. | 11.11.2 through 11.11.8 below. | |||
* When a file system is present in a server's pseudo-fs, but there | * When a file system is present in a server's pseudo-fs, but there | |||
is no corresponding local file system, it is said to be "absent". | is no corresponding local file system, it is said to be "absent". | |||
In such cases, all associated instances will be accessed on other | In such cases, all associated instances will be accessed on other | |||
servers. | servers. | |||
Regarding terminology relating to attributes used in trunking | Regarding the terminology that relates to attributes used in trunking | |||
discovery and other multi-server namespace features: | discovery and other multi-server namespace features: | |||
* File system location attributes include the fs_locations and | * File system location attributes include the fs_locations and | |||
fs_locations_info attributes. | fs_locations_info attributes. | |||
* File system location entries provide the individual file system | * File system location entries provide the individual file system | |||
locations within the file system location attributes. Each such | locations within the file system location attributes. Each such | |||
entry specifies a server, in the form of a host name or an | entry specifies a server, in the form of a hostname or an address, | |||
address, and an fs name, which designates the location of the file | and an fs name, which designates the location of the file system | |||
system within the server's local namespace. A file system | within the server's local namespace. A file system location entry | |||
location entry designates a set of server endpoints to which the | designates a set of server endpoints to which the client may | |||
client may establish connections. There may be multiple endpoints | establish connections. There may be multiple endpoints because a | |||
because a host name may map to multiple network addresses and | hostname may map to multiple network addresses and because | |||
because multiple connection types may be used to communicate with | multiple connection types may be used to communicate with a single | |||
a single network address. However, except where an explicit port | network address. However, except where explicit port numbers are | |||
numbers are used to designate a set of server within a single | used to designate a set of servers within a single server node, | |||
server node, all such endpoints MUST designate a way of connecting | all such endpoints MUST designate a way of connecting to a single | |||
to a single server. The exact form of the location entry varies | server. The exact form of the location entry varies with the | |||
with the particular file system location attribute used, as | particular file system location attribute used, as described in | |||
described in Section 11.2. | Section 11.2. | |||
The network addresses used in file system location entries | The network addresses used in file system location entries | |||
typically appear without port number indications and are used to | typically appear without port number indications and are used to | |||
designate a server at one of the standard ports for NFS access, | designate a server at one of the standard ports for NFS access, | |||
e.g., 2049 for TCP, or 20049 for use with RPC-over-RDMA. Port | e.g., 2049 for TCP or 20049 for use with RPC-over-RDMA. Port | |||
numbers may be used in file system location entries to designate | numbers may be used in file system location entries to designate | |||
servers (typically user-level ones) accessed using other port | servers (typically user-level ones) accessed using other port | |||
numbers. In the case where network addresses indicate trunking | numbers. In the case where network addresses indicate trunking | |||
relationships, use of an explicit port number is inappropriate | relationships, the use of an explicit port number is inappropriate | |||
since trunking is a relationship between network addresses. See | since trunking is a relationship between network addresses. See | |||
Section 11.5.2 for details. | Section 11.5.2 for details. | |||
* File system location elements are derived from location entries | * File system location elements are derived from location entries, | |||
and each describes a particular network access path, consisting of | and each describes a particular network access path consisting of | |||
a network address and a location within the server's local | a network address and a location within the server's local | |||
namespace. Such location elements need not appear within a file | namespace. Such location elements need not appear within a file | |||
system location attribute, but the existence of each location | system location attribute, but the existence of each location | |||
element derives from a corresponding location entry. When a | element derives from a corresponding location entry. When a | |||
location entry specifies an IP address there is only a single | location entry specifies an IP address, there is only a single | |||
corresponding location element. File system location entries that | corresponding location element. File system location entries that | |||
contain a host name are resolved using DNS, and may result in one | contain a hostname are resolved using DNS, and may result in one | |||
or more location elements. All location elements consist of a | or more location elements. All location elements consist of a | |||
location address which includes the IP address of an interface to | location address that includes the IP address of an interface to a | |||
a server and an fs name which is the location of the file system | server and an fs name, which is the location of the file system | |||
within the server's local namespace. The fs name can be empty if | within the server's local namespace. The fs name can be empty if | |||
the server has no pseudo-fs and only a single exported file system | the server has no pseudo-fs and only a single exported file system | |||
at the root filehandle. | at the root filehandle. | |||
* Two file system location elements are said to be server-trunkable | * Two file system location elements are said to be server-trunkable | |||
if they specify the same fs name and the location addresses are | if they specify the same fs name and the location addresses are | |||
such that the location addresses are server-trunkable. When the | such that the location addresses are server-trunkable. When the | |||
corresponding network paths are used, the client will always be | corresponding network paths are used, the client will always be | |||
able to use client ID trunking, but will only be able to use | able to use client ID trunking, but will only be able to use | |||
session trunking if the paths are also session-trunkable. | session trunking if the paths are also session-trunkable. | |||
* Two file system location elements are said to be session-trunkable | * Two file system location elements are said to be session-trunkable | |||
if they specify the same fs name and the location addresses are | if they specify the same fs name and the location addresses are | |||
such that the location addresses are session-trunkable. When the | such that the location addresses are session-trunkable. When the | |||
corresponding network paths are used, the client will be able to | corresponding network paths are used, the client will be able to | |||
able to use either client ID trunking or session trunking. | able to use either client ID trunking or session trunking. | |||
Discussion of the term "replica" is complicated by the fact that the | Discussion of the term "replica" is complicated by the fact that the | |||
term was used in RFC5661 [65], with a meaning different from that in | term was used in RFC 5661 [65] with a meaning different from that | |||
this document. In short, in [65] each replica is identified by a | used in this document. In short, in [65] each replica is identified | |||
single network access path while, in the current document a set of | by a single network access path, while in the current document, a set | |||
network access paths which have server-trunkable network addresses | of network access paths that have server-trunkable network addresses | |||
and the same root-relative file system pathname is considered to be a | and the same root-relative file system pathname is considered to be a | |||
single replica with multiple network access paths. | single replica with multiple network access paths. | |||
Each set of server-trunkable location elements defines a set of | Each set of server-trunkable location elements defines a set of | |||
available network access paths to a particular file system. When | available network access paths to a particular file system. When | |||
there are multiple such file systems, each of which contains the same | there are multiple such file systems, each of which containing the | |||
data, these file systems are considered replicas of one another. | same data, these file systems are considered replicas of one another. | |||
Logically, such replication is symmetric, since the fs currently in | Logically, such replication is symmetric, since the fs currently in | |||
use and an alternate fs are replicas of each other. Often, in other | use and an alternate fs are replicas of each other. Often, in other | |||
documents, the term "replica" is not applied to the fs currently in | documents, the term "replica" is not applied to the fs currently in | |||
use, despite the fact that the replication relation is inherently | use, despite the fact that the replication relation is inherently | |||
symmetric. | symmetric. | |||
11.2. File System Location Attributes | 11.2. File System Location Attributes | |||
NFSv4.1 contains attributes that provide information about how (i.e., | NFSv4.1 contains attributes that provide information about how a | |||
at what network address and namespace position) a given file system | given file system may be accessed (i.e., at what network address and | |||
may be accessed. As a result, file systems in the namespace of one | namespace position). As a result, file systems in the namespace of | |||
server can be associated with one or more instances of that file | one server can be associated with one or more instances of that file | |||
system on other servers. These attributes contain file system | system on other servers. These attributes contain file system | |||
location entries specifying a server address target (either as a DNS | location entries specifying a server address target (either as a DNS | |||
name representing one or more IP addresses or as a specific IP | name representing one or more IP addresses or as a specific IP | |||
address) together with the pathname of that file system within the | address) together with the pathname of that file system within the | |||
associated single-server namespace. | associated single-server namespace. | |||
The fs_locations_info RECOMMENDED attribute allows specification of | The fs_locations_info RECOMMENDED attribute allows specification of | |||
one or more file system instance locations where the data | one or more file system instance locations where the data | |||
corresponding to a given file system may be found. This attribute | corresponding to a given file system may be found. In addition to | |||
provides to the client, in addition to specification of file system | the specification of file system instance locations, this attribute | |||
instance locations, other helpful information such as: | provides helpful information to do the following: | |||
* Information guiding choices among the various file system | * Guide choices among the various file system instances provided | |||
instances provided (e.g., priority for use, writability, currency, | (e.g., priority for use, writability, currency, etc.). | |||
etc.). | ||||
* Information to help the client efficiently effect as seamless a | * Help the client efficiently effect as seamless a transition as | |||
transition as possible among multiple file system instances, when | possible among multiple file system instances, when and if that | |||
and if that should be necessary. | should be necessary. | |||
* Information helping to guide the selection of the appropriate | * Guide the selection of the appropriate connection type to be used | |||
connection type to be used when establishing a connection. | when establishing a connection. | |||
Within the fs_locations_info attribute, each fs_locations_server4 | Within the fs_locations_info attribute, each fs_locations_server4 | |||
entry corresponds to a file system location entry with the fls_server | entry corresponds to a file system location entry with the fls_server | |||
field designating the server, with the location pathname within the | field designating the server and with the location pathname within | |||
server's pseudo-fs given by the fl_rootpath field of the encompassing | the server's pseudo-fs given by the fl_rootpath field of the | |||
fs_locations_item4. | encompassing fs_locations_item4. | |||
The fs_locations attribute defined in NFSv4.0 is also a part of | The fs_locations attribute defined in NFSv4.0 is also a part of | |||
NFSv4.1. This attribute only allows specification of the file system | NFSv4.1. This attribute only allows specification of the file system | |||
locations where the data corresponding to a given file system may be | locations where the data corresponding to a given file system may be | |||
found. Servers SHOULD make this attribute available whenever | found. Servers SHOULD make this attribute available whenever | |||
fs_locations_info is supported, but client use of fs_locations_info | fs_locations_info is supported, but client use of fs_locations_info | |||
is preferable, as it provides more information. | is preferable because it provides more information. | |||
Within the fs_location attribute, each fs_location4 contains a file | Within the fs_locations attribute, each fs_location4 contains a file | |||
system location entry with the server field designating the server | system location entry with the server field designating the server | |||
and the rootpath field giving the location pathname within the | and the rootpath field giving the location pathname within the | |||
server's pseudo-fs. | server's pseudo-fs. | |||
11.3. File System Presence or Absence | 11.3. File System Presence or Absence | |||
A given location in an NFSv4.1 namespace (typically but not | A given location in an NFSv4.1 namespace (typically but not | |||
necessarily a multi-server namespace) can have a number of file | necessarily a multi-server namespace) can have a number of file | |||
system instance locations associated with it (via the fs_locations or | system instance locations associated with it (via the fs_locations or | |||
fs_locations_info attribute). There may also be an actual current | fs_locations_info attribute). There may also be an actual current | |||
skipping to change at line 11294 ¶ | skipping to change at line 11294 ¶ | |||
with an NFS4ERR_MOVED error. | with an NFS4ERR_MOVED error. | |||
* The unavailability of an attribute because of a file system's | * The unavailability of an attribute because of a file system's | |||
absence, even one that is ordinarily REQUIRED, does not result in | absence, even one that is ordinarily REQUIRED, does not result in | |||
any error indication. The set of attributes returned for the root | any error indication. The set of attributes returned for the root | |||
directory of the absent file system in that case is simply | directory of the absent file system in that case is simply | |||
restricted to those actually available. | restricted to those actually available. | |||
11.5. Uses of File System Location Information | 11.5. Uses of File System Location Information | |||
The file system location attributes (i.e. fs_locations and | The file system location attributes (i.e., fs_locations and | |||
fs_locations_info), together with the possibility of absent file | fs_locations_info), together with the possibility of absent file | |||
systems, provide a number of important facilities in providing | systems, provide a number of important facilities for reliable, | |||
reliable, manageable, and scalable data access. | manageable, and scalable data access. | |||
When a file system is present, these attributes can provide | When a file system is present, these attributes can provide the | |||
following: | ||||
* The locations of alternative replicas, to be used to access the | * The locations of alternative replicas to be used to access the | |||
same data in the event of server failures, communications | same data in the event of server failures, communications | |||
problems, or other difficulties that make continued access to the | problems, or other difficulties that make continued access to the | |||
current replica impossible or otherwise impractical. Provision | current replica impossible or otherwise impractical. Provisioning | |||
and use of such alternate replicas is referred to as "replication" | and use of such alternate replicas is referred to as "replication" | |||
and is discussed in Section 11.5.4 below. | and is discussed in Section 11.5.4 below. | |||
* The network address(es) to be used to access the current file | * The network address(es) to be used to access the current file | |||
system instance or replicas of it. Client use of this information | system instance or replicas of it. Client use of this information | |||
is discussed in Section 11.5.2 below. | is discussed in Section 11.5.2 below. | |||
Under some circumstances, multiple replicas may be used | Under some circumstances, multiple replicas may be used | |||
simultaneously to provide higher-performance access to the file | simultaneously to provide higher-performance access to the file | |||
system in question, although the lack of state sharing between | system in question, although the lack of state sharing between | |||
servers may be an impediment to such use. | servers may be an impediment to such use. | |||
When a file system is present and becomes absent, clients can be | When a file system is present but becomes absent, clients can be | |||
given the opportunity to have continued access to their data, using a | given the opportunity to have continued access to their data using a | |||
different replica. In this case, a continued attempt to use the data | different replica. In this case, a continued attempt to use the data | |||
in the now-absent file system will result in an NFS4ERR_MOVED error | in the now-absent file system will result in an NFS4ERR_MOVED error, | |||
and, at that point, the successor replica or set of possible replica | and then the successor replica or set of possible replica choices can | |||
choices can be fetched and used to continue access. Transfer of | be fetched and used to continue access. Transfer of access to the | |||
access to the new replica location is referred to as "migration", and | new replica location is referred to as "migration" and is discussed | |||
is discussed in Section 11.5.4 below. | in Section 11.5.4 below. | |||
Where a file system is currently absent, specification of file system | When a file system is currently absent, specification of file system | |||
location provides a means by which file systems located on one server | location provides a means by which file systems located on one server | |||
can be associated with a namespace defined by another server, thus | can be associated with a namespace defined by another server, thus | |||
allowing a general multi-server namespace facility. A designation of | allowing a general multi-server namespace facility. A designation of | |||
such a remote instance, in place of a file system not previously | such a remote instance, in place of a file system not previously | |||
present, is called a "pure referral" and is discussed in | present, is called a "pure referral" and is discussed in | |||
Section 11.5.6 below. | Section 11.5.6 below. | |||
Because client support for attributes related to file system location | Because client support for attributes related to file system location | |||
is OPTIONAL, a server may choose to take action to hide migration and | is OPTIONAL, a server may choose to take action to hide migration and | |||
referral events from such clients, by acting as a proxy, for example. | referral events from such clients, by acting as a proxy, for example. | |||
The server can determine the presence of client support from the | The server can determine the presence of client support from the | |||
arguments of the EXCHANGE_ID operation (see Section 18.35.3). | arguments of the EXCHANGE_ID operation (see Section 18.35.3). | |||
11.5.1. Combining Multiple Uses in a Single Attribute | 11.5.1. Combining Multiple Uses in a Single Attribute | |||
A file system location attribute will sometimes contain information | A file system location attribute will sometimes contain information | |||
relating to the location of multiple replicas which may be used in | relating to the location of multiple replicas, which may be used in | |||
different ways. | different ways: | |||
* File system location entries that relate to the file system | * File system location entries that relate to the file system | |||
instance currently in use provide trunking information, allowing | instance currently in use provide trunking information, allowing | |||
the client to find additional network addresses by which the | the client to find additional network addresses by which the | |||
instance may be accessed. | instance may be accessed. | |||
* File system location entries that provide information about | * File system location entries that provide information about | |||
replicas to which access is to be transferred. | replicas to which access is to be transferred. | |||
* Other file system location entries that relate to replicas that | * Other file system location entries that relate to replicas that | |||
are available to use in the event that access to the current | are available to use in the event that access to the current | |||
replica becomes unsatisfactory. | replica becomes unsatisfactory. | |||
In order to simplify client handling and allow the best choice of | In order to simplify client handling and to allow the best choice of | |||
replicas to access, the server should adhere to the following | replicas to access, the server should adhere to the following | |||
guidelines. | guidelines: | |||
* All file system location entries that relate to a single file | * All file system location entries that relate to a single file | |||
system instance should be adjacent. | system instance should be adjacent. | |||
* File system location entries that relate to the instance currently | * File system location entries that relate to the instance currently | |||
in use should appear first. | in use should appear first. | |||
* File system location entries that relate to replica(s) to which | * File system location entries that relate to replica(s) to which | |||
migration is occurring should appear before replicas which are | migration is occurring should appear before replicas that are | |||
available for later use if the current replica should become | available for later use if the current replica should become | |||
inaccessible. | inaccessible. | |||
11.5.2. File System Location Attributes and Trunking | 11.5.2. File System Location Attributes and Trunking | |||
Trunking is the use of multiple connections between a client and | Trunking is the use of multiple connections between a client and | |||
server in order to increase the speed of data transfer. A client may | server in order to increase the speed of data transfer. A client may | |||
determine the set of network addresses to use to access a given file | determine the set of network addresses to use to access a given file | |||
system in a number of ways: | system in a number of ways: | |||
* When the name of the server is known to the client, it may use DNS | * When the name of the server is known to the client, it may use DNS | |||
to obtain a set of network addresses to use in accessing the | to obtain a set of network addresses to use in accessing the | |||
server. | server. | |||
* The client may fetch the file system location attribute for the | * The client may fetch the file system location attribute for the | |||
file system. This will provide either the name of the server | file system. This will provide either the name of the server | |||
(which can be turned into a set of network addresses using DNS), | (which can be turned into a set of network addresses using DNS) or | |||
or a set of server-trunkable location entries. Using the latter | a set of server-trunkable location entries. Using the latter | |||
alternative, the server can provide addresses it regards as | alternative, the server can provide addresses it regards as | |||
desirable to use to access the file system in question. Although | desirable to use to access the file system in question. Although | |||
these entries can contain port numbers, these port numbers are not | these entries can contain port numbers, these port numbers are not | |||
used in determining trunking relationships. Once the candidate | used in determining trunking relationships. Once the candidate | |||
addresses have been determined and EXCHANGE_ID done to the proper | addresses have been determined and EXCHANGE_ID done to the proper | |||
server, only the value of the so_major field returned by the | server, only the value of the so_major field returned by the | |||
servers in question determines whether a trunking relationship | servers in question determines whether a trunking relationship | |||
actually exists. | actually exists. | |||
It should be noted that the client, when it fetches a location | When the client fetches a location attribute for a file system, it | |||
attribute for a file system, may encounter multiple entries for a | should be noted that the client may encounter multiple entries for a | |||
number of reasons, so that, when determining trunking information, it | number of reasons, such that when it determines trunking information, | |||
may have to bypass addresses not trunkable with one already known. | it may have to bypass addresses not trunkable with one already known. | |||
The server can provide location entries that include either names or | The server can provide location entries that include either names or | |||
network addresses. It might use the latter form because of DNS- | network addresses. It might use the latter form because of DNS- | |||
related security concerns or because the set of addresses to be used | related security concerns or because the set of addresses to be used | |||
might require active management by the server. | might require active management by the server. | |||
Location entries used to discover candidate addresses for use in | Location entries used to discover candidate addresses for use in | |||
trunking are subject to change, as discussed in Section 11.5.7 below. | trunking are subject to change, as discussed in Section 11.5.7 below. | |||
The client may respond to such changes by using additional addresses | The client may respond to such changes by using additional addresses | |||
once they are verified or by ceasing to use existing ones. The | once they are verified or by ceasing to use existing ones. The | |||
server can force the client to cease using an address by returning | server can force the client to cease using an address by returning | |||
NFS4ERR_MOVED when that address is used to access a file system. | NFS4ERR_MOVED when that address is used to access a file system. | |||
This allows a transfer of client access which is similar to | This allows a transfer of client access that is similar to migration, | |||
migration, although the same file system instance is accessed | although the same file system instance is accessed throughout. | |||
throughout. | ||||
11.5.3. File System Location Attributes and Connection Type Selection | 11.5.3. File System Location Attributes and Connection Type Selection | |||
Because of the need to support multiple types of connections, clients | Because of the need to support multiple types of connections, clients | |||
face the issue of determining the proper connection type to use when | face the issue of determining the proper connection type to use when | |||
establishing a connection to a given server network address. In some | establishing a connection to a given server network address. In some | |||
cases, this issue can be addressed through the use of the connection | cases, this issue can be addressed through the use of the connection | |||
"step-up" facility described in Section 18.36. However, because | "step-up" facility described in Section 18.36. However, because | |||
there are cases is which that facility is not available, the client | there are cases in which that facility is not available, the client | |||
may have to choose a connection type with no possibility of changing | may have to choose a connection type with no possibility of changing | |||
it within the scope of a single connection. | it within the scope of a single connection. | |||
The two file system location attributes differ as to the information | The two file system location attributes differ as to the information | |||
made available in this regard. Fs_locations provides no information | made available in this regard. The fs_locations attribute provides | |||
to support connection type selection. As a result, clients | no information to support connection type selection. As a result, | |||
supporting multiple connection types would need to attempt to | clients supporting multiple connection types would need to attempt to | |||
establish connections using multiple connection types until the one | establish connections using multiple connection types until the one | |||
preferred by the client is successfully established. | preferred by the client is successfully established. | |||
Fs_locations_info includes a flag, FSLI4TF_RDMA, which, when set | The fs_locations_info attribute includes a flag, FSLI4TF_RDMA, which, | |||
indicates that RPC-over-RDMA support is available using the specified | when set indicates that RPC-over-RDMA support is available using the | |||
location entry, by "stepping up" an existing TCP connection to | specified location entry, by "stepping up" an existing TCP connection | |||
include support for RDMA operation. This flag makes it convenient | to include support for RDMA operation. This flag makes it convenient | |||
for a client wishing to use RDMA. When this flag is set, it can | for a client wishing to use RDMA. When this flag is set, it can | |||
establish a TCP connection and then convert that connection to use | establish a TCP connection and then convert that connection to use | |||
RDMA by using the step-up facility. | RDMA by using the step-up facility. | |||
Irrespective of the particular attribute used, when there is no | Irrespective of the particular attribute used, when there is no | |||
indication that a step-up operation can be performed, a client | indication that a step-up operation can be performed, a client | |||
supporting RDMA operation can establish a new RDMA connection and it | supporting RDMA operation can establish a new RDMA connection, and it | |||
can be bound to the session already established by the TCP | can be bound to the session already established by the TCP | |||
connection, allowing the TCP connection to be dropped and the session | connection, allowing the TCP connection to be dropped and the session | |||
converted to further use in RDMA mode, if the server supports that. | converted to further use in RDMA mode, if the server supports that. | |||
11.5.4. File System Replication | 11.5.4. File System Replication | |||
The fs_locations and fs_locations_info attributes provide alternative | The fs_locations and fs_locations_info attributes provide alternative | |||
file system locations, to be used to access data in place of or in | file system locations, to be used to access data in place of or in | |||
addition to the current file system instance. On first access to a | addition to the current file system instance. On first access to a | |||
file system, the client should obtain the set of alternate locations | file system, the client should obtain the set of alternate locations | |||
skipping to change at line 11471 ¶ | skipping to change at line 11471 ¶ | |||
file system impossible or otherwise impractical, the client can use | file system impossible or otherwise impractical, the client can use | |||
the alternate locations as a way to get continued access to its data. | the alternate locations as a way to get continued access to its data. | |||
The alternate locations may be physical replicas of the (typically | The alternate locations may be physical replicas of the (typically | |||
read-only) file system data supplemented by possible asynchronous | read-only) file system data supplemented by possible asynchronous | |||
propagation of updates. Alternatively, they may provide for the use | propagation of updates. Alternatively, they may provide for the use | |||
of various forms of server clustering in which multiple servers | of various forms of server clustering in which multiple servers | |||
provide alternate ways of accessing the same physical file system. | provide alternate ways of accessing the same physical file system. | |||
How the difference between replicas affects file system transitions | How the difference between replicas affects file system transitions | |||
can be represented within the fs_locations and fs_locations_info | can be represented within the fs_locations and fs_locations_info | |||
attributes and how the client deals with file system transition | attributes, and how the client deals with file system transition | |||
issues will be discussed in detail in later sections. | issues will be discussed in detail in later sections. | |||
Although the location attributes provide some information about the | Although the location attributes provide some information about the | |||
nature of the inter-replica transition, many aspects of the semantics | nature of the inter-replica transition, many aspects of the semantics | |||
of possible asynchronous updates are not currently described by the | of possible asynchronous updates are not currently described by the | |||
protocol, making it necessary that clients using replication to | protocol, which makes it necessary for clients using replication to | |||
switch among replicas undergoing change familiarize themselves with | switch among replicas undergoing change to familiarize themselves | |||
the semantics of the update approach used. Because of this lack of | with the semantics of the update approach used. Because of this lack | |||
specificity, many applications may find use of migration more | of specificity, many applications may find the use of migration more | |||
appropriate, since, in that case, the server, when effecting the | appropriate, since, in that case, the server, when effecting the | |||
transition, has established a point in time such that all updates | transition, has established a point in time such that all updates | |||
made before that can propagated to the new replica as part of the | made before that can propagated to the new replica as part of the | |||
migration event. | migration event. | |||
11.5.4.1. File System Trunking Presented as Replication | 11.5.4.1. File System Trunking Presented as Replication | |||
In some situations, a file system location entry may indicate a file | In some situations, a file system location entry may indicate a file | |||
system access path to be used as an alternate location, where | system access path to be used as an alternate location, where | |||
trunking, rather than replication, is to be used. The situations in | trunking, rather than replication, is to be used. The situations in | |||
which this is appropriate are limited to those in which both of the | which this is appropriate are limited to those in which both of the | |||
following are true. | following are true: | |||
* The two file system locations (i.e., the one on which the location | * The two file system locations (i.e., the one on which the location | |||
attribute is obtained and the one specified in the file system | attribute is obtained and the one specified in the file system | |||
location entry) designate the same locations within their | location entry) designate the same locations within their | |||
respective single-server namespaces. | respective single-server namespaces. | |||
* The two server network addresses (i.e., the one being used to | * The two server network addresses (i.e., the one being used to | |||
obtain the location attribute and the one specified in the file | obtain the location attribute and the one specified in the file | |||
system location entry) designate the same server (as indicated by | system location entry) designate the same server (as indicated by | |||
the same value of the so_major_id field of the eir_server_owner | the same value of the so_major_id field of the eir_server_owner | |||
field returned in response to EXCHANGE_ID). | field returned in response to EXCHANGE_ID). | |||
When these conditions hold, operations using both access paths are | When these conditions hold, operations using both access paths are | |||
generally trunked, although, when the attribute fs_locations_info is | generally trunked, although trunking may be disallowed when the | |||
used, trunking may be disallowed: | attribute fs_locations_info is used: | |||
* When the fs_locations_info attribute shows the two entries as not | * When the fs_locations_info attribute shows the two entries as not | |||
having the same simultaneous-use class, trunking is inhibited and | having the same simultaneous-use class, trunking is inhibited, and | |||
the two access paths cannot be used together. | the two access paths cannot be used together. | |||
In this case the two paths can be used serially with no transition | In this case, the two paths can be used serially with no | |||
activity required on the part of the client. In this case, any | transition activity required on the part of the client, and any | |||
transition between access paths is transparent, and the client, in | transition between access paths is transparent. In transferring | |||
transferring access from one to the other, is acting as it would | access from one to the other, the client acts as if communication | |||
in the event that communication is interrupted, with a new | were interrupted, establishing a new connection and possibly a new | |||
connection and possibly a new session being established to | session to continue access to the same file system. | |||
continue access to the same file system. | ||||
* Note that for two such location entries, any information within | * Note that for two such location entries, any information within | |||
the fs_locations_info attribute that indicates the need for | the fs_locations_info attribute that indicates the need for | |||
special transition activity, i.e., the appearance of the two file | special transition activity, i.e., the appearance of the two file | |||
system location entries with different handle, fileid, write- | system location entries with different handle, fileid, write- | |||
verifier, change, and readdir classes, indicates a serious | verifier, change, and readdir classes, indicates a serious | |||
problem. The client, if it allows transition to the file system | problem. The client, if it allows transition to the file system | |||
instance at all, must not treat any transition as a transparent | instance at all, must not treat any transition as a transparent | |||
one. The server SHOULD NOT indicate that these two entries (for | one. The server SHOULD NOT indicate that these two entries (for | |||
the same file system on the same server) belong to different | the same file system on the same server) belong to different | |||
handle, fileid, write-verifier, change, and readdir classes, | handle, fileid, write-verifier, change, and readdir classes, | |||
whether or not the two entries are shown belonging to the same | whether or not the two entries are shown belonging to the same | |||
simultaneous-use class. | simultaneous-use class. | |||
These situations were recognized by [65], even though that document | These situations were recognized by [65], even though that document | |||
made no explicit mention of trunking. | made no explicit mention of trunking: | |||
* It treated the situation that we describe as trunking as one of | * It treated the situation that we describe as trunking as one of | |||
simultaneous use of two distinct file system instances, even | simultaneous use of two distinct file system instances, even | |||
though, in the explanatory framework now used to describe the | though, in the explanatory framework now used to describe the | |||
situation, the case is one in which a single file system is | situation, the case is one in which a single file system is | |||
accessed by two different trunked addresses. | accessed by two different trunked addresses. | |||
* It treated the situation in which two paths are to be used | * It treated the situation in which two paths are to be used | |||
serially as a special sort of "transparent transition". however, | serially as a special sort of "transparent transition". However, | |||
in the descriptive framework now used to categorize transition | in the descriptive framework now used to categorize transition | |||
situations, this is considered a case of a "network endpoint | situations, this is considered a case of a "network endpoint | |||
transition" (see Section 11.9). | transition" (see Section 11.9). | |||
11.5.5. File System Migration | 11.5.5. File System Migration | |||
When a file system is present and becomes inaccessible using the | When a file system is present and becomes inaccessible using the | |||
current access path, the NFSv4.1 protocol provides a means by which | current access path, the NFSv4.1 protocol provides a means by which | |||
clients can be given the opportunity to have continued access to | clients can be given the opportunity to have continued access to | |||
their data. This may involve use of a different access path to the | their data. This may involve using a different access path to the | |||
existing replica or by providing a path to a different replica. The | existing replica or providing a path to a different replica. The new | |||
new access path or the location of the new replica is specified by a | access path or the location of the new replica is specified by a file | |||
file system location attribute. The ensuing migration of access | system location attribute. The ensuing migration of access includes | |||
includes the ability to retain locks across the transition. | the ability to retain locks across the transition. Depending on | |||
Depending on circumstances, this can involve: | circumstances, this can involve: | |||
* The continued use of the existing clientid when accessing the | * The continued use of the existing clientid when accessing the | |||
current replica using a new access path. | current replica using a new access path. | |||
* Use of lock reclaim, taking advantage of a per-fs grace period. | * Use of lock reclaim, taking advantage of a per-fs grace period. | |||
* Use of Transparent State Migration. | * Use of Transparent State Migration. | |||
Typically, a client will be accessing the file system in question, | Typically, a client will be accessing the file system in question, | |||
get an NFS4ERR_MOVED error, and then use a file system location | get an NFS4ERR_MOVED error, and then use a file system location | |||
attribute to determine the new access path for the data. When | attribute to determine the new access path for the data. When | |||
fs_locations_info is used, additional information will be available | fs_locations_info is used, additional information will be available | |||
that will define the nature of the client's handling of the | that will define the nature of the client's handling of the | |||
transition to a new server. | transition to a new server. | |||
In most instances, servers will choose to migrate all clients using a | In most instances, servers will choose to migrate all clients using a | |||
particular file system to a successor replica at the same time to | particular file system to a successor replica at the same time to | |||
avoid cases in which different clients are updating different | avoid cases in which different clients are updating different | |||
replicas. However migration of individual client can be helpful in | replicas. However, migration of an individual client can be helpful | |||
providing load balancing, as long as the replicas in question are | in providing load balancing, as long as the replicas in question are | |||
such that they represent the same data as described in | such that they represent the same data as described in | |||
Section 11.11.8. | Section 11.11.8. | |||
* In the case in which there is no transition between replicas | * In the case in which there is no transition between replicas | |||
(i.e., only a change in access path), there are no special | (i.e., only a change in access path), there are no special | |||
difficulties in using of this mechanism to effect load balancing. | difficulties in using of this mechanism to effect load balancing. | |||
* In the case in which the two replicas are sufficiently co- | * In the case in which the two replicas are sufficiently coordinated | |||
ordinated as to allow coherent simultaneous access to both by a | as to allow a single client coherent, simultaneous access to both, | |||
single client, there is, in general, no obstacle to use of | there is, in general, no obstacle to the use of migration of | |||
migration of particular clients to effect load balancing. | particular clients to effect load balancing. Generally, such | |||
Generally, such simultaneous use involves co-operation between | simultaneous use involves cooperation between servers to ensure | |||
servers to ensure that locks granted on two co-ordinated replicas | that locks granted on two coordinated replicas cannot conflict and | |||
cannot conflict and can remain effective when transferred to a | can remain effective when transferred to a common replica. | |||
common replica. | ||||
* In the case in which a large set of clients are accessing a file | * In the case in which a large set of clients is accessing a file | |||
system in a read-only fashion, in can be helpful to migrate all | system in a read-only fashion, it can be helpful to migrate all | |||
clients with writable access simultaneously, while using load | clients with writable access simultaneously, while using load | |||
balancing on the set of read-only copies, as long as the rules | balancing on the set of read-only copies, as long as the rules in | |||
appearing in Section 11.11.8, designed to prevent data reversion | Section 11.11.8, which are designed to prevent data reversion, are | |||
are adhered to. | followed. | |||
In other cases, the client might not have sufficient guarantees of | In other cases, the client might not have sufficient guarantees of | |||
data similarity/coherence to function properly (e.g. the data in the | data similarity or coherence to function properly (e.g., the data in | |||
two replicas is similar but not identical), and the possibility that | the two replicas is similar but not identical), and the possibility | |||
different clients are updating different replicas can exacerbate the | that different clients are updating different replicas can exacerbate | |||
difficulties, making use of load balancing in such situations a | the difficulties, making the use of load balancing in such situations | |||
perilous enterprise. | a perilous enterprise. | |||
The protocol does not specify how the file system will be moved | The protocol does not specify how the file system will be moved | |||
between servers or how updates to multiple replicas will be co- | between servers or how updates to multiple replicas will be | |||
ordinated. It is anticipated that a number of different server-to- | coordinated. It is anticipated that a number of different server-to- | |||
server co-ordination mechanisms might be used with the choice left to | server coordination mechanisms might be used, with the choice left to | |||
the server implementer. The NFSv4.1 protocol specifies the method | the server implementer. The NFSv4.1 protocol specifies the method | |||
used to communicate the migration event between client and server. | used to communicate the migration event between client and server. | |||
The new location may be, in the case of various forms of server | In the case of various forms of server clustering, the new location | |||
clustering, another server providing access to the same physical file | may be another server providing access to the same physical file | |||
system. The client's responsibilities in dealing with this | system. The client's responsibilities in dealing with this | |||
transition will depend on whether a switch between replicas has | transition will depend on whether a switch between replicas has | |||
occurred and the means the server has chosen to provide continuity of | occurred and the means the server has chosen to provide continuity of | |||
locking state. These issues will be discussed in detail below. | locking state. These issues will be discussed in detail below. | |||
Although a single successor location is typical, multiple locations | Although a single successor location is typical, multiple locations | |||
may be provided. When multiple locations are provided, the client | may be provided. When multiple locations are provided, the client | |||
will typically use the first one provided. If that is inaccessible | will typically use the first one provided. If that is inaccessible | |||
for some reason, later ones can be used. In such cases the client | for some reason, later ones can be used. In such cases, the client | |||
might consider the transition to the new replica to be a migration | might consider the transition to the new replica to be a migration | |||
event, even though some of the servers involved might not be aware of | event, even though some of the servers involved might not be aware of | |||
the use of the server which was inaccessible. In such a case, a | the use of the server that was inaccessible. In such a case, a | |||
client might lose access to locking state as a result of the access | client might lose access to locking state as a result of the access | |||
transfer. | transfer. | |||
When an alternate location is designated as the target for migration, | When an alternate location is designated as the target for migration, | |||
it must designate the same data (with metadata being the same to the | it must designate the same data (with metadata being the same to the | |||
degree indicated by the fs_locations_info attribute). Where file | degree indicated by the fs_locations_info attribute). Where file | |||
systems are writable, a change made on the original file system must | systems are writable, a change made on the original file system must | |||
be visible on all migration targets. Where a file system is not | be visible on all migration targets. Where a file system is not | |||
writable but represents a read-only copy (possibly periodically | writable but represents a read-only copy (possibly periodically | |||
updated) of a writable file system, similar requirements apply to the | updated) of a writable file system, similar requirements apply to the | |||
skipping to change at line 11683 ¶ | skipping to change at line 11681 ¶ | |||
to different locations as reported to individual clients, in order to | to different locations as reported to individual clients, in order to | |||
adapt to client physical location or to effect load balancing. When | adapt to client physical location or to effect load balancing. When | |||
both read-only and read-write file systems are present, some of the | both read-only and read-write file systems are present, some of the | |||
read-only locations might not be absolutely up-to-date (as they would | read-only locations might not be absolutely up-to-date (as they would | |||
have to be in the case of replication and migration). Servers may | have to be in the case of replication and migration). Servers may | |||
also specify file system locations that include client-substituted | also specify file system locations that include client-substituted | |||
variables so that different clients are referred to different file | variables so that different clients are referred to different file | |||
systems (with different data contents) based on client attributes | systems (with different data contents) based on client attributes | |||
such as CPU architecture. | such as CPU architecture. | |||
When the fs_locations_info attribute is such that that there are | If the fs_locations_info attribute lists multiple possible targets, | |||
multiple possible targets listed, the relationships among them may be | the relationships among them may be important to the client in | |||
important to the client in selecting which one to use. The same | selecting which one to use. The same rules specified in | |||
rules specified in Section 11.5.5 below regarding multiple migration | Section 11.5.5 below regarding multiple migration targets apply to | |||
targets apply to these multiple replicas as well. For example, the | these multiple replicas as well. For example, the client might | |||
client might prefer a writable target on a server that has additional | prefer a writable target on a server that has additional writable | |||
writable replicas to which it subsequently might switch. Note that, | replicas to which it subsequently might switch. Note that, as | |||
as distinguished from the case of replication, there is no need to | distinguished from the case of replication, there is no need to deal | |||
deal with the case of propagation of updates made by the current | with the case of propagation of updates made by the current client, | |||
client, since the current client has not accessed the file system in | since the current client has not accessed the file system in | |||
question. | question. | |||
Use of multi-server namespaces is enabled by NFSv4.1 but is not | Use of multi-server namespaces is enabled by NFSv4.1 but is not | |||
required. The use of multi-server namespaces and their scope will | required. The use of multi-server namespaces and their scope will | |||
depend on the applications used and system administration | depend on the applications used and system administration | |||
preferences. | preferences. | |||
Multi-server namespaces can be established by a single server | Multi-server namespaces can be established by a single server | |||
providing a large set of pure referrals to all of the included file | providing a large set of pure referrals to all of the included file | |||
systems. Alternatively, a single multi-server namespace may be | systems. Alternatively, a single multi-server namespace may be | |||
skipping to change at line 11717 ¶ | skipping to change at line 11715 ¶ | |||
Generally, multi-server namespaces are for the most part uniform, in | Generally, multi-server namespaces are for the most part uniform, in | |||
that the same data made available to one client at a given location | that the same data made available to one client at a given location | |||
in the namespace is made available to all clients at that namespace | in the namespace is made available to all clients at that namespace | |||
location. However, there are facilities provided that allow | location. However, there are facilities provided that allow | |||
different clients to be directed to different sets of data, for | different clients to be directed to different sets of data, for | |||
reasons such as enabling adaptation to such client characteristics as | reasons such as enabling adaptation to such client characteristics as | |||
CPU architecture. These facilities are described in Section 11.17.3. | CPU architecture. These facilities are described in Section 11.17.3. | |||
Note that it is possible, when providing a uniform namespace, to | Note that it is possible, when providing a uniform namespace, to | |||
provide different location entries to different clients, in order to | provide different location entries to different clients in order to | |||
provide each client with a copy of the data physically closest to it, | provide each client with a copy of the data physically closest to it | |||
or otherwise optimize access (e.g. provide load balancing). | or otherwise optimize access (e.g., provide load balancing). | |||
11.5.7. Changes in a File System Location Attribute | 11.5.7. Changes in a File System Location Attribute | |||
Although clients will typically fetch a file system location | Although clients will typically fetch a file system location | |||
attribute when first accessing a file system and when NFS4ERR_MOVED | attribute when first accessing a file system and when NFS4ERR_MOVED | |||
is returned, a client can choose to fetch the attribute periodically, | is returned, a client can choose to fetch the attribute periodically, | |||
in which case the value fetched may change over time. | in which case, the value fetched may change over time. | |||
For clients not prepared to access multiple replicas simultaneously | For clients not prepared to access multiple replicas simultaneously | |||
(see Section 11.11.1), the handling of the various cases of location | (see Section 11.11.1), the handling of the various cases of location | |||
change are as follows: | change are as follows: | |||
* Changes in the list of replicas or in the network addresses | * Changes in the list of replicas or in the network addresses | |||
associated with replicas do not require immediate action. The | associated with replicas do not require immediate action. The | |||
client will typically update its list of replicas to reflect the | client will typically update its list of replicas to reflect the | |||
new information. | new information. | |||
* Additions to the list of network addresses for the current file | * Additions to the list of network addresses for the current file | |||
system instance need not be acted on promptly. However, to | system instance need not be acted on promptly. However, to | |||
prepare for the case in which a migration event occurs | prepare for a subsequent migration event, the client can choose to | |||
subsequently, the client can choose to take note of the new | take note of the new address and then use it whenever it needs to | |||
address and then use it whenever it needs to switch access to a | switch access to a new replica. | |||
new replica. | ||||
* Deletions from the list of network addresses for the current file | * Deletions from the list of network addresses for the current file | |||
system instance do not need to be acted on immediately by ceasing | system instance do not require the client to immediately cease use | |||
use of existing access paths although new connections are not to | of existing access paths, although new connections are not to be | |||
be established on addresses that have been deleted. However, | established on addresses that have been deleted. However, clients | |||
clients can choose to act on such deletions by making preparations | can choose to act on such deletions by preparing for an eventual | |||
for an eventual shift in access which would become unavoidable as | shift in access, which becomes unavoidable as soon as the server | |||
soon as the server indicates that a particular network access path | returns NFS4ERR_MOVED to indicate that a particular network access | |||
is not usable to access the current file system, by returning | path is not usable to access the current file system. | |||
NFS4ERR_MOVED. | ||||
For clients that are prepared to access several replicas | For clients that are prepared to access several replicas | |||
simultaneously, the following additional cases need to be addressed. | simultaneously, the following additional cases need to be addressed. | |||
As in the cases discussed above, changes in the set of replicas need | As in the cases discussed above, changes in the set of replicas need | |||
not be acted upon promptly, although the client has the option of | not be acted upon promptly, although the client has the option of | |||
adjusting its access even in the absence of difficulties that would | adjusting its access even in the absence of difficulties that would | |||
lead to a new replica to be selected. | lead to the selection of a new replica. | |||
* When a new replica is added which may be accessed simultaneously | * When a new replica is added, which may be accessed simultaneously | |||
with one currently in use, the client is free to use the new | with one currently in use, the client is free to use the new | |||
replica immediately. | replica immediately. | |||
* When a replica currently in use is deleted from the list, the | * When a replica currently in use is deleted from the list, the | |||
client need not cease using it immediately. However, since the | client need not cease using it immediately. However, since the | |||
server may subsequently force such use to cease (by returning | server may subsequently force such use to cease (by returning | |||
NFS4ERR_MOVED), clients might decide to limit the need for later | NFS4ERR_MOVED), clients might decide to limit the need for later | |||
state transfer. For example, new opens might be done on other | state transfer. For example, new opens might be done on other | |||
replicas, rather than on one not present in the list. | replicas, rather than on one not present in the list. | |||
11.6. Trunking without File System Location Information | 11.6. Trunking without File System Location Information | |||
In situations in which a file system is accessed using two server- | In situations in which a file system is accessed using two server- | |||
trunkable addresses (as indicated by the same value of the | trunkable addresses (as indicated by the same value of the | |||
so_major_id field of the eir_server_owner field returned in response | so_major_id field of the eir_server_owner field returned in response | |||
to EXCHANGE_ID), trunked access is allowed even though there might | to EXCHANGE_ID), trunked access is allowed even though there might | |||
not be any location entries specifically indicating the use of | not be any location entries specifically indicating the use of | |||
trunking for that file system. | trunking for that file system. | |||
This situation was recognized by [65], even though that document made | This situation was recognized by [65], although that document made no | |||
no explicit mention of trunking and treated the situation as one of | explicit mention of trunking and treated the situation as one of | |||
simultaneous use of two distinct file system instances, even though, | simultaneous use of two distinct file system instances. In the | |||
in the explanatory framework now used to describe the situation, the | explanatory framework now used to describe the situation, the case is | |||
case is one in which a single file system is accessed by two | one in which a single file system is accessed by two different | |||
different trunked addresses. | trunked addresses. | |||
11.7. Users and Groups in a Multi-server Namespace | 11.7. Users and Groups in a Multi-Server Namespace | |||
As in the case of a single-server environment (see Section 5.9, when | As in the case of a single-server environment (see Section 5.9), when | |||
an owner or group name of the form "id@domain" is assigned to a file, | an owner or group name of the form "id@domain" is assigned to a file, | |||
there is an implicit promise to return that same string when the | there is an implicit promise to return that same string when the | |||
corresponding attribute is interrogated subsequently. In the case of | corresponding attribute is interrogated subsequently. In the case of | |||
a multi-server namespace, that same promise applies even if server | a multi-server namespace, that same promise applies even if server | |||
boundaries have been crossed. Similarly, when the owner attribute of | boundaries have been crossed. Similarly, when the owner attribute of | |||
a file is derived from the security principal which created the file, | a file is derived from the security principal that created the file, | |||
that attribute should have the same value even if the interrogation | that attribute should have the same value even if the interrogation | |||
occurs on a different server from the file creation. | occurs on a different server from the file creation. | |||
Similarly, the set of security principals recognized by all the | Similarly, the set of security principals recognized by all the | |||
participating servers needs to be the same, with each such principal | participating servers needs to be the same, with each such principal | |||
having the same credentials, regardless of the particular server | having the same credentials, regardless of the particular server | |||
being accessed. | being accessed. | |||
In order to meet these requirements, those setting up multi-server | In order to meet these requirements, those setting up multi-server | |||
namespaces will need to limit the servers included so that: | namespaces will need to limit the servers included so that: | |||
* In all cases in which more than a single domain is supported, the | * In all cases in which more than a single domain is supported, the | |||
requirements stated in RFC8000 [31] are to be respected. | requirements stated in RFC 8000 [31] are to be respected. | |||
* All servers support a common set of domains which includes all of | * All servers support a common set of domains that includes all of | |||
the domains clients use and expect to see returned as the domain | the domains clients use and expect to see returned as the domain | |||
portion of an owner or group in the form "id@domain". Note that | portion of an owner or group in the form "id@domain". Note that, | |||
although this set most often consists of a single domain, it is | although this set most often consists of a single domain, it is | |||
possible for multiple domains to be supported. | possible for multiple domains to be supported. | |||
* All servers, for each domain that they support, accept the same | * All servers, for each domain that they support, accept the same | |||
set of user and group ids as valid. | set of user and group ids as valid. | |||
* All servers recognize the same set of security principals. For | * All servers recognize the same set of security principals. For | |||
each principal, the same credential is required, independent of | each principal, the same credential is required, independent of | |||
the server being accessed. In addition, the group membership for | the server being accessed. In addition, the group membership for | |||
each such principal is to be the same, independent of the server | each such principal is to be the same, independent of the server | |||
accessed. | accessed. | |||
Note that there is no requirement in general that the users | Note that there is no requirement in general that the users | |||
corresponding to particular security principals have the same local | corresponding to particular security principals have the same local | |||
representation on each server, even though it is most often the case | representation on each server, even though it is most often the case | |||
that this is so. | that this is so. | |||
When AUTH_SYS is used, the following additional requirements must be | When AUTH_SYS is used, the following additional requirements must be | |||
met: | met: | |||
* Only a single NFSv4 domain can be supported through use of | * Only a single NFSv4 domain can be supported through the use of | |||
AUTH_SYS. | AUTH_SYS. | |||
* The "local" representation of all owners and groups must be the | * The "local" representation of all owners and groups must be the | |||
same on all servers. The word "local" is used here since that is | same on all servers. The word "local" is used here since that is | |||
the way that numeric user and group ids are described in | the way that numeric user and group ids are described in | |||
Section 5.9. However, when AUTH_SYS or stringified numeric owners | Section 5.9. However, when AUTH_SYS or stringified numeric owners | |||
or groups are used, these identifiers are not truly local, since | or groups are used, these identifiers are not truly local, since | |||
they are known to the clients as well as the server. | they are known to the clients as well as to the server. | |||
Similarly, when stringified numeric user and group ids are used, the | Similarly, when stringified numeric user and group ids are used, the | |||
"local" representation of all owners and groups must be the same on | "local" representation of all owners and groups must be the same on | |||
all servers, even when AUTH_SYS is not used. | all servers, even when AUTH_SYS is not used. | |||
11.8. Additional Client-Side Considerations | 11.8. Additional Client-Side Considerations | |||
When clients make use of servers that implement referrals, | When clients make use of servers that implement referrals, | |||
replication, and migration, care should be taken that a user who | replication, and migration, care should be taken that a user who | |||
mounts a given file system that includes a referral or a relocated | mounts a given file system that includes a referral or a relocated | |||
skipping to change at line 11900 ¶ | skipping to change at line 11896 ¶ | |||
How these are dealt with is discussed in Section 11.11. | How these are dealt with is discussed in Section 11.11. | |||
* Those in which access to the current file system instance is | * Those in which access to the current file system instance is | |||
retained, while the network path used to access that instance is | retained, while the network path used to access that instance is | |||
changed. This case is discussed in Section 11.10. | changed. This case is discussed in Section 11.10. | |||
11.10. Effecting Network Endpoint Transitions | 11.10. Effecting Network Endpoint Transitions | |||
The endpoints used to access a particular file system instance may | The endpoints used to access a particular file system instance may | |||
change in a number of ways, as listed below. In each of these cases, | change in a number of ways, as listed below. In each of these cases, | |||
the same fsid, filehandles, stateids, client IDs and are used to | the same fsid, filehandles, stateids, client IDs, and are used to | |||
continue access, with a continuity of lock state. In many cases, the | continue access, with a continuity of lock state. In many cases, the | |||
same sessions can also be used. | same sessions can also be used. | |||
The appropriate action depends on the set of replacement addresses | The appropriate action depends on the set of replacement addresses | |||
(i.e. server endpoints which are server-trunkable with one previously | that are available for use (i.e., server endpoints that are server- | |||
being used) which are available for use. | trunkable with one previously being used). | |||
* When use of a particular address is to cease and there is also | * When use of a particular address is to cease, and there is also | |||
another one currently in use which is server-trunkable with it, | another address currently in use that is server-trunkable with it, | |||
requests that would have been issued on the address whose use is | requests that would have been issued on the address whose use is | |||
to be discontinued can be issued on the remaining address(es). | to be discontinued can be issued on the remaining address(es). | |||
When an address is server-trunkable but not session-trunkable with | When an address is server-trunkable but not session-trunkable with | |||
the address whose use is to be discontinued, the request might | the address whose use is to be discontinued, the request might | |||
need to be modified to reflect the fact that a different session | need to be modified to reflect the fact that a different session | |||
will be used. | will be used. | |||
* When use of a particular connection is to cease, as indicated by | * When use of a particular connection is to cease, as indicated by | |||
receiving NFS4ERR_MOVED when using that connection but that | receiving NFS4ERR_MOVED when using that connection, but that | |||
address is still indicated as accessible according to the | address is still indicated as accessible according to the | |||
appropriate file system location entries, it is likely that | appropriate file system location entries, it is likely that | |||
requests can be issued on a new connection of a different | requests can be issued on a new connection of a different | |||
connection type, once that connection is established. Since any | connection type once that connection is established. Since any | |||
two, non-port-specific server endpoints that share a network | two non-port-specific server endpoints that share a network | |||
address are inherently session-trunkable, the client can use | address are inherently session-trunkable, the client can use | |||
BIND_CONN_TO_SESSION to access the existing session using the new | BIND_CONN_TO_SESSION to access the existing session using the new | |||
connection and proceed to access the file system using the new | connection and proceed to access the file system using the new | |||
connection. | connection. | |||
* When there are no potential replacement addresses in use but there | * When there are no potential replacement addresses in use, but | |||
are valid addresses session-trunkable with the one whose use is to | there are valid addresses session-trunkable with the one whose use | |||
be discontinued, the client can use BIND_CONN_TO_SESSION to access | is to be discontinued, the client can use BIND_CONN_TO_SESSION to | |||
the existing session using the new address. Although the target | access the existing session using the new address. Although the | |||
session will generally be accessible, there may be rare situations | target session will generally be accessible, there may be rare | |||
in which that session is no longer accessible, when an attempt is | situations in which that session is no longer accessible when an | |||
made to bind the new connection to it. In this case, the client | attempt is made to bind the new connection to it. In this case, | |||
can create a new session to enable continued access to the | the client can create a new session to enable continued access to | |||
existing instance using the new connection, providing for use of | the existing instance using the new connection, providing for the | |||
existing filehandles, stateids, and client ids while providing | use of existing filehandles, stateids, and client ids while | |||
continuity of locking state. | supplying continuity of locking state. | |||
* When there is no potential replacement address in use and there | * When there is no potential replacement address in use, and there | |||
are no valid addresses session-trunkable with the one whose use is | are no valid addresses session-trunkable with the one whose use is | |||
to be discontinued, other server-trunkable addresses may be used | to be discontinued, other server-trunkable addresses may be used | |||
to provide continued access. Although use of CREATE_SESSION is | to provide continued access. Although the use of CREATE_SESSION | |||
available to provide continued access to the existing instance, | is available to provide continued access to the existing instance, | |||
servers have the option of providing continued access to the | servers have the option of providing continued access to the | |||
existing session through the new network access path in a fashion | existing session through the new network access path in a fashion | |||
similar to that provided by session migration (see Section 11.12). | similar to that provided by session migration (see Section 11.12). | |||
To take advantage of this possibility, clients can perform an | To take advantage of this possibility, clients can perform an | |||
initial BIND_CONN_TO_SESSION, as in the previous case, and use | initial BIND_CONN_TO_SESSION, as in the previous case, and use | |||
CREATE_SESSION only if that fails. | CREATE_SESSION only if that fails. | |||
11.11. Effecting File System Transitions | 11.11. Effecting File System Transitions | |||
There are a range of situations in which there is a change to be | There are a range of situations in which there is a change to be | |||
skipping to change at line 11970 ¶ | skipping to change at line 11966 ¶ | |||
For reasons explained in that section, most transitions will involve | For reasons explained in that section, most transitions will involve | |||
a transition from a single replica to a corresponding replacement | a transition from a single replica to a corresponding replacement | |||
replica. When effecting replica transition, some types of sharing | replica. When effecting replica transition, some types of sharing | |||
between the replicas may affect handling of the transition as | between the replicas may affect handling of the transition as | |||
described in Sections 11.11.2 through 11.11.8 below. The attribute | described in Sections 11.11.2 through 11.11.8 below. The attribute | |||
fs_locations_info provides helpful information to allow the client to | fs_locations_info provides helpful information to allow the client to | |||
determine the degree of inter-replica sharing. | determine the degree of inter-replica sharing. | |||
With regard to some types of state, the degree of continuity across | With regard to some types of state, the degree of continuity across | |||
the transition depends on the occasion prompting the transition, with | the transition depends on the occasion prompting the transition, with | |||
transitions initiated by the servers (i.e. migration) offering much | transitions initiated by the servers (i.e., migration) offering much | |||
more scope for a non-disruptive transition than cases in which the | more scope for a nondisruptive transition than cases in which the | |||
client on its own shifts its access to another replica (i.e. | client on its own shifts its access to another replica (i.e., | |||
replication). This issue potentially applies to locking state and to | replication). This issue potentially applies to locking state and to | |||
session state, which are dealt with below as follows: | session state, which are dealt with below as follows: | |||
* An introduction to the possible means of providing continuity in | * An introduction to the possible means of providing continuity in | |||
these areas appears in Section 11.11.9 below. | these areas appears in Section 11.11.9 below. | |||
* Transparent State Migration is introduced in Section 11.12. The | * Transparent State Migration is introduced in Section 11.12. The | |||
possible transfer of session state is addressed there as well. | possible transfer of session state is addressed there as well. | |||
* The client handling of transitions, including determining how to | * The client handling of transitions, including determining how to | |||
deal with the various means that the server might take to supply | deal with the various means that the server might take to supply | |||
effective continuity of locking state is discussed in | effective continuity of locking state, is discussed in | |||
Section 11.13. | Section 11.13. | |||
* The servers' (source and destination) responsibilities in | * The source and destination servers' responsibilities in effecting | |||
effecting Transparent Migration of locking and session state are | Transparent State Migration of locking and session state are | |||
discussed in Section 11.14. | discussed in Section 11.14. | |||
11.11.1. File System Transitions and Simultaneous Access | 11.11.1. File System Transitions and Simultaneous Access | |||
The fs_locations_info attribute (described in Section 11.17) may | The fs_locations_info attribute (described in Section 11.17) may | |||
indicate that two replicas may be used simultaneously, although some | indicate that two replicas may be used simultaneously, although some | |||
situations in which such simultaneous access is permitted are more | situations in which such simultaneous access is permitted are more | |||
appropriately described as instances of trunking (see | appropriately described as instances of trunking (see | |||
Section 11.5.4.1). Although situations in which multiple replicas | Section 11.5.4.1). Although situations in which multiple replicas | |||
may be accessed simultaneously are somewhat similar to those in which | may be accessed simultaneously are somewhat similar to those in which | |||
a single replica is accessed by multiple network addresses, there are | a single replica is accessed by multiple network addresses, there are | |||
important differences, since locking state is not shared among | important differences since locking state is not shared among | |||
multiple replicas. | multiple replicas. | |||
Because of this difference in state handling, many clients will not | Because of this difference in state handling, many clients will not | |||
have the ability to take advantage of the fact that such replicas | have the ability to take advantage of the fact that such replicas | |||
represent the same data. Such clients will not be prepared to use | represent the same data. Such clients will not be prepared to use | |||
multiple replicas simultaneously but will access each file system | multiple replicas simultaneously but will access each file system | |||
using only a single replica, although the replica selected might make | using only a single replica, although the replica selected might make | |||
multiple server-trunkable addresses available. | multiple server-trunkable addresses available. | |||
Clients who are prepared to use multiple replicas simultaneously will | Clients who are prepared to use multiple replicas simultaneously can | |||
divide opens among replicas however they choose. Once that choice is | divide opens among replicas however they choose. Once that choice is | |||
made, any subsequent transitions will treat the set of locking state | made, any subsequent transitions will treat the set of locking state | |||
associated with each replica as a single entity. | associated with each replica as a single entity. | |||
For example, if one of the replicas become unavailable, access will | For example, if one of the replicas become unavailable, access will | |||
be transferred to a different replica, also capable of simultaneous | be transferred to a different replica, which is also capable of | |||
access with the one still in use. | simultaneous access with the one still in use. | |||
When there is no such replica, the transition may be to the replica | When there is no such replica, the transition may be to the replica | |||
already in use. At this point, the client has a choice between | already in use. At this point, the client has a choice between | |||
merging the locking state for the two replicas under the aegis of the | merging the locking state for the two replicas under the aegis of the | |||
sole replica in use or treating these separately, until another | sole replica in use or treating these separately until another | |||
replica capable of simultaneous access presents itself. | replica capable of simultaneous access presents itself. | |||
11.11.2. Filehandles and File System Transitions | 11.11.2. Filehandles and File System Transitions | |||
There are a number of ways in which filehandles can be handled across | There are a number of ways in which filehandles can be handled across | |||
a file system transition. These can be divided into two broad | a file system transition. These can be divided into two broad | |||
classes depending upon whether the two file systems across which the | classes depending upon whether the two file systems across which the | |||
transition happens share sufficient state to effect some sort of | transition happens share sufficient state to effect some sort of | |||
continuity of file system handling. | continuity of file system handling. | |||
skipping to change at line 12162 ¶ | skipping to change at line 12158 ¶ | |||
When the two file systems have consistent change attribute formats, | When the two file systems have consistent change attribute formats, | |||
and this fact is communicated to the client by reporting in the same | and this fact is communicated to the client by reporting in the same | |||
change class, the client may assume a continuity of change attribute | change class, the client may assume a continuity of change attribute | |||
construction and handle this situation just as it would be handled | construction and handle this situation just as it would be handled | |||
without any file system transition. | without any file system transition. | |||
11.11.6. Write Verifiers and File System Transitions | 11.11.6. Write Verifiers and File System Transitions | |||
In a file system transition, the two file systems might be | In a file system transition, the two file systems might be | |||
cooperating in the handling of unstably written data. Clients can | cooperating in the handling of unstably written data. Clients can | |||
determine if this is the case, by seeing if the two file systems | determine if this is the case by seeing if the two file systems | |||
belong to the same write-verifier class. When this is the case, | belong to the same write-verifier class. When this is the case, | |||
write verifiers returned from one system may be compared to those | write verifiers returned from one system may be compared to those | |||
returned by the other and superfluous writes avoided. | returned by the other and superfluous writes can be avoided. | |||
When two file systems belong to different write-verifier classes, any | When two file systems belong to different write-verifier classes, any | |||
verifier generated by one must not be compared to one provided by the | verifier generated by one must not be compared to one provided by the | |||
other. Instead, the two verifiers should be treated as not equal | other. Instead, the two verifiers should be treated as not equal | |||
even when the values are identical. | even when the values are identical. | |||
11.11.7. Readdir Cookies and Verifiers and File System Transitions | 11.11.7. READDIR Cookies and Verifiers and File System Transitions | |||
In a file system transition, the two file systems might be consistent | In a file system transition, the two file systems might be consistent | |||
in their handling of READDIR cookies and verifiers. Clients can | in their handling of READDIR cookies and verifiers. Clients can | |||
determine if this is the case, by seeing if the two file systems | determine if this is the case by seeing if the two file systems | |||
belong to the same readdir class. When this is the case, readdir | belong to the same readdir class. When this is the case, readdir | |||
class, READDIR cookies and verifiers from one system will be | class, READDIR cookies, and verifiers from one system will be | |||
recognized by the other and READDIR operations started on one server | recognized by the other, and READDIR operations started on one server | |||
can be validly continued on the other, simply by presenting the | can be validly continued on the other simply by presenting the cookie | |||
cookie and verifier returned by a READDIR operation done on the first | and verifier returned by a READDIR operation done on the first file | |||
file system to the second. | system to the second. | |||
When two file systems belong to different readdir classes, any | When two file systems belong to different readdir classes, any | |||
READDIR cookie and verifier generated by one is not valid on the | READDIR cookie and verifier generated by one is not valid on the | |||
second, and must not be presented to that server by the client. The | second and must not be presented to that server by the client. The | |||
client should act as if the verifier were rejected. | client should act as if the verifier were rejected. | |||
11.11.8. File System Data and File System Transitions | 11.11.8. File System Data and File System Transitions | |||
When multiple replicas exist and are used simultaneously or in | When multiple replicas exist and are used simultaneously or in | |||
succession by a client, applications using them will normally expect | succession by a client, applications using them will normally expect | |||
that they contain either the same data or data that is consistent | that they contain either the same data or data that is consistent | |||
with the normal sorts of changes that are made by other clients | with the normal sorts of changes that are made by other clients | |||
updating the data of the file system (with metadata being the same to | updating the data of the file system (with metadata being the same to | |||
the degree indicated by the fs_locations_info attribute). However, | the degree indicated by the fs_locations_info attribute). However, | |||
when multiple file systems are presented as replicas of one another, | when multiple file systems are presented as replicas of one another, | |||
the precise relationship between the data of one and the data of | the precise relationship between the data of one and the data of | |||
another is not, as a general matter, specified by the NFSv4.1 | another is not, as a general matter, specified by the NFSv4.1 | |||
protocol. It is quite possible to present as replicas file systems | protocol. It is quite possible to present as replicas file systems | |||
where the data of those file systems is sufficiently different that | where the data of those file systems is sufficiently different that | |||
some applications have problems dealing with the transition between | some applications have problems dealing with the transition between | |||
replicas. The namespace will typically be constructed so that | replicas. The namespace will typically be constructed so that | |||
applications can choose an appropriate level of support, so that in | applications can choose an appropriate level of support, so that in | |||
one position in the namespace a varied set of replicas might be | one position in the namespace, a varied set of replicas might be | |||
listed, while in another only those that are up-to-date would be | listed, while in another, only those that are up-to-date would be | |||
considered replicas. The protocol does define three special cases of | considered replicas. The protocol does define three special cases of | |||
the relationship among replicas to be specified by the server and | the relationship among replicas to be specified by the server and | |||
relied upon by clients: | relied upon by clients: | |||
* When multiple replicas exist and are used simultaneously by a | * When multiple replicas exist and are used simultaneously by a | |||
client (see the FSLIB4_CLSIMUL definition within | client (see the FSLIB4_CLSIMUL definition within | |||
fs_locations_info), they must designate the same data. Where file | fs_locations_info), they must designate the same data. Where file | |||
systems are writable, a change made on one instance must be | systems are writable, a change made on one instance must be | |||
visible on all instances at the same time, regardless of whether | visible on all instances at the same time, regardless of whether | |||
the interrogated instance is the one on which the modification was | the interrogated instance is the one on which the modification was | |||
done. This allows a client to use these replicas simultaneously | done. This allows a client to use these replicas simultaneously | |||
without any special adaptation to the fact that there are multiple | without any special adaptation to the fact that there are multiple | |||
replicas, beyond adapting to the fact that locks obtained on one | replicas, beyond adapting to the fact that locks obtained on one | |||
replica are maintained separately (i.e. under a different client | replica are maintained separately (i.e., under a different client | |||
ID). In this case, locks (whether share reservations or byte- | ID). In this case, locks (whether share reservations or byte- | |||
range locks) and delegations obtained on one replica are | range locks) and delegations obtained on one replica are | |||
immediately reflected on all replicas, in the sense that access | immediately reflected on all replicas, in the sense that access | |||
from all other servers is prevented regardless of the replica | from all other servers is prevented regardless of the replica | |||
used. However, because the servers are not required to treat two | used. However, because the servers are not required to treat two | |||
associated client IDs as representing the same client, it is best | associated client IDs as representing the same client, it is best | |||
to access each file using only a single client ID. | to access each file using only a single client ID. | |||
* When one replica is designated as the successor instance to | * When one replica is designated as the successor instance to | |||
another existing instance after return NFS4ERR_MOVED (i.e., the | another existing instance after the return of NFS4ERR_MOVED (i.e., | |||
case of migration), the client may depend on the fact that all | the case of migration), the client may depend on the fact that all | |||
changes written to stable storage on the original instance are | changes written to stable storage on the original instance are | |||
written to stable storage of the successor (uncommitted writes are | written to stable storage of the successor (uncommitted writes are | |||
dealt with in Section 11.11.6 above). | dealt with in Section 11.11.6 above). | |||
* Where a file system is not writable but represents a read-only | * Where a file system is not writable but represents a read-only | |||
copy (possibly periodically updated) of a writable file system, | copy (possibly periodically updated) of a writable file system, | |||
clients have similar requirements with regard to the propagation | clients have similar requirements with regard to the propagation | |||
of updates. They may need a guarantee that any change visible on | of updates. They may need a guarantee that any change visible on | |||
the original file system instance must be immediately visible on | the original file system instance must be immediately visible on | |||
any replica before the client transitions access to that replica, | any replica before the client transitions access to that replica, | |||
skipping to change at line 12253 ¶ | skipping to change at line 12249 ¶ | |||
transition to a replica, will see any reversion in file system | transition to a replica, will see any reversion in file system | |||
state. The specific means of this guarantee varies based on the | state. The specific means of this guarantee varies based on the | |||
value of the fss_type field that is reported as part of the | value of the fss_type field that is reported as part of the | |||
fs_status attribute (see Section 11.18). Since these file systems | fs_status attribute (see Section 11.18). Since these file systems | |||
are presumed to be unsuitable for simultaneous use, there is no | are presumed to be unsuitable for simultaneous use, there is no | |||
specification of how locking is handled; in general, locks | specification of how locking is handled; in general, locks | |||
obtained on one file system will be separate from those on others. | obtained on one file system will be separate from those on others. | |||
Since these are expected to be read-only file systems, this is not | Since these are expected to be read-only file systems, this is not | |||
likely to pose an issue for clients or applications. | likely to pose an issue for clients or applications. | |||
When none of these special situations apply, there is no basis, | When none of these special situations applies, there is no basis | |||
within the protocol for the client to make assumptions about the | within the protocol for the client to make assumptions about the | |||
contents of a replica file system or its relationship to previous | contents of a replica file system or its relationship to previous | |||
file system instances. Thus switching between nominally identical | file system instances. Thus, switching between nominally identical | |||
read-write file systems would not be possible, because either the | read-write file systems would not be possible because either the | |||
client does not use or the server does not support the | client does not use the fs_locations_info attribute, or the server | |||
fs_locations_info attribute. | does not support it. | |||
11.11.9. Lock State and File System Transitions | 11.11.9. Lock State and File System Transitions | |||
While accessing a file system, clients obtain locks enforced by the | While accessing a file system, clients obtain locks enforced by the | |||
server which may prevent actions by other clients that are | server, which may prevent actions by other clients that are | |||
inconsistent with those locks. | inconsistent with those locks. | |||
When access is transferred between replicas, clients need to be | When access is transferred between replicas, clients need to be | |||
assured that the actions disallowed by holding these locks cannot | assured that the actions disallowed by holding these locks cannot | |||
have occurred during the transition. This can be ensured by the | have occurred during the transition. This can be ensured by the | |||
methods below. Unless at least one of these is implemented, clients | methods below. Unless at least one of these is implemented, clients | |||
will not be assured of continuity of lock possession across a | will not be assured of continuity of lock possession across a | |||
migration event. | migration event: | |||
* Providing the client an opportunity to re-obtain his locks via a | * Providing the client an opportunity to re-obtain his locks via a | |||
per-fs grace period on the destination server, denying all clients | per-fs grace period on the destination server, denying all clients | |||
using the destination file system the opportunity to obtain new | using the destination file system the opportunity to obtain new | |||
locks that conflict which those held by the transferred client as | locks that conflict with those held by the transferred client as | |||
long as that client has not completed its per-fs grace period. | long as that client has not completed its per-fs grace period. | |||
Because the lock reclaim mechanism was originally defined to | Because the lock reclaim mechanism was originally defined to | |||
support server reboot, it implicitly assumes that file handles | support server reboot, it implicitly assumes that filehandles | |||
will, upon reclaim, will be the same as those at open. In the | will, upon reclaim, be the same as those at open. In the case of | |||
case of migration, this requires that source and destination | migration, this requires that source and destination servers use | |||
servers use the same filehandles, as evidenced by using the same | the same filehandles, as evidenced by using the same server scope | |||
server scope (see Section 2.10.4) or by showing this agreement | (see Section 2.10.4) or by showing this agreement using | |||
using fs_locations_info (see Section 11.11.2 above). | fs_locations_info (see Section 11.11.2 above). | |||
Note that such a grace period can be implemented without | Note that such a grace period can be implemented without | |||
interfering with the ability of non-transferred clients to obtain | interfering with the ability of non-transferred clients to obtain | |||
new locks while it is going on. As long as the destination server | new locks while it is going on. As long as the destination server | |||
is aware of the transferred locks, it can distinguish requests to | is aware of the transferred locks, it can distinguish requests to | |||
obtain new locks that contrast with existing locks from those that | obtain new locks that contrast with existing locks from those that | |||
do not, allowing it to treat such client requests without | do not, allowing it to treat such client requests without | |||
reference to the ongoing grace period. | reference to the ongoing grace period. | |||
* Locking state can be transferred as part of the transition by | * Locking state can be transferred as part of the transition by | |||
providing Transparent State Migration as described in | providing Transparent State Migration as described in | |||
Section 11.12. | Section 11.12. | |||
Of these, Transparent State Migration provides the smoother | Of these, Transparent State Migration provides the smoother | |||
experience for clients in that there is no need to go through a | experience for clients in that there is no need to go through a | |||
reclaim process before new locks can be obtained. However, it | reclaim process before new locks can be obtained; however, it | |||
requires a greater degree of inter-server co-ordination. In general, | requires a greater degree of inter-server coordination. In general, | |||
the servers taking part in migration are free to provide either | the servers taking part in migration are free to provide either | |||
facility. However, when the filehandles can differ across the | facility. However, when the filehandles can differ across the | |||
migration event, Transparent State Migration is the only available | migration event, Transparent State Migration is the only available | |||
means of providing the needed functionality. | means of providing the needed functionality. | |||
It should be noted that these two methods are not mutually exclusive | It should be noted that these two methods are not mutually exclusive | |||
and that a server might well provide both. In particular, if there | and that a server might well provide both. In particular, if there | |||
is some circumstance preventing a specific lock from being | is some circumstance preventing a specific lock from being | |||
transferred transparently, the destination server can allow it to be | transferred transparently, the destination server can allow it to be | |||
reclaimed, by implementing a per-fs grace period for the migrated | reclaimed by implementing a per-fs grace period for the migrated file | |||
file system. | system. | |||
11.11.9.1. Security Consideration Related to Reclaiming Lock State | 11.11.9.1. Security Consideration Related to Reclaiming Lock State | |||
after File System Transitions | after File System Transitions | |||
Although it is possible for a client reclaiming state to misrepresent | Although it is possible for a client reclaiming state to misrepresent | |||
its state, in the same fashion as described in Section 8.4.2.1.1, | its state in the same fashion as described in Section 8.4.2.1.1, most | |||
most implementations providing for such reclamation in the case of | implementations providing for such reclamation in the case of file | |||
file system transitions will have the ability to detect such | system transitions will have the ability to detect such | |||
misrepresentations. This limits the ability of unauthenticated | misrepresentations. This limits the ability of unauthenticated | |||
clients to execute denial-of-service attacks in these circumstances. | clients to execute denial-of-service attacks in these circumstances. | |||
Nevertheless, the rules stated in Section 8.4.2.1.1, regarding | Nevertheless, the rules stated in Section 8.4.2.1.1 regarding | |||
principal verification for reclaim requests, apply in this situation | principal verification for reclaim requests apply in this situation | |||
as well. | as well. | |||
Typically, implementations that support file system transitions will | Typically, implementations that support file system transitions will | |||
have extensive information about the locks to be transferred. This | have extensive information about the locks to be transferred. This | |||
is because: | is because of the following: | |||
* Since failure is not involved, there is no need store to locking | * Since failure is not involved, there is no need to store locking | |||
information in persistent storage. | information in persistent storage. | |||
* There is no need, as there is in the failure case, to update | * There is no need, as there is in the failure case, to update | |||
multiple repositories containing locking state to keep them in | multiple repositories containing locking state to keep them in | |||
sync. Instead, there is a one-time communication of locking state | sync. Instead, there is a one-time communication of locking state | |||
from the source to the destination server. | from the source to the destination server. | |||
* Providing this information avoids potential interference with | * Providing this information avoids potential interference with | |||
existing clients using the destination file system, by denying | existing clients using the destination file system by denying them | |||
them the ability to obtain new locks during the grace period. | the ability to obtain new locks during the grace period. | |||
When such detailed locking information, not necessarily including the | When such detailed locking information, not necessarily including the | |||
associated stateids, is available: | associated stateids, is available: | |||
* It is possible to detect reclaim requests that attempt to reclaim | * It is possible to detect reclaim requests that attempt to reclaim | |||
locks that did not exist before the transfer, rejecting them with | locks that did not exist before the transfer, rejecting them with | |||
NFS4ERR_RECLAIM_BAD (Section 15.1.9.4). | NFS4ERR_RECLAIM_BAD (Section 15.1.9.4). | |||
* It is possible when dealing with non-reclaim requests, to | * It is possible when dealing with non-reclaim requests, to | |||
determine whether they conflict with existing locks, eliminating | determine whether they conflict with existing locks, eliminating | |||
the need to return NFS4ERR_GRACE (Section 15.1.9.2) on non-reclaim | the need to return NFS4ERR_GRACE (Section 15.1.9.2) on non-reclaim | |||
requests. | requests. | |||
It is possible for implementations of grace periods in connection | It is possible for implementations of grace periods in connection | |||
with file system transitions not to have detailed locking information | with file system transitions not to have detailed locking information | |||
available at the destination server, in which case the security | available at the destination server, in which case, the security | |||
situation is exactly as described in Section 8.4.2.1.1. | situation is exactly as described in Section 8.4.2.1.1. | |||
11.11.9.2. Leases and File System Transitions | 11.11.9.2. Leases and File System Transitions | |||
In the case of lease renewal, the client may not be submitting | In the case of lease renewal, the client may not be submitting | |||
requests for a file system that has been transferred to another | requests for a file system that has been transferred to another | |||
server. This can occur because of the lease renewal mechanism. The | server. This can occur because of the lease renewal mechanism. The | |||
client renews the lease associated with all file systems when | client renews the lease associated with all file systems when | |||
submitting a request on an associated session, regardless of the | submitting a request on an associated session, regardless of the | |||
specific file system being referenced. | specific file system being referenced. | |||
skipping to change at line 12438 ¶ | skipping to change at line 12434 ¶ | |||
new server, the client should fetch the value of lease_time on the | new server, the client should fetch the value of lease_time on the | |||
new (i.e., destination) server, and use it for subsequent locking | new (i.e., destination) server, and use it for subsequent locking | |||
requests. However, the server must respect a grace period of at | requests. However, the server must respect a grace period of at | |||
least as long as the lease_time on the source server, in order to | least as long as the lease_time on the source server, in order to | |||
ensure that clients have ample time to reclaim their lock before | ensure that clients have ample time to reclaim their lock before | |||
potentially conflicting non-reclaimed locks are granted. | potentially conflicting non-reclaimed locks are granted. | |||
11.12. Transferring State upon Migration | 11.12. Transferring State upon Migration | |||
When the transition is a result of a server-initiated decision to | When the transition is a result of a server-initiated decision to | |||
transition access and the source and destination servers have | transition access, and the source and destination servers have | |||
implemented appropriate co-operation, it is possible to: | implemented appropriate cooperation, it is possible to do the | |||
following: | ||||
* Transfer locking state from the source to the destination server, | * Transfer locking state from the source to the destination server | |||
in a fashion similar to that provided by Transparent State | in a fashion similar to that provided by Transparent State | |||
Migration in NFSv4.0, as described in [68]. Server | Migration in NFSv4.0, as described in [68]. Server | |||
responsibilities are described in Section 11.14.2. | responsibilities are described in Section 11.14.2. | |||
* Transfer session state from the source to the destination server. | * Transfer session state from the source to the destination server. | |||
Server responsibilities in effecting such a transfer are described | Server responsibilities in effecting such a transfer are described | |||
in Section 11.14.3. | in Section 11.14.3. | |||
The means by which the client determines which of these transfer | The means by which the client determines which of these transfer | |||
events has occurred are described in Section 11.13. | events has occurred are described in Section 11.13. | |||
11.12.1. Transparent State Migration and pNFS | 11.12.1. Transparent State Migration and pNFS | |||
When pNFS is involved, the protocol is capable of supporting: | When pNFS is involved, the protocol is capable of supporting: | |||
* Migration of the Metadata Server (MDS), leaving the Data Servers | * Migration of the Metadata Server (MDS), leaving the Data Servers | |||
(DS's) in place. | (DSs) in place. | |||
* Migration of the file system as a whole, including the MDS and | * Migration of the file system as a whole, including the MDS and | |||
associated DS's. | associated DSs. | |||
* Replacement of one DS by another. | * Replacement of one DS by another. | |||
* Migration of a pNFS file system to one in which pNFS is not used. | * Migration of a pNFS file system to one in which pNFS is not used. | |||
* Migration of a file system not using pNFS to one in which layouts | * Migration of a file system not using pNFS to one in which layouts | |||
are available. | are available. | |||
Note that migration per se is only involved in the transfer of the | Note that migration, per se, is only involved in the transfer of the | |||
MDS function. Although the servicing of a layout may be transferred | MDS function. Although the servicing of a layout may be transferred | |||
from one data server to another, this not done using the file system | from one data server to another, this not done using the file system | |||
location attributes. The MDS can effect such transfers by recalling/ | location attributes. The MDS can effect such transfers by recalling | |||
revoking existing layouts and granting new ones on a different data | or revoking existing layouts and granting new ones on a different | |||
server. | data server. | |||
Migration of the MDS function is directly supported by Transparent | Migration of the MDS function is directly supported by Transparent | |||
State Migration. Layout state will normally be transparently | State Migration. Layout state will normally be transparently | |||
transferred, just as other state is. As a result, Transparent State | transferred, just as other state is. As a result, Transparent State | |||
Migration provides a framework in which, given appropriate inter-MDS | Migration provides a framework in which, given appropriate inter-MDS | |||
data transfer, one MDS can be substituted for another. | data transfer, one MDS can be substituted for another. | |||
Migration of the file system function as a whole can be accomplished | Migration of the file system function as a whole can be accomplished | |||
by recalling all layouts as part of the initial phase of the | by recalling all layouts as part of the initial phase of the | |||
migration process. As a result, IO will be done through the MDS | migration process. As a result, I/O will be done through the MDS | |||
during the migration process, and new layouts can be granted once the | during the migration process, and new layouts can be granted once the | |||
client is interacting with the new MDS. An MDS can also effect this | client is interacting with the new MDS. An MDS can also effect this | |||
sort of transition by revoking all layouts as part of Transparent | sort of transition by revoking all layouts as part of Transparent | |||
State Migration, as long as the client is notified about the loss of | State Migration, as long as the client is notified about the loss of | |||
locking state. | locking state. | |||
In order to allow migration to a file system on which pNFS is not | In order to allow migration to a file system on which pNFS is not | |||
supported, clients need to be prepared for a situation in which | supported, clients need to be prepared for a situation in which | |||
layouts are not available or supported on the destination file system | layouts are not available or supported on the destination file system | |||
and so direct IO requests to the destination server, rather than | and so direct I/O requests to the destination server, rather than | |||
depending on layouts being available. | depending on layouts being available. | |||
Replacement of one DS by another is not addressed by migration as | Replacement of one DS by another is not addressed by migration as | |||
such but can be effected by an MDS recalling layouts for the DS to be | such but can be effected by an MDS recalling layouts for the DS to be | |||
replaced and issuing new ones to be served by the successor DS. | replaced and issuing new ones to be served by the successor DS. | |||
Migration may transfer a file system from a server which does not | Migration may transfer a file system from a server that does not | |||
support pNFS to one which does. In order to properly adapt to this | support pNFS to one that does. In order to properly adapt to this | |||
situation, clients which support pNFS, but function adequately in its | situation, clients that support pNFS, but function adequately in its | |||
absence should check for pNFS support when a file system is migrated | absence, should check for pNFS support when a file system is migrated | |||
and be prepared to use pNFS when support is available on the | and be prepared to use pNFS when support is available on the | |||
destination. | destination. | |||
11.13. Client Responsibilities when Access is Transitioned | 11.13. Client Responsibilities When Access Is Transitioned | |||
For a client to respond to an access transition, it must become aware | For a client to respond to an access transition, it must become aware | |||
of it. The ways in which this can happen are discussed in | of it. The ways in which this can happen are discussed in | |||
Section 11.13.1 which discusses indications that a specific file | Section 11.13.1, which discusses indications that a specific file | |||
system access path has transitioned as well as situations in which | system access path has transitioned as well as situations in which | |||
additional activity is necessary to determine the set of file systems | additional activity is necessary to determine the set of file systems | |||
that have been migrated. Section 11.13.2 goes on to complete the | that have been migrated. Section 11.13.2 goes on to complete the | |||
discussion of how the set of migrated file systems might be | discussion of how the set of migrated file systems might be | |||
determined. Sections 11.13.3 through 11.13.5 discuss how the client | determined. Sections 11.13.3 through 11.13.5 discuss how the client | |||
should deal with each transition it becomes aware of, either directly | should deal with each transition it becomes aware of, either directly | |||
or as a result of migration discovery. | or as a result of migration discovery. | |||
The following terms are used to describe client activities: | The following terms are used to describe client activities: | |||
* "Transition recovery" refers to the process of restoring access to | * "Transition recovery" refers to the process of restoring access to | |||
a file system on which NFS4ERR_MOVED was received. | a file system on which NFS4ERR_MOVED was received. | |||
* "Migration recovery" to that subset of transition recovery which | * "Migration recovery" refers to that subset of transition recovery | |||
applies when the file system has migrated to a different replica. | that applies when the file system has migrated to a different | |||
replica. | ||||
* "Migration discovery" refers to the process of determining which | * "Migration discovery" refers to the process of determining which | |||
file system(s) have been migrated. It is necessary to avoid a | file system(s) have been migrated. It is necessary to avoid a | |||
situation in which leases could expire when a file system is not | situation in which leases could expire when a file system is not | |||
accessed for a long period of time, since a client unaware of the | accessed for a long period of time, since a client unaware of the | |||
migration might be referencing an unmigrated file system and not | migration might be referencing an unmigrated file system and not | |||
renewing the lease associated with the migrated file system. | renewing the lease associated with the migrated file system. | |||
11.13.1. Client Transition Notifications | 11.13.1. Client Transition Notifications | |||
When there is a change in the network access path which a client is | When there is a change in the network access path that a client is to | |||
to use to access a file system, there are a number of related status | use to access a file system, there are a number of related status | |||
indications with which clients need to deal: | indications with which clients need to deal: | |||
* If an attempt is made to use or return a filehandle within a file | * If an attempt is made to use or return a filehandle within a file | |||
system that is no longer accessible at the address previously used | system that is no longer accessible at the address previously used | |||
to access it, the error NFS4ERR_MOVED is returned. | to access it, the error NFS4ERR_MOVED is returned. | |||
Exceptions are made to allow such file handles to be used when | Exceptions are made to allow such filehandles to be used when | |||
interrogating a file system location attribute. This enables a | interrogating a file system location attribute. This enables a | |||
client to determine a new replica's location or a new network | client to determine a new replica's location or a new network | |||
access path. | access path. | |||
This condition continues on subsequent attempts to access the file | This condition continues on subsequent attempts to access the file | |||
system in question. The only way the client can avoid the error | system in question. The only way the client can avoid the error | |||
is to cease accessing the file system in question at its old | is to cease accessing the file system in question at its old | |||
server location and access it instead using a different address at | server location and access it instead using a different address at | |||
which it is now available. | which it is now available. | |||
skipping to change at line 12570 ¶ | skipping to change at line 12568 ¶ | |||
a file system that is no longer accessible on the server at which | a file system that is no longer accessible on the server at which | |||
it was previously available, the response will contain a lease- | it was previously available, the response will contain a lease- | |||
migrated indication, with the SEQ4_STATUS_LEASE_MOVED status bit | migrated indication, with the SEQ4_STATUS_LEASE_MOVED status bit | |||
being set. | being set. | |||
This condition continues until the client acknowledges the | This condition continues until the client acknowledges the | |||
notification by fetching a file system location attribute for the | notification by fetching a file system location attribute for the | |||
file system whose network access path is being changed. When | file system whose network access path is being changed. When | |||
there are multiple such file systems, a location attribute for | there are multiple such file systems, a location attribute for | |||
each such file system needs to be fetched. The location attribute | each such file system needs to be fetched. The location attribute | |||
for all migrated file system needs to be fetched in order to clear | for all migrated file systems needs to be fetched in order to | |||
the condition. Even after the condition is cleared, the client | clear the condition. Even after the condition is cleared, the | |||
needs to respond by using the location information to access the | client needs to respond by using the location information to | |||
file system at its new location to ensure that leases are not | access the file system at its new location to ensure that leases | |||
needlessly expired. | are not needlessly expired. | |||
Unlike the case of NFSv4.0, in which the corresponding conditions are | Unlike NFSv4.0, in which the corresponding conditions are both errors | |||
both errors and thus mutually exclusive, in NFSv4.1 the client can, | and thus mutually exclusive, in NFSv4.1 the client can, and often | |||
and often will, receive both indications on the same request. As a | will, receive both indications on the same request. As a result, | |||
result, implementations need to address the question of how to co- | implementations need to address the question of how to coordinate the | |||
ordinate the necessary recovery actions when both indications arrive | necessary recovery actions when both indications arrive in the | |||
in the response to the same request. It should be noted that when | response to the same request. It should be noted that when | |||
processing an NFSv4 COMPOUND, the server will normally decide whether | processing an NFSv4 COMPOUND, the server will normally decide whether | |||
SEQ4_STATUS_LEASE_MOVED is to be set before it determines which file | SEQ4_STATUS_LEASE_MOVED is to be set before it determines which file | |||
system will be referenced or whether NFS4ERR_MOVED is to be returned. | system will be referenced or whether NFS4ERR_MOVED is to be returned. | |||
Since these indications are not mutually exclusive in NFSv4.1, the | Since these indications are not mutually exclusive in NFSv4.1, the | |||
following combinations are possible results when a COMPOUND is | following combinations are possible results when a COMPOUND is | |||
issued: | issued: | |||
* The COMPOUND status is NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED | * The COMPOUND status is NFS4ERR_MOVED, and SEQ4_STATUS_LEASE_MOVED | |||
is asserted. | is asserted. | |||
In this case, transition recovery is required. While it is | In this case, transition recovery is required. While it is | |||
possible that migration discovery is needed in addition, it is | possible that migration discovery is needed in addition, it is | |||
likely that only the accessed file system has transitioned. In | likely that only the accessed file system has transitioned. In | |||
any case, because addressing NFS4ERR_MOVED is necessary to allow | any case, because addressing NFS4ERR_MOVED is necessary to allow | |||
the rejected requests to be processed on the target, dealing with | the rejected requests to be processed on the target, dealing with | |||
it will typically have priority over migration discovery. | it will typically have priority over migration discovery. | |||
* The COMPOUND status is NFS4ERR_MOVED and SEQ4_STATUS_LEASE_MOVED | * The COMPOUND status is NFS4ERR_MOVED, and SEQ4_STATUS_LEASE_MOVED | |||
is clear. | is clear. | |||
In this case, transition recovery is also required. It is clear | In this case, transition recovery is also required. It is clear | |||
that migration discovery is not needed to find file systems that | that migration discovery is not needed to find file systems that | |||
have been migrated other that the one returning NFS4ERR_MOVED. | have been migrated other than the one returning NFS4ERR_MOVED. | |||
Cases in which this result can arise include a referral or a | Cases in which this result can arise include a referral or a | |||
migration for which there is no associated locking state. This | migration for which there is no associated locking state. This | |||
can also arise in cases in which an access path transition other | can also arise in cases in which an access path transition other | |||
than migration occurs within the same server. In such a case, | than migration occurs within the same server. In such a case, | |||
there is no need to set SEQ4_STATUS_LEASE_MOVED, since the lease | there is no need to set SEQ4_STATUS_LEASE_MOVED, since the lease | |||
remains associated with the current server even though the access | remains associated with the current server even though the access | |||
path has changed. | path has changed. | |||
* The COMPOUND status is not NFS4ERR_MOVED and | * The COMPOUND status is not NFS4ERR_MOVED, and | |||
SEQ4_STATUS_LEASE_MOVED is asserted. | SEQ4_STATUS_LEASE_MOVED is asserted. | |||
In this case, no transition recovery activity is required on the | In this case, no transition recovery activity is required on the | |||
file system(s) accessed by the request. However, to prevent | file system(s) accessed by the request. However, to prevent | |||
avoidable lease expiration, migration discovery needs to be done | avoidable lease expiration, migration discovery needs to be done. | |||
* The COMPOUND status is not NFS4ERR_MOVED and | * The COMPOUND status is not NFS4ERR_MOVED, and | |||
SEQ4_STATUS_LEASE_MOVED is clear. | SEQ4_STATUS_LEASE_MOVED is clear. | |||
In this case, neither transition-related activity nor migration | In this case, neither transition-related activity nor migration | |||
discovery is required. | discovery is required. | |||
Note that the specified actions only need to be taken if they are not | Note that the specified actions only need to be taken if they are not | |||
already going on. For example, when NFS4ERR_MOVED is received when | already going on. For example, when NFS4ERR_MOVED is received while | |||
accessing a file system for which transition recovery already going | accessing a file system for which transition recovery is already | |||
on, the client merely waits for that recovery to be completed while | occurring, the client merely waits for that recovery to be completed, | |||
the receipt of SEQ4_STATUS_LEASE_MOVED indication only needs to | while the receipt of the SEQ4_STATUS_LEASE_MOVED indication only | |||
initiate migration discovery for a server if such discovery is not | needs to initiate migration discovery for a server if such discovery | |||
already underway for that server. | is not already underway for that server. | |||
The fact that a lease-migrated condition does not result in an error | The fact that a lease-migrated condition does not result in an error | |||
in NFSv4.1 has a number of important consequences. In addition to | in NFSv4.1 has a number of important consequences. In addition to | |||
the fact, discussed above, that the two indications are not mutually | the fact that the two indications are not mutually exclusive, as | |||
exclusive, there are number of issues that are important in | discussed above, there are number of issues that are important in | |||
considering implementation of migration discovery, as discussed in | considering implementation of migration discovery, as discussed in | |||
Section 11.13.2. | Section 11.13.2. | |||
Because SEQ4_STATUS_LEASE_MOVED is not an error condition", it is | Because SEQ4_STATUS_LEASE_MOVED is not an error condition, it is | |||
possible for file systems whose access paths have not changed to be | possible for file systems whose access paths have not changed to be | |||
successfully accessed on a given server even though recovery is | successfully accessed on a given server even though recovery is | |||
necessary for other file systems on the same server. As a result, | necessary for other file systems on the same server. As a result, | |||
access can go on while, | access can take place while: | |||
* The migration discovery process is going on for that server. | * The migration discovery process is happening for that server. | |||
* The transition recovery process is going on for other file systems | * The transition recovery process is happening for other file | |||
connected to that server. | systems connected to that server. | |||
11.13.2. Performing Migration Discovery | 11.13.2. Performing Migration Discovery | |||
Migration discovery can be performed in the same context as | Migration discovery can be performed in the same context as | |||
transition recovery, allowing recovery for each migrated file system | transition recovery, allowing recovery for each migrated file system | |||
to be invoked as it is discovered. Alternatively, it may be done in | to be invoked as it is discovered. Alternatively, it may be done in | |||
a separate migration discovery thread, allowing migration discovery | a separate migration discovery thread, allowing migration discovery | |||
to be done in parallel with one or more instances of transition | to be done in parallel with one or more instances of transition | |||
recovery. | recovery. | |||
In either case, because the lease-migrated indication does not result | In either case, because the lease-migrated indication does not result | |||
in an error. other access to file systems on the server can proceed | in an error, other access to file systems on the server can proceed | |||
normally, with the possibility that further such indications will be | normally, with the possibility that further such indications will be | |||
received, raising the issue of how such indications are to be dealt | received, raising the issue of how such indications are to be dealt | |||
with. In general, | with. In general: | |||
* No action needs to be taken for such indications received by any | * No action needs to be taken for such indications received by any | |||
threads performing migration discovery, since continuation of that | threads performing migration discovery, since continuation of that | |||
work will address the issue. | work will address the issue. | |||
* In other cases in which migration discovery is currently being | * In other cases in which migration discovery is currently being | |||
performed, nothing further needs to be done to respond to such | performed, nothing further needs to be done to respond to such | |||
lease migration indications, as long as one can be certain that | lease migration indications, as long as one can be certain that | |||
the migration discovery process would deal with those indications. | the migration discovery process would deal with those indications. | |||
See below for details. | See below for details. | |||
skipping to change at line 12702 ¶ | skipping to change at line 12700 ¶ | |||
migration events may occur at any time, and because a LEASE_MOVED | migration events may occur at any time, and because a LEASE_MOVED | |||
indication may reflect the situation in effect a considerable time | indication may reflect the situation in effect a considerable time | |||
before the indication is received, special care needs to be taken to | before the indication is received, special care needs to be taken to | |||
ensure that LEASE_MOVED indications are not inappropriately ignored. | ensure that LEASE_MOVED indications are not inappropriately ignored. | |||
A useful approach to this issue involves the use of separate | A useful approach to this issue involves the use of separate | |||
externally-visible migration discovery states for each server. | externally-visible migration discovery states for each server. | |||
Separate values could represent the various possible states for the | Separate values could represent the various possible states for the | |||
migration discovery process for a server: | migration discovery process for a server: | |||
* non-operation, in which migration discovery is not being performed | * Non-operation, in which migration discovery is not being | |||
performed. | ||||
* normal operation, in which there is an ongoing scan for migrated | * Normal operation, in which there is an ongoing scan for migrated | |||
file systems. | file systems. | |||
* completion/verification of migration discovery processing, in | * Completion/verification of migration discovery processing, in | |||
which the possible completion of migration discovery processing | which the possible completion of migration discovery processing | |||
needs to be verified. | needs to be verified. | |||
Given that framework, migration discovery processing would proceed as | Given that framework, migration discovery processing would proceed as | |||
follows. | follows: | |||
* While in the normal-operation state, the thread performing | * While in the normal-operation state, the thread performing | |||
discovery would fetch, for successive file systems known to the | discovery would fetch, for successive file systems known to the | |||
client on the server being worked on, a file system location | client on the server being worked on, a file system location | |||
attribute plus the fs_status attribute. | attribute plus the fs_status attribute. | |||
* If the fs_status attribute indicates that the file system is a | * If the fs_status attribute indicates that the file system is a | |||
migrated one (i.e. fss_absent is true and fss_type != | migrated one (i.e., fss_absent is true, and fss_type != | |||
STATUS4_REFERRAL) then a migrated file system has been found. In | STATUS4_REFERRAL), then a migrated file system has been found. In | |||
this situation, it is likely that the fetch of the file system | this situation, it is likely that the fetch of the file system | |||
location attribute has cleared one the file systems contributing | location attribute has cleared one of the file systems | |||
to the lease-migrated indication. | contributing to the lease-migrated indication. | |||
* In cases in which that happened, the thread cannot know whether | * In cases in which that happened, the thread cannot know whether | |||
the lease-migrated indication has been cleared and so it enters | the lease-migrated indication has been cleared, and so it enters | |||
the completion/verification state and proceeds to issue a COMPOUND | the completion/verification state and proceeds to issue a COMPOUND | |||
to see if the LEASE_MOVED indication has been cleared. | to see if the LEASE_MOVED indication has been cleared. | |||
* When the discovery process is in the completion/verification | * When the discovery process is in the completion/verification | |||
state, if other requests get a lease-migrated indication they note | state, if other requests get a lease-migrated indication, they | |||
that it was received. Later, the existence of such indications is | note that it was received. Later, the existence of such | |||
used when the request completes, as described below. | indications is used when the request completes, as described | |||
below. | ||||
When the request used in the completion/verification state completes: | When the request used in the completion/verification state completes: | |||
* If a lease-migrated indication is returned, the discovery | * If a lease-migrated indication is returned, the discovery | |||
continues normally. Note that this is so even if all file systems | continues normally. Note that this is so even if all file systems | |||
have traversed, since new migrations could have occurred while the | have been traversed, since new migrations could have occurred | |||
process was going on. | while the process was going on. | |||
* Otherwise, if there is any record that other requests saw a lease- | * Otherwise, if there is any record that other requests saw a lease- | |||
migrated indication while the request was going on, that record is | migrated indication while the request was occurring, that record | |||
cleared and the verification request retried. The discovery | is cleared, and the verification request is retried. The | |||
process remains in completion/verification state. | discovery process remains in the completion/verification state. | |||
* If there have been no lease-migrated indications, the work of | * If there have been no lease-migrated indications, the work of | |||
migration discovery is considered completed and it enters the non- | migration discovery is considered completed, and it enters the | |||
operating state. Once it enters this state, subsequent lease- | non-operating state. Once it enters this state, subsequent lease- | |||
migrated indication will trigger a new migration discovery | migrated indications will trigger a new migration discovery | |||
process. | process. | |||
It should be noted that the process described above is not guaranteed | It should be noted that the process described above is not guaranteed | |||
to terminate, as a long series of new migration events might | to terminate, as a long series of new migration events might | |||
continually delay the clearing of the LEASE_MOVED indication. To | continually delay the clearing of the LEASE_MOVED indication. To | |||
prevent unnecessary lease expiration, it is appropriate for clients | prevent unnecessary lease expiration, it is appropriate for clients | |||
to use the discovery of migrations to effect lease renewal | to use the discovery of migrations to effect lease renewal | |||
immediately, rather than waiting for clearing of the LEASE_MOVED | immediately, rather than waiting for the clearing of the LEASE_MOVED | |||
indication when the complete set of migrations is available. | indication when the complete set of migrations is available. | |||
Lease discovery needs to be provided as described above. This | Lease discovery needs to be provided as described above. This | |||
ensures that the client discovers file system migrations soon enough | ensures that the client discovers file system migrations soon enough | |||
to renew its leases on each destination server before they expire. | to renew its leases on each destination server before they expire. | |||
Non-renewal of leases can lead to loss of locking state. While the | Non-renewal of leases can lead to loss of locking state. While the | |||
consequences of such loss can be ameliorated through implementations | consequences of such loss can be ameliorated through implementations | |||
of courtesy locks, servers are under no obligation to do so, and a | of courtesy locks, servers are under no obligation to do so, and a | |||
conflicting lock request may mean that a lock is revoked | conflicting lock request may mean that a lock is revoked | |||
unexpectedly. Clients should be aware of this possibility. | unexpectedly. Clients should be aware of this possibility. | |||
skipping to change at line 12799 ¶ | skipping to change at line 12799 ¶ | |||
State Migration. | State Migration. | |||
During the first phase of this process, the client proceeds to | During the first phase of this process, the client proceeds to | |||
examine file system location entries to find the initial network | examine file system location entries to find the initial network | |||
address it will use to continue access to the file system or its | address it will use to continue access to the file system or its | |||
replacement. For each location entry that the client examines, the | replacement. For each location entry that the client examines, the | |||
process consists of five steps: | process consists of five steps: | |||
1. Performing an EXCHANGE_ID directed at the location address. This | 1. Performing an EXCHANGE_ID directed at the location address. This | |||
operation is used to register the client owner (in the form of a | operation is used to register the client owner (in the form of a | |||
client_owner4) with the server, to obtain a client ID to be use | client_owner4) with the server, to obtain a client ID to be used | |||
subsequently to communicate with it, to obtain that client ID's | subsequently to communicate with it, to obtain that client ID's | |||
confirmation status, and to determine server_owner and scope for | confirmation status, and to determine server_owner and scope for | |||
the purpose of determining if the entry is trunkable with that | the purpose of determining if the entry is trunkable with the | |||
previously being used to access the file system (i.e. that it | address previously being used to access the file system (i.e., | |||
represents another network access path to the same file system | that it represents another network access path to the same file | |||
and can share locking state with it). | system and can share locking state with it). | |||
2. Making an initial determination of whether migration has | 2. Making an initial determination of whether migration has | |||
occurred. The initial determination will be based on whether the | occurred. The initial determination will be based on whether the | |||
EXCHANGE_ID results indicate that the current location element is | EXCHANGE_ID results indicate that the current location element is | |||
server-trunkable with that used to access the file system when | server-trunkable with that used to access the file system when | |||
access was terminated by receiving NFS4ERR_MOVED. If it is, then | access was terminated by receiving NFS4ERR_MOVED. If it is, then | |||
migration has not occurred. In that case, the transition is | migration has not occurred. In that case, the transition is | |||
dealt with, at least initially, as one involving continued access | dealt with, at least initially, as one involving continued access | |||
to the same file system on the same server through a new network | to the same file system on the same server through a new network | |||
address. | address. | |||
3. Obtaining access to existing session state or creating new | 3. Obtaining access to existing session state or creating new | |||
sessions. How this is done depends on the initial determination | sessions. How this is done depends on the initial determination | |||
of whether migration has occurred and can be done as described in | of whether migration has occurred and can be done as described in | |||
Section 11.13.4 below in the case of migration or as described in | Section 11.13.4 below in the case of migration or as described in | |||
Section 11.13.5 below in the case of a network address transfer | Section 11.13.5 below in the case of a network address transfer | |||
without migration. | without migration. | |||
4. Verification of the trunking relationship assumed in step 2 as | 4. Verifying the trunking relationship assumed in step 2 as | |||
discussed in Section 2.10.5.1. Although this step will generally | discussed in Section 2.10.5.1. Although this step will generally | |||
confirm the initial determination, it is possible for | confirm the initial determination, it is possible for | |||
verification to fail with the result that an initial | verification to fail with the result that an initial | |||
determination that a network address shift (without migration) | determination that a network address shift (without migration) | |||
has occurred may be invalidated and migration determined to have | has occurred may be invalidated and migration determined to have | |||
occurred. There is no need to redo step 3 above, since it will | occurred. There is no need to redo step 3 above, since it will | |||
be possible to continue use of the session established already. | be possible to continue use of the session established already. | |||
5. Obtaining access to existing locking state and/or reobtaining it. | 5. Obtaining access to existing locking state and/or re-obtaining | |||
How this is done depends on the final determination of whether | it. How this is done depends on the final determination of | |||
migration has occurred and can be done as described below in | whether migration has occurred and can be done as described below | |||
Section 11.13.4 in the case of migration or as described in | in Section 11.13.4 in the case of migration or as described in | |||
Section 11.13.5 in the case of a network address transfer without | Section 11.13.5 in the case of a network address transfer without | |||
migration. | migration. | |||
Once the initial address has been determined, clients are free to | Once the initial address has been determined, clients are free to | |||
apply an abbreviated process to find additional addresses trunkable | apply an abbreviated process to find additional addresses trunkable | |||
with it (clients may seek session-trunkable or server-trunkable | with it (clients may seek session-trunkable or server-trunkable | |||
addresses depending on whether they support clientid trunking). | addresses depending on whether they support client ID trunking). | |||
During this later phase of the process, further location entries are | During this later phase of the process, further location entries are | |||
examined using the abbreviated procedure specified below: | examined using the abbreviated procedure specified below: | |||
A: Before the EXCHANGE_ID, the fs name of the location entry is | A: Before the EXCHANGE_ID, the fs name of the location entry is | |||
examined and if it does not match that currently being used, the | examined, and if it does not match that currently being used, the | |||
entry is ignored. otherwise, one proceeds as specified by step 1 | entry is ignored. Otherwise, one proceeds as specified by step 1 | |||
above. | above. | |||
B: In the case that the network address is session-trunkable with | B: In the case that the network address is session-trunkable with | |||
one used previously a BIND_CONN_TO_SESSION is used to access that | one used previously, a BIND_CONN_TO_SESSION is used to access | |||
session using the new network address. Otherwise, or if the bind | that session using the new network address. Otherwise, or if the | |||
operation fails, a CREATE_SESSION is done. | bind operation fails, a CREATE_SESSION is done. | |||
C: The verification procedure referred to in step 4 above is used. | C: The verification procedure referred to in step 4 above is used. | |||
However, if it fails, the entry is ignored and the next available | However, if it fails, the entry is ignored and the next available | |||
entry is used. | entry is used. | |||
11.13.4. Obtaining Access to Sessions and State after Migration | 11.13.4. Obtaining Access to Sessions and State after Migration | |||
In the event that migration has occurred, migration recovery will | In the event that migration has occurred, migration recovery will | |||
involve determining whether Transparent State Migration has occurred. | involve determining whether Transparent State Migration has occurred. | |||
This decision is made based on the client ID returned by the | This decision is made based on the client ID returned by the | |||
EXCHANGE_ID and the reported confirmation status. | EXCHANGE_ID and the reported confirmation status. | |||
* If the client ID is an unconfirmed client ID not previously known | * If the client ID is an unconfirmed client ID not previously known | |||
to the client, then Transparent State Migration has not occurred. | to the client, then Transparent State Migration has not occurred. | |||
* If the client ID is a confirmed client ID previously known to the | * If the client ID is a confirmed client ID previously known to the | |||
client, then any transferred state would have been merged with an | client, then any transferred state would have been merged with an | |||
existing client ID representing the client to the destination | existing client ID representing the client to the destination | |||
server. In this state merger case, Transparent State Migration | server. In this state merger case, Transparent State Migration | |||
might or might not have occurred and a determination as to whether | might or might not have occurred, and a determination as to | |||
it has occurred is deferred until sessions are established and the | whether it has occurred is deferred until sessions are established | |||
client is ready to begin state recovery. | and the client is ready to begin state recovery. | |||
* If the client ID is a confirmed client ID not previously known to | * If the client ID is a confirmed client ID not previously known to | |||
the client, then the client can conclude that the client ID was | the client, then the client can conclude that the client ID was | |||
transferred as part of Transparent State Migration. In this | transferred as part of Transparent State Migration. In this | |||
transferred client ID case, Transparent State Migration has | transferred client ID case, Transparent State Migration has | |||
occurred although some state might have been lost. | occurred, although some state might have been lost. | |||
Once the client ID has been obtained, it is necessary to obtain | Once the client ID has been obtained, it is necessary to obtain | |||
access to sessions to continue communication with the new server. In | access to sessions to continue communication with the new server. In | |||
any of the cases in which Transparent State Migration has occurred, | any of the cases in which Transparent State Migration has occurred, | |||
it is possible that a session was transferred as well. To deal with | it is possible that a session was transferred as well. To deal with | |||
that possibility, clients can, after doing the EXCHANGE_ID, issue a | that possibility, clients can, after doing the EXCHANGE_ID, issue a | |||
BIND_CONN_TO_SESSION to connect the transferred session to a | BIND_CONN_TO_SESSION to connect the transferred session to a | |||
connection to the new server. If that fails, it is an indication | connection to the new server. If that fails, it is an indication | |||
that the session was not transferred and that a new session needs to | that the session was not transferred and that a new session needs to | |||
be created to take its place. | be created to take its place. | |||
In some situations, it is possible for a BIND_CONN_TO_SESSION to | In some situations, it is possible for a BIND_CONN_TO_SESSION to | |||
succeed without session migration having occurred. If state merger | succeed without session migration having occurred. If state merger | |||
has taken place then the associated client ID may have already had a | has taken place, then the associated client ID may have already had a | |||
set of existing sessions, with it being possible that the sessionid | set of existing sessions, with it being possible that the sessionid | |||
of a given session is the same as one that might have been migrated. | of a given session is the same as one that might have been migrated. | |||
In that event, a BIND_CONN_TO_SESSION might succeed, even though | In that event, a BIND_CONN_TO_SESSION might succeed, even though | |||
there could have been no migration of the session with that | there could have been no migration of the session with that | |||
sessionid. In such cases, the client will receive sequence errors | sessionid. In such cases, the client will receive sequence errors | |||
when the slot sequence values used are not appropriate on the new | when the slot sequence values used are not appropriate on the new | |||
session. When this occurs, the client can create a new a session and | session. When this occurs, the client can create a new a session and | |||
cease using the existing one. | cease using the existing one. | |||
Once the client has determined the initial migration status, and | Once the client has determined the initial migration status, and | |||
skipping to change at line 12927 ¶ | skipping to change at line 12927 ¶ | |||
Clients need to deal with the following cases: | Clients need to deal with the following cases: | |||
* In the state merger case, it is possible that the server has not | * In the state merger case, it is possible that the server has not | |||
attempted Transparent State Migration, in which case state may | attempted Transparent State Migration, in which case state may | |||
have been lost without it being reflected in the SEQ4_STATUS bits. | have been lost without it being reflected in the SEQ4_STATUS bits. | |||
To determine whether this has happened, the client can use | To determine whether this has happened, the client can use | |||
TEST_STATEID to check whether the stateids created on the source | TEST_STATEID to check whether the stateids created on the source | |||
server are still accessible on the destination server. Once a | server are still accessible on the destination server. Once a | |||
single stateid is found to have been successfully transferred, the | single stateid is found to have been successfully transferred, the | |||
client can conclude that Transparent State Migration was begun and | client can conclude that Transparent State Migration was begun, | |||
any failure to transport all of the stateids will be reflected in | and any failure to transport all of the stateids will be reflected | |||
the SEQ4_STATUS bits. Otherwise, Transparent State Migration has | in the SEQ4_STATUS bits. Otherwise, Transparent State Migration | |||
not occurred. | has not occurred. | |||
* In a case in which Transparent State Migration has not occurred, | * In a case in which Transparent State Migration has not occurred, | |||
the client can use the per-fs grace period provided by the | the client can use the per-fs grace period provided by the | |||
destination server to reclaim locks that were held on the source | destination server to reclaim locks that were held on the source | |||
server. | server. | |||
* In a case in which Transparent State Migration has occurred, and | * In a case in which Transparent State Migration has occurred, and | |||
no lock state was lost (as shown by SEQ4_STATUS flags), no lock | no lock state was lost (as shown by SEQ4_STATUS flags), no lock | |||
reclaim is necessary. | reclaim is necessary. | |||
* In a case in which Transparent State Migration has occurred, and | * In a case in which Transparent State Migration has occurred, and | |||
some lock state was lost (as shown by SEQ4_STATUS flags), existing | some lock state was lost (as shown by SEQ4_STATUS flags), existing | |||
stateids need to be checked for validity using TEST_STATEID, and | stateids need to be checked for validity using TEST_STATEID, and | |||
reclaim used to re-establish any that were not transferred. | reclaim used to re-establish any that were not transferred. | |||
For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs value | For all of the cases above, RECLAIM_COMPLETE with an rca_one_fs value | |||
of TRUE needs to be done before normal use of the file system | of TRUE needs to be done before normal use of the file system, | |||
including obtaining new locks for the file system. This applies even | including obtaining new locks for the file system. This applies even | |||
if no locks were lost and there was no need for any to be reclaimed. | if no locks were lost and there was no need for any to be reclaimed. | |||
11.13.5. Obtaining Access to Sessions and State after Network Address | 11.13.5. Obtaining Access to Sessions and State after Network Address | |||
Transfer | Transfer | |||
The case in which there is a transfer to a new network address | The case in which there is a transfer to a new network address | |||
without migration is similar to that described in Section 11.13.4 | without migration is similar to that described in Section 11.13.4 | |||
above in that there is a need to obtain access to needed sessions and | above in that there is a need to obtain access to needed sessions and | |||
locking state. However, the details are simpler and will vary | locking state. However, the details are simpler and will vary | |||
depending on the type of trunking between the address receiving | depending on the type of trunking between the address receiving | |||
NFS4ERR_MOVED and that to which the transfer is to be made | NFS4ERR_MOVED and that to which the transfer is to be made. | |||
To make a session available for use, a BIND_CONN_TO_SESSION should be | To make a session available for use, a BIND_CONN_TO_SESSION should be | |||
used to obtain access to the session previously in use. Only if this | used to obtain access to the session previously in use. Only if this | |||
fails, should a CREATE_SESSION be done. While this procedure mirrors | fails, should a CREATE_SESSION be done. While this procedure mirrors | |||
that in Section 11.13.4 above, there is an important difference in | that in Section 11.13.4 above, there is an important difference in | |||
that preservation of the session is not purely optional but depends | that preservation of the session is not purely optional but depends | |||
on the type of trunking. | on the type of trunking. | |||
Access to appropriate locking state will generally need no actions | Access to appropriate locking state will generally need no actions | |||
beyond access to the session. However, the SEQ4_STATUS bits need to | beyond access to the session. However, the SEQ4_STATUS bits need to | |||
be checked for lost locking state, including the need to reclaim | be checked for lost locking state, including the need to reclaim | |||
locks after a server reboot, since there is always a possibility of | locks after a server reboot, since there is always a possibility of | |||
locking state being lost. | locking state being lost. | |||
11.14. Server Responsibilities Upon Migration | 11.14. Server Responsibilities Upon Migration | |||
In the event of file system migration, when the client connects to | In the event of file system migration, when the client connects to | |||
the destination server, that server needs to be able to provide the | the destination server, that server needs to be able to provide the | |||
client continued to access the files it had open on the source | client continued access to the files it had open on the source | |||
server. There are two ways to provide this: | server. There are two ways to provide this: | |||
* By provision of an fs-specific grace period, allowing the client | * By provision of an fs-specific grace period, allowing the client | |||
the ability to reclaim its locks, in a fashion similar to what | the ability to reclaim its locks, in a fashion similar to what | |||
would have been done in the case of recovery from a server | would have been done in the case of recovery from a server | |||
restart. See Section 11.14.1 for a more complete discussion. | restart. See Section 11.14.1 for a more complete discussion. | |||
* By implementing Transparent State Migration possibly in connection | * By implementing Transparent State Migration possibly in connection | |||
with session migration, the server can provide the client | with session migration, the server can provide the client | |||
immediate access to the state built up on the source server, on | immediate access to the state built up on the source server on the | |||
the destination. | destination server. | |||
These features are discussed separately in Sections 11.14.2 and | These features are discussed separately in Sections 11.14.2 and | |||
11.14.3, which discuss Transparent State Migration and session | 11.14.3, which discuss Transparent State Migration and session | |||
migration respectively. | migration, respectively. | |||
All the features described above can involve transfer of lock-related | All the features described above can involve transfer of lock-related | |||
information between source and destination servers. In some cases, | information between source and destination servers. In some cases, | |||
this transfer is a necessary part of the implementation while in | this transfer is a necessary part of the implementation, while in | |||
other cases it is a helpful implementation aid which servers might or | other cases, it is a helpful implementation aid, which servers might | |||
might not use. The sub-sections below discuss the information which | or might not use. The subsections below discuss the information that | |||
would be transferred but do not define the specifics of the transfer | would be transferred but do not define the specifics of the transfer | |||
protocol. This is left as an implementation choice although | protocol. This is left as an implementation choice, although | |||
standards in this area could be developed at a later time. | standards in this area could be developed at a later time. | |||
11.14.1. Server Responsibilities in Effecting State Reclaim after | 11.14.1. Server Responsibilities in Effecting State Reclaim after | |||
Migration | Migration | |||
In this case, the destination server needs no knowledge of the locks | In this case, the destination server needs no knowledge of the locks | |||
held on the source server. It relies on the clients to accurately | held on the source server. It relies on the clients to accurately | |||
report (via reclaim operations) the locks previously held, and does | report (via reclaim operations) the locks previously held, and does | |||
not allow new locks to be granted on migrated file systems until the | not allow new locks to be granted on migrated file systems until the | |||
grace period expires. Disallowing of new locks applies to all | grace period expires. Disallowing of new locks applies to all | |||
clients accessing these file system, while grace period expiration | clients accessing these file systems, while grace period expiration | |||
occurs for each migrated client independently. | occurs for each migrated client independently. | |||
During this grace period clients have the opportunity to use reclaim | During this grace period, clients have the opportunity to use reclaim | |||
operations to obtain locks for file system objects within the | operations to obtain locks for file system objects within the | |||
migrated file system, in the same way that they do when recovering | migrated file system, in the same way that they do when recovering | |||
from server restart, and the servers typically rely on clients to | from server restart, and the servers typically rely on clients to | |||
accurately report their locks, although they have the option of | accurately report their locks, although they have the option of | |||
subjecting these requests to verification. If the clients only | subjecting these requests to verification. If the clients only | |||
reclaim locks held on the source server, no conflict can arise. Once | reclaim locks held on the source server, no conflict can arise. Once | |||
the client has reclaimed its locks, it indicates the completion of | the client has reclaimed its locks, it indicates the completion of | |||
lock reclamation by performing a RECLAIM_COMPLETE specifying | lock reclamation by performing a RECLAIM_COMPLETE specifying | |||
rca_one_fs as TRUE. | rca_one_fs as TRUE. | |||
While it is not necessary for source and destination servers to co- | While it is not necessary for source and destination servers to | |||
operate to transfer information about locks, implementations are | cooperate to transfer information about locks, implementations are | |||
well-advised to consider transferring the following useful | well advised to consider transferring the following useful | |||
information: | information: | |||
* If information about the set of clients that have locking state | * If information about the set of clients that have locking state | |||
for the transferred file system is made available, the destination | for the transferred file system is made available, the destination | |||
server will be able to terminate the grace period once all such | server will be able to terminate the grace period once all such | |||
clients have reclaimed their locks, allowing normal locking | clients have reclaimed their locks, allowing normal locking | |||
activity to resume earlier than it would have otherwise. | activity to resume earlier than it would have otherwise. | |||
* Locking summary information for individual clients (at various | * Locking summary information for individual clients (at various | |||
possible levels of detail) can detect some instances in which | possible levels of detail) can detect some instances in which | |||
clients do not accurately represent the locks held on the source | clients do not accurately represent the locks held on the source | |||
server. | server. | |||
11.14.2. Server Responsibilities in Effecting Transparent State | 11.14.2. Server Responsibilities in Effecting Transparent State | |||
Migration | Migration | |||
The basic responsibility of the source server in effecting | The basic responsibility of the source server in effecting | |||
Transparent State Migration is to make available to the destination | Transparent State Migration is to make available to the destination | |||
server a description of each piece of locking state associated with | server a description of each piece of locking state associated with | |||
the file system being migrated. In addition to client id string and | the file system being migrated. In addition to client id string and | |||
verifier, the source server needs to provide, for each stateid: | verifier, the source server needs to provide for each stateid: | |||
* The stateid including the current sequence value. | * The stateid including the current sequence value. | |||
* The associated client ID. | * The associated client ID. | |||
* The handle of the associated file. | * The handle of the associated file. | |||
* The type of the lock, such as open, byte-range lock, delegation, | * The type of the lock, such as open, byte-range lock, delegation, | |||
or layout. | or layout. | |||
skipping to change at line 13091 ¶ | skipping to change at line 13091 ¶ | |||
that locks revoked soon before or soon after migration are not | that locks revoked soon before or soon after migration are not | |||
inadvertently allowed to be reclaimed in situations in which the | inadvertently allowed to be reclaimed in situations in which the | |||
continuity of lock possession cannot be assured. | continuity of lock possession cannot be assured. | |||
* For locks lost on the source but whose loss has not yet been | * For locks lost on the source but whose loss has not yet been | |||
acknowledged by the client (by using FREE_STATEID), the | acknowledged by the client (by using FREE_STATEID), the | |||
destination must be aware of this loss so that it can deny a | destination must be aware of this loss so that it can deny a | |||
request to reclaim them. | request to reclaim them. | |||
* For locks lost on the destination after the state transfer but | * For locks lost on the destination after the state transfer but | |||
before the client's RECLAIM_COMPLTE is done, the destination | before the client's RECLAIM_COMPLETE is done, the destination | |||
server should note these and not allow them to be reclaimed. | server should note these and not allow them to be reclaimed. | |||
An additional responsibility of the cooperating servers concerns | An additional responsibility of the cooperating servers concerns | |||
situations in which a stateid cannot be transferred transparently | situations in which a stateid cannot be transferred transparently | |||
because it conflicts with an existing stateid held by the client and | because it conflicts with an existing stateid held by the client and | |||
associated with a different file system. In this case there are two | associated with a different file system. In this case, there are two | |||
valid choices: | valid choices: | |||
* Treat the transfer, as in NFSv4.0, as one without Transparent | * Treat the transfer, as in NFSv4.0, as one without Transparent | |||
State Migration. In this case, conflicting locks cannot be | State Migration. In this case, conflicting locks cannot be | |||
granted until the client does a RECLAIM_COMPLETE, after reclaiming | granted until the client does a RECLAIM_COMPLETE, after reclaiming | |||
the locks it had, with the exception of reclaims denied because | the locks it had, with the exception of reclaims denied because | |||
they were attempts to reclaim locks that had been lost. | they were attempts to reclaim locks that had been lost. | |||
* Implement Transparent State Migration, except for the lock with | * Implement Transparent State Migration, except for the lock with | |||
the conflicting stateid. In this case, the client will be aware | the conflicting stateid. In this case, the client will be aware | |||
of a lost lock (through the SEQ4_STATUS flags) and be allowed to | of a lost lock (through the SEQ4_STATUS flags) and be allowed to | |||
reclaim it. | reclaim it. | |||
When transferring state between the source and destination, the | When transferring state between the source and destination, the | |||
issues discussed in Section 7.2 of [68] must still be attended to. | issues discussed in Section 7.2 of [68] must still be attended to. | |||
In this case, the use of NFS4ERR_DELAY may still necessary in | In this case, the use of NFS4ERR_DELAY may still be necessary in | |||
NFSv4.1, as it was in NFSv4.0, to prevent locking state changing | NFSv4.1, as it was in NFSv4.0, to prevent locking state changing | |||
while it is being transferred. See Section 15.1.1.3 for information | while it is being transferred. See Section 15.1.1.3 for information | |||
about appropriate client retry approaches in the event that | about appropriate client retry approaches in the event that | |||
NFS4ERR_DELAY is returned. | NFS4ERR_DELAY is returned. | |||
There are a number of important differences in the NFS4.1 context: | There are a number of important differences in the NFS4.1 context: | |||
* The absence of RELEASE_LOCKOWNER means that the one case in which | * The absence of RELEASE_LOCKOWNER means that the one case in which | |||
an operation could not be deferred by use of NFS4ERR_DELAY no | an operation could not be deferred by use of NFS4ERR_DELAY no | |||
longer exists. | longer exists. | |||
* Sequencing of operations is no longer done using owner-based | * Sequencing of operations is no longer done using owner-based | |||
operation sequences numbers. Instead, sequencing is session- | operation sequences numbers. Instead, sequencing is session- | |||
based | based. | |||
As a result, when sessions are not transferred, the techniques | As a result, when sessions are not transferred, the techniques | |||
discussed in Section 7.2 of [68] are adequate and will not be further | discussed in Section 7.2 of [68] are adequate and will not be further | |||
discussed. | discussed. | |||
11.14.3. Server Responsibilities in Effecting Session Transfer | 11.14.3. Server Responsibilities in Effecting Session Transfer | |||
The basic responsibility of the source server in effecting session | The basic responsibility of the source server in effecting session | |||
transfer is to make available to the destination server a description | transfer is to make available to the destination server a description | |||
of the current state of each slot with the session, including: | of the current state of each slot with the session, including the | |||
following: | ||||
* The last sequence value received for that slot. | * The last sequence value received for that slot. | |||
* Whether there is cached reply data for the last request executed | * Whether there is cached reply data for the last request executed | |||
and, if so, the cached reply. | and, if so, the cached reply. | |||
When sessions are transferred, there are a number of issues that pose | When sessions are transferred, there are a number of issues that pose | |||
challenges in terms of making the transferred state unmodifiable | challenges in terms of making the transferred state unmodifiable | |||
during the period it is gathered up and transferred to the | during the period it is gathered up and transferred to the | |||
destination server. | destination server: | |||
* A single session may be used to access multiple file systems, not | * A single session may be used to access multiple file systems, not | |||
all of which are being transferred. | all of which are being transferred. | |||
* Requests made on a session may, even if rejected, affect the state | * Requests made on a session may, even if rejected, affect the state | |||
of the session by advancing the sequence number associated with | of the session by advancing the sequence number associated with | |||
the slot used. | the slot used. | |||
As a result, when the file system state might otherwise be considered | As a result, when the file system state might otherwise be considered | |||
unmodifiable, the client might have any number of in-flight requests, | unmodifiable, the client might have any number of in-flight requests, | |||
each of which is capable of changing session state, which may be of a | each of which is capable of changing session state, which may be of a | |||
number of types: | number of types: | |||
1. Those requests that were processed on the migrating file system, | 1. Those requests that were processed on the migrating file system | |||
before migration began. | before migration began. | |||
2. Those requests which got the error NFS4ERR_DELAY because the file | 2. Those requests that received the error NFS4ERR_DELAY because the | |||
system being accessed was in the process of being migrated. | file system being accessed was in the process of being migrated. | |||
3. Those requests which got the error NFS4ERR_MOVED because the file | 3. Those requests that received the error NFS4ERR_MOVED because the | |||
system being accessed had been migrated. | file system being accessed had been migrated. | |||
4. Those requests that accessed the migrating file system, in order | 4. Those requests that accessed the migrating file system in order | |||
to obtain location or status information. | to obtain location or status information. | |||
5. Those requests that did not reference the migrating file system. | 5. Those requests that did not reference the migrating file system. | |||
It should be noted that the history of any particular slot is likely | It should be noted that the history of any particular slot is likely | |||
to include a number of these request classes. In the case in which a | to include a number of these request classes. In the case in which a | |||
session which is migrated is used by file systems other than the one | session that is migrated is used by file systems other than the one | |||
migrated, requests of class 5 may be common and be the last request | migrated, requests of class 5 may be common and may be the last | |||
processed, for many slots. | request processed for many slots. | |||
Since session state can change even after the locking state has been | Since session state can change even after the locking state has been | |||
fixed as part of the migration process, the session state known to | fixed as part of the migration process, the session state known to | |||
the client could be different from that on the destination server, | the client could be different from that on the destination server, | |||
which necessarily reflects the session state on the source server, at | which necessarily reflects the session state on the source server at | |||
an earlier time. In deciding how to deal with this situation, it is | an earlier time. In deciding how to deal with this situation, it is | |||
helpful to distinguish between two sorts of behavioral consequences | helpful to distinguish between two sorts of behavioral consequences | |||
of the choice of initial sequence ID values. | of the choice of initial sequence ID values: | |||
* The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID | * The error NFS4ERR_SEQ_MISORDERED is returned when the sequence ID | |||
in a request is neither equal to the last one seen for the current | in a request is neither equal to the last one seen for the current | |||
slot nor the next greater one. | slot nor the next greater one. | |||
In view of the difficulty of arriving at a mutually acceptable | In view of the difficulty of arriving at a mutually acceptable | |||
value for the correct last sequence value at the point of | value for the correct last sequence value at the point of | |||
migration, it may be necessary for the server to show some degree | migration, it may be necessary for the server to show some degree | |||
of forbearance, when the sequence ID is one that would be | of forbearance when the sequence ID is one that would be | |||
considered unacceptable if session migration were not involved. | considered unacceptable if session migration were not involved. | |||
* Returning the cached reply for a previously executed request when | * Returning the cached reply for a previously executed request when | |||
the sequence ID in the request matches the last value recorded for | the sequence ID in the request matches the last value recorded for | |||
the slot. | the slot. | |||
In the cases in which an error is returned and there is no | In the cases in which an error is returned and there is no | |||
possibility of any non-idempotent operation having been executed, | possibility of any non-idempotent operation having been executed, | |||
it may not be necessary to adhere to this as strictly as might be | it may not be necessary to adhere to this as strictly as might be | |||
proper if session migration were not involved. For example, the | proper if session migration were not involved. For example, the | |||
fact that the error NFS4ERR_DELAY was returned may not assist the | fact that the error NFS4ERR_DELAY was returned may not assist the | |||
client in any material way, while the fact that NFS4ERR_MOVED was | client in any material way, while the fact that NFS4ERR_MOVED was | |||
returned by the source server may not be relevant when the request | returned by the source server may not be relevant when the request | |||
was reissued, directed to the destination server. | was reissued and directed to the destination server. | |||
An important issue is that the specification needs to take note of | An important issue is that the specification needs to take note of | |||
all potential COMPOUNDs, even if they might be unlikely in practice. | all potential COMPOUNDs, even if they might be unlikely in practice. | |||
For example, a COMPOUND is allowed to access multiple file systems | For example, a COMPOUND is allowed to access multiple file systems | |||
and might perform non-idempotent operations in some of them before | and might perform non-idempotent operations in some of them before | |||
accessing a file system being migrated. Also, a COMPOUND may return | accessing a file system being migrated. Also, a COMPOUND may return | |||
considerable data in the response, before being rejected with | considerable data in the response before being rejected with | |||
NFS4ERR_DELAY or NFS4ERR_MOVED, and may in addition be marked as | NFS4ERR_DELAY or NFS4ERR_MOVED, and may in addition be marked as | |||
sa_cachethis. However, note that if the client and server adhere to | sa_cachethis. However, note that if the client and server adhere to | |||
rules in Section 15.1.1.3, there is no possibility of non-idempotent | rules in Section 15.1.1.3, there is no possibility of non-idempotent | |||
operations being spuriously reissued after receiving NFS4ERR_DELAY | operations being spuriously reissued after receiving NFS4ERR_DELAY | |||
response. | response. | |||
To address these issues, a destination server MAY do any of the | To address these issues, a destination server MAY do any of the | |||
following when implementing session transfer. | following when implementing session transfer: | |||
* Avoid enforcing any sequencing semantics for a particular slot | * Avoid enforcing any sequencing semantics for a particular slot | |||
until the client has established the starting sequence for that | until the client has established the starting sequence for that | |||
slot on the destination server. | slot on the destination server. | |||
* For each slot, avoid returning a cached reply returning | * For each slot, avoid returning a cached reply returning | |||
NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established | NFS4ERR_DELAY or NFS4ERR_MOVED until the client has established | |||
the starting sequence for that slot on the destination server. | the starting sequence for that slot on the destination server. | |||
* Until the client has established the starting sequence for a | * Until the client has established the starting sequence for a | |||
particular slot on the destination server, avoid reporting | particular slot on the destination server, avoid reporting | |||
NFS4ERR_SEQ_MISORDERED or returning a cached reply returning | NFS4ERR_SEQ_MISORDERED or returning a cached reply returning | |||
NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of | NFS4ERR_DELAY or NFS4ERR_MOVED, where the reply consists solely of | |||
a series of operations where the response is NFS4_OK until the | a series of operations where the response is NFS4_OK until the | |||
final error. | final error. | |||
Because of the considerations mentioned above including the rules for | Because of the considerations mentioned above, including the rules | |||
the handling of NFS4ERR_DELAY included in Section 15.1.1.3, the | for the handling of NFS4ERR_DELAY included in Section 15.1.1.3, the | |||
destination server can respond appropriately to SEQUENCE operations | destination server can respond appropriately to SEQUENCE operations | |||
received from the client by adopting the three policies listed below: | received from the client by adopting the three policies listed below: | |||
* Not responding with NFS4ERR_SEQ_MISORDERED for the initial request | * Not responding with NFS4ERR_SEQ_MISORDERED for the initial request | |||
on a slot within a transferred session, since the destination | on a slot within a transferred session because the destination | |||
server cannot be aware of requests made by the client after the | server cannot be aware of requests made by the client after the | |||
server handoff but before the client became aware of the shift. | server handoff but before the client became aware of the shift. | |||
In cases in which NFS4ERR_SEQ_MISORDERED would normally have been | In cases in which NFS4ERR_SEQ_MISORDERED would normally have been | |||
reported, the request is to be processed normally, as a new | reported, the request is to be processed normally as a new | |||
request. | request. | |||
* Replying as it would for a retry whenever the sequence matches | * Replying as it would for a retry whenever the sequence matches | |||
that transferred by the source server, even though this would not | that transferred by the source server, even though this would not | |||
provide retry handling for requests issued after the server | provide retry handling for requests issued after the server | |||
handoff, under the assumption that when such requests are issued | handoff, under the assumption that, when such requests are issued, | |||
they will never be responded to in a state-changing fashion, | they will never be responded to in a state-changing fashion, | |||
making retry support for them unnecessary. | making retry support for them unnecessary. | |||
* Once a non-retry SEQUENCE is received for a given slot, using that | * Once a non-retry SEQUENCE is received for a given slot, using that | |||
as the basis for further sequence checking, with no further | as the basis for further sequence checking, with no further | |||
reference to the sequence value transferred by the source. | reference to the sequence value transferred by the source server. | |||
server. | ||||
11.15. Effecting File System Referrals | 11.15. Effecting File System Referrals | |||
Referrals are effected when an absent file system is encountered and | Referrals are effected when an absent file system is encountered and | |||
one or more alternate locations are made available by the | one or more alternate locations are made available by the | |||
fs_locations or fs_locations_info attributes. The client will | fs_locations or fs_locations_info attributes. The client will | |||
typically get an NFS4ERR_MOVED error, fetch the appropriate location | typically get an NFS4ERR_MOVED error, fetch the appropriate location | |||
information, and proceed to access the file system on a different | information, and proceed to access the file system on a different | |||
server, even though it retains its logical position within the | server, even though it retains its logical position within the | |||
original namespace. Referrals differ from migration events in that | original namespace. Referrals differ from migration events in that | |||
they happen only when the client has not previously referenced the | they happen only when the client has not previously referenced the | |||
file system in question (so there is nothing to transition). | file system in question (so there is nothing to transition). | |||
Referrals can only come into effect when an absent file system is | Referrals can only come into effect when an absent file system is | |||
encountered at its root. | encountered at its root. | |||
The examples given in the sections below are somewhat artificial in | The examples given in the sections below are somewhat artificial in | |||
that an actual client will not typically do a multi-component look | that an actual client will not typically do a multi-component look | |||
up, but will have cached information regarding the upper levels of | up, but will have cached information regarding the upper levels of | |||
the name hierarchy. However, these examples are chosen to make the | the name hierarchy. However, these examples are chosen to make the | |||
required behavior clear and easy to put within the scope of a small | required behavior clear and easy to put within the scope of a small | |||
number of requests, without getting a discussion of the details of | number of requests, without getting into a discussion of the details | |||
how specific clients might choose to cache things. | of how specific clients might choose to cache things. | |||
11.15.1. Referral Example (LOOKUP) | 11.15.1. Referral Example (LOOKUP) | |||
Let us suppose that the following COMPOUND is sent in an environment | Let us suppose that the following COMPOUND is sent in an environment | |||
in which /this/is/the/path is absent from the target server. This | in which /this/is/the/path is absent from the target server. This | |||
may be for a number of reasons. It may be that the file system has | may be for a number of reasons. It may be that the file system has | |||
moved, or it may be that the target server is functioning mainly, or | moved, or it may be that the target server is functioning mainly, or | |||
solely, to refer clients to the servers on which various file systems | solely, to refer clients to the servers on which various file systems | |||
are located. | are located. | |||
skipping to change at line 13731 ¶ | skipping to change at line 13731 ¶ | |||
different write-verifier class from the source. | different write-verifier class from the source. | |||
The specific choices reflect typical implementation patterns for | The specific choices reflect typical implementation patterns for | |||
failover and controlled migration, respectively. Since other choices | failover and controlled migration, respectively. Since other choices | |||
are possible and useful, this information is better obtained by using | are possible and useful, this information is better obtained by using | |||
fs_locations_info. When a server implementation needs to communicate | fs_locations_info. When a server implementation needs to communicate | |||
other choices, it MUST support the fs_locations_info attribute. | other choices, it MUST support the fs_locations_info attribute. | |||
See Section 21 for a discussion on the recommendations for the | See Section 21 for a discussion on the recommendations for the | |||
security flavor to be used by any GETATTR operation that requests the | security flavor to be used by any GETATTR operation that requests the | |||
"fs_locations" attribute. | fs_locations attribute. | |||
11.17. The Attribute fs_locations_info | 11.17. The Attribute fs_locations_info | |||
The fs_locations_info attribute is intended as a more functional | The fs_locations_info attribute is intended as a more functional | |||
replacement for the fs_locations attribute which will continue to | replacement for the fs_locations attribute, which will continue to | |||
exist and be supported. Clients can use it to get a more complete | exist and be supported. Clients can use it to get a more complete | |||
set of data about alternative file system locations, including | set of data about alternative file system locations, including | |||
additional network paths to access replicas in use and additional | additional network paths to access replicas in use and additional | |||
replicas. When the server does not support fs_locations_info, | replicas. When the server does not support fs_locations_info, | |||
fs_locations can be used to get a subset of the data. A server that | fs_locations can be used to get a subset of the data. A server that | |||
supports fs_locations_info MUST support fs_locations as well. | supports fs_locations_info MUST support fs_locations as well. | |||
There is additional data present in fs_locations_info, that is not | There is additional data present in fs_locations_info that is not | |||
available in fs_locations: | available in fs_locations: | |||
* Attribute continuity information. This information will allow a | * Attribute continuity information. This information will allow a | |||
client to select a replica that meets the transparency | client to select a replica that meets the transparency | |||
requirements of the applications accessing the data and to | requirements of the applications accessing the data and to | |||
leverage optimizations due to the server guarantees of attribute | leverage optimizations due to the server guarantees of attribute | |||
continuity (e.g., if the change attribute of a file of the file | continuity (e.g., if the change attribute of a file of the file | |||
system is continuous between multiple replicas, the client does | system is continuous between multiple replicas, the client does | |||
not have to invalidate the file's cache when switching to a | not have to invalidate the file's cache when switching to a | |||
different replica). | different replica). | |||
skipping to change at line 13782 ¶ | skipping to change at line 13782 ¶ | |||
used to implement load-balancing while giving the client the | used to implement load-balancing while giving the client the | |||
entire file system list to be used in case the primary fails. | entire file system list to be used in case the primary fails. | |||
The fs_locations_info attribute is structured similarly to the | The fs_locations_info attribute is structured similarly to the | |||
fs_locations attribute. A top-level structure (fs_locations_info4) | fs_locations attribute. A top-level structure (fs_locations_info4) | |||
contains the entire attribute including the root pathname of the file | contains the entire attribute including the root pathname of the file | |||
system and an array of lower-level structures that define replicas | system and an array of lower-level structures that define replicas | |||
that share a common rootpath on their respective servers. The lower- | that share a common rootpath on their respective servers. The lower- | |||
level structure in turn (fs_locations_item4) contains a specific | level structure in turn (fs_locations_item4) contains a specific | |||
pathname and information on one or more individual network access | pathname and information on one or more individual network access | |||
paths. For that last lowest level, fs_locations_info has an | paths. For that last, lowest level, fs_locations_info has an | |||
fs_locations_server4 structure that contains per-server-replica | fs_locations_server4 structure that contains per-server-replica | |||
information in addition to the file system location entry. This per- | information in addition to the file system location entry. This per- | |||
server-replica information includes a nominally opaque array, | server-replica information includes a nominally opaque array, | |||
fls_info, within which specific pieces of information are located at | fls_info, within which specific pieces of information are located at | |||
the specific indices listed below. | the specific indices listed below. | |||
Two fs_location_server4 entries that are within different | Two fs_location_server4 entries that are within different | |||
fs_location_item4 structures are never trunkable, while two entries | fs_location_item4 structures are never trunkable, while two entries | |||
within in the same fs_location_item4 structure might or might not be | within in the same fs_location_item4 structure might or might not be | |||
trunkable. Two entries that are trunkable will have identical | trunkable. Two entries that are trunkable will have identical | |||
skipping to change at line 13909 ¶ | skipping to change at line 13909 ¶ | |||
The data presented in the fs_locations_info attribute may be obtained | The data presented in the fs_locations_info attribute may be obtained | |||
by the server in any number of ways, including specification by the | by the server in any number of ways, including specification by the | |||
administrator or by current protocols for transferring data among | administrator or by current protocols for transferring data among | |||
replicas and protocols not yet developed. NFSv4.1 only defines how | replicas and protocols not yet developed. NFSv4.1 only defines how | |||
this information is presented by the server to the client. | this information is presented by the server to the client. | |||
11.17.1. The fs_locations_server4 Structure | 11.17.1. The fs_locations_server4 Structure | |||
The fs_locations_server4 structure consists of the following items in | The fs_locations_server4 structure consists of the following items in | |||
addition to the fls_server field which specifies a network address or | addition to the fls_server field, which specifies a network address | |||
set of addresses to be used to access the specified file system. | or set of addresses to be used to access the specified file system. | |||
Note that both of these items (i.e., fls_currency and flinfo) specify | Note that both of these items (i.e., fls_currency and flinfo) specify | |||
attributes of the file system replica and should not be different | attributes of the file system replica and should not be different | |||
when there are multiple fs_locations_server4 structures for the same | when there are multiple fs_locations_server4 structures, each | |||
replica, each specifying a network path to the chosen replica. | specifying a network path to the chosen replica, for the same | |||
replica. | ||||
When these values are different in two fs_locations_server4 | When these values are different in two fs_locations_server4 | |||
structures, a client has no basis for choosing one over the other and | structures, a client has no basis for choosing one over the other and | |||
is best off simply ignoring both entries, whether these entries apply | is best off simply ignoring both entries, whether these entries apply | |||
to migration replication or referral. When there are more than two | to migration replication or referral. When there are more than two | |||
such entries, majority voting can be used to exclude a single | such entries, majority voting can be used to exclude a single | |||
erroneous entry from consideration. In the case in which trunking | erroneous entry from consideration. In the case in which trunking | |||
information is provided for a replica currently being accessed, the | information is provided for a replica currently being accessed, the | |||
additional trunked addresses can be ignored while access continues on | additional trunked addresses can be ignored while access continues on | |||
the address currently being used, even if the entry corresponding to | the address currently being used, even if the entry corresponding to | |||
skipping to change at line 13955 ¶ | skipping to change at line 13956 ¶ | |||
* The server string (fls_server). For the case of the replica | * The server string (fls_server). For the case of the replica | |||
currently being accessed (via GETATTR), a zero-length string MAY | currently being accessed (via GETATTR), a zero-length string MAY | |||
be used to indicate the current address being used for the RPC | be used to indicate the current address being used for the RPC | |||
call. The fls_server field can also be an IPv4 or IPv6 address, | call. The fls_server field can also be an IPv4 or IPv6 address, | |||
formatted the same way as an IPv4 or IPv6 address in the "server" | formatted the same way as an IPv4 or IPv6 address in the "server" | |||
field of the fs_location4 data type (see Section 11.16). | field of the fs_location4 data type (see Section 11.16). | |||
With the exception of the transport-flag field (at offset | With the exception of the transport-flag field (at offset | |||
FSLI4BX_TFLAGS with the fls_info array), all of this data defined in | FSLI4BX_TFLAGS with the fls_info array), all of this data defined in | |||
this specification applies to the replica specified by the entry, | this specification applies to the replica specified by the entry, | |||
rather that the specific network path used to access it. The | rather than the specific network path used to access it. The | |||
classification of data in extensions to this data is discussed below. | classification of data in extensions to this data is discussed below. | |||
Data within the fls_info array is in the form of 8-bit data items | Data within the fls_info array is in the form of 8-bit data items | |||
with constants giving the offsets within the array of various values | with constants giving the offsets within the array of various values | |||
describing this particular file system instance. This style of | describing this particular file system instance. This style of | |||
definition was chosen, in preference to explicit XDR structure | definition was chosen, in preference to explicit XDR structure | |||
definitions for these values, for a number of reasons. | definitions for these values, for a number of reasons. | |||
* The kinds of data in the fls_info array, representing flags, file | * The kinds of data in the fls_info array, representing flags, file | |||
system classes, and priorities among sets of file systems | system classes, and priorities among sets of file systems | |||
representing the same data, are such that 8 bits provide a quite | representing the same data, are such that 8 bits provide a quite | |||
acceptable range of values. Even where there might be more than | acceptable range of values. Even where there might be more than | |||
256 such file system instances, having more than 256 distinct | 256 such file system instances, having more than 256 distinct | |||
classes or priorities is unlikely. | classes or priorities is unlikely. | |||
* Explicit definition of the various specific data items within XDR | * Explicit definition of the various specific data items within XDR | |||
would limit expandability in that any extension within would | would limit expandability in that any extension within would | |||
require yet another attribute, leading to specification and | require yet another attribute, leading to specification and | |||
implementation clumsiness. In the context of the NFSv4 extension | implementation clumsiness. In the context of the NFSv4 extension | |||
model in effect at the time fs_locations_info was designed (i.e. | model in effect at the time fs_locations_info was designed (i.e., | |||
that described in RFC5661 [65]), this would necessitate a new | that which is described in RFC 5661 [65]), this would necessitate | |||
minor version to effect any Standards Track extension to the data | a new minor version to effect any Standards Track extension to the | |||
in in fls_info. | data in fls_info. | |||
The set of fls_info data is subject to expansion in a future minor | The set of fls_info data is subject to expansion in a future minor | |||
version, or in a Standards Track RFC, within the context of a single | version or in a Standards Track RFC within the context of a single | |||
minor version. The server SHOULD NOT send and the client MUST NOT | minor version. The server SHOULD NOT send and the client MUST NOT | |||
use indices within the fls_info array or flag bits that are not | use indices within the fls_info array or flag bits that are not | |||
defined in Standards Track RFCs. | defined in Standards Track RFCs. | |||
In light of the new extension model defined in RFC8178 [66] and the | In light of the new extension model defined in RFC 8178 [66] and the | |||
fact that the individual items within fls_info are not explicitly | fact that the individual items within fls_info are not explicitly | |||
referenced in the XDR, the following practices should be followed | referenced in the XDR, the following practices should be followed | |||
when extending or otherwise changing the structure of the data | when extending or otherwise changing the structure of the data | |||
returned in fls_info within the scope of a single minor version. | returned in fls_info within the scope of a single minor version: | |||
* All extensions need to be described by Standards Track documents. | * All extensions need to be described by Standards Track documents. | |||
There is no need for such documents to be marked as updating | There is no need for such documents to be marked as updating RFC | |||
RFC5661 [65] or this document. | 5661 [65] or this document. | |||
* It needs to be made clear whether the information in any added | * It needs to be made clear whether the information in any added | |||
data items applies to the replica specified by the entry or to the | data items applies to the replica specified by the entry or to the | |||
specific network paths specified in the entry. | specific network paths specified in the entry. | |||
* There needs to be a reliable way defined to determine whether the | * There needs to be a reliable way defined to determine whether the | |||
server is aware of the extension. This may be based on the length | server is aware of the extension. This may be based on the length | |||
field of the fls_info array, but it is more flexible to provide | field of the fls_info array, but it is more flexible to provide | |||
fs-scope or server-scope attributes to indicate what extensions | fs-scope or server-scope attributes to indicate what extensions | |||
are provided. | are provided. | |||
skipping to change at line 14084 ¶ | skipping to change at line 14085 ¶ | |||
reasonable. | reasonable. | |||
When this flag is seen as part of a transition into a new file | When this flag is seen as part of a transition into a new file | |||
system, a client might choose to transfer immediately to another | system, a client might choose to transfer immediately to another | |||
replica, or it may reference the current file system and only | replica, or it may reference the current file system and only | |||
transition when a migration event occurs. Similarly, when this | transition when a migration event occurs. Similarly, when this | |||
flag appears as a replica in the referral, clients would likely | flag appears as a replica in the referral, clients would likely | |||
avoid being referred to this instance whenever there is another | avoid being referred to this instance whenever there is another | |||
choice. | choice. | |||
This flag, like the other items within fls_info applies to the | This flag, like the other items within fls_info, applies to the | |||
replica, rather than to a particular path to that replica. When | replica rather than to a particular path to that replica. When it | |||
it appears, a transition to a new replica rather than to a | appears, a transition to a new replica, rather than to a different | |||
different path to the same replica, is indicated. | path to the same replica, is indicated. | |||
* FSLI4GF_SPLIT indicates that when a transition occurs from the | * FSLI4GF_SPLIT indicates that when a transition occurs from the | |||
current file system instance to this one, the replacement may | current file system instance to this one, the replacement may | |||
consist of multiple file systems. In this case, the client has to | consist of multiple file systems. In this case, the client has to | |||
be prepared for the possibility that objects on the same file | be prepared for the possibility that objects on the same file | |||
system before migration will be on different ones after. Note | system before migration will be on different ones after. Note | |||
that FSLI4GF_SPLIT is not incompatible with the file systems | that FSLI4GF_SPLIT is not incompatible with the file systems | |||
belonging to the same fileid class since, if one has a set of | belonging to the same fileid class since, if one has a set of | |||
fileids that are unique within a file system, each subset assigned | fileids that are unique within a file system, each subset assigned | |||
to a smaller file system after migration would not have any | to a smaller file system after migration would not have any | |||
skipping to change at line 14133 ¶ | skipping to change at line 14134 ¶ | |||
the server to determine when the need for emulating two file | the server to determine when the need for emulating two file | |||
systems as one is over. | systems as one is over. | |||
Although it is possible for this flag to be present in the event | Although it is possible for this flag to be present in the event | |||
of referral, it would generally be of little interest to the | of referral, it would generally be of little interest to the | |||
client, since the client is not expected to have information | client, since the client is not expected to have information | |||
regarding the current contents of the absent file system. | regarding the current contents of the absent file system. | |||
The transport-flag field (at byte index FSLI4BX_TFLAGS) contains the | The transport-flag field (at byte index FSLI4BX_TFLAGS) contains the | |||
following bits related to the transport capabilities of the specific | following bits related to the transport capabilities of the specific | |||
network path(s) specified by the entry. | network path(s) specified by the entry: | |||
* FSLI4TF_RDMA indicates that any specified network paths provide | * FSLI4TF_RDMA indicates that any specified network paths provide | |||
NFSv4.1 clients access using an RDMA-capable transport. | NFSv4.1 clients access using an RDMA-capable transport. | |||
Attribute continuity and file system identity information are | Attribute continuity and file system identity information are | |||
expressed by defining equivalence relations on the sets of file | expressed by defining equivalence relations on the sets of file | |||
systems presented to the client. Each such relation is expressed as | systems presented to the client. Each such relation is expressed as | |||
a set of file system equivalence classes. For each relation, a file | a set of file system equivalence classes. For each relation, a file | |||
system has an 8-bit class number. Two file systems belong to the | system has an 8-bit class number. Two file systems belong to the | |||
same class if both have identical non-zero class numbers. Zero is | same class if both have identical non-zero class numbers. Zero is | |||
skipping to change at line 14864 ¶ | skipping to change at line 14865 ¶ | |||
Via a notification mechanism (see Section 20.12), device ID to device | Via a notification mechanism (see Section 20.12), device ID to device | |||
address mappings can change over the duration of server operation | address mappings can change over the duration of server operation | |||
without recalling or revoking the layouts that refer to device ID. | without recalling or revoking the layouts that refer to device ID. | |||
The notification mechanism can also delete a device ID, but only if | The notification mechanism can also delete a device ID, but only if | |||
the client has no layouts referring to the device ID. A notification | the client has no layouts referring to the device ID. A notification | |||
of a change to a device ID to device address mapping will immediately | of a change to a device ID to device address mapping will immediately | |||
or eventually invalidate some or all of the device ID's mappings. | or eventually invalidate some or all of the device ID's mappings. | |||
The server MUST support notifications and the client must request | The server MUST support notifications and the client must request | |||
them before they can be used. For further information about the | them before they can be used. For further information about the | |||
notification types Section 20.12. | notification types, see Section 20.12. | |||
12.3. pNFS Operations | 12.3. pNFS Operations | |||
NFSv4.1 has several operations that are needed for pNFS servers, | NFSv4.1 has several operations that are needed for pNFS servers, | |||
regardless of layout type or storage protocol. These operations are | regardless of layout type or storage protocol. These operations are | |||
all sent to a metadata server and summarized here. While pNFS is an | all sent to a metadata server and summarized here. While pNFS is an | |||
OPTIONAL feature, if pNFS is implemented, some operations are | OPTIONAL feature, if pNFS is implemented, some operations are | |||
REQUIRED in order to comply with pNFS. See Section 17. | REQUIRED in order to comply with pNFS. See Section 17. | |||
These are the fore channel pNFS operations: | These are the fore channel pNFS operations: | |||
skipping to change at line 17829 ¶ | skipping to change at line 17830 ¶ | |||
For any of a number of reasons, the replier could not process this | For any of a number of reasons, the replier could not process this | |||
operation in what was deemed a reasonable time. The client should | operation in what was deemed a reasonable time. The client should | |||
wait and then try the request with a new slot and sequence value. | wait and then try the request with a new slot and sequence value. | |||
Some examples of scenarios that might lead to this situation: | Some examples of scenarios that might lead to this situation: | |||
* A server that supports hierarchical storage receives a request to | * A server that supports hierarchical storage receives a request to | |||
process a file that had been migrated. | process a file that had been migrated. | |||
* An operation requires a delegation recall to proceed, so that the | * An operation requires a delegation recall to proceed, but the need | |||
need to wait for this delegation to be recalled and returned makes | to wait for this delegation to be recalled and returned makes | |||
processing this request in a timely fashion impossible. | processing this request in a timely fashion impossible. | |||
* A request is being performed on a session being migrated from | * A request is being performed on a session being migrated from | |||
another server as described in Section 11.14.3, and the lack of | another server as described in Section 11.14.3, and the lack of | |||
full information about the state of the session on the source | full information about the state of the session on the source | |||
makes it impossible to process the request immediately. | makes it impossible to process the request immediately. | |||
In such cases, returning the error NFS4ERR_DELAY allows necessary | In such cases, returning the error NFS4ERR_DELAY allows necessary | |||
preparatory operations to proceed without holding up requester | preparatory operations to proceed without holding up requester | |||
resources such as a session slot. After delaying for period of time, | resources such as a session slot. After delaying for period of time, | |||
skipping to change at line 17861 ¶ | skipping to change at line 17862 ¶ | |||
is retried in full with the SEQUENCE operation containing the same | is retried in full with the SEQUENCE operation containing the same | |||
slot and sequence values. In this case, the replier MUST avoid | slot and sequence values. In this case, the replier MUST avoid | |||
returning a response containing NFS4ERR_DELAY as the response to | returning a response containing NFS4ERR_DELAY as the response to | |||
SEQUENCE solely on the basis of its presence in the replay cache. | SEQUENCE solely on the basis of its presence in the replay cache. | |||
If the replier did this, the retries would not be effective as | If the replier did this, the retries would not be effective as | |||
there would be no opportunity for the replier to see whether the | there would be no opportunity for the replier to see whether the | |||
condition that generated the NFS4ERR_DELAY had been rectified | condition that generated the NFS4ERR_DELAY had been rectified | |||
during the interim between the original request and the retry. | during the interim between the original request and the retry. | |||
* If NFS4ERR_DELAY is returned on an operation other than SEQUENCE | * If NFS4ERR_DELAY is returned on an operation other than SEQUENCE | |||
which validly appears as the first operation of a request, | that validly appears as the first operation of a request, the | |||
handling is similar. The request can be retried in full without | handling is similar. The request can be retried in full without | |||
modification. In this case as well, the replier MUST avoid | modification. In this case as well, the replier MUST avoid | |||
returning a response containing NFS4ERR_DELAY as the response to | returning a response containing NFS4ERR_DELAY as the response to | |||
an initial operation of a request solely on the basis of its | an initial operation of a request solely on the basis of its | |||
presence in the replay cache. If the replier did this, the | presence in the replay cache. If the replier did this, the | |||
retries would not be effective as there would be no opportunity | retries would not be effective as there would be no opportunity | |||
for the replier to see whether the condition that generated the | for the replier to see whether the condition that generated the | |||
NFS4ERR_DELAY had been rectified during the interim between the | NFS4ERR_DELAY had been rectified during the interim between the | |||
original request and the retry. | original request and the retry. | |||
* If NFS4ERR_DELAY is returned on an operation other than the first | * If NFS4ERR_DELAY is returned on an operation other than the first | |||
in the request, the request when retried MUST contain a SEQUENCE | in the request, the request when retried MUST contain a SEQUENCE | |||
operation which is different than the original one, with either | operation that is different than the original one, with either the | |||
the bin id or the sequence value different from that in the | bin id or the sequence value different from that in the original | |||
original request. Because requesters do this, there is no need | request. Because requesters do this, there is no need for the | |||
for the replier to take special care to avoid returning an | replier to take special care to avoid returning an NFS4ERR_DELAY | |||
NFS4ERR_DELAY error, obtained from the replay cache. When no non- | error obtained from the replay cache. When no non-idempotent | |||
idempotent operations have been processed before the NFS4ERR_DELAY | operations have been processed before the NFS4ERR_DELAY was | |||
was returned, the requester should retry the request in full, with | returned, the requester should retry the request in full, with the | |||
the only difference from the original request being the | only difference from the original request being the modification | |||
modification to the slot id or sequence value in the reissued | to the slot ID or sequence value in the reissued SEQUENCE | |||
SEQUENCE operation. | operation. | |||
* When NFS4ERR_DELAY is returned on an operation other than the | * When NFS4ERR_DELAY is returned on an operation other than the | |||
first within a request and there has been a non-idempotent | first within a request and there has been a non-idempotent | |||
operation processed before the NFS4ERR_DELAY was returned, | operation processed before the NFS4ERR_DELAY was returned, | |||
reissuing the request as is normally done would incorrectly cause | reissuing the request as is normally done would incorrectly cause | |||
the re-execution of the non-idempotent operation. | the re-execution of the non-idempotent operation. | |||
To avoid this situation, the client should reissue the request | To avoid this situation, the client should reissue the request | |||
without the non-idempotent operation. The request still must use | without the non-idempotent operation. The request still must use | |||
a SEQUENCE operation with either a different slot id or sequence | a SEQUENCE operation with either a different slot ID or sequence | |||
value from the SEQUENCE in the original request. Because this is | value from the SEQUENCE in the original request. Because this is | |||
done, there is no way the replier could avoid spuriously re- | done, there is no way the replier could avoid spuriously re- | |||
executing the non-idempotent operation since the different | executing the non-idempotent operation since the different | |||
SEQUENCE parameters prevent the requester from recognizing that | SEQUENCE parameters prevent the requester from recognizing that | |||
the non-idempotent operation is being retried. | the non-idempotent operation is being retried. | |||
Note that without the ability to return NFS4ERR_DELAY and the | Note that without the ability to return NFS4ERR_DELAY and the | |||
requester's willingness to re-send when receiving it, deadlock might | requester's willingness to re-send when receiving it, deadlock might | |||
result. For example, if a recall is done, and if the delegation | result. For example, if a recall is done, and if the delegation | |||
return or operations preparatory to delegation return are held up by | return or operations preparatory to delegation return are held up by | |||
skipping to change at line 17947 ¶ | skipping to change at line 17948 ¶ | |||
in which the filehandle is a valid filehandle in general but is not | in which the filehandle is a valid filehandle in general but is not | |||
of the appropriate object type for the current operation. | of the appropriate object type for the current operation. | |||
Where the error description indicates a problem with the current or | Where the error description indicates a problem with the current or | |||
saved filehandle, it is to be understood that filehandles are only | saved filehandle, it is to be understood that filehandles are only | |||
checked for the condition if they are implicit arguments of the | checked for the condition if they are implicit arguments of the | |||
operation in question. | operation in question. | |||
15.1.2.1. NFS4ERR_BADHANDLE (Error Code 10001) | 15.1.2.1. NFS4ERR_BADHANDLE (Error Code 10001) | |||
Illegal NFS filehandle for the current server. The current file | Illegal NFS filehandle for the current server. The current | |||
handle failed internal consistency checks. Once accepted as valid | filehandle failed internal consistency checks. Once accepted as | |||
(by PUTFH), no subsequent status change can cause the filehandle to | valid (by PUTFH), no subsequent status change can cause the | |||
generate this error. | filehandle to generate this error. | |||
15.1.2.2. NFS4ERR_FHEXPIRED (Error Code 10014) | 15.1.2.2. NFS4ERR_FHEXPIRED (Error Code 10014) | |||
A current or saved filehandle that is an argument to the current | A current or saved filehandle that is an argument to the current | |||
operation is volatile and has expired at the server. | operation is volatile and has expired at the server. | |||
15.1.2.3. NFS4ERR_ISDIR (Error Code 21) | 15.1.2.3. NFS4ERR_ISDIR (Error Code 21) | |||
The current or saved filehandle designates a directory when the | The current or saved filehandle designates a directory when the | |||
current operation does not allow a directory to be accepted as the | current operation does not allow a directory to be accepted as the | |||
target of this operation. | target of this operation. | |||
15.1.2.4. NFS4ERR_MOVED (Error Code 10019) | 15.1.2.4. NFS4ERR_MOVED (Error Code 10019) | |||
The file system that contains the current filehandle object is not | The file system that contains the current filehandle object is not | |||
present at the server, or is not accessible using the network address | present at the server or is not accessible with the network address | |||
used. It may have been made accessible on a different set of network | used. It may have been made accessible on a different set of network | |||
addresses, relocated or migrated to another server, or it may have | addresses, relocated or migrated to another server, or it may have | |||
never been present. The client may obtain the new file system | never been present. The client may obtain the new file system | |||
location by obtaining the "fs_locations" or "fs_locations_info" | location by obtaining the fs_locations or fs_locations_info attribute | |||
attribute for the current filehandle. For further discussion, refer | for the current filehandle. For further discussion, refer to | |||
to Section 11.3. | Section 11.3. | |||
As with the case of NFS4ERR_DELAY, it is possible that one or more | As with the case of NFS4ERR_DELAY, it is possible that one or more | |||
non-idempotent operations may have been successfully executed within | non-idempotent operations may have been successfully executed within | |||
a COMPOUND before NFS4ERR_MOVED is returned. Because of this, once | a COMPOUND before NFS4ERR_MOVED is returned. Because of this, once | |||
the new location is determined, the original request which received | the new location is determined, the original request that received | |||
the NFS4ERR_MOVED should not be re-executed in full. Instead, the | the NFS4ERR_MOVED should not be re-executed in full. Instead, the | |||
client should send a new COMPOUND, with any successfully executed | client should send a new COMPOUND with any successfully executed non- | |||
non-idempotent operations removed. When the client uses the same | idempotent operations removed. When the client uses the same session | |||
session for the new COMPOUND, its SEQUENCE operation should use a | for the new COMPOUND, its SEQUENCE operation should use a different | |||
different slot id or sequence. | slot ID or sequence. | |||
15.1.2.5. NFS4ERR_NOFILEHANDLE (Error Code 10020) | 15.1.2.5. NFS4ERR_NOFILEHANDLE (Error Code 10020) | |||
The logical current or saved filehandle value is required by the | The logical current or saved filehandle value is required by the | |||
current operation and is not set. This may be a result of a | current operation and is not set. This may be a result of a | |||
malformed COMPOUND operation (i.e., no PUTFH or PUTROOTFH before an | malformed COMPOUND operation (i.e., no PUTFH or PUTROOTFH before an | |||
operation that requires the current filehandle be set). | operation that requires the current filehandle be set). | |||
15.1.2.6. NFS4ERR_NOTDIR (Error Code 20) | 15.1.2.6. NFS4ERR_NOTDIR (Error Code 20) | |||
skipping to change at line 18363 ¶ | skipping to change at line 18364 ¶ | |||
specifying the same scope, whether that scope is global or for the | specifying the same scope, whether that scope is global or for the | |||
same file system in the case of a per-fs RECLAIM_COMPLETE. An | same file system in the case of a per-fs RECLAIM_COMPLETE. An | |||
additional RECLAIM_COMPLETE operation is not necessary and results in | additional RECLAIM_COMPLETE operation is not necessary and results in | |||
this error. | this error. | |||
15.1.9.2. NFS4ERR_GRACE (Error Code 10013) | 15.1.9.2. NFS4ERR_GRACE (Error Code 10013) | |||
This error is returned when the server is in its grace period with | This error is returned when the server is in its grace period with | |||
regard to the file system object for which the lock was requested. | regard to the file system object for which the lock was requested. | |||
In this situation, a non-reclaim locking request cannot be granted. | In this situation, a non-reclaim locking request cannot be granted. | |||
This can occur because either | This can occur because either: | |||
* The server does not have sufficient information about locks that | * The server does not have sufficient information about locks that | |||
might be potentially reclaimed to determine whether the lock could | might be potentially reclaimed to determine whether the lock could | |||
be granted. | be granted. | |||
* The request is made by a client responsible for reclaiming its | * The request is made by a client responsible for reclaiming its | |||
locks that has not yet done the appropriate RECLAIM_COMPLETE | locks that has not yet done the appropriate RECLAIM_COMPLETE | |||
operation, allowing it to proceed to obtain new locks. | operation, allowing it to proceed to obtain new locks. | |||
In the case of a per-fs grace period, there may be clients, (i.e., | In the case of a per-fs grace period, there may be clients (i.e., | |||
those currently using the destination file system) who might be | those currently using the destination file system) who might be | |||
unaware of the circumstances resulting in the initiation of the grace | unaware of the circumstances resulting in the initiation of the grace | |||
period. Such clients need to periodically retry the request until | period. Such clients need to periodically retry the request until | |||
the grace period is over, just as other clients do. | the grace period is over, just as other clients do. | |||
15.1.9.3. NFS4ERR_NO_GRACE (Error Code 10033) | 15.1.9.3. NFS4ERR_NO_GRACE (Error Code 10033) | |||
A reclaim of client state was attempted in circumstances in which the | A reclaim of client state was attempted in circumstances in which the | |||
server cannot guarantee that conflicting state has not been provided | server cannot guarantee that conflicting state has not been provided | |||
to another client. This occurs in any of the following situations. | to another client. This occurs in any of the following situations: | |||
* There is no active grace period applying to the file system object | * There is no active grace period applying to the file system object | |||
for which the request was made. | for which the request was made. | |||
* The client making the request has no current role in reclaiming | * The client making the request has no current role in reclaiming | |||
locks. | locks. | |||
* Previous operations have created a situation in which the server | * Previous operations have created a situation in which the server | |||
is not able to determine that a reclaim-interfering edge condition | is not able to determine that a reclaim-interfering edge condition | |||
does not exist. | does not exist. | |||
15.1.9.4. NFS4ERR_RECLAIM_BAD (Error Code 10034) | 15.1.9.4. NFS4ERR_RECLAIM_BAD (Error Code 10034) | |||
The server has determined that a reclaim attempted by the client is | The server has determined that a reclaim attempted by the client is | |||
not valid, i.e. the lock specified as being reclaimed could not | not valid, i.e., the lock specified as being reclaimed could not | |||
possibly have existed before the server restart or file system | possibly have existed before the server restart or file system | |||
migration event. A server is not obliged to make this determination | migration event. A server is not obliged to make this determination | |||
and will typically rely on the client to only reclaim locks that the | and will typically rely on the client to only reclaim locks that the | |||
client was granted prior to restart. However, when a server does | client was granted prior to restart. However, when a server does | |||
have reliable information to enable it to make this determination, | have reliable information to enable it to make this determination, | |||
this error indicates that the reclaim has been rejected as invalid. | this error indicates that the reclaim has been rejected as invalid. | |||
This is as opposed to the error NFS4ERR_RECLAIM_CONFLICT (see | This is as opposed to the error NFS4ERR_RECLAIM_CONFLICT (see | |||
Section 15.1.9.5) where the server can only determine that there has | Section 15.1.9.5) where the server can only determine that there has | |||
been an invalid reclaim, but cannot determine which request is | been an invalid reclaim, but cannot determine which request is | |||
invalid. | invalid. | |||
skipping to change at line 21654 ¶ | skipping to change at line 21655 ¶ | |||
18.4.3. DESCRIPTION | 18.4.3. DESCRIPTION | |||
The CREATE operation creates a file object other than an ordinary | The CREATE operation creates a file object other than an ordinary | |||
file in a directory with a given name. The OPEN operation MUST be | file in a directory with a given name. The OPEN operation MUST be | |||
used to create a regular file or a named attribute. | used to create a regular file or a named attribute. | |||
The current filehandle must be a directory: an object of type NF4DIR. | The current filehandle must be a directory: an object of type NF4DIR. | |||
If the current filehandle is an attribute directory (type | If the current filehandle is an attribute directory (type | |||
NF4ATTRDIR), the error NFS4ERR_WRONG_TYPE is returned. If the | NF4ATTRDIR), the error NFS4ERR_WRONG_TYPE is returned. If the | |||
current file handle designates any other type of object, the error | current filehandle designates any other type of object, the error | |||
NFS4ERR_NOTDIR results. | NFS4ERR_NOTDIR results. | |||
The objname specifies the name for the new object. The objtype | The objname specifies the name for the new object. The objtype | |||
determines the type of object to be created: directory, symlink, etc. | determines the type of object to be created: directory, symlink, etc. | |||
If the object type specified is that of an ordinary file, a named | If the object type specified is that of an ordinary file, a named | |||
attribute, or a named attribute directory, the error NFS4ERR_BADTYPE | attribute, or a named attribute directory, the error NFS4ERR_BADTYPE | |||
results. | results. | |||
If an object of the same name already exists in the directory, the | If an object of the same name already exists in the directory, the | |||
server will return the error NFS4ERR_EXIST. | server will return the error NFS4ERR_EXIST. | |||
skipping to change at line 22750 ¶ | skipping to change at line 22751 ¶ | |||
* to file. Ordinary OPEN of the | * to file. Ordinary OPEN of the | |||
* specified file by current filehandle. | * specified file by current filehandle. | |||
*/ | */ | |||
case CLAIM_FH: /* new to v4.1 */ | case CLAIM_FH: /* new to v4.1 */ | |||
/* CURRENT_FH: regular file to open */ | /* CURRENT_FH: regular file to open */ | |||
void; | void; | |||
/* | /* | |||
* Like CLAIM_DELEGATE_PREV. Right to file based on a | * Like CLAIM_DELEGATE_PREV. Right to file based on a | |||
* delegation granted to a previous boot | * delegation granted to a previous boot | |||
* instance of the client. File is identified by | * instance of the client. File is identified | |||
* by filehandle. | * by filehandle. | |||
*/ | */ | |||
case CLAIM_DELEG_PREV_FH: /* new to v4.1 */ | case CLAIM_DELEG_PREV_FH: /* new to v4.1 */ | |||
/* CURRENT_FH: file being opened */ | /* CURRENT_FH: file being opened */ | |||
void; | void; | |||
/* | /* | |||
* Like CLAIM_DELEGATE_CUR. Right to file based on | * Like CLAIM_DELEGATE_CUR. Right to file based on | |||
* a delegation granted by the server. | * a delegation granted by the server. | |||
* File is identified by filehandle. | * File is identified by filehandle. | |||
skipping to change at line 22979 ¶ | skipping to change at line 22980 ¶ | |||
| | | and | or EXCLUSIVE4 (SHOULD | | | | | and | or EXCLUSIVE4 (SHOULD | | |||
| | | EXCLUSIVE4 | NOT) | | | | | EXCLUSIVE4 | NOT) | | |||
+-------------+----------+--------------+-----------------------+ | +-------------+----------+--------------+-----------------------+ | |||
| no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 | | | no | yes | EXCLUSIVE4_1 | EXCLUSIVE4_1 | | |||
+-------------+----------+--------------+-----------------------+ | +-------------+----------+--------------+-----------------------+ | |||
| yes | no | GUARDED4 | GUARDED4 | | | yes | no | GUARDED4 | GUARDED4 | | |||
+-------------+----------+--------------+-----------------------+ | +-------------+----------+--------------+-----------------------+ | |||
| yes | yes | GUARDED4 | GUARDED4 | | | yes | yes | GUARDED4 | GUARDED4 | | |||
+-------------+----------+--------------+-----------------------+ | +-------------+----------+--------------+-----------------------+ | |||
Table 18: Required methods for exclusive create | Table 18: Required Methods for Exclusive Create | |||
If CREATE_SESSION4_FLAG_PERSIST is set in the results of | If CREATE_SESSION4_FLAG_PERSIST is set in the results of | |||
CREATE_SESSION, the reply cache is persistent (see Section 18.36). | CREATE_SESSION, the reply cache is persistent (see Section 18.36). | |||
If the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from | If the EXCHGID4_FLAG_USE_PNFS_MDS flag is set in the results from | |||
EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the | EXCHANGE_ID, the server is a pNFS server (see Section 18.35). If the | |||
client attempts to use EXCLUSIVE4 on a persistent session, or a | client attempts to use EXCLUSIVE4 on a persistent session, or a | |||
session derived from an EXCHGID4_FLAG_USE_PNFS_MDS client ID, the | session derived from an EXCHGID4_FLAG_USE_PNFS_MDS client ID, the | |||
server MUST return NFS4ERR_INVAL. | server MUST return NFS4ERR_INVAL. | |||
With persistent sessions, exclusive create semantics are fully | With persistent sessions, exclusive create semantics are fully | |||
skipping to change at line 23564 ¶ | skipping to change at line 23565 ¶ | |||
the object. If none exist, then NFS4ERR_NOENT will be returned. If | the object. If none exist, then NFS4ERR_NOENT will be returned. If | |||
createdir has a value of TRUE and no named attribute directory | createdir has a value of TRUE and no named attribute directory | |||
exists, one is created and its filehandle becomes the current | exists, one is created and its filehandle becomes the current | |||
filehandle. On the other hand, if createdir has a value of TRUE and | filehandle. On the other hand, if createdir has a value of TRUE and | |||
the named attribute directory already exists, no error results and | the named attribute directory already exists, no error results and | |||
the filehandle of the existing directory becomes the current | the filehandle of the existing directory becomes the current | |||
filehandle. The creation of a named attribute directory assumes that | filehandle. The creation of a named attribute directory assumes that | |||
the server has implemented named attribute support in this fashion | the server has implemented named attribute support in this fashion | |||
and is not required to do so by this definition. | and is not required to do so by this definition. | |||
If the current file handle designates an object of type NF4NAMEDATTR | If the current filehandle designates an object of type NF4NAMEDATTR | |||
(a named attribute) or NF4ATTRDIR (a named attribute directory), an | (a named attribute) or NF4ATTRDIR (a named attribute directory), an | |||
error of NFS4ERR_WRONG_TYPE is returned to the client. Named | error of NFS4ERR_WRONG_TYPE is returned to the client. Named | |||
attributes or a named attribute directory MUST NOT have their own | attributes or a named attribute directory MUST NOT have their own | |||
named attributes. | named attributes. | |||
18.17.4. IMPLEMENTATION | 18.17.4. IMPLEMENTATION | |||
If the server does not support named attributes for the current | If the server does not support named attributes for the current | |||
filehandle, an error of NFS4ERR_NOTSUPP will be returned to the | filehandle, an error of NFS4ERR_NOTSUPP will be returned to the | |||
client. | client. | |||
skipping to change at line 24927 ¶ | skipping to change at line 24928 ¶ | |||
+============+===================================+ | +============+===================================+ | |||
| stable | committed | | | stable | committed | | |||
+============+===================================+ | +============+===================================+ | |||
| UNSTABLE4 | FILE_SYNC4, DATA_SYNC4, UNSTABLE4 | | | UNSTABLE4 | FILE_SYNC4, DATA_SYNC4, UNSTABLE4 | | |||
+------------+-----------------------------------+ | +------------+-----------------------------------+ | |||
| DATA_SYNC4 | FILE_SYNC4, DATA_SYNC4 | | | DATA_SYNC4 | FILE_SYNC4, DATA_SYNC4 | | |||
+------------+-----------------------------------+ | +------------+-----------------------------------+ | |||
| FILE_SYNC4 | FILE_SYNC4 | | | FILE_SYNC4 | FILE_SYNC4 | | |||
+------------+-----------------------------------+ | +------------+-----------------------------------+ | |||
Table 20: Valid combinations of the fields | Table 20: Valid Combinations of the Fields | |||
stable in the request and committed in the | Stable in the Request and Committed in the | |||
reply | Reply | |||
The final portion of the result is the field writeverf. This field | The final portion of the result is the field writeverf. This field | |||
is the write verifier and is a cookie that the client can use to | is the write verifier and is a cookie that the client can use to | |||
determine whether a server has changed instance state (e.g., server | determine whether a server has changed instance state (e.g., server | |||
restart) between a call to WRITE and a subsequent call to either | restart) between a call to WRITE and a subsequent call to either | |||
WRITE or COMMIT. This cookie MUST be unchanged during a single | WRITE or COMMIT. This cookie MUST be unchanged during a single | |||
instance of the NFSv4.1 server and MUST be unique between instances | instance of the NFSv4.1 server and MUST be unique between instances | |||
of the NFSv4.1 server. If the cookie changes, then the client MUST | of the NFSv4.1 server. If the cookie changes, then the client MUST | |||
assume that any data written with an UNSTABLE4 value for committed | assume that any data written with an UNSTABLE4 value for committed | |||
and an old writeverf in the reply has been lost and will need to be | and an old writeverf in the reply has been lost and will need to be | |||
skipping to change at line 25255 ¶ | skipping to change at line 25256 ¶ | |||
* The attempted BIND_CONN_TO_SESSION with the old SSV should | * The attempted BIND_CONN_TO_SESSION with the old SSV should | |||
succeed. If so, the client re-sends the original SET_SSV. If the | succeed. If so, the client re-sends the original SET_SSV. If the | |||
original SET_SSV was not executed, then the server executes it. | original SET_SSV was not executed, then the server executes it. | |||
If the original SET_SSV was executed but failed, the server will | If the original SET_SSV was executed but failed, the server will | |||
return the SET_SSV from the reply cache. | return the SET_SSV from the reply cache. | |||
18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID | 18.35. Operation 42: EXCHANGE_ID - Instantiate Client ID | |||
The EXCHANGE_ID operation exchanges long-hand client and server | The EXCHANGE_ID operation exchanges long-hand client and server | |||
identifiers (owners), and provides access to a client ID, creating | identifiers (owners) and provides access to a client ID, creating one | |||
one if necessary. This client ID becomes associated with the | if necessary. This client ID becomes associated with the connection | |||
connection on which the operation is done, so that it is available | on which the operation is done, so that it is available when a | |||
when a CREATE_SESSION is done or when the connection is used to issue | CREATE_SESSION is done or when the connection is used to issue a | |||
a request on an existing session associated with the current client. | request on an existing session associated with the current client. | |||
18.35.1. ARGUMENT | 18.35.1. ARGUMENT | |||
const EXCHGID4_FLAG_SUPP_MOVED_REFER = 0x00000001; | const EXCHGID4_FLAG_SUPP_MOVED_REFER = 0x00000001; | |||
const EXCHGID4_FLAG_SUPP_MOVED_MIGR = 0x00000002; | const EXCHGID4_FLAG_SUPP_MOVED_MIGR = 0x00000002; | |||
const EXCHGID4_FLAG_BIND_PRINC_STATEID = 0x00000100; | const EXCHGID4_FLAG_BIND_PRINC_STATEID = 0x00000100; | |||
const EXCHGID4_FLAG_USE_NON_PNFS = 0x00010000; | const EXCHGID4_FLAG_USE_NON_PNFS = 0x00010000; | |||
const EXCHGID4_FLAG_USE_PNFS_MDS = 0x00020000; | const EXCHGID4_FLAG_USE_PNFS_MDS = 0x00020000; | |||
skipping to change at line 25354 ¶ | skipping to change at line 25355 ¶ | |||
EXCHANGE_ID4resok eir_resok4; | EXCHANGE_ID4resok eir_resok4; | |||
default: | default: | |||
void; | void; | |||
}; | }; | |||
18.35.3. DESCRIPTION | 18.35.3. DESCRIPTION | |||
The client uses the EXCHANGE_ID operation to register a particular | The client uses the EXCHANGE_ID operation to register a particular | |||
client_owner with the server. However, when the client_owner has | client_owner with the server. However, when the client_owner has | |||
already been registered by other means (e.g. Transparent State | already been registered by other means (e.g., Transparent State | |||
Migration), the client may still use EXCHANGE_ID to obtain the client | Migration), the client may still use EXCHANGE_ID to obtain the client | |||
ID assigned previously. | ID assigned previously. | |||
The client ID returned from this operation will be associated with | The client ID returned from this operation will be associated with | |||
the connection on which the EXCHANGE_ID is received and will serve as | the connection on which the EXCHANGE_ID is received and will serve as | |||
a parent object for sessions created by the client on this connection | a parent object for sessions created by the client on this connection | |||
or to which the connection is bound. As a result of using those | or to which the connection is bound. As a result of using those | |||
sessions to make requests involving the creation of state, that state | sessions to make requests involving the creation of state, that state | |||
will become associated with the client ID returned. | will become associated with the client ID returned. | |||
skipping to change at line 25377 ¶ | skipping to change at line 25378 ¶ | |||
returned eir_sequenceid, in creating an associated session using | returned eir_sequenceid, in creating an associated session using | |||
CREATE_SESSION. | CREATE_SESSION. | |||
If the flag EXCHGID4_FLAG_CONFIRMED_R is set in the result, | If the flag EXCHGID4_FLAG_CONFIRMED_R is set in the result, | |||
eir_flags, then it is an indication that the registration of the | eir_flags, then it is an indication that the registration of the | |||
client_owner has already occurred and that a further CREATE_SESSION | client_owner has already occurred and that a further CREATE_SESSION | |||
is not needed to confirm it. Of course, subsequent CREATE_SESSION | is not needed to confirm it. Of course, subsequent CREATE_SESSION | |||
operations may be needed for other reasons. | operations may be needed for other reasons. | |||
The value eir_sequenceid is used to establish an initial sequence | The value eir_sequenceid is used to establish an initial sequence | |||
value associate with the client ID returned. In cases in which a | value associated with the client ID returned. In cases in which a | |||
CREATE_SESSION has already been done, there is no need for this | CREATE_SESSION has already been done, there is no need for this | |||
value, since sequencing of such request has already been established | value, since sequencing of such request has already been established, | |||
and the client has no need for this value and will ignore it | and the client has no need for this value and will ignore it. | |||
EXCHANGE_ID MAY be sent in a COMPOUND procedure that starts with | EXCHANGE_ID MAY be sent in a COMPOUND procedure that starts with | |||
SEQUENCE. However, when a client communicates with a server for the | SEQUENCE. However, when a client communicates with a server for the | |||
first time, it will not have a session, so using SEQUENCE will not be | first time, it will not have a session, so using SEQUENCE will not be | |||
possible. If EXCHANGE_ID is sent without a preceding SEQUENCE, then | possible. If EXCHANGE_ID is sent without a preceding SEQUENCE, then | |||
it MUST be the only operation in the COMPOUND procedure's request. | it MUST be the only operation in the COMPOUND procedure's request. | |||
If it is not, the server MUST return NFS4ERR_NOT_ONLY_OP. | If it is not, the server MUST return NFS4ERR_NOT_ONLY_OP. | |||
The eia_clientowner field is composed of a co_verifier field and a | The eia_clientowner field is composed of a co_verifier field and a | |||
co_ownerid string. As noted in Section 2.4, the co_ownerid | co_ownerid string. As noted in Section 2.4, the co_ownerid | |||
identifies the client, and the co_verifier specifies a particular | identifies the client, and the co_verifier specifies a particular | |||
incarnation of that client. An EXCHANGE_ID sent with a new | incarnation of that client. An EXCHANGE_ID sent with a new | |||
incarnation of the client will lead to the server removing lock state | incarnation of the client will lead to the server removing lock state | |||
of the old incarnation. On the other hand, an EXCHANGE_ID sent with | of the old incarnation. On the other hand, when an EXCHANGE_ID sent | |||
the current incarnation and co_ownerid will, when it does not result | with the current incarnation and co_ownerid does not result in an | |||
in an unrelated error, potentially update an existing client ID's | unrelated error, it will potentially update an existing client ID's | |||
properties, or simply return information about the existing | properties or simply return information about the existing client_id. | |||
client_id. That latter would happen when this operation is done to | The latter would happen when this operation is done to the same | |||
the same server using different network addresses as part of creating | server using different network addresses as part of creating trunked | |||
trunked connections. | connections. | |||
A server MUST NOT provide the same client ID to two different | A server MUST NOT provide the same client ID to two different | |||
incarnations of an eia_clientowner. | incarnations of an eia_clientowner. | |||
In addition to the client ID and sequence ID, the server returns a | In addition to the client ID and sequence ID, the server returns a | |||
server owner (eir_server_owner) and server scope (eir_server_scope). | server owner (eir_server_owner) and server scope (eir_server_scope). | |||
The former field is used in connection with network trunking as | The former field is used in connection with network trunking as | |||
described in Section 2.10.5. The latter field is used to allow | described in Section 2.10.5. The latter field is used to allow | |||
clients to determine when client IDs sent by one server may be | clients to determine when client IDs sent by one server may be | |||
recognized by another in the event of file system migration (see | recognized by another in the event of file system migration (see | |||
skipping to change at line 25799 ¶ | skipping to change at line 25800 ¶ | |||
ssp_num_gss_handles to zero; the client can create more handles | ssp_num_gss_handles to zero; the client can create more handles | |||
with another EXCHANGE_ID call. | with another EXCHANGE_ID call. | |||
Because each SSV RPCSEC_GSS handle shares a common SSV GSS | Because each SSV RPCSEC_GSS handle shares a common SSV GSS | |||
context, there are security considerations specific to this | context, there are security considerations specific to this | |||
situation discussed in Section 2.10.10. | situation discussed in Section 2.10.10. | |||
The seq_window (see Section 5.2.3.1 of RFC 2203 [4]) of each | The seq_window (see Section 5.2.3.1 of RFC 2203 [4]) of each | |||
RPCSEC_GSS handle in spi_handle MUST be the same as the seq_window | RPCSEC_GSS handle in spi_handle MUST be the same as the seq_window | |||
of the RPCSEC_GSS handle used for the credential of the RPC | of the RPCSEC_GSS handle used for the credential of the RPC | |||
request that the EXCHANGE_ID operation was sent as a part of. | request of which the EXCHANGE_ID operation was sent as a part. | |||
+======================+===========================+===============+ | +======================+===========================+===============+ | |||
| Encryption Algorithm | MUST NOT be combined with | SHOULD NOT be | | | Encryption Algorithm | MUST NOT be combined with | SHOULD NOT be | | |||
| | | combined with | | | | | combined with | | |||
+======================+===========================+===============+ | +======================+===========================+===============+ | |||
| id-aes128-CBC | | id-sha384, | | | id-aes128-CBC | | id-sha384, | | |||
| | | id-sha512 | | | | | id-sha512 | | |||
+----------------------+---------------------------+---------------+ | +----------------------+---------------------------+---------------+ | |||
| id-aes192-CBC | id-sha1 | id-sha512 | | | id-aes192-CBC | id-sha1 | id-sha512 | | |||
+----------------------+---------------------------+---------------+ | +----------------------+---------------------------+---------------+ | |||
skipping to change at line 25839 ¶ | skipping to change at line 25840 ¶ | |||
server MUST NOT interpret this implementation identity information in | server MUST NOT interpret this implementation identity information in | |||
a way that affects how the implementation interacts with its peer. | a way that affects how the implementation interacts with its peer. | |||
The client and server are not allowed to depend on the peer's | The client and server are not allowed to depend on the peer's | |||
manifesting a particular allowed behavior based on an implementation | manifesting a particular allowed behavior based on an implementation | |||
identifier but are required to interoperate as specified elsewhere in | identifier but are required to interoperate as specified elsewhere in | |||
the protocol specification. | the protocol specification. | |||
Because it is possible that some implementations might violate the | Because it is possible that some implementations might violate the | |||
protocol specification and interpret the identity information, | protocol specification and interpret the identity information, | |||
implementations MUST provide facilities to allow the NFSv4 client and | implementations MUST provide facilities to allow the NFSv4 client and | |||
server be configured to set the contents of the nfs_impl_id | server to be configured to set the contents of the nfs_impl_id | |||
structures sent to any specified value. | structures sent to any specified value. | |||
18.35.4. IMPLEMENTATION | 18.35.4. IMPLEMENTATION | |||
A server's client record is a 5-tuple: | A server's client record is a 5-tuple: | |||
1. co_ownerid: | 1. co_ownerid: | |||
The client identifier string, from the eia_clientowner structure | The client identifier string, from the eia_clientowner structure | |||
of the EXCHANGE_ID4args structure. | of the EXCHANGE_ID4args structure. | |||
skipping to change at line 26296 ¶ | skipping to change at line 26297 ¶ | |||
ca_maxresponsesize_cached: | ca_maxresponsesize_cached: | |||
Like ca_maxresponsesize, but the maximum size of a reply that | Like ca_maxresponsesize, but the maximum size of a reply that | |||
will be stored in the reply cache (Section 2.10.6.1). For each | will be stored in the reply cache (Section 2.10.6.1). For each | |||
channel, the server MAY decrease this value, but MUST NOT | channel, the server MAY decrease this value, but MUST NOT | |||
increase it. If, in the reply to CREATE_SESSION, the value of | increase it. If, in the reply to CREATE_SESSION, the value of | |||
ca_maxresponsesize_cached of a channel is less than the value | ca_maxresponsesize_cached of a channel is less than the value | |||
of ca_maxresponsesize of the same channel, then this is an | of ca_maxresponsesize of the same channel, then this is an | |||
indication to the requester that it needs to be selective about | indication to the requester that it needs to be selective about | |||
which replies it directs the replier to cache; for example, | which replies it directs the replier to cache; for example, | |||
large replies from nonidempotent operations (e.g., COMPOUND | large replies from non-idempotent operations (e.g., COMPOUND | |||
requests with a READ operation) should not be cached. The | requests with a READ operation) should not be cached. The | |||
requester decides which replies to cache via an argument to the | requester decides which replies to cache via an argument to the | |||
SEQUENCE (the sa_cachethis field, see Section 18.46) or | SEQUENCE (the sa_cachethis field, see Section 18.46) or | |||
CB_SEQUENCE (the csa_cachethis field, see Section 20.9) | CB_SEQUENCE (the csa_cachethis field, see Section 20.9) | |||
operations. After the session is created, if a requester sends | operations. After the session is created, if a requester sends | |||
a request for which the size of the reply would exceed | a request for which the size of the reply would exceed | |||
ca_maxresponsesize_cached, the replier will return | ca_maxresponsesize_cached, the replier will return | |||
NFS4ERR_REP_TOO_BIG_TO_CACHE, per the description in | NFS4ERR_REP_TOO_BIG_TO_CACHE, per the description in | |||
Section 2.10.6.4. | Section 2.10.6.4. | |||
skipping to change at line 26382 ¶ | skipping to change at line 26383 ¶ | |||
gcbp_service, i.e., it MUST set the "service" field of the | gcbp_service, i.e., it MUST set the "service" field of the | |||
rpc_gss_cred_t data type in RPCSEC_GSS credential to the value of | rpc_gss_cred_t data type in RPCSEC_GSS credential to the value of | |||
gcbp_service (see "RPC Request Header", Section 5.3.1 of [4]). | gcbp_service (see "RPC Request Header", Section 5.3.1 of [4]). | |||
If the RPCSEC_GSS handle identified by gcbp_handle_from_server | If the RPCSEC_GSS handle identified by gcbp_handle_from_server | |||
does not exist on the server, the server will return | does not exist on the server, the server will return | |||
NFS4ERR_NOENT. | NFS4ERR_NOENT. | |||
Within each element of csa_sec_parms, the fore and back RPCSEC_GSS | Within each element of csa_sec_parms, the fore and back RPCSEC_GSS | |||
contexts MUST share the same GSS context and MUST have the same | contexts MUST share the same GSS context and MUST have the same | |||
seq_window (see Section 5.2.3.1 of RFC2203 [4]). The fore and | seq_window (see Section 5.2.3.1 of RFC 2203 [4]). The fore and | |||
back RPCSEC_GSS context state are independent of each other as far | back RPCSEC_GSS context state are independent of each other as far | |||
as the RPCSEC_GSS sequence number (see the seq_num field in the | as the RPCSEC_GSS sequence number (see the seq_num field in the | |||
rpc_gss_cred_t data type of Sections 5 and 5.3.1 of [4]). | rpc_gss_cred_t data type of Sections 5 and 5.3.1 of [4]). | |||
If an RPCSEC_GSS handle is using the SSV context (see | If an RPCSEC_GSS handle is using the SSV context (see | |||
Section 2.10.9), then because each SSV RPCSEC_GSS handle shares a | Section 2.10.9), then because each SSV RPCSEC_GSS handle shares a | |||
common SSV GSS context, there are security considerations specific | common SSV GSS context, there are security considerations specific | |||
to this situation discussed in Section 2.10.10. | to this situation discussed in Section 2.10.10. | |||
Once the session is created, the first SEQUENCE or CB_SEQUENCE | Once the session is created, the first SEQUENCE or CB_SEQUENCE | |||
skipping to change at line 28598 ¶ | skipping to change at line 28599 ¶ | |||
DESTROY_CLIENTID allows a server to immediately reclaim the resources | DESTROY_CLIENTID allows a server to immediately reclaim the resources | |||
consumed by an unused client ID, and also to forget that it ever | consumed by an unused client ID, and also to forget that it ever | |||
generated the client ID. By forgetting that it ever generated the | generated the client ID. By forgetting that it ever generated the | |||
client ID, the server can safely reuse the client ID on a future | client ID, the server can safely reuse the client ID on a future | |||
EXCHANGE_ID operation. | EXCHANGE_ID operation. | |||
18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished | 18.51. Operation 58: RECLAIM_COMPLETE - Indicates Reclaims Finished | |||
18.51.1. ARGUMENT | 18.51.1. ARGUMENT | |||
<CODE BEGINS> | ||||
struct RECLAIM_COMPLETE4args { | struct RECLAIM_COMPLETE4args { | |||
/* | /* | |||
* If rca_one_fs TRUE, | * If rca_one_fs TRUE, | |||
* | * | |||
* CURRENT_FH: object in | * CURRENT_FH: object in | |||
* file system reclaim is | * file system reclaim is | |||
* complete for. | * complete for. | |||
*/ | */ | |||
bool rca_one_fs; | bool rca_one_fs; | |||
}; | }; | |||
<CODE ENDS> | ||||
18.51.2. RESULTS | 18.51.2. RESULTS | |||
<CODE BEGINS> | ||||
struct RECLAIM_COMPLETE4res { | struct RECLAIM_COMPLETE4res { | |||
nfsstat4 rcr_status; | nfsstat4 rcr_status; | |||
}; | }; | |||
<CODE ENDS> | ||||
18.51.3. DESCRIPTION | 18.51.3. DESCRIPTION | |||
A RECLAIM_COMPLETE operation is used to indicate that the client has | A RECLAIM_COMPLETE operation is used to indicate that the client has | |||
reclaimed all of the locking state that it will recover using | reclaimed all of the locking state that it will recover using | |||
reclaim, when it is recovering state due to either a server restart | reclaim, when it is recovering state due to either a server restart | |||
or the migration of a file system to another server. There are two | or the migration of a file system to another server. There are two | |||
types of RECLAIM_COMPLETE operations: | types of RECLAIM_COMPLETE operations: | |||
* When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done. | * When rca_one_fs is FALSE, a global RECLAIM_COMPLETE is being done. | |||
skipping to change at line 28716 ¶ | skipping to change at line 28721 ¶ | |||
When a RECLAIM_COMPLETE is sent, the client effectively acknowledges | When a RECLAIM_COMPLETE is sent, the client effectively acknowledges | |||
any locks not yet reclaimed as lost. This allows the server to re- | any locks not yet reclaimed as lost. This allows the server to re- | |||
enable the client to recover locks if the occurrence of edge | enable the client to recover locks if the occurrence of edge | |||
conditions, as described in Section 8.4.3, had caused the server to | conditions, as described in Section 8.4.3, had caused the server to | |||
disable the client's ability to recover locks. | disable the client's ability to recover locks. | |||
Because previous descriptions of RECLAIM_COMPLETE were not | Because previous descriptions of RECLAIM_COMPLETE were not | |||
sufficiently explicit about the circumstances in which use of | sufficiently explicit about the circumstances in which use of | |||
RECLAIM_COMPLETE with rca_one_fs set to TRUE was appropriate, there | RECLAIM_COMPLETE with rca_one_fs set to TRUE was appropriate, there | |||
have been cases which it has been misused by clients who have issued | have been cases in which it has been misused by clients who have | |||
RECLAIM_COMPLETE with rca_one_fs set to TRUE when it should have not | issued RECLAIM_COMPLETE with rca_one_fs set to TRUE when it should | |||
been. There have also been cases in which servers have, in various | have not been. There have also been cases in which servers have, in | |||
ways, not responded to such misuse as described above, either | various ways, not responded to such misuse as described above, either | |||
ignoring the rca_one_fs setting (treating the operation as a global | ignoring the rca_one_fs setting (treating the operation as a global | |||
RECLAIM_COMPLETE) or ignoring the entire operation. | RECLAIM_COMPLETE) or ignoring the entire operation. | |||
While clients SHOULD NOT misuse this feature and servers SHOULD | While clients SHOULD NOT misuse this feature, and servers SHOULD | |||
respond to such misuse as described above, implementers need to be | respond to such misuse as described above, implementors need to be | |||
aware of the following considerations as they make necessary | aware of the following considerations as they make necessary trade- | |||
tradeoffs between interoperability with existing implementations and | offs between interoperability with existing implementations and | |||
proper support for facilities to allow lock recovery in the event of | proper support for facilities to allow lock recovery in the event of | |||
file system migration. | file system migration. | |||
* When servers have no support for becoming the destination server | * When servers have no support for becoming the destination server | |||
of a file system subject to migration, there is no possibility of | of a file system subject to migration, there is no possibility of | |||
a per-fs RECLAIM_COMPLETE being done legitimately and occurrences | a per-fs RECLAIM_COMPLETE being done legitimately, and occurrences | |||
of it SHOULD be ignored. However, the negative consequences of | of it SHOULD be ignored. However, the negative consequences of | |||
accepting such mistaken use are quite limited as long as the | accepting such mistaken use are quite limited as long as the | |||
client does not issue it before all necessary reclaims are done. | client does not issue it before all necessary reclaims are done. | |||
* When a server might become the destination for a file system being | * When a server might become the destination for a file system being | |||
migrated, inappropriate use of per-fs RECLAIM_COMPLETE is more | migrated, inappropriate use of per-fs RECLAIM_COMPLETE is more | |||
concerning. In the case in which the file system designated is | concerning. In the case in which the file system designated is | |||
not within a per-fs grace period, the per-fs RECLAIM_COMPLETE | not within a per-fs grace period, the per-fs RECLAIM_COMPLETE | |||
SHOULD be ignored, with the negative consequences of accepting it | SHOULD be ignored, with the negative consequences of accepting it | |||
being limited, as in the case in which migration is not supported. | being limited, as in the case in which migration is not supported. | |||
skipping to change at line 28990 ¶ | skipping to change at line 28995 ¶ | |||
+------------------------------+----------------------------------+ | +------------------------------+----------------------------------+ | |||
| NFS4ERR_TOO_MANY_OPS | | | | NFS4ERR_TOO_MANY_OPS | | | |||
+------------------------------+----------------------------------+ | +------------------------------+----------------------------------+ | |||
| NFS4ERR_REP_TOO_BIG | | | | NFS4ERR_REP_TOO_BIG | | | |||
+------------------------------+----------------------------------+ | +------------------------------+----------------------------------+ | |||
| NFS4ERR_REP_TOO_BIG_TO_CACHE | | | | NFS4ERR_REP_TOO_BIG_TO_CACHE | | | |||
+------------------------------+----------------------------------+ | +------------------------------+----------------------------------+ | |||
| NFS4ERR_REQ_TOO_BIG | | | | NFS4ERR_REQ_TOO_BIG | | | |||
+------------------------------+----------------------------------+ | +------------------------------+----------------------------------+ | |||
Table 24: CB_COMPOUND error returns | Table 24: CB_COMPOUND Error Returns | |||
20. NFSv4.1 Callback Operations | 20. NFSv4.1 Callback Operations | |||
20.1. Operation 3: CB_GETATTR - Get Attributes | 20.1. Operation 3: CB_GETATTR - Get Attributes | |||
20.1.1. ARGUMENT | 20.1.1. ARGUMENT | |||
struct CB_GETATTR4args { | struct CB_GETATTR4args { | |||
nfs_fh4 fh; | nfs_fh4 fh; | |||
bitmap4 attr_request; | bitmap4 attr_request; | |||
skipping to change at line 30102 ¶ | skipping to change at line 30107 ¶ | |||
Relative to previous NFS versions, NFSv4.1 has additional security | Relative to previous NFS versions, NFSv4.1 has additional security | |||
considerations for pNFS (see Sections 12.9 and 13.12), locking and | considerations for pNFS (see Sections 12.9 and 13.12), locking and | |||
session state (see Section 2.10.8.3), and state recovery during grace | session state (see Section 2.10.8.3), and state recovery during grace | |||
period (see Section 8.4.2.1.1). With respect to locking and session | period (see Section 8.4.2.1.1). With respect to locking and session | |||
state, if SP4_SSV state protection is being used, Section 2.10.10 has | state, if SP4_SSV state protection is being used, Section 2.10.10 has | |||
specific security considerations for the NFSv4.1 client and server. | specific security considerations for the NFSv4.1 client and server. | |||
Security considerations for lock reclaim differ between the two | Security considerations for lock reclaim differ between the two | |||
different situations in which state reclaim is to be done. The | different situations in which state reclaim is to be done. The | |||
server failure situation is discussed in Section 8.4.2.1.1 while the | server failure situation is discussed in Section 8.4.2.1.1, while the | |||
per-fs state reclaim done in support of migration/replication is | per-fs state reclaim done in support of migration/replication is | |||
discussed in Section 11.11.9.1. | discussed in Section 11.11.9.1. | |||
The use of the multi-server namespace features described in | The use of the multi-server namespace features described in | |||
Section 11 raises the possibility that requests to determine the set | Section 11 raises the possibility that requests to determine the set | |||
of network addresses corresponding to a given server might be | of network addresses corresponding to a given server might be | |||
interfered with or have their responses modified in flight. In light | interfered with or have their responses modified in flight. In light | |||
of this possibility, the following considerations should be taken | of this possibility, the following considerations should be noted: | |||
note of: | ||||
* When DNS is used to convert server names to addresses and DNSSEC | * When DNS is used to convert server names to addresses and DNSSEC | |||
[29] is not available, the validity of the network addresses | [29] is not available, the validity of the network addresses | |||
returned generally cannot be relied upon. However, when combined | returned generally cannot be relied upon. However, when combined | |||
with a trusted resolver, DNS over TLS [30], and DNS over HTTPS | with a trusted resolver, DNS over TLS [30] and DNS over HTTPS [34] | |||
[34] can also be relied upon to provide valid address resolutions. | can be relied upon to provide valid address resolutions. | |||
In situations in which the validity of the provided addresses | In situations in which the validity of the provided addresses | |||
cannot be relied upon and the client uses RPCSEC_GSS to access the | cannot be relied upon and the client uses RPCSEC_GSS to access the | |||
designated server, it is possible for mutual authentication to | designated server, it is possible for mutual authentication to | |||
discover invalid server addresses as long as the RPCSEC_GSS | discover invalid server addresses as long as the RPCSEC_GSS | |||
implementation used does not use insecure DNS queries to | implementation used does not use insecure DNS queries to | |||
canonicalize the hostname components of the service principal | canonicalize the hostname components of the service principal | |||
names, as explained in [28]. | names, as explained in [28]. | |||
* The fetching of attributes containing file system location | * The fetching of attributes containing file system location | |||
information SHOULD be performed using integrity protection. It is | information SHOULD be performed using integrity protection. It is | |||
important to note here that a client making a request of this sort | important to note here that a client making a request of this sort | |||
without using integrity protection needs be aware of the negative | without using integrity protection needs be aware of the negative | |||
consequences of doing so, which can lead to invalid host names or | consequences of doing so, which can lead to invalid hostnames or | |||
network addresses being returned. These include cases in which | network addresses being returned. These include cases in which | |||
the client is directed to a server under the control of an | the client is directed to a server under the control of an | |||
attacker, who might get access to data written or provide | attacker, who might get access to data written or provide | |||
incorrect values for data read. In light of this, the client | incorrect values for data read. In light of this, the client | |||
needs to recognize that using such returned location information | needs to recognize that using such returned location information | |||
to access an NFSv4 server without use of RPCSEC_GSS (i.e. by | to access an NFSv4 server without use of RPCSEC_GSS (i.e., by | |||
using AUTH_SYS) poses dangers as it can result in the client | using AUTH_SYS) poses dangers as it can result in the client | |||
interacting with such an attacker-controlled server, without any | interacting with such an attacker-controlled server without any | |||
authentication facilities to verify the server's identity. | authentication facilities to verify the server's identity. | |||
* Despite the fact that it is a requirement that implementations | * Despite the fact that it is a requirement that implementations | |||
provide "support" for use of RPCSEC_GSS, it cannot be assumed that | provide "support" for use of RPCSEC_GSS, it cannot be assumed that | |||
use of RPCSEC_GSS is always available between any particular | use of RPCSEC_GSS is always available between any particular | |||
client-server pair. | client-server pair. | |||
* When a client has the network addresses of a server but not the | * When a client has the network addresses of a server but not the | |||
associated host names, that would interfere with its ability to | associated hostnames, that would interfere with its ability to use | |||
use RPCSEC_GSS. | RPCSEC_GSS. | |||
In light of the above, a server SHOULD present file system location | In light of the above, a server SHOULD present file system location | |||
entries that correspond to file systems on other servers using a host | entries that correspond to file systems on other servers using a | |||
name. This would allow the client to interrogate the fs_locations on | hostname. This would allow the client to interrogate the | |||
the destination server to obtain trunking information (as well as | fs_locations on the destination server to obtain trunking information | |||
replica information) using integrity protection, validating the name | (as well as replica information) using integrity protection, | |||
provided while assuring that the response has not been modified in | validating the name provided while assuring that the response has not | |||
flight. | been modified in flight. | |||
When RPCSEC_GSS is not available on a server, the client needs to be | When RPCSEC_GSS is not available on a server, the client needs to be | |||
aware of the fact that the location entries are subject to | aware of the fact that the location entries are subject to | |||
modification in flight and so cannot be relied upon. In the case of | modification in flight and so cannot be relied upon. In the case of | |||
a client being directed to another server after NFS4ERR_MOVED, this | a client being directed to another server after NFS4ERR_MOVED, this | |||
could vitiate the authentication provided by the use of RPCSEC_GSS on | could vitiate the authentication provided by the use of RPCSEC_GSS on | |||
the designated destination server. Even when RPCSEC_GSS | the designated destination server. Even when RPCSEC_GSS | |||
authentication is available on the destination, the server might | authentication is available on the destination, the server might | |||
still properly authenticate as the server to which the client was | still properly authenticate as the server to which the client was | |||
erroneously directed. Without a way to decide whether the server is | erroneously directed. Without a way to decide whether the server is | |||
skipping to change at line 30184 ¶ | skipping to change at line 30188 ¶ | |||
When a file system location attribute is fetched upon connecting with | When a file system location attribute is fetched upon connecting with | |||
an NFS server, it SHOULD, as stated above, be done with integrity | an NFS server, it SHOULD, as stated above, be done with integrity | |||
protection. When this not possible, it is generally best for the | protection. When this not possible, it is generally best for the | |||
client to ignore trunking and replica information or simply not fetch | client to ignore trunking and replica information or simply not fetch | |||
the location information for these purposes. | the location information for these purposes. | |||
When location information cannot be verified, it can be subjected to | When location information cannot be verified, it can be subjected to | |||
additional filtering to prevent the client from being inappropriately | additional filtering to prevent the client from being inappropriately | |||
directed. For example, if a range of network addresses can be | directed. For example, if a range of network addresses can be | |||
determined that assure that the servers and clients using AUTH_SYS | determined that assure that the servers and clients using AUTH_SYS | |||
are subject to the appropriate set of constraints (e.g. physical | are subject to the appropriate set of constraints (e.g., physical | |||
network isolation, administrative controls on the operating systems | network isolation, administrative controls on the operating systems | |||
used), then network addresses in the appropriate range can be used | used), then network addresses in the appropriate range can be used | |||
with others discarded or restricted in their use of AUTH_SYS. | with others discarded or restricted in their use of AUTH_SYS. | |||
To summarize considerations regarding the use of RPCSEC_GSS in | To summarize considerations regarding the use of RPCSEC_GSS in | |||
fetching location information, we need to consider the following | fetching location information, we need to consider the following | |||
possibilities for requests to interrogate location information, with | possibilities for requests to interrogate location information, with | |||
interrogation approaches on the referring and destination servers | interrogation approaches on the referring and destination servers | |||
arrived at separately: | arrived at separately: | |||
* The use of integrity protection is RECOMMENDED in all cases, since | * The use of integrity protection is RECOMMENDED in all cases, since | |||
the absence of integrity protection exposes the client to the | the absence of integrity protection exposes the client to the | |||
possibility of the results being modified in transit. | possibility of the results being modified in transit. | |||
* The use of requests issued without RPCSEC_GSS (i.e. using AUTH_SYS | * The use of requests issued without RPCSEC_GSS (i.e., using | |||
which has no provision to avoid modification of data in flight), | AUTH_SYS, which has no provision to avoid modification of data in | |||
while undesirable and a potential security exposure, may not be | flight), while undesirable and a potential security exposure, may | |||
avoidable in all cases. Where the use of the returned information | not be avoidable in all cases. Where the use of the returned | |||
cannot be avoided, it is made subject to filtering as described | information cannot be avoided, it is made subject to filtering as | |||
above to eliminate the possibility that the client would treat an | described above to eliminate the possibility that the client would | |||
invalid address as if it were a NFSv4 server. The specifics will | treat an invalid address as if it were a NFSv4 server. The | |||
vary depending on the degree of network isolation and whether the | specifics will vary depending on the degree of network isolation | |||
request is to the referring or destination servers. | and whether the request is to the referring or destination | |||
servers. | ||||
Even if such requests are not interfered with in flight, it is | Even if such requests are not interfered with in flight, it is | |||
possible for a compromised server to direct the client to use | possible for a compromised server to direct the client to use | |||
inappropriate servers, such as those under the control of the | inappropriate servers, such as those under the control of the | |||
attacker. It is not clear that being directed to such servers | attacker. It is not clear that being directed to such servers | |||
represents a greater threat to the client than the damage that could | represents a greater threat to the client than the damage that could | |||
be done by the compromised server itself. However, it is possible | be done by the compromised server itself. However, it is possible | |||
that some sorts of transient server compromises might be taken | that some sorts of transient server compromises might be exploited to | |||
advantage of to direct a client to a server capable of doing greater | direct a client to a server capable of doing greater damage over a | |||
damage over a longer time. One useful step to guard against this | longer time. One useful step to guard against this possibility is to | |||
possibility is to issue requests to fetch location data using | issue requests to fetch location data using RPCSEC_GSS, even if no | |||
RPCSEC_GSS, even if no mapping to an RPCSEC_GSS principal is | mapping to an RPCSEC_GSS principal is available. In this case, | |||
available. In this case, RPCSEC_GSS would not be used, as it | RPCSEC_GSS would not be used, as it typically is, to identify the | |||
typically is, to identify the client principal to the server, but | client principal to the server, but rather to make sure (via | |||
rather to make sure (via RPCSEC_GSS mutual authentication) that the | RPCSEC_GSS mutual authentication) that the server being contacted is | |||
server being contacted is the one intended. | the one intended. | |||
Similar considerations apply if the threat to be avoided is the | Similar considerations apply if the threat to be avoided is the | |||
redirection of client traffic to inappropriate (i.e. poorly | redirection of client traffic to inappropriate (i.e., poorly | |||
performing) servers. In both cases, there is no reason for the | performing) servers. In both cases, there is no reason for the | |||
information returned to depend on the identity of the client | information returned to depend on the identity of the client | |||
principal requesting it, while the validity of the server | principal requesting it, while the validity of the server | |||
information, which has the capability to affect all client | information, which has the capability to affect all client | |||
principals, is of considerable importance. | principals, is of considerable importance. | |||
22. IANA Considerations | 22. IANA Considerations | |||
This section uses terms that are defined in [62]. | This section uses terms that are defined in [62]. | |||
22.1. IANA Actions Needed | 22.1. IANA Actions | |||
This update does not require any modification of or additions to | This update does not require any modification of, or additions to, | |||
registry entries or registry rules associated with NFSv4.1. However, | registry entries or registry rules associated with NFSv4.1. However, | |||
since this document is intended to obsolete RFC5661, it will be | since this document obsoletes RFC 8881, IANA has updated all registry | |||
necessary for IANA to update all registry entries and registry rules | entries and registry rules references that point to RFC 5661 to point | |||
references that points to RFC5661 to point to this document instead. | to this document instead. | |||
Previous actions by IANA related to NFSv4.1 are listed in the | Previous actions by IANA related to NFSv4.1 are listed in the | |||
remaining subsections of Section 22. | remaining subsections of Section 22. | |||
22.2. Named Attribute Definitions | 22.2. Named Attribute Definitions | |||
IANA created a registry called the "NFSv4 Named Attribute Definitions | IANA created a registry called the "NFSv4 Named Attribute Definitions | |||
Registry". | Registry". | |||
The NFSv4.1 protocol supports the association of a file with zero or | The NFSv4.1 protocol supports the association of a file with zero or | |||
skipping to change at line 30373 ¶ | skipping to change at line 30378 ¶ | |||
22.3.1. Initial Registry | 22.3.1. Initial Registry | |||
The initial registry is in Table 25. Note that the next available | The initial registry is in Table 25. Note that the next available | |||
value is zero. | value is zero. | |||
+=========================+=======+==========+=====+================+ | +=========================+=======+==========+=====+================+ | |||
| Notification Name | Value | RFC | How | Minor Versions | | | Notification Name | Value | RFC | How | Minor Versions | | |||
+=========================+=======+==========+=====+================+ | +=========================+=======+==========+=====+================+ | |||
| NOTIFY_DEVICEID4_CHANGE | 1 | RFC | N | 1 | | | NOTIFY_DEVICEID4_CHANGE | 1 | RFC | N | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------+-------+----------+-----+----------------+ | +-------------------------+-------+----------+-----+----------------+ | |||
| NOTIFY_DEVICEID4_DELETE | 2 | RFC | N | 1 | | | NOTIFY_DEVICEID4_DELETE | 2 | RFC | N | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------+-------+----------+-----+----------------+ | +-------------------------+-------+----------+-----+----------------+ | |||
Table 25: Initial Device ID Notification Assignments | Table 25: Initial Device ID Notification Assignments | |||
22.3.2. Updating Registrations | 22.3.2. Updating Registrations | |||
The update of a registration will require IESG Approval on the advice | The update of a registration will require IESG Approval on the advice | |||
of a Designated Expert. | of a Designated Expert. | |||
22.4. Object Recall Types | 22.4. Object Recall Types | |||
skipping to change at line 30448 ¶ | skipping to change at line 30453 ¶ | |||
22.4.1. Initial Registry | 22.4.1. Initial Registry | |||
The initial registry is in Table 26. Note that the next available | The initial registry is in Table 26. Note that the next available | |||
value is five. | value is five. | |||
+===============================+=======+======+=====+==========+ | +===============================+=======+======+=====+==========+ | |||
| Recallable Object Type Name | Value | RFC | How | Minor | | | Recallable Object Type Name | Value | RFC | How | Minor | | |||
| | | | | Versions | | | | | | | Versions | | |||
+===============================+=======+======+=====+==========+ | +===============================+=======+======+=====+==========+ | |||
| RCA4_TYPE_MASK_RDATA_DLG | 0 | RFC | N | 1 | | | RCA4_TYPE_MASK_RDATA_DLG | 0 | RFC | N | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------------+-------+------+-----+----------+ | +-------------------------------+-------+------+-----+----------+ | |||
| RCA4_TYPE_MASK_WDATA_DLG | 1 | RFC | N | 1 | | | RCA4_TYPE_MASK_WDATA_DLG | 1 | RFC | N | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------------+-------+------+-----+----------+ | +-------------------------------+-------+------+-----+----------+ | |||
| RCA4_TYPE_MASK_DIR_DLG | 2 | RFC | N | 1 | | | RCA4_TYPE_MASK_DIR_DLG | 2 | RFC | N | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------------+-------+------+-----+----------+ | +-------------------------------+-------+------+-----+----------+ | |||
| RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFC | N | 1 | | | RCA4_TYPE_MASK_FILE_LAYOUT | 3 | RFC | N | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------------+-------+------+-----+----------+ | +-------------------------------+-------+------+-----+----------+ | |||
| RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFC | L | 1 | | | RCA4_TYPE_MASK_BLK_LAYOUT | 4 | RFC | L | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------------+-------+------+-----+----------+ | +-------------------------------+-------+------+-----+----------+ | |||
| RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFC | L | 1 | | | RCA4_TYPE_MASK_OBJ_LAYOUT_MIN | 8 | RFC | L | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------------+-------+------+-----+----------+ | +-------------------------------+-------+------+-----+----------+ | |||
| RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFC | L | 1 | | | RCA4_TYPE_MASK_OBJ_LAYOUT_MAX | 9 | RFC | L | 1 | | |||
| | | 5661 | | | | | | | 8881 | | | | |||
+-------------------------------+-------+------+-----+----------+ | +-------------------------------+-------+------+-----+----------+ | |||
Table 26: Initial Recallable Object Type Assignments | Table 26: Initial Recallable Object Type Assignments | |||
22.4.2. Updating Registrations | 22.4.2. Updating Registrations | |||
The update of a registration will require IESG Approval on the advice | The update of a registration will require IESG Approval on the advice | |||
of a Designated Expert. | of a Designated Expert. | |||
22.5. Layout Types | 22.5. Layout Types | |||
skipping to change at line 30527 ¶ | skipping to change at line 30532 ¶ | |||
minor version of NFSv4 approved, a Designated Expert should | minor version of NFSv4 approved, a Designated Expert should | |||
review the registry to make recommended updates as needed. | review the registry to make recommended updates as needed. | |||
22.5.1. Initial Registry | 22.5.1. Initial Registry | |||
The initial registry is in Table 27. | The initial registry is in Table 27. | |||
+=======================+=======+==========+=====+================+ | +=======================+=======+==========+=====+================+ | |||
| Layout Type Name | Value | RFC | How | Minor Versions | | | Layout Type Name | Value | RFC | How | Minor Versions | | |||
+=======================+=======+==========+=====+================+ | +=======================+=======+==========+=====+================+ | |||
| LAYOUT4_NFSV4_1_FILES | 0x1 | RFC 5661 | N | 1 | | | LAYOUT4_NFSV4_1_FILES | 0x1 | RFC 8881 | N | 1 | | |||
+-----------------------+-------+----------+-----+----------------+ | +-----------------------+-------+----------+-----+----------------+ | |||
| LAYOUT4_OSD2_OBJECTS | 0x2 | RFC 5664 | L | 1 | | | LAYOUT4_OSD2_OBJECTS | 0x2 | RFC 5664 | L | 1 | | |||
+-----------------------+-------+----------+-----+----------------+ | +-----------------------+-------+----------+-----+----------------+ | |||
| LAYOUT4_BLOCK_VOLUME | 0x3 | RFC 5663 | L | 1 | | | LAYOUT4_BLOCK_VOLUME | 0x3 | RFC 5663 | L | 1 | | |||
+-----------------------+-------+----------+-----+----------------+ | +-----------------------+-------+----------+-----+----------------+ | |||
Table 27: Initial Layout Type Assignments | Table 27: Initial Layout Type Assignments | |||
22.5.2. Updating Registrations | 22.5.2. Updating Registrations | |||
skipping to change at line 30675 ¶ | skipping to change at line 30680 ¶ | |||
For assignments made on a Standards Action basis, the point of | For assignments made on a Standards Action basis, the point of | |||
contact is always IESG. | contact is always IESG. | |||
22.6.1.1.1. Initial Registry | 22.6.1.1.1. Initial Registry | |||
The initial registry is in Table 28. | The initial registry is in Table 28. | |||
+========================+==========+==================+ | +========================+==========+==================+ | |||
| Variable Name | RFC | Point of Contact | | | Variable Name | RFC | Point of Contact | | |||
+========================+==========+==================+ | +========================+==========+==================+ | |||
| ${ietf.org:CPU_ARCH} | RFC 5661 | IESG | | | ${ietf.org:CPU_ARCH} | RFC 8881 | IESG | | |||
+------------------------+----------+------------------+ | +------------------------+----------+------------------+ | |||
| ${ietf.org:OS_TYPE} | RFC 5661 | IESG | | | ${ietf.org:OS_TYPE} | RFC 8881 | IESG | | |||
+------------------------+----------+------------------+ | +------------------------+----------+------------------+ | |||
| ${ietf.org:OS_VERSION} | RFC 5661 | IESG | | | ${ietf.org:OS_VERSION} | RFC 8881 | IESG | | |||
+------------------------+----------+------------------+ | +------------------------+----------+------------------+ | |||
Table 28: Initial List of Path Variables | Table 28: Initial List of Path Variables | |||
IANA has created registries for the values of the variable names | IANA has created registries for the values of the variable names | |||
${ietf.org:CPU_ARCH} and ${ietf.org:OS_TYPE}. See Sections 22.6.2 and | ${ietf.org:CPU_ARCH} and ${ietf.org:OS_TYPE}. See Sections 22.6.2 and | |||
22.6.3. | 22.6.3. | |||
For the values of the variable ${ietf.org:OS_VERSION}, no registry is | For the values of the variable ${ietf.org:OS_VERSION}, no registry is | |||
needed as the specifics of the values of the variable will vary with | needed as the specifics of the values of the variable will vary with | |||
skipping to change at line 30792 ¶ | skipping to change at line 30797 ¶ | |||
1997, <https://www.rfc-editor.org/info/rfc2203>. | 1997, <https://www.rfc-editor.org/info/rfc2203>. | |||
[5] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos | [5] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos | |||
Version 5 Generic Security Service Application Program | Version 5 Generic Security Service Application Program | |||
Interface (GSS-API) Mechanism: Version 2", RFC 4121, | Interface (GSS-API) Mechanism: Version 2", RFC 4121, | |||
DOI 10.17487/RFC4121, July 2005, | DOI 10.17487/RFC4121, July 2005, | |||
<https://www.rfc-editor.org/info/rfc4121>. | <https://www.rfc-editor.org/info/rfc4121>. | |||
[6] The Open Group, "Section 3.191 of Chapter 3 of Base | [6] The Open Group, "Section 3.191 of Chapter 3 of Base | |||
Definitions of The Open Group Base Specifications Issue 6 | Definitions of The Open Group Base Specifications Issue 6 | |||
IEEE Std 1003.1, 2004 Edition, HTML Version | IEEE Std 1003.1, 2004 Edition, HTML Version", | |||
(www.opengroup.org), ISBN 1931624232", 2004. | ISBN 1931624232, 2004, <https://www.opengroup.org>. | |||
[7] Linn, J., "Generic Security Service Application Program | [7] Linn, J., "Generic Security Service Application Program | |||
Interface Version 2, Update 1", RFC 2743, | Interface Version 2, Update 1", RFC 2743, | |||
DOI 10.17487/RFC2743, January 2000, | DOI 10.17487/RFC2743, January 2000, | |||
<https://www.rfc-editor.org/info/rfc2743>. | <https://www.rfc-editor.org/info/rfc2743>. | |||
[8] Recio, R., Metzler, B., Culley, P., Hilland, J., and D. | [8] Recio, R., Metzler, B., Culley, P., Hilland, J., and D. | |||
Garcia, "A Remote Direct Memory Access Protocol | Garcia, "A Remote Direct Memory Access Protocol | |||
Specification", RFC 5040, DOI 10.17487/RFC5040, October | Specification", RFC 5040, DOI 10.17487/RFC5040, October | |||
2007, <https://www.rfc-editor.org/info/rfc5040>. | 2007, <https://www.rfc-editor.org/info/rfc5040>. | |||
skipping to change at line 30817 ¶ | skipping to change at line 30822 ¶ | |||
<https://www.rfc-editor.org/info/rfc5403>. | <https://www.rfc-editor.org/info/rfc5403>. | |||
[10] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., | [10] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., | |||
"Network File System (NFS) Version 4 Minor Version 1 | "Network File System (NFS) Version 4 Minor Version 1 | |||
External Data Representation Standard (XDR) Description", | External Data Representation Standard (XDR) Description", | |||
RFC 5662, DOI 10.17487/RFC5662, January 2010, | RFC 5662, DOI 10.17487/RFC5662, January 2010, | |||
<https://www.rfc-editor.org/info/rfc5662>. | <https://www.rfc-editor.org/info/rfc5662>. | |||
[11] The Open Group, "Section 3.372 of Chapter 3 of Base | [11] The Open Group, "Section 3.372 of Chapter 3 of Base | |||
Definitions of The Open Group Base Specifications Issue 6 | Definitions of The Open Group Base Specifications Issue 6 | |||
IEEE Std 1003.1, 2004 Edition, HTML Version | IEEE Std 1003.1, 2004 Edition, HTML Version", | |||
(www.opengroup.org), ISBN 1931624232", 2004. | ISBN 1931624232, 2004, <https://www.opengroup.org>. | |||
[12] Eisler, M., "IANA Considerations for Remote Procedure Call | [12] Eisler, M., "IANA Considerations for Remote Procedure Call | |||
(RPC) Network Identifiers and Universal Address Formats", | (RPC) Network Identifiers and Universal Address Formats", | |||
RFC 5665, DOI 10.17487/RFC5665, January 2010, | RFC 5665, DOI 10.17487/RFC5665, January 2010, | |||
<https://www.rfc-editor.org/info/rfc5665>. | <https://www.rfc-editor.org/info/rfc5665>. | |||
[13] The Open Group, "Section 'read()' of System Interfaces of | [13] The Open Group, "Section 'read()' of System Interfaces of | |||
The Open Group Base Specifications Issue 6 IEEE Std | The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[14] The Open Group, "Section 'readdir()' of System Interfaces | [14] The Open Group, "Section 'readdir()' of System Interfaces | |||
of The Open Group Base Specifications Issue 6 IEEE Std | of The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[15] The Open Group, "Section 'write()' of System Interfaces of | [15] The Open Group, "Section 'write()' of System Interfaces of | |||
The Open Group Base Specifications Issue 6 IEEE Std | The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[16] Hoffman, P. and M. Blanchet, "Preparation of | [16] Hoffman, P. and M. Blanchet, "Preparation of | |||
Internationalized Strings ("stringprep")", RFC 3454, | Internationalized Strings ("stringprep")", RFC 3454, | |||
DOI 10.17487/RFC3454, December 2002, | DOI 10.17487/RFC3454, December 2002, | |||
<https://www.rfc-editor.org/info/rfc3454>. | <https://www.rfc-editor.org/info/rfc3454>. | |||
[17] The Open Group, "Section 'chmod()' of System Interfaces of | [17] The Open Group, "Section 'chmod()' of System Interfaces of | |||
The Open Group Base Specifications Issue 6 IEEE Std | The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[18] International Organization for Standardization, | [18] International Organization for Standardization, | |||
"Information Technology - Universal Multiple-octet coded | "Information Technology - Universal Multiple-octet coded | |||
Character Set (UCS) - Part 1: Architecture and Basic | Character Set (UCS) - Part 1: Architecture and Basic | |||
Multilingual Plane", ISO Standard 10646-1, May 1993. | Multilingual Plane", ISO Standard 10646-1, May 1993. | |||
[19] Alvestrand, H., "IETF Policy on Character Sets and | [19] Alvestrand, H., "IETF Policy on Character Sets and | |||
Languages", BCP 18, RFC 2277, DOI 10.17487/RFC2277, | Languages", BCP 18, RFC 2277, DOI 10.17487/RFC2277, | |||
January 1998, <https://www.rfc-editor.org/info/rfc2277>. | January 1998, <https://www.rfc-editor.org/info/rfc2277>. | |||
[20] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep | [20] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep | |||
Profile for Internationalized Domain Names (IDN)", | Profile for Internationalized Domain Names (IDN)", | |||
RFC 3491, DOI 10.17487/RFC3491, March 2003, | RFC 3491, DOI 10.17487/RFC3491, March 2003, | |||
<https://www.rfc-editor.org/info/rfc3491>. | <https://www.rfc-editor.org/info/rfc3491>. | |||
[21] The Open Group, "Section 'fcntl()' of System Interfaces of | [21] The Open Group, "Section 'fcntl()' of System Interfaces of | |||
The Open Group Base Specifications Issue 6 IEEE Std | The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[22] The Open Group, "Section 'fsync()' of System Interfaces of | [22] The Open Group, "Section 'fsync()' of System Interfaces of | |||
The Open Group Base Specifications Issue 6 IEEE Std | The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[23] The Open Group, "Section 'getpwnam()' of System Interfaces | [23] The Open Group, "Section 'getpwnam()' of System Interfaces | |||
of The Open Group Base Specifications Issue 6 IEEE Std | of The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[24] The Open Group, "Section 'unlink()' of System Interfaces | [24] The Open Group, "Section 'unlink()' of System Interfaces | |||
of The Open Group Base Specifications Issue 6 IEEE Std | of The Open Group Base Specifications Issue 6 IEEE Std | |||
1003.1, 2004 Edition, HTML Version (www.opengroup.org), | 1003.1, 2004 Edition, HTML Version", ISBN 1931624232, | |||
ISBN 1931624232", 2004. | 2004, <https://www.opengroup.org>. | |||
[25] Schaad, J., Kaliski, B., and R. Housley, "Additional | [25] Schaad, J., Kaliski, B., and R. Housley, "Additional | |||
Algorithms and Identifiers for RSA Cryptography for use in | Algorithms and Identifiers for RSA Cryptography for use in | |||
the Internet X.509 Public Key Infrastructure Certificate | the Internet X.509 Public Key Infrastructure Certificate | |||
and Certificate Revocation List (CRL) Profile", RFC 4055, | and Certificate Revocation List (CRL) Profile", RFC 4055, | |||
DOI 10.17487/RFC4055, June 2005, | DOI 10.17487/RFC4055, June 2005, | |||
<https://www.rfc-editor.org/info/rfc4055>. | <https://www.rfc-editor.org/info/rfc4055>. | |||
[26] National Institute of Standards and Technology, | [26] National Institute of Standards and Technology, | |||
"Cryptographic Algorithm Object Registration", URL | "Cryptographic Algorithm Object Registration", November | |||
http://csrc.nist.gov/groups/ST/crypto_apps_infra/csor/ | 2007, | |||
algorithms.html, November 2007. | <http://csrc.nist.gov/groups/ST/crypto_apps_infra/csor/ | |||
algorithms.html>. | ||||
[27] Adamson, A. and N. Williams, "Remote Procedure Call (RPC) | [27] Adamson, A. and N. Williams, "Remote Procedure Call (RPC) | |||
Security Version 3", RFC 7861, DOI 10.17487/RFC7861, | Security Version 3", RFC 7861, DOI 10.17487/RFC7861, | |||
November 2016, <https://www.rfc-editor.org/info/rfc7861>. | November 2016, <https://www.rfc-editor.org/info/rfc7861>. | |||
[28] Neuman, C., Yu, T., Hartman, S., and K. Raeburn, "The | [28] Neuman, C., Yu, T., Hartman, S., and K. Raeburn, "The | |||
Kerberos Network Authentication Service (V5)", RFC 4120, | Kerberos Network Authentication Service (V5)", RFC 4120, | |||
DOI 10.17487/RFC4120, July 2005, | DOI 10.17487/RFC4120, July 2005, | |||
<https://www.rfc-editor.org/info/rfc4120>. | <https://www.rfc-editor.org/info/rfc4120>. | |||
skipping to change at line 30962 ¶ | skipping to change at line 30968 ¶ | |||
[38] Eisler, M., "LIPKEY - A Low Infrastructure Public Key | [38] Eisler, M., "LIPKEY - A Low Infrastructure Public Key | |||
Mechanism Using SPKM", RFC 2847, DOI 10.17487/RFC2847, | Mechanism Using SPKM", RFC 2847, DOI 10.17487/RFC2847, | |||
June 2000, <https://www.rfc-editor.org/info/rfc2847>. | June 2000, <https://www.rfc-editor.org/info/rfc2847>. | |||
[39] Eisler, M., "NFS Version 2 and Version 3 Security Issues | [39] Eisler, M., "NFS Version 2 and Version 3 Security Issues | |||
and the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5", | and the NFS Protocol's Use of RPCSEC_GSS and Kerberos V5", | |||
RFC 2623, DOI 10.17487/RFC2623, June 1999, | RFC 2623, DOI 10.17487/RFC2623, June 1999, | |||
<https://www.rfc-editor.org/info/rfc2623>. | <https://www.rfc-editor.org/info/rfc2623>. | |||
[40] Juszczak, C., "Improving the Performance and Correctness | [40] Juszczak, C., "Improving the Performance and Correctness | |||
of an NFS Server", USENIX Conference Proceedings , June | of an NFS Server", USENIX Conference Proceedings, June | |||
1990. | 1990. | |||
[41] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced | [41] Reynolds, J., Ed., "Assigned Numbers: RFC 1700 is Replaced | |||
by an On-line Database", RFC 3232, DOI 10.17487/RFC3232, | by an On-line Database", RFC 3232, DOI 10.17487/RFC3232, | |||
January 2002, <https://www.rfc-editor.org/info/rfc3232>. | January 2002, <https://www.rfc-editor.org/info/rfc3232>. | |||
[42] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", | [42] Srinivasan, R., "Binding Protocols for ONC RPC Version 2", | |||
RFC 1833, DOI 10.17487/RFC1833, August 1995, | RFC 1833, DOI 10.17487/RFC1833, August 1995, | |||
<https://www.rfc-editor.org/info/rfc1833>. | <https://www.rfc-editor.org/info/rfc1833>. | |||
[43] Werme, R., "RPC XID Issues", USENIX Conference | [43] Werme, R., "RPC XID Issues", USENIX Conference | |||
Proceedings , February 1996. | Proceedings, February 1996. | |||
[44] Nowicki, B., "NFS: Network File System Protocol | [44] Nowicki, B., "NFS: Network File System Protocol | |||
specification", RFC 1094, DOI 10.17487/RFC1094, March | specification", RFC 1094, DOI 10.17487/RFC1094, March | |||
1989, <https://www.rfc-editor.org/info/rfc1094>. | 1989, <https://www.rfc-editor.org/info/rfc1094>. | |||
[45] Bhide, A., Elnozahy, E. N., and S. P. Morgan, "A Highly | [45] Bhide, A., Elnozahy, E. N., and S. P. Morgan, "A Highly | |||
Available Network Server", USENIX Conference Proceedings , | Available Network Server", USENIX Conference Proceedings, | |||
January 1991. | January 1991. | |||
[46] Halevy, B., Welch, B., and J. Zelenka, "Object-Based | [46] Halevy, B., Welch, B., and J. Zelenka, "Object-Based | |||
Parallel NFS (pNFS) Operations", RFC 5664, | Parallel NFS (pNFS) Operations", RFC 5664, | |||
DOI 10.17487/RFC5664, January 2010, | DOI 10.17487/RFC5664, January 2010, | |||
<https://www.rfc-editor.org/info/rfc5664>. | <https://www.rfc-editor.org/info/rfc5664>. | |||
[47] Black, D., Fridella, S., and J. Glasgow, "Parallel NFS | [47] Black, D., Fridella, S., and J. Glasgow, "Parallel NFS | |||
(pNFS) Block/Volume Layout", RFC 5663, | (pNFS) Block/Volume Layout", RFC 5663, | |||
DOI 10.17487/RFC5663, January 2010, | DOI 10.17487/RFC5663, January 2010, | |||
skipping to change at line 31003 ¶ | skipping to change at line 31009 ¶ | |||
[48] Callaghan, B., "WebNFS Client Specification", RFC 2054, | [48] Callaghan, B., "WebNFS Client Specification", RFC 2054, | |||
DOI 10.17487/RFC2054, October 1996, | DOI 10.17487/RFC2054, October 1996, | |||
<https://www.rfc-editor.org/info/rfc2054>. | <https://www.rfc-editor.org/info/rfc2054>. | |||
[49] Callaghan, B., "WebNFS Server Specification", RFC 2055, | [49] Callaghan, B., "WebNFS Server Specification", RFC 2055, | |||
DOI 10.17487/RFC2055, October 1996, | DOI 10.17487/RFC2055, October 1996, | |||
<https://www.rfc-editor.org/info/rfc2055>. | <https://www.rfc-editor.org/info/rfc2055>. | |||
[50] IESG, "IESG Processing of RFC Errata for the IETF Stream", | [50] IESG, "IESG Processing of RFC Errata for the IETF Stream", | |||
July 2008, <http://www.ietf.org/IESG/STATEMENTS/iesg- | July 2008, | |||
statement-07-30-2008.txt>. | <https://www.ietf.org/about/groups/iesg/statements/ | |||
processing-rfc-errata/>. | ||||
[51] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- | [51] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- | |||
Hashing for Message Authentication", RFC 2104, | Hashing for Message Authentication", RFC 2104, | |||
DOI 10.17487/RFC2104, February 1997, | DOI 10.17487/RFC2104, February 1997, | |||
<https://www.rfc-editor.org/info/rfc2104>. | <https://www.rfc-editor.org/info/rfc2104>. | |||
[52] Shepler, S., "NFS Version 4 Design Considerations", | [52] Shepler, S., "NFS Version 4 Design Considerations", | |||
RFC 2624, DOI 10.17487/RFC2624, June 1999, | RFC 2624, DOI 10.17487/RFC2624, June 1999, | |||
<https://www.rfc-editor.org/info/rfc2624>. | <https://www.rfc-editor.org/info/rfc2624>. | |||
[53] The Open Group, "Protocols for Interworking: XNFS, Version | [53] The Open Group, "Protocols for Interworking: XNFS, Version | |||
3W, ISBN 1-85912-184-5", February 1998. | 3W", ISBN 1-85912-184-5, February 1998. | |||
[54] Floyd, S. and V. Jacobson, "The Synchronization of | [54] Floyd, S. and V. Jacobson, "The Synchronization of | |||
Periodic Routing Messages", IEEE/ACM Transactions on | Periodic Routing Messages", IEEE/ACM Transactions on | |||
Networking 2(2), pp. 122-136, April 1994. | Networking, 2(2), pp. 122-136, April 1994. | |||
[55] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., | [55] Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., | |||
and E. Zeidner, "Internet Small Computer Systems Interface | and E. Zeidner, "Internet Small Computer Systems Interface | |||
(iSCSI)", RFC 3720, DOI 10.17487/RFC3720, April 2004, | (iSCSI)", RFC 3720, DOI 10.17487/RFC3720, April 2004, | |||
<https://www.rfc-editor.org/info/rfc3720>. | <https://www.rfc-editor.org/info/rfc3720>. | |||
[56] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version | [56] Snively, R., "Fibre Channel Protocol for SCSI, 2nd Version | |||
(FCP-2)", ANSI/INCITS 350-2003, October 2003. | (FCP-2)", ANSI/INCITS, 350-2003, October 2003. | |||
[57] Weber, R.O., "Object-Based Storage Device Commands (OSD)", | [57] Weber, R.O., "Object-Based Storage Device Commands (OSD)", | |||
ANSI/INCITS 400-2004, July 2004, | ANSI/INCITS, 400-2004, July 2004, | |||
<http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. | <http://www.t10.org/ftp/t10/drafts/osd/osd-r10.pdf>. | |||
[58] Carns, P. H., Ligon III, W. B., Ross, R. B., and R. | [58] Carns, P. H., Ligon III, W. B., Ross, R. B., and R. | |||
Thakur, "PVFS: A Parallel File System for Linux | Thakur, "PVFS: A Parallel File System for Linux | |||
Clusters.", Proceedings of the 4th Annual Linux Showcase | Clusters.", Proceedings of the 4th Annual Linux Showcase | |||
and Conference , 2000. | and Conference, 2000. | |||
[59] The Open Group, "The Open Group Base Specifications Issue | [59] The Open Group, "The Open Group Base Specifications Issue | |||
6, IEEE Std 1003.1, 2004 Edition", 2004. | 6, IEEE Std 1003.1, 2004 Edition", 2004, | |||
<https://www.opengroup.org>. | ||||
[60] Callaghan, B., "NFS URL Scheme", RFC 2224, | [60] Callaghan, B., "NFS URL Scheme", RFC 2224, | |||
DOI 10.17487/RFC2224, October 1997, | DOI 10.17487/RFC2224, October 1997, | |||
<https://www.rfc-editor.org/info/rfc2224>. | <https://www.rfc-editor.org/info/rfc2224>. | |||
[61] Chiu, A., Eisler, M., and B. Callaghan, "Security | [61] Chiu, A., Eisler, M., and B. Callaghan, "Security | |||
Negotiation for WebNFS", RFC 2755, DOI 10.17487/RFC2755, | Negotiation for WebNFS", RFC 2755, DOI 10.17487/RFC2755, | |||
January 2000, <https://www.rfc-editor.org/info/rfc2755>. | January 2000, <https://www.rfc-editor.org/info/rfc2755>. | |||
[62] Narten, T. and H. Alvestrand, "Guidelines for Writing an | [62] Narten, T. and H. Alvestrand, "Guidelines for Writing an | |||
IANA Considerations Section in RFCs", RFC 5226, | IANA Considerations Section in RFCs", RFC 5226, | |||
DOI 10.17487/RFC5226, May 2008, | DOI 10.17487/RFC5226, May 2008, | |||
<https://www.rfc-editor.org/info/rfc5226>. | <https://www.rfc-editor.org/info/rfc5226>. | |||
[63] Eisler, M., "Errata 2006 for RFC 5661", January 2010, | [63] RFC Errata, Erratum ID 2006, RFC 5661, | |||
<https://www.rfc-editor.org/errata_search.php?eid=2006>. | <https://www.rfc-editor.org/errata/eid2006>. | |||
[64] Spasojevic, M. and M. Satayanarayanan, "An Empirical Study | [64] Spasojevic, M. and M. Satayanarayanan, "An Empirical Study | |||
of a Wide-Area Distributed File System", May 1996, | of a Wide-Area Distributed File System", May 1996, | |||
<https://www.cs.cmu.edu/~satya/docdir/spasojevic-tocs-afs- | <https://www.cs.cmu.edu/~satya/docdir/spasojevic-tocs-afs- | |||
measurement-1996.pdf>. | measurement-1996.pdf>. | |||
[65] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., | [65] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., | |||
"Network File System (NFS) Version 4 Minor Version 1 | "Network File System (NFS) Version 4 Minor Version 1 | |||
Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, | Protocol", RFC 5661, DOI 10.17487/RFC5661, January 2010, | |||
<https://www.rfc-editor.org/info/rfc5661>. | <https://www.rfc-editor.org/info/rfc5661>. | |||
skipping to change at line 31094 ¶ | skipping to change at line 31102 ¶ | |||
[70] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an | [70] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an | |||
Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May | Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May | |||
2014, <https://www.rfc-editor.org/info/rfc7258>. | 2014, <https://www.rfc-editor.org/info/rfc7258>. | |||
[71] Rescorla, E. and B. Korver, "Guidelines for Writing RFC | [71] Rescorla, E. and B. Korver, "Guidelines for Writing RFC | |||
Text on Security Considerations", BCP 72, RFC 3552, | Text on Security Considerations", BCP 72, RFC 3552, | |||
DOI 10.17487/RFC3552, July 2003, | DOI 10.17487/RFC3552, July 2003, | |||
<https://www.rfc-editor.org/info/rfc3552>. | <https://www.rfc-editor.org/info/rfc3552>. | |||
Appendix A. Need for this Update | Appendix A. The Need for This Update | |||
This document includes an explanation of how clients and servers are | This document includes an explanation of how clients and servers are | |||
to determine the particular network access paths to be used to access | to determine the particular network access paths to be used to access | |||
a file system. This includes describing how changes to the specific | a file system. This includes descriptions of how to handle changes | |||
replica to be used or to the set of addresses to be used to access it | to the specific replica to be used or to the set of addresses to be | |||
are to be dealt with, and how transfers of responsibility that need | used to access it, and how to deal transparently with transfers of | |||
to be made can be dealt with transparently. This includes cases in | responsibility that need to be made. This includes cases in which | |||
which there is a shift between one replica and another and those in | there is a shift between one replica and another and those in which | |||
which different network access paths are used to access the same | different network access paths are used to access the same replica. | |||
replica. | ||||
As a result of the following problems in RFC5661 [65], it is | As a result of the following problems in RFC 5661 [65], it was | |||
necessary to provide the specific updates which are made by this | necessary to provide the specific updates that are made by this | |||
document. These updates are described in Appendix B | document. These updates are described in Appendix B. | |||
* RFC5661 [65], while it dealt with situations in which various | * RFC 5661 [65], while it dealt with situations in which various | |||
forms of clustering allowed co-ordination of the state assigned by | forms of clustering allowed coordination of the state assigned by | |||
co-operating servers to be used, made no provisions for | cooperating servers to be used, made no provisions for Transparent | |||
Transparent State Migration. Within NFSv4.0, Transparent | State Migration. Within NFSv4.0, Transparent State Migration was | |||
Migration was first explained clearly in RFC7530 [67] and | first explained clearly in RFC 7530 [67] and corrected and | |||
corrected and clarified by RFC7931 [68]. No corresponding | clarified by RFC 7931 [68]. No corresponding explanation for | |||
explanation for NFSv4.1 had been provided. | NFSv4.1 had been provided. | |||
* Although NFSv4.1 was defined with a clear definition of how | * Although NFSv4.1 provided a clear definition of how trunking | |||
trunking detection was to be done, there was no clear | detection was to be done, there was no clear specification of how | |||
specification of how trunking discovery was to be done, despite | trunking discovery was to be done, despite the fact that the | |||
the fact that the specification clearly indicated that this | specification clearly indicated that this information could be | |||
information could be made available via the file system location | made available via the file system location attributes. | |||
attributes. | ||||
* Because the existence of multiple network access paths to the same | * Because the existence of multiple network access paths to the same | |||
file system was dealt with as if there were multiple replicas, | file system was dealt with as if there were multiple replicas, | |||
issues relating to transitions between replicas could never be | issues relating to transitions between replicas could never be | |||
clearly distinguished from trunking-related transitions between | clearly distinguished from trunking-related transitions between | |||
the addresses used to access a particular file system instance. | the addresses used to access a particular file system instance. | |||
As a result, in situations in which both migration and trunking | As a result, in situations in which both migration and trunking | |||
configuration changes were involved, neither of these could be | configuration changes were involved, neither of these could be | |||
clearly dealt with and the relationship between these two features | clearly dealt with, and the relationship between these two | |||
was not seriously addressed. | features was not seriously addressed. | |||
* Because use of two network access paths to the same file system | * Because use of two network access paths to the same file system | |||
instance (i.e. trunking) was often treated as if two replicas were | instance (i.e., trunking) was often treated as if two replicas | |||
involved, it was considered that two replicas were being used | were involved, it was considered that two replicas were being used | |||
simultaneously. As a result, the treatment of replicas being used | simultaneously. As a result, the treatment of replicas being used | |||
simultaneously in RFC5661 [65] was not clear as it covered the two | simultaneously in RFC 5661 [65] was not clear, as it covered the | |||
distinct cases of a single file system instance being accessed by | two distinct cases of a single file system instance being accessed | |||
two different network access paths and two replicas being accessed | by two different network access paths and two replicas being | |||
simultaneously, with the limitations of the latter case not being | accessed simultaneously, with the limitations of the latter case | |||
clearly laid out. | not being clearly laid out. | |||
The majority of the consequences of these issues are dealt with by | The majority of the consequences of these issues are dealt with by | |||
presenting in Section 11 a replacement for Section 11 of RFC 5661 | presenting in Section 11 a replacement for Section 11 of RFC 5661 | |||
[65]. This replacement modifies existing sub-sections within that | [65]. This replacement modifies existing subsections within that | |||
section and adds new ones, as described in Appendix B.1. Also, some | section and adds new ones as described in Appendix B.1. Also, some | |||
existing sections are deleted. These changes were made in order to: | existing sections were deleted. These changes were made in order to | |||
do the following: | ||||
* Reorganize the description so that the case of two network access | * Reorganize the description so that the case of two network access | |||
paths to the same file system instance needs to be distinguished | paths to the same file system instance is distinguished clearly | |||
clearly from the case of two different replicas since, in the | from the case of two different replicas since, in the former case, | |||
former case, locking state is shared and there also can be sharing | locking state is shared and there also can be sharing of session | |||
of session state. | state. | |||
* Provide a clear statement regarding the desirability of | * Provide a clear statement regarding the desirability of | |||
transparent transfer of state between replicas together with a | transparent transfer of state between replicas together with a | |||
recommendation that either that or a single-fs grace period be | recommendation that either transparent transfer or a single-fs | |||
provided. | grace period be provided. | |||
* Specifically delineate how such transfers are to be dealt with by | * Specifically delineate how a client is to handle such transfers, | |||
the client, taking into account the differences from the treatment | taking into account the differences from the treatment in [68] | |||
in [68] made necessary by the major protocol changes made in | made necessary by the major protocol changes to NFSv4.1. | |||
NFSv4.1. | ||||
* Provide discussion of the relationship between transparent state | * Discuss the relationship between transparent state transfer and | |||
transfer and Parallel NFS (pNFS). | Parallel NFS (pNFS). | |||
* Provide clarification of the fs_locations_info attribute in order | * Clarify the fs_locations_info attribute in order to specify which | |||
to specify which portions of the information provided apply to a | portions of the provided information apply to a specific network | |||
specific network access path and which to the replica which that | access path and which apply to the replica that the path is used | |||
path is used to access. | to access. | |||
In addition, there are also updates to other sections of RFC5661 | In addition, other sections of RFC 5661 [65] were updated to correct | |||
[65], where the consequences of the incorrect assumptions underlying | the consequences of the incorrect assumptions underlying the | |||
the current treatment of multi-server namespace issues also needed to | treatment of multi-server namespace issues. These are described in | |||
be corrected. These are to be dealt with as described in Appendices | Appendices B.2 through B.4. | |||
B.2 through B.4. | ||||
* A revised introductory section regarding multi-server namespace | * A revised introductory section regarding multi-server namespace | |||
facilities is provided. | facilities is provided. | |||
* A more realistic treatment of server scope is provided, which | * A more realistic treatment of server scope is provided. This | |||
reflects the more limited co-ordination of locking state adopted | treatment reflects the more limited coordination of locking state | |||
by servers actually sharing a common server scope. | adopted by servers actually sharing a common server scope. | |||
* Some confusing text regarding changes in server_owner has been | * Some confusing text regarding changes in server_owner has been | |||
clarified. | clarified. | |||
* The description of some existing errors has been modified to more | * The description of some existing errors has been modified to more | |||
clearly explain certain errors situations to reflect the existence | clearly explain certain error situations to reflect the existence | |||
of trunking and the possible use of fs-specific grace periods. | of trunking and the possible use of fs-specific grace periods. | |||
For details, see Appendix B.3. | For details, see Appendix B.3. | |||
* New descriptions of certain existing operations are provided, | * New descriptions of certain existing operations are provided, | |||
either because the existing treatment did not account for | either because the existing treatment did not account for | |||
situations that would arise in dealing with transparent state | situations that would arise in dealing with Transparent State | |||
migration, or because some types of reclaim issues were not | Migration, or because some types of reclaim issues were not | |||
adequately dealt with in the context of fs-specific grace periods. | adequately dealt with in the context of fs-specific grace periods. | |||
For details, see Appendix B.2. | For details, see Appendix B.2. | |||
Appendix B. Changes in this Update | Appendix B. Changes in This Update | |||
B.1. Revisions Made to Section 11 of RFC5661 | B.1. Revisions Made to Section 11 of RFC 5661 | |||
A number of areas needed to be revised or extended, in many case | A number of areas have been revised or extended, in many cases | |||
replacing existing sub-sections within Section 11 of RFC 5661 [65]: | replacing subsections within Section 11 of RFC 5661 [65]: | |||
* New introductory material, including a terminology section, | * New introductory material, including a terminology section, | |||
replaces the existing material in RFC5661 [65] ranging from the | replaces the material in RFC 5661 [65], ranging from the start of | |||
start of the existing Section 11 up to and including the existing | the original Section 11 up to and including Section 11.1. The new | |||
Section 11.1. The new material starts at the beginning of | material starts at the beginning of Section 11 and continues | |||
Section 11 and continues through 11.2 below. | through 11.2. | |||
* A significant reorganization of the material in the existing | * A significant reorganization of the material in Sections 11.4 and | |||
Sections 11.4 and 11.5 of RFC 5661 [65]) is necessary. The | 11.5 of RFC 5661 [65] was necessary. The reasons for the | |||
reasons for the reorganization of these sections into a single | reorganization of these sections into a single section with | |||
section with multiple subsections are discussed in Appendix B.1.1 | multiple subsections are discussed in Appendix B.1.1 below. This | |||
below. This replacement appears as Section 11.5 below. | replacement appears as Section 11.5. | |||
New material relating to the handling of the file system location | New material relating to the handling of the file system location | |||
attributes is contained in Sections 11.5.1 and 11.5.7 below. | attributes is contained in Sections 11.5.1 and 11.5.7. | |||
* A new section describing requirements for user and group handling | * A new section describing requirements for user and group handling | |||
within a multi-server namespace has been added as Section 11.7 | within a multi-server namespace has been added as Section 11.7. | |||
* A major replacement for the existing Section 11.7 of RFC 5661 [65] | * A major replacement for Section 11.7 of RFC 5661 [65], entitled | |||
entitled "Effecting File System Transitions", will appear as | "Effecting File System Transitions", appears as Sections 11.9 | |||
Sections 11.9 through 11.14. The reasons for the reorganization | through 11.14. The reasons for the reorganization of this section | |||
of this section into multiple sections are discussed in | into multiple sections are discussed in Appendix B.1.2. | |||
Appendix B.1.2. | ||||
* A replacement for the existing Section 11.10 of RFC 5661 [65] | * A replacement for Section 11.10 of RFC 5661 [65], entitled "The | |||
entitled "The Attribute fs_locations_info", will appear as | Attribute fs_locations_info", appears as Section 11.17, with | |||
Section 11.17, with Appendix B.1.3 describing the differences | Appendix B.1.3 describing the differences between the new section | |||
between the new section and the treatment within [65]. A revised | and the treatment within [65]. A revised treatment was necessary | |||
treatment is necessary because the existing treatment did not make | because the original treatment did not make clear how the added | |||
clear how the added attribute information relates to the case of | attribute information relates to the case of trunked paths to the | |||
trunked paths to the same replica. These issues were not | same replica. These issues were not addressed in RFC 5661 [65] | |||
addressed in RFC5661 [65] where the concepts of a replica and a | where the concepts of a replica and a network path used to access | |||
network path used to access a replica were not clearly | a replica were not clearly distinguished. | |||
distinguished. | ||||
B.1.1. Re-organization of Sections 11.4 and 11.5 of RFC5661 | B.1.1. Reorganization of Sections 11.4 and 11.5 of RFC 5661 | |||
Previously, issues related to the fact that multiple location entries | Previously, issues related to the fact that multiple location entries | |||
directed the client to the same file system instance were dealt with | directed the client to the same file system instance were dealt with | |||
in a separate Section 11.5 of RFC 5661 [65]. Because of the new | in Section 11.5 of RFC 5661 [65]. Because of the new treatment of | |||
treatment of trunking, these issues now belong within Section 11.5 | trunking, these issues now belong within Section 11.5. | |||
below. | ||||
In this new section, trunking is dealt with in Section 11.5.2 | In this new section, trunking is covered in Section 11.5.2 together | |||
together with the other uses of file system location information | with the other uses of file system location information described in | |||
described in Sections 11.5.3 through 11.5.6. | Sections 11.5.3 through 11.5.6. | |||
As a result, Section 11.5 which will replace Section 11.4 of RFC 5661 | As a result, Section 11.5, which replaces Section 11.4 of RFC 5661 | |||
[65] is substantially different than the section it replaces in that | [65], is substantially different than the section it replaces in that | |||
some existing sections will be replaced by corresponding sections | some original sections have been replaced by corresponding sections | |||
below while, at the same time, new sections will be added, resulting | as described below, while new sections have been added: | |||
in a replacement containing some renumbered sections, as follows: | ||||
* The material in Section 11.5, exclusive of subsections, replaces | * The material in Section 11.5, exclusive of subsections, replaces | |||
the material in Section 11.4 of RFC 5661 [65] exclusive of | the material in Section 11.4 of RFC 5661 [65] exclusive of | |||
subsections. | subsections. | |||
* Section 11.5.1 is a new first subsection of the overall section. | * Section 11.5.1 is the new first subsection of the overall section. | |||
* Section 11.5.2 is a new second subsection of the overall section. | * Section 11.5.2 is the new second subsection of the overall | |||
section. | ||||
* Each of the Sections 11.5.4, 11.5.5, and 11.5.6 replaces (in | * Each of the Sections 11.5.4, 11.5.5, and 11.5.6 replaces (in | |||
order) one of the corresponding Sections 11.4.1, 11.4.2, and | order) one of the corresponding Sections 11.4.1, 11.4.2, and | |||
11.4.3 of RFC 5661 [65]. 11.4.4, and 11.4.5. | 11.4.3 of RFC 5661 [65]. 11.4.4, and 11.4.5. | |||
* Section 11.5.7 is a new final subsection of the overall section. | * Section 11.5.7 is the new final subsection of the overall section. | |||
B.1.2. Re-organization of Material Dealing with File System Transitions | B.1.2. Reorganization of Material Dealing with File System Transitions | |||
The material relating to file system transition, previously contained | The material relating to file system transition, previously contained | |||
in Section 11.7 of RFC 5661 [65] has been reorganized and augmented | in Section 11.7 of RFC 5661 [65] has been reorganized and augmented | |||
as described below: | as described below: | |||
* Because there can be a shift of the network access paths used to | * Because there can be a shift of the network access paths used to | |||
access a file system instance without any shift between replicas, | access a file system instance without any shift between replicas, | |||
a new Section 11.9 distinguishes between those cases in which | a new Section 11.9 distinguishes between those cases in which | |||
there is a shift between distinct replicas and those involving a | there is a shift between distinct replicas and those involving a | |||
shift in network access paths with no shift between replicas. | shift in network access paths with no shift between replicas. | |||
As a result, a new Section 11.10 deals with network address | As a result, the new Section 11.10 deals with network address | |||
transitions while the bulk of the former Section 11.7 of RFC 5661 | transitions, while the bulk of the original Section 11.7 of RFC | |||
[65] is extensively modified as reflected in Section 11.11 which | 5661 [65] has been extensively modified as reflected in | |||
is now limited to cases in which there is a shift between two | Section 11.11, which is now limited to cases in which there is a | |||
different sets of replicas. | shift between two different sets of replicas. | |||
* The additional Section 11.12 discusses the case in which a shift | * The additional Section 11.12 discusses the case in which a shift | |||
to a different replica is made and state is transferred to allow | to a different replica is made and state is transferred to allow | |||
the client the ability to have continued access to its accumulated | the client the ability to have continued access to its accumulated | |||
locking state on the new server. | locking state on the new server. | |||
* The additional Section 11.13 discusses the client's response to | * The additional Section 11.13 discusses the client's response to | |||
access transitions and how it determines whether migration has | access transitions, how it determines whether migration has | |||
occurred, and how it gets access to any transferred locking and | occurred, and how it gets access to any transferred locking and | |||
session state. | session state. | |||
* The additional Section 11.14 discusses the responsibilities of the | * The additional Section 11.14 discusses the responsibilities of the | |||
source and destination servers when transferring locking and | source and destination servers when transferring locking and | |||
session state. | session state. | |||
This re-organization has caused a renumbering of the sections within | This reorganization has caused a renumbering of the sections within | |||
Section 11 of [65] as described below: | Section 11 of [65] as described below: | |||
* The new Sections 11.9 and 11.10 have resulted in existing sections | * The new Sections 11.9 and 11.10 have resulted in the renumbering | |||
with these numbers to be renumbered. | of existing sections with these numbers. | |||
* Section 11.7 of [65] will be substantially modified and appear as | * Section 11.7 of [65] has been substantially modified and appears | |||
Section 11.11. The necessary modifications reflect the fact that | as Section 11.11. The necessary modifications reflect the fact | |||
this section will only deal with transitions between replicas | that this section only deals with transitions between replicas, | |||
while transitions between network addresses are dealt with in | while transitions between network addresses are dealt with in | |||
other sections. Details of the reorganization are described later | other sections. Details of the reorganization are described later | |||
in this section. | in this section. | |||
* The additional Sections 11.12, 11.13, and 11.14 have been added. | * Sections 11.12, 11.13, and 11.14 have been added. | |||
* Consequently, Sections 11.8, 11.9, 11.10, and 11.11 in [65] now | * Consequently, Sections 11.8, 11.9, 11.10, and 11.11 in [65] now | |||
appear as Sections 11.13, 11.14, 11.15, and 11.16, respectively. | appear as Sections 11.15, 11.16, 11.17, and 11.18, respectively. | |||
As part of this general re-organization, Section 11.7 of RFC 5661 | As part of this general reorganization, Section 11.7 of RFC 5661 [65] | |||
[65] will be modified as described below: | has been modified as described below: | |||
* Sections 11.7 and 11.7.1 of RFC 5661 [65] are to be replaced by | * Sections 11.7 and 11.7.1 of RFC 5661 [65] have been replaced by | |||
Sections 11.11 and 11.11.1, respectively. | Sections 11.11 and 11.11.1, respectively. | |||
* Section 11.7.2 of RFC 5661 (and included subsections) are to be | * Section 11.7.2 of RFC 5661 (and included subsections) has been | |||
deleted. | deleted. | |||
* Sections 11.7.3, 11.7.4, 11.7.5, 11.7.5.1, and 11.7.6 of RFC 5661 | * Sections 11.7.3, 11.7.4, 11.7.5, 11.7.5.1, and 11.7.6 of RFC 5661 | |||
[65] are to be replaced by Sections 11.11.2, 11.11.3, 11.11.4, | [65] have been replaced by Sections 11.11.2, 11.11.3, 11.11.4, | |||
11.11.4.1, and 11.11.5 respectively in this document. | 11.11.4.1, and 11.11.5 respectively in this document. | |||
* Section 11.7.7 of RFC 5661 [65] is to be replaced by | * Section 11.7.7 of RFC 5661 [65] has been replaced by | |||
Section 11.11.9 This sub-section has been moved to the end of the | Section 11.11.9. This subsection has been moved to the end of the | |||
section dealing with file system transitions. | section dealing with file system transitions. | |||
* Sections 11.7.8, 11.7.9, and 11.7.10 of RFC 5661 [65] are to be | * Sections 11.7.8, 11.7.9, and 11.7.10 of RFC 5661 [65] have been | |||
replaced by Sections 11.11.6, 11.11.7, and 11.11.8 respectively in | replaced by Sections 11.11.6, 11.11.7, and 11.11.8 respectively in | |||
this document. | this document. | |||
B.1.3. Updates to treatment of fs_locations_info | B.1.3. Updates to the Treatment of fs_locations_info | |||
Various elements of the fs_locations_info attribute contain | Various elements of the fs_locations_info attribute contain | |||
information that applies to either a specific file system replica or | information that applies to either a specific file system replica or | |||
to a network path or set of network paths used to access such a | to a network path or set of network paths used to access such a | |||
replica. The existing treatment of fs_locations info (Section 11.10 | replica. The original treatment of fs_locations_info (Section 11.10 | |||
of RFC 5661 [65]) does not clearly distinguish these cases, in part | of RFC 5661 [65]) did not clearly distinguish these cases, in part | |||
because the document did not clearly distinguish replicas from the | because the document did not clearly distinguish replicas from the | |||
paths used to access them. | paths used to access them. | |||
In addition, special clarification needed to be provided with regard | In addition, special clarification has been provided with regard to | |||
to the following fields: | the following fields: | |||
* With regard to the handling of FSLI4GF_GOING, it needs to be made | * With regard to the handling of FSLI4GF_GOING, it was clarified | |||
clear that this only applies to the unavailability of a replica | that this only applies to the unavailability of a replica rather | |||
rather than to a path to access a replica. | than to a path to access a replica. | |||
* In describing the appropriate value for a server to use for | * In describing the appropriate value for a server to use for | |||
fli_valid_for, it needs to be made clear that there is no need for | fli_valid_for, it was clarified that there is no need for the | |||
the client to frequently fetch the fs_locations_info value to be | client to frequently fetch the fs_locations_info value to be | |||
prepared for shifts in trunking patterns. | prepared for shifts in trunking patterns. | |||
* Clarification of the rules for extensions to the fls_info needs to | * Clarification of the rules for extensions to the fls_info has been | |||
be provided. The existing treatment reflects the extension model | provided. The original treatment reflected the extension model | |||
in effect at the time RFC5661 [65] was written, and needed to be | that was in effect at the time RFC 5661 [65] was written, but has | |||
updated in accordance with the extension model described in | been updated in accordance with the extension model described in | |||
RFC8178 [66]. | RFC 8178 [66]. | |||
B.2. Revisions Made to Operations in RFC5661 | B.2. Revisions Made to Operations in RFC 5661 | |||
Revised descriptions were needed to address issues that arose in | Descriptions have been revised to address issues that arose in | |||
effecting necessary changes to multi-server namespace features. | effecting necessary changes to multi-server namespace features. | |||
* The existing treatment of EXCHANGE_ID (Section 13.35 of RFC 5661 | * The treatment of EXCHANGE_ID (Section 18.35 of RFC 5661 [65]) | |||
[65]) assumes that client IDs cannot be created/ confirmed other | assumed that client IDs cannot be created/confirmed other than by | |||
than by the EXCHANGE_ID and CREATE_SESSION operations. Also, the | the EXCHANGE_ID and CREATE_SESSION operations. Also, the | |||
necessary use of EXCHANGE_ID in recovery from migration and | necessary use of EXCHANGE_ID in recovery from migration and | |||
related situations is not addressed clearly. A revised treatment | related situations was not clearly addressed. A revised treatment | |||
of EXCHANGE_ID is necessary and it appears in Section 18.35 while | of EXCHANGE_ID was necessary, and it appears in Section 18.35, | |||
the specific differences between it and the treatment within [65] | while the specific differences between it and the treatment within | |||
are explained in Appendix B.2.1 below. | [65] are explained in Appendix B.2.1 below. | |||
* The existing treatment of RECLAIM_COMPLETE in Section 18.51 of RFC | * The treatment of RECLAIM_COMPLETE in Section 18.51 of RFC 5661 | |||
5661 [65]) is not sufficiently clear about the purpose and use of | [65] was not sufficiently clear about the purpose and use of the | |||
the rca_one_fs and how the server is to deal with inappropriate | rca_one_fs and how the server was to deal with inappropriate | |||
values of this argument. Because the resulting confusion raises | values of this argument. Because the resulting confusion raised | |||
interoperability issues, a new treatment of RECLAIM_COMPLETE is | interoperability issues, a new treatment of RECLAIM_COMPLETE was | |||
necessary and it appears in Section 18.51 below while the specific | necessary, and it appears in Section 18.51, while the specific | |||
differences between it and the treatment within RFC5661 [65] are | differences between it and the treatment within RFC 5661 [65] are | |||
discussed in Appendix B.2.2 below. In addition, the definitions | discussed in Appendix B.2.2 below. In addition, the definitions | |||
of the reclaim-related errors receive an updated treatment in | of the reclaim-related errors have received an updated treatment | |||
Section 15.1.9 to reflect the fact that there are multiple | in Section 15.1.9 to reflect the fact that there are multiple | |||
contexts for lock reclaim operations. | contexts for lock reclaim operations. | |||
B.2.1. Revision to Treatment of EXCHANGE_ID | B.2.1. Revision of Treatment of EXCHANGE_ID | |||
There are a number of issues in the original treatment of EXCHANGE_ID | There was a number of issues in the original treatment of EXCHANGE_ID | |||
(in RFC5661 [65]) that cause problems for Transparent State Migration | in RFC 5661 [65] that caused problems for Transparent State Migration | |||
and for the transfer of access between different network access paths | and for the transfer of access between different network access paths | |||
to the same file system instance. | to the same file system instance. | |||
These issues arise from the fact that this treatment was written, | These issues arose from the fact that this treatment was written: | |||
* Assuming that a client ID can only become known to a server by | * Assuming that a client ID can only become known to a server by | |||
having been created by executing an EXCHANGE_ID, with confirmation | having been created by executing an EXCHANGE_ID, with confirmation | |||
of the ID only possible by execution of a CREATE_SESSION. | of the ID only possible by execution of a CREATE_SESSION. | |||
* Considering the interactions between a client and a server only | * Considering the interactions between a client and a server only | |||
occurring on a single network address | occurring on a single network address. | |||
As these assumptions have become invalid in the context of | As these assumptions have become invalid in the context of | |||
Transparent State Migration and active use of trunking, the treatment | Transparent State Migration and active use of trunking, the treatment | |||
has been modified in several respects. | has been modified in several respects: | |||
* It had been assumed that an EXCHANGED_ID executed when the server | * It had been assumed that an EXCHANGE_ID executed when the server | |||
is already aware of a given client instance must be either | is already aware of a given client instance must be either | |||
updating associated parameters (e.g. with respect to callbacks) or | updating associated parameters (e.g., with respect to callbacks) | |||
a lingering retransmission to deal with a previously lost reply. | or a lingering retransmission to deal with a previously lost | |||
As result, any slot sequence returned by that operation would be | reply. As result, any slot sequence returned by that operation | |||
of no use. The existing treatment goes so far as to say that it | would be of no use. The original treatment went so far as to say | |||
"MUST NOT" be used, although this usage is not in accord with [1]. | that it "MUST NOT" be used, although this usage was not in accord | |||
This created a difficulty when an EXCHANGE_ID is done after | with [1]. This created a difficulty when an EXCHANGE_ID is done | |||
Transparent State Migration since that slot sequence would need to | after Transparent State Migration since that slot sequence would | |||
be used in a subsequent CREATE_SESSION. | need to be used in a subsequent CREATE_SESSION. | |||
In the updated treatment, CREATE_SESSION is a way that client IDs | In the updated treatment, CREATE_SESSION is a way that client IDs | |||
are confirmed but it is understood that other ways are possible. | are confirmed, but it is understood that other ways are possible. | |||
The slot sequence can be used as needed and cases in which it | The slot sequence can be used as needed, and cases in which it | |||
would be of no use are appropriately noted. | would be of no use are appropriately noted. | |||
* It was assumed that the only functions of EXCHANGE_ID were to | * It had been assumed that the only functions of EXCHANGE_ID were to | |||
inform the server of the client, create the client ID, and | inform the server of the client, to create the client ID, and to | |||
communicate it to the client. When multiple simultaneous | communicate it to the client. When multiple simultaneous | |||
connections are involved, as often happens when trunking, that | connections are involved, as often happens when trunking, that | |||
treatment was inadequate in that it ignored the role of | treatment was inadequate in that it ignored the role of | |||
EXCHANGE_ID in associating the client ID with the connection on | EXCHANGE_ID in associating the client ID with the connection on | |||
which it was done, so that it could be used by a subsequent | which it was done, so that it could be used by a subsequent | |||
CREATE_SESSSION, whose parameters do not include an explicit | CREATE_SESSSION whose parameters do not include an explicit client | |||
client ID. | ID. | |||
The new treatment explicitly discusses the role of EXCHANGE_ID in | The new treatment explicitly discusses the role of EXCHANGE_ID in | |||
associating the client ID with the connection so it can be used by | associating the client ID with the connection so it can be used by | |||
CREATE_SESSION and in associating a connection with an existing | CREATE_SESSION and in associating a connection with an existing | |||
session. | session. | |||
The new treatment can be found in Section 18.35 above. It supersedes | The new treatment can be found in Section 18.35 above. It supersedes | |||
the treatment in Section 18.35 of RFC 5661 [65]. | the treatment in Section 18.35 of RFC 5661 [65]. | |||
B.2.2. Revision to Treatment of RECLAIM_COMPLETE | B.2.2. Revision of Treatment of RECLAIM_COMPLETE | |||
The following changes were made to the treatment of RECLAIM_COMPLETE | The following changes were made to the treatment of RECLAIM_COMPLETE | |||
in RFC5661 [65] to arrive at the treatment in Section 18.51. | in RFC 5661 [65] to arrive at the treatment in Section 18.51: | |||
* In a number of places the text is made more explicit about the | * In a number of places, the text was made more explicit about the | |||
purpose of rca_one_fs and its connection to file system migration. | purpose of rca_one_fs and its connection to file system migration. | |||
* There is a discussion of situations in which particular forms of | * There is a discussion of situations in which particular forms of | |||
RECLAIM_COMPLETE would need to be done. | RECLAIM_COMPLETE would need to be done. | |||
* There is a discussion of interoperability issues that result from | * There is a discussion of interoperability issues between | |||
implementations that may have arisen due to the lack of clarity of | implementations that may have arisen due to the lack of clarity of | |||
the previous treatment of RECLAIM_COMPLETE. | the previous treatment of RECLAIM_COMPLETE. | |||
B.3. Revisions Made to Error Definitions in RFC5661 | B.3. Revisions Made to Error Definitions in RFC 5661 | |||
The new handling of various situations required revisions of some | The new handling of various situations required revisions to some | |||
existing error definition: | existing error definitions: | |||
* Because of the need to appropriately address trunking-related | * Because of the need to appropriately address trunking-related | |||
issues, some uses of the term "replica" in RFC5661 [65] have | issues, some uses of the term "replica" in RFC 5661 [65] became | |||
become problematic since a shift in network access paths was | problematic because a shift in network access paths was considered | |||
considered to be a shift to a different replica. As a result, the | to be a shift to a different replica. As a result, the original | |||
existing definition of NFS4ERR_MOVED (in Section 15.1.2.4 of RFC | definition of NFS4ERR_MOVED (in Section 15.1.2.4 of RFC 5661 [65]) | |||
5661 [65]) needs to be updated to reflect the different handling | was updated to reflect the different handling of unavailability of | |||
of unavailability of a particular fs via a specific network | a particular fs via a specific network address. | |||
address. | ||||
Since such a situation is no longer considered to constitute | Since such a situation is no longer considered to constitute | |||
unavailability of a file system instance, the description needs to | unavailability of a file system instance, the description has been | |||
change even though the set of circumstances in which it is to be | changed, even though the set of circumstances in which it is to be | |||
returned remain the same. The new paragraph explicitly recognizes | returned remains the same. The new paragraph explicitly | |||
that a different network address might be used, while the previous | recognizes that a different network address might be used, while | |||
description, misleadingly, treated this as a shift between two | the previous description, misleadingly, treated this as a shift | |||
replicas while only a single file system instance might be | between two replicas while only a single file system instance | |||
involved. The updated description appears in Section 15.1.2.4 | might be involved. The updated description appears in | |||
below. | Section 15.1.2.4. | |||
* Because of the need to accommodate use of fs-specific grace | * Because of the need to accommodate the use of fs-specific grace | |||
periods, it is necessary to clarify some of the error definitions | periods, it was necessary to clarify some of the definitions of | |||
of reclaim-related errors in Section 15 of RFC 5661 [65], so the | reclaim-related errors in Section 15 of RFC 5661 [65] so that the | |||
text applies properly to reclaims for all types of grace periods. | text applies properly to reclaims for all types of grace periods. | |||
The updated descriptions appear within Section 15.1.9 below. | The updated descriptions appear within Section 15.1.9. | |||
* Because of the need to provide the clarifications in errata report | * Because of the need to provide the clarifications in errata report | |||
2006 [63] and to adapt these to properly explain the interaction | 2006 [63] and to adapt these to properly explain the interaction | |||
of NFS4ERR_DELAY with the replay cache, a revised description of | of NFS4ERR_DELAY with the replay cache, a revised description of | |||
NFS4ERR_DELAY appears in Section 15.1.1.3. This errata report, | NFS4ERR_DELAY appears in Section 15.1.1.3. This errata report, | |||
unlike many other RFC5661 errata reports, is addressed in this | unlike many other RFC 5661 errata reports, is addressed in this | |||
document because of the extensive use of NFS4ERR_DELAY in | document because of the extensive use of NFS4ERR_DELAY in | |||
connection with state migration and session migration. | connection with state migration and session migration. | |||
B.4. Other Revisions Made to RFC5661 | B.4. Other Revisions Made to RFC 5661 | |||
Beside the major reworking of Section 11 of RFC 5661 [65] and the | Besides the major reworking of Section 11 of RFC 5661 [65] and the | |||
associated revisions to existing operations and errors, there are a | associated revisions to existing operations and errors, there were a | |||
number of related changes that are necessary: | number of related changes that were necessary: | |||
* The summary that appeared in Section 1.7.3.3 of RFC 5661 [65] was | * The summary in Section 1.7.3.3 of RFC 5661 [65] was revised to | |||
revised to reflect the changes made in the revised Section 11 | reflect the changes made to Section 11 above. The updated summary | |||
above. The updated summary appears as Section 1.8.3.3 above. | appears as Section 1.8.3.3 above. | |||
* The discussion of server scope which appeared in Section 2.10.4 of | * The discussion of server scope in Section 2.10.4 of RFC 5661 [65] | |||
RFC 5661 [65] needed to be replaced, since the previous text | was replaced since it appeared to require a level of inter-server | |||
appears to require a level of inter-server co-ordination | coordination incompatible with its basic function of avoiding the | |||
incompatible with its basic function of avoiding the need for a | need for a globally uniform means of assigning server_owner | |||
globally uniform means of assigning server_owner values. A | values. A revised treatment appears in Section 2.10.4. | |||
revised treatment appears in Section 2.10.4. | ||||
* The discussion of trunking which appeared in Section 2.10.5 of RFC | * The discussion of trunking in Section 2.10.5 of RFC 5661 [65] was | |||
5661 [65] needed to be revised, to more clearly explain the | revised to more clearly explain the multiple types of trunking | |||
multiple types of trunking support and how the client can be made | support and how the client can be made aware of the existing | |||
aware of the existing trunking configuration. In addition, while | trunking configuration. In addition, while the last paragraph | |||
the last paragraph (exclusive of sub-sections) of that section, | (exclusive of subsections) of that section dealing with | |||
dealing with server_owner changes, is literally true, it has been | server_owner changes was literally true, it had been a source of | |||
a source of confusion. Since the existing paragraph can be read | confusion. Since the original paragraph could be read as | |||
as suggesting that such changes be dealt with non-disruptively, | suggesting that such changes be handled nondisruptively, the issue | |||
the issue needs to be clarified in the revised section, which | was clarified in the revised Section 2.10.5. | |||
appears in Section 2.10.5. | ||||
Appendix C. Security Issues that Need to be Addressed | Appendix C. Security Issues That Need to Be Addressed | |||
The following issues in the treatment of security within the NFSv4.1 | The following issues in the treatment of security within the NFSv4.1 | |||
specification need to be addressed: | specification need to be addressed: | |||
* The Security Considerations Section of RFC5661 [65] is not written | * The Security Considerations Section of RFC 5661 [65] was not | |||
in accord with RFC3552 [71] (also BCP72). Of particular concern | written in accordance with RFC 3552 (BCP 72) [71]. Of particular | |||
is the fact that the section does not contain a threat analysis. | concern was the fact that the section did not contain a threat | |||
analysis. | ||||
* Initial analysis of the existing security issues with NFSv4.1 has | * Initial analysis of the existing security issues with NFSv4.1 has | |||
made it likely that a revised Security Considerations Section for | made it likely that a revised Security Considerations section for | |||
the existing protocol (one containing a threat analysis) would be | the existing protocol (one containing a threat analysis) would be | |||
likely to conclude that NFSv4.1 does not meet the goal of secure | likely to conclude that NFSv4.1 does not meet the goal of secure | |||
use on the internet. | use on the Internet. | |||
The Security Considerations Section of this document (in Section 21) | The Security Considerations section of this document (Section 21) has | |||
has not been thoroughly revised to correct the difficulties mentioned | not been thoroughly revised to correct the difficulties mentioned | |||
above. Instead, it has been modified to take proper account of | above. Instead, it has been modified to take proper account of | |||
issues related to the multi-server namespace features discussed in | issues related to the multi-server namespace features discussed in | |||
Section 11, leaving the incomplete discussion and security weaknesses | Section 11, leaving the incomplete discussion and security weaknesses | |||
pretty much as they were. | pretty much as they were. | |||
The following major security issues need to be addressed in a | The following major security issues need to be addressed in a | |||
satisfactory fashion before an updated Security Considerations | satisfactory fashion before an updated Security Considerations | |||
section can be published as part of a bis document for NFSv4.1: | section can be published as part of a bis document for NFSv4.1: | |||
* The continued use of AUTH_SYS and the security exposures it | * The continued use of AUTH_SYS and the security exposures it | |||
creates needs to be addressed. Addressing this issue must not be | creates need to be addressed. Addressing this issue must not be | |||
limited to the questions of whether the designation of this as | limited to the questions of whether the designation of this as | |||
OPTIONAL was justified and whether it should be changed. | OPTIONAL was justified and whether it should be changed. | |||
In any event, it may not be possible, at this point, to correct | In any event, it may not be possible at this point to correct the | |||
the security problems created by continued use of AUTH_SYS simply | security problems created by continued use of AUTH_SYS simply by | |||
by revising this designation. | revising this designation. | |||
* The lack of attention within the protocol to the possibility of | * The lack of attention within the protocol to the possibility of | |||
pervasive monitoring attacks such as those described in RFC7258 | pervasive monitoring attacks such as those described in RFC 7258 | |||
[70] (also BCP188). | [70] (also BCP 188). | |||
In that connection, the use of CREATE_SESSION without privacy | In that connection, the use of CREATE_SESSION without privacy | |||
protection needs to be addressed as it exposes the session ID to | protection needs to be addressed as it exposes the session ID to | |||
view by an attacker. This is worrisome as this is precisely the | view by an attacker. This is worrisome as this is precisely the | |||
type of protocol artifact alluded to in RFC7258, which can enable | type of protocol artifact alluded to in RFC 7258, which can enable | |||
further mischief on the part of the attacker as it enables denial- | further mischief on the part of the attacker as it enables denial- | |||
of-service attacks which can be executed effectively with only a | of-service attacks that can be executed effectively with only a | |||
single, normally low-value, credential, even when RPCSEC_GSS | single, normally low-value, credential, even when RPCSEC_GSS | |||
authentication is in use. | authentication is in use. | |||
* The lack of effective use of privacy and integrity, even where the | * The lack of effective use of privacy and integrity, even where the | |||
infrastructure to support use of RPCSEC_GSS in present, needs to | infrastructure to support use of RPCSEC_GSS is present, needs to | |||
be addressed. | be addressed. | |||
In light of the security exposures that this situation creates, it | In light of the security exposures that this situation creates, it | |||
is not enough to define a protocol that could, with the provision | is not enough to define a protocol that could address this problem | |||
of sufficient resources, address the problem. Instead, what is | with the provision of sufficient resources. Instead, what is | |||
needed is a way to provide the necessary security, with very | needed is a way to provide the necessary security with very | |||
limited performance costs and without requiring security | limited performance costs and without requiring security | |||
infrastructure that experience has shown is difficult for many | infrastructure, which experience has shown is difficult for many | |||
clients and servers to provide. | clients and servers to provide. | |||
In trying to provide a major security upgrade for a deployed protocol | In trying to provide a major security upgrade for a deployed protocol | |||
such as NFSv4.1, the working group, and the internet community is | such as NFSv4.1, the working group and the Internet community are | |||
likely to find itself dealing with a number of considerations such as | likely to find themselves dealing with a number of considerations | |||
the following: | such as the following: | |||
* The need to accommodate existing deployments of existing protocols | * The need to accommodate existing deployments of protocols | |||
as specified previously in existing Proposed Standards. | specified previously in existing Proposed Standards. | |||
* The difficulty of effecting changes to existing interoperating | * The difficulty of effecting changes to existing, interoperating | |||
implementations. | implementations. | |||
* The difficulty of making changes to NFSv4 protocols other than | * The difficulty of making changes to NFSv4 protocols other than | |||
those in the form of OPTIONAL extensions. | those in the form of OPTIONAL extensions. | |||
* The tendency of those responsible for existing NFSv4 deployments | * The tendency of those responsible for existing NFSv4 deployments | |||
to ignore security flaws in the context of local area networks | to ignore security flaws in the context of local area networks | |||
under the mistaken impression that network isolation provides, in | under the mistaken impression that network isolation provides, in | |||
and of itself, isolation from all potential attackers. | and of itself, isolation from all potential attackers. | |||
Given that the difficulties mentioned above apply to minor version | Given that the above-mentioned difficulties apply to minor version | |||
zero as well, it may make sense to deal with these security issues in | zero as well, it may make sense to deal with these security issues in | |||
a common document applying to all NFSv4 minor versions. If that | a common document that applies to all NFSv4 minor versions. If that | |||
approach is taken the, Security Considerations section of an eventual | approach is taken, the Security Considerations section of an eventual | |||
NFv4.1 bis document would reference that common document and the | NFv4.1 bis document would reference that common document, and the | |||
defining RFCs for other minor versions might do so as well. | defining RFCs for other minor versions might do so as well. | |||
Acknowledgments | Acknowledgments | |||
Acknowledgments for this Update | Acknowledgments for This Update | |||
The authors wish to acknowledge the important role of Andy Adamson of | The authors wish to acknowledge the important role of Andy Adamson of | |||
Netapp in clarifying the need for trunking discovery functionality, | Netapp in clarifying the need for trunking discovery functionality, | |||
and exploring the role of the file system location attributes in | and exploring the role of the file system location attributes in | |||
providing the necessary support. | providing the necessary support. | |||
The authors wish to thank Tom Haynes of Hammerspace for drawing our | The authors wish to thank Tom Haynes of Hammerspace for drawing our | |||
attention to the fact that internationalization and security might | attention to the fact that internationalization and security might | |||
best be handled in documents dealing with such protocol issues as | best be handled in documents dealing with such protocol issues as | |||
they apply to all NFSv4 minor versions. | they apply to all NFSv4 minor versions. | |||
The authors also wish to acknowledge the work of Xuan Qi of Oracle | The authors also wish to acknowledge the work of Xuan Qi of Oracle | |||
with NFSv4.1 client and server prototypes of transparent state | with NFSv4.1 client and server prototypes of Transparent State | |||
migration functionality. | Migration functionality. | |||
The authors wish to thank others that brought attention to important | The authors wish to thank others that brought attention to important | |||
issues. The comments of Trond Myklebust of Primary Data related to | issues. The comments of Trond Myklebust of Primary Data related to | |||
trunking helped to clarify the role of DNS in trunking discovery. | trunking helped to clarify the role of DNS in trunking discovery. | |||
Rick Macklem's comments brought attention to problems in the handling | Rick Macklem's comments brought attention to problems in the handling | |||
of the per-fs version of RECLAIM_COMPLETE. | of the per-fs version of RECLAIM_COMPLETE. | |||
The authors wish to thank Olga Kornievskaia of Netapp for her helpful | The authors wish to thank Olga Kornievskaia of Netapp for her helpful | |||
review comments. | review comments. | |||
Acknowledgments for RFC5661 | Acknowledgments for RFC 5661 | |||
The initial text for the SECINFO extensions were edited by Mike | The initial text for the SECINFO extensions were edited by Mike | |||
Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl | Eisler with contributions from Peng Dai, Sergey Klyushin, and Carl | |||
Burnett. | Burnett. | |||
The initial text for the SESSIONS extensions were edited by Tom | The initial text for the SESSIONS extensions were edited by Tom | |||
Talpey, Spencer Shepler, Jon Bauman with contributions from Charles | Talpey, Spencer Shepler, Jon Bauman with contributions from Charles | |||
Antonelli, Brent Callaghan, Mike Eisler, John Howard, Chet Juszczak, | Antonelli, Brent Callaghan, Mike Eisler, John Howard, Chet Juszczak, | |||
Trond Myklebust, Dave Noveck, John Scott, Mike Stolarchuk, and Mark | Trond Myklebust, Dave Noveck, John Scott, Mike Stolarchuk, and Mark | |||
Wittle. | Wittle. | |||
skipping to change at line 31690 ¶ | skipping to change at line 31690 ¶ | |||
The initial text for the parallel NFS support was edited by Brent | The initial text for the parallel NFS support was edited by Brent | |||
Welch and Garth Goodson. Additional authors for those documents were | Welch and Garth Goodson. Additional authors for those documents were | |||
Benny Halevy, David Black, and Andy Adamson. Additional input came | Benny Halevy, David Black, and Andy Adamson. Additional input came | |||
from the informal group that contributed to the construction of the | from the informal group that contributed to the construction of the | |||
initial pNFS drafts; specific acknowledgment goes to Gary Grider, | initial pNFS drafts; specific acknowledgment goes to Gary Grider, | |||
Peter Corbett, Dave Noveck, Peter Honeyman, and Stephen Fridella. | Peter Corbett, Dave Noveck, Peter Honeyman, and Stephen Fridella. | |||
Fredric Isaman found several errors in draft versions of the ONC RPC | Fredric Isaman found several errors in draft versions of the ONC RPC | |||
XDR description of the NFSv4.1 protocol. | XDR description of the NFSv4.1 protocol. | |||
Audrey Van Belleghem provided, in numerous ways, essential co- | Audrey Van Belleghem provided, in numerous ways, essential | |||
ordination and management of the process of editing the specification | coordination and management of the process of editing the | |||
documents. | specification documents. | |||
Richard Jernigan gave feedback on the file layout's striping pattern | Richard Jernigan gave feedback on the file layout's striping pattern | |||
design. | design. | |||
Several formal inspection teams were formed to review various areas | Several formal inspection teams were formed to review various areas | |||
of the protocol. All the inspections found significant errors and | of the protocol. All the inspections found significant errors and | |||
room for improvement. NFSv4.1's inspection teams were: | room for improvement. NFSv4.1's inspection teams were: | |||
* ACLs, with the following inspectors: Sam Falkner, Bruce Fields, | * ACLs, with the following inspectors: Sam Falkner, Bruce Fields, | |||
Rahul Iyer, Saadia Khan, Dave Noveck, Lisa Week, Mario Wurzl, and | Rahul Iyer, Saadia Khan, Dave Noveck, Lisa Week, Mario Wurzl, and | |||
skipping to change at line 31778 ¶ | skipping to change at line 31778 ¶ | |||
Phone: +1-781-768-5347 | Phone: +1-781-768-5347 | |||
Email: dnoveck@netapp.com | Email: dnoveck@netapp.com | |||
Charles Lever | Charles Lever | |||
Oracle Corporation | Oracle Corporation | |||
1015 Granger Avenue | 1015 Granger Avenue | |||
Ann Arbor, MI 48104 | Ann Arbor, MI 48104 | |||
United States of America | United States of America | |||
Phone: +1 248 614 5091 | Phone: +1-248-614-5091 | |||
Email: chuck.lever@oracle.com | Email: chuck.lever@oracle.com | |||
End of changes. 491 change blocks. | ||||
985 lines changed or deleted | 985 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |