INTERNET DRAFT Michiel B. de Jong Document: draft-dejong-remotestorage-00 (independent) F. Kooman Intended Status: Proposed Standard SURFnet Expires: 8 June 2013 5 December 2012 remotestorage Abstract This draft describes a protocol by which client-side applications, running inside a web browser, can communicate with a data storage server that is hosted on a different domain name. This way, the provider of a web application need not also play the role of data storage provider. The protocol supports storing, retrieving, and removing individual documents, as well as listing the contents of an individual directory, and access control is based on bearer tokens. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 9 June 2013. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. de Jong [Page 1] Internet-Draft remotestorage December 2012 Table of Contents 1. Introduction...................................................2 2. Terminology....................................................2 3. Storage model..................................................3 4. Requests.......................................................3 5. Response codes.................................................4 6. Versioning.....................................................5 7. CORS headers...................................................5 8. Session description............................................5 9. Bearer tokens and access control...............................6 10. Application-first bearer token issuance........................6 11. Storage-first bearer token issuance............................7 12. Security Considerations........................................8 13. IANA Considerations............................................9 14. Acknowledgments................................................9 15. References.....................................................9 15.1. Normative References......................................9 15.2. Informative References....................................9 16. Authors' addresses............................................10 1. Introduction Many services for data storage are available over the internet. This specification describes a vendor-independent interface for such services. It is based on https, CORS and bearer tokens. The metaphor for addressing data on the storage is that of folders containing documents and subfolders. The actions the interface exposes are: * GET a folder: retrieve the names and current versions of the documents and subfolders currently contained by the folder * GET a document: retrieve its content type, current version, and contents * PUT a document: store a new version, its content type, and contents, conditional on the current version * DELETE a document: remove it from the storage, conditional on the current version The exact details of these four actions are described in this specification. 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this de Jong [Page 2] Internet-Draft remotestorage December 2012 document are to be interpreted as described in RFC 2119 [WORDS]. "SHOULD" and "SHOULD NOT" are appropriate when valid exceptions to a general requirement are known to exist or appear to exist, and it is infeasible or impractical to enumerate all of them. However, they should not be interpreted as permitting implementors to fail to implement the general requirement when such failure would result in interoperability failure. 3. Storage model The server stores data in nodes that form a tree structure. Internal nodes are called 'folders' and leaf nodes are called 'documents'. For a folder, the server stores references to nodes contained in the folder, and it should be able to produce a list of them, with for each contained item: * item name * item type (folder or document) * current version For a document, the server stores, and should be able to produce: * content type * content * current version 4. Requests Client-to-server requests SHOULD be made over https [HTTPS]. The root folder of the storage tree is represented by the URL '/'. Subsequently, if is the URL of a folder, then the URL of an item contained in it is for a document, or '/' for a folder. Item names MAY contain a-z, A-Z, 0-9, %, -, _. A successful GET request to a folder SHOULD be responded to with a JSON document (content type 'application/json'), representing a map in which contained documents appear as entries to , and contained folders appear as entries '/' to , for instance: { "abc": 1234567890123, "def/": 1234567890456 } Empty folders are treated as non-existing, and therefore GET requests to them SHOULD be responded to with a 404 response, and an de Jong [Page 3] Internet-Draft remotestorage December 2012 empty folder MUST NOT be listed as an item in its parent folder. Also, folders SHOULD be created silently, as necessary to contain newly added items. This way, PUT and DELETE requests only need to be made to documents, and folder management becomes an implicit result. A successful GET request to a document SHOULD be responded to with the full document contents in the body, the document's content type in a 'Content-Type' header, and the document's current version in an 'ETag' header. A successful PUT request to a document MUST result in: * the request body being stored as the document's new content, * parent and further ancestor folders being silently created as necessary, with the document (name and version) being added to its parent folder, and each folder added to its subsequent parent, * the value of its Content-Type header being stored as the document's new content type, * the current server time, in the form of milliseconds since 0:00 UCT, 1 January, 1970 being stored as the new version of the document itself, as well as of its parent folder and further ancestor folders. The response MUST contain an ETag header, with the document's new version (milliseconds since the beginning of 1970) as its value. A successful DELETE request to a document MUST result in the deletion of that document from the storage, and from its parent folder. If the parent folder is left empty by this, then it MUST also be removed, and so on for ancestor folders. A successful OPTIONS request SHOULD be responded to as described in the CORS section below. 5. Response codes The following responses SHOULD be given in the indicated cases, in order of preference, and SHOULD be recognized by the client: * 500 if an internal server error occurs, * 420 if the client makes too frequent requests or is suspected of malicious activity, * 400 for all malformed requests (e.g. foreign characters in the path or unrecognized http verb, etcetera), as well as for all PUT and DELETE requests to folders, * 401 for all requests that don't have a bearer token with sufficient permissions, * 404 for all DELETE and GET requests to nodes that do not exist de Jong [Page 4] Internet-Draft remotestorage December 2012 on the storage, * 304 for a conditional GET request whose condition fails (see "Versioning" below), * 409 for a conditional PUT or DELETE request whose condition fails (see "Versioning" below), * 200 for all successful requests, including PUT and DELETE, Clients SHOULD also handle the case where a response takes too long to arrive, or where no response is received at all. 6. Versioning The current version of a document is the 13-digit decimal number representing the number of milliseconds between 00:00 UCT, 1 January 1970, and the last time its content or content type were set or changed successfully. The current version of a folder is the highest version of any of the items it contains. All successful requests MUST return an 'ETag' header with, in the case of GET, the current version, in the case of PUT, the new version, and in case of DELETE, the version that was deleted. PUT and DELETE requests MAY have an 'If-Unmodified-Since' request header, and MUST fail with a 409 response code if that doesn't match the document's current version. GET requests MAY have an 'If-Modified-Since' header, and SHOULD be responded to with a 304 if that matches the document or folder's current version. 7. CORS headers All responses MUST carry CORS headers [CORS]. The server MUST also reply to OPTIONS requests as per CORS. For GET requests, a wildcard origin MAY be returned, but for PUT and DELETE requests, the response MUST echo back the Origin header sent by the client. 8. Session description The information that a client needs to receive in order to be able to connect to a server SHOULD reach the client as described in the 'bearer token issuance' sections below. It consists of: * , consisting of 'https://' followed by a server host, and optionally a server port and a path prefix as per [IRI]. Examples: * 'https://example.com' (host only) * 'https://example.com:8080' (host and port) * 'https://example.com/some?path/to/storage' (host, port and path prefix; note there is no trailing slash) * as per [OAUTH]. The token SHOULD be hard to guess and SHOULD NOT be reused from one client to another. It can however be reused in subsequent interactions with the same de Jong [Page 5] Internet-Draft remotestorage December 2012 client, as long as that client is still trusted. Example: * 'ofb24f1ac3973e70j6vts19qr9v2eei' * , always 'draft-dejong-remotestorage-00' for this version of the specification. The client can make its requests using https with CORS and bearer tokens, to the URL that is the concatenation of with '/' plus one or more '/' strings indicating a path in the folder tree, followed by zero or one strings, indicating a document. For example, if is "https://storage.example.com/bob", then to retrieve the folder contents of the /public/documents/ folder, or to retrieve a 'draft.txt' document from that folder, the client would make requests to, respectively: * https://storage.example.com/bob/public/documents/ * https://storage.example.com/bob/public/documents/draft.txt 9. Bearer tokens and access control A bearer token represents one or more access scopes. These access scopes are represented as strings of the form , where the string SHOULD be lower-case alphanumerical, other than the reserved word 'public', and can be ':r' or ':rw'. The access the bearer token gives is the sum of its access scopes, with each access scope representing the following permissions: 'root:rw') any request, 'root:r') any GET request, ':rw') any requests to paths that start with '/' '/' or '/public/' '/', ':r') any GET requests to paths that start with '/' '/' or '/public/' '/', As a special exceptions, GET requests to a document (but not a folder) whose path starts with '/public/' are always allowed. They, as well as OPTIONS requests, can be made without a bearer token. All other requests should present a bearer token with sufficient access scope, using a header of the following form: Authorization: Bearer 10. Application-first bearer token issuance To make a remotestorage server available as 'the remotestorage of at ', exactly one link of the following format SHOULD be added to the webfinger record [WEBFINGER] of at : de Jong [Page 6] Internet-Draft remotestorage December 2012 { href: , rel: "remotestorage", type: , properties: { 'auth-method': "http://tools.ietf.org/html/rfc6749#section-4.2", 'auth-endpoint': } } Here and are as per "Session description" above, and SHOULD be a URL where an OAuth2 implicit-grant flow dialog [OAUTH] is be presented, so the user can supply her credentials (how, is out of scope), and allow or reject a request by the connecting application to obtain a bearer token for a certain list of access scopes. The server SHOULD NOT expire bearer tokens unless they are revoked, and MAY require the user to register applications as OAuth clients before first use; if no client registration is required, then the server MAY ignore the client_id parameter in favour of relying on the redirect_uri parameter for client identification. 11. Storage-first bearer token issuance The provider MAY also present a dashboard to the user, where she has some way to add open web app manifests [MANIFEST]. Adding a manifest to the dashboard is considered equivalent to clicking 'accept' in the dialog of the application-first flow. Removing one is considered equivalent to revoking its access token. As an equivalent to OAuth's 'scope' parameter, a 'remotestorage' field SHOULD be present in the root of such an application manifest document, as a JSON array of strings, each string being one access scope of the form . When the user gestures she wants to use a certain application whose manifest is present on the dashboard, the dashboard SHOULD redirect to the application or open it in a new window. To mimic coming back from the OAuth dialog, it MAY add 'access_token' and 'scope' parameters to the URL fragment. Regardless of whether 'access_token' and 'scope' are specified, it SHOULD add a 'remotestorage' parameter to the URL fragment, with a value of the form '@' . When the application detects this parameter, it SHOULD resolve the webfinger record for at and extract the and information. If no access_token was given, then the application SHOULD also extract the information from webfinger, and continue de Jong [Page 7] Internet-Draft remotestorage December 2012 as per application-first bearer token issuance. Note that whereas a remotestorage server SHOULD offer support of the application-first flow with webfinger and OAuth, it MAY choose not to support the storage-first flow, provided that users will easily remember their '@' webfinger address at that provider. Applications SHOULD, however, support both flows, which means checking the URL for a 'remotestorage' parameter, but giving the user a way to specify her webfinger address if there is none. If a server provides an application manifest dashboard, then it SHOULD merge the list of applications there with the list of issued access tokens as specified by OAuth into one list. Also, the interface for revoking an access token as specified by OAuth SHOULD coincide with removing an application from the dashboard. 12. Security Considerations To prevent man-in-the-middle attacks, the use of https instead of http is important for both the interface itself and all end-points involved in webfinger, OAuth, and (if present) the storage-first application launch dashboard. A malicious party could link to an application, but specifying a remotestorage user address that it controls, thus tricking the user into using a trusted application to send sensitive data to the wrong remotestorage server. To mitigate this, applications SHOULD clearly display to which remotestorage server they are sending the user's data. Applications could request scopes that the user did not intend to give access to. The user SHOULD always be prompted to carefully review which scopes an application is requesting. An application may upload malicious html pages and then trick the user into visiting them, or upload malicious client-side scripts, that take advantage of being hosted on the user's domain name. The origin on which the remotestorage server has its interface SHOULD therefore NOT be used for anything else, and the user SHOULD be warned not to visit any web pages on that origin. In particular, the OAuth dialog and launch dashboard or token revokation interface SHOULD be on a different origin than the remotestorage interface. Where the use of bearer tokens is impractical, a user may choose to store documents on hard-to-guess URLs whose path after starts with '/public/', while sharing this URL only with the intended audience. That way, only parties who know the document's hard-to-guess URL, can access it. The server SHOULD therefore make an effort to detect and stop brute-force attacks that attempt to guess the location of such documents. de Jong [Page 8] Internet-Draft remotestorage December 2012 The server SHOULD also detect and stop denial-of-service attacks that aim to overwhelm its interface with too much traffic. 13. IANA Considerations This document registers the 'remotestorage' link relation. 14. Acknowledgements The authors would like to thank everybody who contributed to the development of this protocol, including Kenny Bentley, Javier Diaz, Daniel Groeber, Bjarni Runar, Jan Wildeboer, Charles Schultz, Peter Svensson, Valer Mischenko, Michiel Leenaars, Jan-Christoph Borchardt, Garret Alfert, Sebastian Kippe, Max Wiehle, Melvin Carvalho, Martin Stadler, Geoffroy Couprie, Niklas Cathor, Marco Stahl, James Coglan, Ken Eucker, Daniel Brolund, elf Pavlik, Nick Jennings, and Markus Sabadello, among many others. 15. References 15.1. Normative References [WORDS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [IRI] Duerst, M., "Internationalized Resource Identifiers (IRIs)", RFC 3987, January 2005. [WEBFINGER] Jones, Paul E., Salguerio, Gonzalo, and Smarr, Joseph, "WebFinger", draft-ietf-appsawg-webfinger-07, Work in Progress [OAUTH] "Section 4.2: Implicit Grant", in: Hardt, D. (ed), "The OAuth 2.0 Authorization Framework", RFC6749, October 2012. 15.2. Informative References [HTTPS] Rescorla, E., "HTTP Over TLS", RFC2818, May 2000. [CORS] van Kesteren, Anne (ed), "Cross-Origin Resource Sharing -- W3C Working Draft 3 April 2012", http://www.w3.org/TR/2012/WD-cors-20120403/CORS, April 2012. [MANIFEST] Mozilla Developer Network (ed), "App manifest -- Revision de Jong [Page 9] Internet-Draft remotestorage December 2012 330541", https://developer.mozilla.org/en- US/docs/Apps/Manifest$revision/330541, November 2012. 16. Authors' addresses Michiel B. de Jong (independent) Email: michiel@michielbdejong.com F. Kooman SURFnet bv Postbus 19035 3501 DA Utrecht The Netherlands Email: Francois.Kooman@surfnet.nl de Jong [Page 10]