1. `Reading a file`_
2. `Writing/Uploading a File`_
3. `Creating a New Directory`_
- 4. `Get Information About A File Or Directory (as JSON)`_
- 5. `Attaching an existing File or Directory by its read- or write-cap`_
- 6. `Adding multiple files or directories to a parent directory at once`_
- 7. `Deleting a File or Directory`_
+ 4. `Getting Information About A File Or Directory (as JSON)`_
+ 5. `Attaching an Existing File or Directory by its read- or write-cap`_
+ 6. `Adding Multiple Files or Directories to a Parent Directory at Once`_
+ 7. `Unlinking a File or Directory`_
6. `Browser Operations: Human-Oriented Interfaces`_
1. `Viewing A Directory (as HTML)`_
2. `Viewing/Downloading a File`_
- 3. `Get Information About A File Or Directory (as HTML)`_
+ 3. `Getting Information About A File Or Directory (as HTML)`_
4. `Creating a Directory`_
5. `Uploading a File`_
6. `Attaching An Existing File Or Directory (by URI)`_
- 7. `Deleting A Child`_
+ 7. `Unlinking A Child`_
8. `Renaming A Child`_
9. `Other Utilities`_
10. `Debugging and Testing Features`_
7. `Other Useful Pages`_
8. `Static Files in /public_html`_
-9. `Safety and security issues -- names vs. URIs`_
+9. `Safety and Security Issues -- Names vs. URIs`_
10. `Concurrency Issues`_
+11. `Access Blacklist`_
+
Enabling the web-API port
=========================
option to the 'tahoe create-node' command. By default, the node listens on
port 3456, on the loopback (127.0.0.1) interface.
+
Basic Concepts: GET, PUT, DELETE, POST
======================================
operations are required to have no side-effects.
PUT is used to upload new objects into the filesystem, or to replace an
-existing object. DELETE it used to delete objects from the filesystem. Both
-PUT and DELETE are required to be idempotent: performing the same operation
-multiple times must have the same side-effects as only performing it once.
+existing link or the contents of a mutable file. DELETE is used to unlink
+objects from directories. Both PUT and DELETE are required to be idempotent:
+performing the same operation multiple times must have the same side-effects
+as only performing it once.
POST is used for more complicated actions that cannot be expressed as a GET,
PUT, or DELETE. POST operations can be thought of as a method call: sending
some message to the object referenced by the URL. In Tahoe, POST is also used
for operations that must be triggered by an HTML form (including upload and
-delete), because otherwise a regular web browser has no way to accomplish
+unlinking), because otherwise a regular web browser has no way to accomplish
these tasks. In general, everything that can be done with a PUT or DELETE can
also be done with a POST.
expected to use the RESTful interface described above. The second is a human
using a standard web browser to work with the filesystem. This user is given
a series of HTML pages with links to download files, and forms that use POST
-actions to upload, rename, and delete files.
+actions to upload, rename, and unlink files.
When an error occurs, the HTTP response code will be set to an appropriate
400-series code (like 404 Not Found for an unknown childname, or 400 Bad Request
``text/*``, or text/html (or if there is no Accept header), HTML tracebacks will
be generated.
+
URLs
====
Also note that the filenames inside upload POST forms are interpreted using
whatever character set was provided in the conventional '_charset' field, and
defaults to UTF-8 if not otherwise specified. The JSON representation of each
-directory contains native unicode strings. Tahoe directories are specified to
-contain unicode filenames, and cannot contain binary strings that are not
+directory contains native Unicode strings. Tahoe directories are specified to
+contain Unicode filenames, and cannot contain binary strings that are not
representable as such.
All Tahoe operations that refer to existing files or directories must include
the security properties of Tahoe caps to be extended across the web-API
interface.
+
Slow Operations, Progress, and Cancelling
=========================================
handles. Instead, they emit line-oriented status results immediately. Client
code can cancel the operation by simply closing the HTTP connection.
+
Programmatic Operations
=======================
that use HTTP to communicate with a Tahoe node. A later section describes
operations that are intended for web browsers.
+
Reading A File
--------------
"Browser Operations", for details on how to modify these URLs for that
purpose.
+
Writing/Uploading A File
------------------------
To use the /uri/$FILECAP form, $FILECAP must be a write-cap for a mutable file.
In the /uri/$DIRCAP/[SUBDIRS../]FILENAME form, if the target file is a
- writeable mutable file, that file's contents will be overwritten in-place. If
- it is a read-cap for a mutable file, an error will occur. If it is an
- immutable file, the old file will be discarded, and a new one will be put in
- its place.
-
- When creating a new file, if "mutable=true" is in the query arguments, the
- operation will create a mutable file instead of an immutable one.
+ writeable mutable file, that file's contents will be overwritten
+ in-place. If it is a read-cap for a mutable file, an error will occur.
+ If it is an immutable file, the old file will be discarded, and a new
+ one will be put in its place. If the target file is a writable mutable
+ file, you may also specify an "offset" parameter -- a byte offset that
+ determines where in the mutable file the data from the HTTP request
+ body is placed. This operation is relatively efficient for MDMF mutable
+ files, and is relatively inefficient (but still supported) for SDMF
+ mutable files. If no offset parameter is specified, then the entire
+ file is replaced with the data from the HTTP request body. For an
+ immutable file, the "offset" parameter is not valid.
+
+ When creating a new file, you can control the type of file created by
+ specifying a format= argument in the query string. format=mdmf creates an MDMF
+ mutable file. format=sdmf creates an SDMF mutable file. format=chk creates an
+ immutable file. The value of the format argument is case-insensitive. For
+ compatibility with previous versions of Tahoe-LAFS, the webapi will also
+ accept a mutable=true argument in the query string. If mutable=true is given,
+ then the new file will be mutable, and its format will be the default mutable
+ file format, as configured on the Tahoe-LAFS node hosting the webapi server.
+ Use of mutable=true is discouraged; new code should use format= instead of
+ mutable=true. If neither format= nor mutable=true are given, the
+ newly-created file will be immutable.
This returns the file-cap of the resulting file. If a new file was created
by this method, the HTTP response code (as dictated by rfc2616) will be set
attach the file into the filesystem. No directories will be modified by
this operation. The file-cap is returned as the body of the HTTP response.
- If "mutable=true" is in the query arguments, the operation will create a
- mutable file, and return its write-cap in the HTTP respose. The default is
- to create an immutable file, returning the read-cap as a response.
+ This method accepts format= and mutable=true as query string arguments, and
+ interprets those arguments in the same way as the linked forms of PUT
+ described immediately above.
Creating A New Directory
------------------------
filesystem. The "PUT" operation is provided for backwards compatibility:
new code should use POST.
+ This supports a format= argument in the query string. The format=
+ argument, if specified, controls the format of the directory. format=mdmf
+ indicates that the directory should be stored as an MDMF file; format=sdmf
+ indicates that the directory should be stored as an SDMF file. The value of
+ the format= argument is case-insensitive. If no format= argument is
+ given, the directory's format is determined by the default mutable file
+ format, as configured on the Tahoe-LAFS node responding to the request.
+
``POST /uri?t=mkdir-with-children``
Create a new directory, populated with a set of child nodes, and return its
write-cap as the HTTP response body. The new directory is not attached to
any other directory: the returned write-cap is the only reference to it.
+ The format of the directory can be controlled with the format= argument in
+ the query string, as described above.
+
Initial children are provided as the body of the POST form (this is more
efficient than doing separate mkdir and set_children operations). If the
body is empty, the new directory will be empty. If not empty, the body will
form submissions, since the body is not formatted this way. Doing so will
cause a server error as the lower-level code misparses the request body.
- Child file names should each be expressed as a unicode string, then used as
+ Child file names should each be expressed as a Unicode string, then used as
keys of the dictionary. The dictionary should then be converted into JSON,
and the resulting string encoded into UTF-8. This UTF-8 bytestring should
then be used as the POST body.
If the final directory is created, it will be empty.
+ This accepts a format= argument in the query string, which controls the
+ format of the named target directory, if it does not already exist. format=
+ is interpreted in the same way as in the POST /uri?t=mkdir form. Note that
+ format= only controls the format of the named target directory;
+ intermediate directories, if created, are created based on the default
+ mutable type, as configured on the Tahoe-LAFS server responding to the
+ request.
+
This operation will return an error if a blocking file is present at any of
the parent names, preventing the server from creating the necessary parent
directory; or if it would require changing an immutable directory.
intermediate mutable directories as necessary. If the final directory is
created, it will be populated with initial children from the POST request
body, as described above.
+
+ This accepts a format= argument in the query string, which controls the
+ format of the target directory, if the target directory is created as part
+ of the operation. format= is interpreted in the same way as in the POST/
+ uri?t=mkdir-with-children operation. Note that format= only controls the
+ format of the named target directory; intermediate directories, if created,
+ are created using the default mutable type setting, as configured on the
+ Tahoe-LAFS server responding to the request.
This operation will return an error if a blocking file is present at any of
the parent names, preventing the server from creating the necessary parent
Create a new empty mutable directory and attach it to the given existing
directory. This will create additional intermediate directories as necessary.
+ This accepts a format= argument in the query string, which controls the
+ format of the named target directory, if it does not already exist. format=
+ is interpreted in the same way as in the POST /uri?t=mkdir form. Note that
+ format= only controls the format of the named target directory;
+ intermediate directories, if created, are created based on the default
+ mutable type, as configured on the Tahoe-LAFS server responding to the
+ request.
+
This operation will return an error if a blocking file is present at any of
the parent names, preventing the server from creating the necessary parent
directory, or if it would require changing any immutable directory.
Like /uri/$DIRCAP/[SUBDIRS../]?t=mkdir&name=NAME, but the new directory will
be populated with initial children via the POST request body. This command
will create additional intermediate mutable directories as necessary.
-
+
+ This accepts a format= argument in the query string, which controls the
+ format of the target directory, if the target directory is created as part
+ of the operation. format= is interpreted in the same way as in the POST/
+ uri?t=mkdir-with-children operation. Note that format= only controls the
+ format of the named target directory; intermediate directories, if created,
+ are created using the default mutable type setting, as configured on the
+ Tahoe-LAFS server responding to the request.
+
This operation will return an error if a blocking file is present at any of
the parent names, preventing the server from creating the necessary parent
directory; or if it would require changing an immutable directory; or if
This operation will return an error if the parent directory is immutable,
or already has a child named NAME.
-Get Information About A File Or Directory (as JSON)
----------------------------------------------------
+
+Getting Information About A File Or Directory (as JSON)
+-------------------------------------------------------
``GET /uri/$FILECAP?t=json``
"ro_uri": file_uri,
"verify_uri": verify_uri,
"size": bytes,
- "mutable": false
+ "mutable": false,
+ "format": "chk"
} ]
If it is a capability to a directory followed by a path from that directory
"verify_uri": verify_uri,
"size": bytes,
"mutable": false,
+ "format": "chk",
"metadata": {
"ctime": 1202777696.7564139,
"mtime": 1202777696.7564139,
"ro_uri": read_only_uri,
"verify_uri": verify_uri,
"mutable": true,
+ "format": "sdmf",
"children": {
"foo.txt": [ "filenode", {
"ro_uri": uri,
time.
-Attaching an existing File or Directory by its read- or write-cap
+Attaching an Existing File or Directory by its read- or write-cap
-----------------------------------------------------------------
``PUT /uri/$DIRCAP/[SUBDIRS../]CHILDNAME?t=uri``
would result in granting the cap's write authority to holders of the
directory read cap.
-Adding multiple files or directories to a parent directory at once
+
+Adding Multiple Files or Directories to a Parent Directory at Once
------------------------------------------------------------------
``POST /uri/$DIRCAP/[SUBDIRS..]?t=set_children``
backward compatibility should continue to use "set_children".
-Deleting a File or Directory
-----------------------------
+Unlinking a File or Directory
+-----------------------------
``DELETE /uri/$DIRCAP/[SUBDIRS../]CHILDNAME``
be modified.
Note that this does not actually delete the file or directory that the name
- points to from the tahoe grid -- it only removes the named reference from
+ points to from the tahoe grid -- it only unlinks the named reference from
this directory. If there are other names in this directory or in other
directories that point to the resource, then it will remain accessible
through those paths. Even if all names pointing to this object are removed
This method returns the file- or directory- cap of the object that was just
removed.
+
Browser Operations: Human-oriented interfaces
=============================================
specified by using <input type="hidden"> elements. For clarity, the
descriptions below display the most significant arguments as URL query args.
+
Viewing A Directory (as HTML)
-----------------------------
browser, which contains HREF links to all files and directories reachable
from this directory. These HREF links do not have a t= argument, meaning
that a human who follows them will get pages also meant for a human. It also
- contains forms to upload new files, and to delete files and directories.
- Those forms use POST methods to do their job.
+ contains forms to upload new files, and to unlink files and directories
+ from their parent directory. Those forms use POST methods to do their job.
+
Viewing/Downloading a File
--------------------------
this form can *only* be used with file caps; it is an error to use a
directory cap after the /named/ prefix.
-Get Information About A File Or Directory (as HTML)
----------------------------------------------------
+
+Getting Information About A File Or Directory (as HTML)
+-------------------------------------------------------
``GET /uri/$FILECAP?t=info``
* access caps (URIs): verify-cap, read-cap, write-cap (for mutable objects)
* check/verify/repair form
* deep-check/deep-size/deep-stats/manifest (for directories)
- * replace-conents form (for mutable files)
+ * replace-contents form (for mutable files)
+
Creating a Directory
--------------------
/uri/$DIRCAP page. There is a "create directory" button on the Welcome page
to invoke this action.
+ This accepts a format= argument in the query string. Refer to the
+ documentation of the PUT /uri?t=mkdir operation in `Creating A
+ New Directory`_ for information on the behavior of the format= argument.
+
If "redirect_to_result=true" is not provided (or is given a value of
"false"), then the HTTP response body will simply be the write-cap of the
new directory.
This creates a new empty directory as a child of the designated SUBDIR. This
will create additional intermediate directories as necessary.
+ This accepts a format= argument in the query string. Refer to the
+ documentation of POST /uri/$DIRCAP/[SUBDIRS../]?t=mkdir&name=CHILDNAME in
+ `Creating A New Directory`_ for information on the behavior of the format=
+ argument.
+
If a "when_done=URL" argument is provided, the HTTP response will cause the
web browser to redirect to the given URL. This provides a convenient way to
return the browser to the directory that was just modified. Without a
about which storage servers were used for the upload, how long each
operation took, etc.
- If a "mutable=true" argument is provided, the operation will create a
- mutable file, and the response body will contain the write-cap instead of
- the upload results page. The default is to create an immutable file,
- returning the upload results page as a response.
-
+ This accepts format= and mutable=true query string arguments. Refer to
+ `Writing/Uploading A File`_ for information on the behavior of format= and
+ mutable=true.
``POST /uri/$DIRCAP/[SUBDIRS../]?t=upload``
/uri/$DIRCAP/[SUBDIRS../]", it is likely that the parent directory will
already exist.
- If a "mutable=true" argument is provided, any new file that is created will
- be a mutable file instead of an immutable one. <input type="checkbox"
- name="mutable" /> will give the user a way to set this option.
+ This accepts format= and mutable=true query string arguments. Refer to
+ `Writing/Uploading A File`_ for information on the behavior of format= and
+ mutable=true.
If a "when_done=URL" argument is provided, the HTTP response will cause the
web browser to redirect to the given URL. This provides a convenient way to
the "PUT /uri/$FILECAP" form, but uses a POST for the benefit of HTML forms
in a web browser.
+
Attaching An Existing File Or Directory (by URI)
------------------------------------------------
This accepts the same replace= argument as POST t=upload.
-Deleting A Child
-----------------
+
+Unlinking A Child
+-----------------
``POST /uri/$DIRCAP/[SUBDIRS../]?t=delete&name=CHILDNAME``
+``POST /uri/$DIRCAP/[SUBDIRS../]?t=unlink&name=CHILDNAME``
+
This instructs the node to remove a child object (file or subdirectory) from
the given directory, which must be mutable. Note that the entire subtree is
unlinked from the parent. Unlike deleting a subdirectory in a UNIX local
into the subtree will see that the child subdirectories are not modified by
this operation. Only the link from the given directory to its child is severed.
+ In Tahoe-LAFS v1.9.0 and later, t=unlink can be used as a synonym for t=delete.
+ If interoperability with older web-API servers is required, t=delete should
+ be used.
+
+
Renaming A Child
----------------
If the object is an immutable file, this will return the same value as
t=uri.
+
Debugging and Testing Features
------------------------------
was untraversable, since the manifest entry is emitted to the HTTP response
body before the child is traversed.
+
Other Useful Pages
==================
prettier front-end to the rest of the Tahoe web-API.
-Safety and security issues -- names vs. URIs
+Safety and Security Issues -- Names vs. URIs
============================================
Summary: use explicit file- and dir- caps whenever possible, to reduce the
directory) is found by following this name (or sequence of names) when my
request reaches the server". Use URIs if you want "this particular object".
+
Concurrency Issues
==================
Coordination Directive" sections of `mutable.rst <../specifications/mutable.rst>`_.
+Access Blacklist
+================
+
+Gateway nodes may find it necessary to prohibit access to certain files. The
+web-API has a facility to block access to filecaps by their storage index,
+returning a 403 "Forbidden" error instead of the original file.
+
+This blacklist is recorded in $NODEDIR/access.blacklist, and contains one
+blocked file per line. Comment lines (starting with ``#``) are ignored. Each
+line consists of the storage-index (in the usual base32 format as displayed
+by the "More Info" page, or by the "tahoe debug dump-cap" command), followed
+by whitespace, followed by a reason string, which will be included in the 403
+error message. This could hold a URL to a page that explains why the file is
+blocked, for example.
+
+So for example, if you found a need to block access to a file with filecap
+``URI:CHK:n7r3m6wmomelk4sep3kw5cvduq:os7ijw5c3maek7pg65e5254k2fzjflavtpejjyhshpsxuqzhcwwq:3:20:14861``,
+you could do the following::
+
+ tahoe debug dump-cap URI:CHK:n7r3m6wmomelk4sep3kw5cvduq:os7ijw5c3maek7pg65e5254k2fzjflavtpejjyhshpsxuqzhcwwq:3:20:14861
+ -> storage index: whpepioyrnff7orecjolvbudeu
+ echo "whpepioyrnff7orecjolvbudeu my puppy told me to" >>$NODEDIR/access.blacklist
+ tahoe restart $NODEDIR
+ tahoe get URI:CHK:n7r3m6wmomelk4sep3kw5cvduq:os7ijw5c3maek7pg65e5254k2fzjflavtpejjyhshpsxuqzhcwwq:3:20:14861
+ -> error, 403 Access Prohibited: my puppy told me to
+
+The ``access.blacklist`` file will be checked each time a file or directory
+is accessed: the file's ``mtime`` is used to decide whether it need to be
+reloaded. Therefore no node restart is necessary when creating the initial
+blacklist, nor when adding second, third, or additional entries to the list.
+When modifying the file, be careful to update it atomically, otherwise a
+request may arrive while the file is only halfway written, and the partial
+file may be incorrectly parsed.
+
+The blacklist is applied to all access paths (including FTP, SFTP, and CLI
+operations), not just the web-API. The blacklist also applies to directories.
+If a directory is blacklisted, the gateway will refuse access to both that
+directory and any child files/directories underneath it, when accessed via
+"DIRCAP/SUBDIR/FILENAME" -style URLs. Users who go directly to the child
+file/dir will bypass the blacklist.
+
+The node will log the SI of the file being blocked, and the reason code, into
+the ``logs/twistd.log`` file.
+
+
.. [1] URLs and HTTP and UTF-8, Oh My
HTTP does not provide a mechanism to specify the character set used to
- encode non-ascii names in URLs (rfc2396#2.1). We prefer the convention that
- the filename= argument shall be a URL-encoded UTF-8 encoded unicode object.
+ encode non-ASCII names in URLs
+ (`RFC3986#2.1 <http://tools.ietf.org/html/rfc3986#section-2.1>`_).
+ We prefer the convention that the ``filename=`` argument shall be a
+ URL-escaped UTF-8 encoded Unicode string.
For example, suppose we want to provoke the server into using a filename of
- "f i a n c e-acute e" (i.e. F I A N C U+00E9 E). The UTF-8 encoding of this
- is 0x66 0x69 0x61 0x6e 0x63 0xc3 0xa9 0x65 (or "fianc\xC3\xA9e", as python's
- repr() function would show). To encode this into a URL, the non-printable
- characters must be escaped with the urlencode '%XX' mechansim, giving us
- "fianc%C3%A9e". Thus, the first line of the HTTP request will be "GET
- /uri/CAP...?save=true&filename=fianc%C3%A9e HTTP/1.1". Not all browsers
- provide this: IE7 uses the Latin-1 encoding, which is fianc%E9e.
+ "f i a n c e-acute e" (i.e. f i a n c U+00E9 e). The UTF-8 encoding of this
+ is 0x66 0x69 0x61 0x6e 0x63 0xc3 0xa9 0x65 (or "fianc\\xC3\\xA9e", as python's
+ ``repr()`` function would show). To encode this into a URL, the non-printable
+ characters must be escaped with the urlencode ``%XX`` mechanism, giving
+ us "fianc%C3%A9e". Thus, the first line of the HTTP request will be
+ "``GET /uri/CAP...?save=true&filename=fianc%C3%A9e HTTP/1.1``". Not all
+ browsers provide this: IE7 by default uses the Latin-1 encoding, which is
+ "fianc%E9e" (although it has a configuration option to send URLs as UTF-8).
The response header will need to indicate a non-ASCII filename. The actual
mechanism to do this is not clear. For ASCII filenames, the response header
Content-Disposition: attachment; filename="english.txt"
- If Tahoe were to enforce the utf-8 convention, it would need to decode the
- URL argument into a unicode string, and then encode it back into a sequence
+ If Tahoe were to enforce the UTF-8 convention, it would need to decode the
+ URL argument into a Unicode string, and then encode it back into a sequence
of bytes when creating the response header. One possibility would be to use
- unencoded utf-8. Developers suggest that IE7 might accept this::
+ unencoded UTF-8. Developers suggest that IE7 might accept this::
#1: Content-Disposition: attachment; filename="fianc\xC3\xA9e"
(note, the last four bytes of that line, not including the newline, are
`RFC2231#4 <http://tools.ietf.org/html/rfc2231#section-4>`_
(dated 1997): suggests that the following might work, and
`some developers have reported <http://markmail.org/message/dsjyokgl7hv64ig3>`_
- that it is supported by firefox (but not IE7)::
+ that it is supported by Firefox (but not IE7)::
#2: Content-Disposition: attachment; filename*=utf-8''fianc%C3%A9e
However this is contrary to the examples in the email thread listed above.
Developers report that IE7 (when it is configured for UTF-8 URL encoding,
- which is not the default in asian countries), will accept::
+ which is not the default in Asian countries), will accept::
#4: Content-Disposition: attachment; filename=fianc%C3%A9e
However, for maximum compatibility, Tahoe simply copies bytes from the URL
- into the response header, rather than enforcing the utf-8 convention. This
+ into the response header, rather than enforcing the UTF-8 convention. This
means it does not try to decode the filename from the URL argument, nor does
it encode the filename into the response header.