From: Brian Warner Date: Tue, 3 Jun 2008 06:03:21 +0000 (-0700) Subject: docs: update webapi.txt with write-coordination issues, add TODO note to recovery... X-Git-Tag: allmydata-tahoe-1.1.0~57 X-Git-Url: https://git.rkrishnan.org/components/%22news.html/reliability?a=commitdiff_plain;h=01469433ef2732dfe0f4810f7faa1c80e8fd01f1;p=tahoe-lafs%2Ftahoe-lafs.git docs: update webapi.txt with write-coordination issues, add TODO note to recovery section of mutable.txt --- diff --git a/docs/mutable.txt b/docs/mutable.txt index 44910d94..6e9727b6 100644 --- a/docs/mutable.txt +++ b/docs/mutable.txt @@ -78,10 +78,18 @@ versions of the file that different parties are trying to establish as the one true current contents. Each simultaneous writer counts as a "competing version", as does the previous version of the file. If the count "S" of these competing versions is larger than N/k, then the file runs the risk of being -lost completely. If at least one of the writers remains running after the -collision is detected, it will attempt to recover, but if S>(N/k) and all +lost completely. [TODO] If at least one of the writers remains running after +the collision is detected, it will attempt to recover, but if S>(N/k) and all writers crash after writing a few shares, the file will be lost. +Note that Tahoe uses serialization internally to make sure that a single +Tahoe node will not perform simultaneous modifications to a mutable file. It +accomplishes this by using a weakref cache of the MutableFileNode (so that +there will never be two distinct MutableFileNodes for the same file), and by +forcing all mutable file operations to obtain a per-node lock before they +run. The Prime Coordination Directive therefore applies to inter-node +conflicts, not intra-node ones. + == Small Distributed Mutable Files == diff --git a/docs/webapi.txt b/docs/webapi.txt index 7dff96fa..281452bc 100644 --- a/docs/webapi.txt +++ b/docs/webapi.txt @@ -1,14 +1,13 @@ = The Tahoe REST-ful Web API = -This document has six sections: - -1. the basic API for how to programmatically control your tahoe node -2. convenience methods -3. safety and security issues -4. features for controlling your tahoe node from a standard web browser -5. debugging and testing features -6. XML-RPC (coming soon) +1. Enabling the web-API port +2. Basic Concepts: GET, PUT, DELETE, POST +3. URLs, Machine-Oriented Interfaces +4. Browser Operations: Human-Oriented Interfaces +5. Welcome / Debug / Status pages +6. Safety and security issues -- names vs. URIs +7. Concurrency Issues == Enabling the web-API port == @@ -800,7 +799,7 @@ GET / (introducer status) clients over time. -3. safety and security issues -- names vs. URIs +== safety and security issues -- names vs. URIs == Summary: use explicit file- and dir- caps whenever possible, to reduce the potential for surprises when the virtual drive is changed while you aren't @@ -844,3 +843,45 @@ parent directory, so it isn't any harder to use the URI for this purpose. In general, use names if you want "whatever object (whether file or directory) is found by following this name (or sequence of names) when my request reaches the server". Use URIs if you want "this particular object". + +== Concurrency Issues == + +Tahoe uses both mutable and immutable files. Mutable files can be created +explicitly by doing an upload with ?mutable=true added, or implicitly by +creating a new directory (since a directory is just a special way to +interpret a given mutable file). + +Mutable files suffer from the same consistency-vs-availability tradeoff that +all distributed data storage systems face. It is not possible to +simultaneously achieve perfect consistency and perfect availability in the +face of network partitions (servers being unreachable or faulty). + +Tahoe tries to achieve a reasonable compromise, but there is a basic rule in +place, known as the Prime Coordination Directive: "Don't Do That". What this +means is that if write-access to a mutable file is available to several +parties, then those parties are responsible for coordinating their activities +to avoid multiple simultaneous updates. This could be achieved by having +these parties talk to each other and using some sort of locking mechanism, or +by serializing all changes through a single writer. + +The consequences of performing uncoordinated writes can vary. Some of the +writers may lose their changes, as somebody else wins the race condition. In +many cases the file will be left in an "unhealthy" state, meaning that there +are not as many redundant shares as we would like (reducing the reliability +of the file against server failures). In the worst case, the file can be left +in such an unhealthy state that no version is recoverable, even the old ones. +It is this small possibility of data loss that prompts us to issue the Prime +Coordination Directive. + +Tahoe nodes implement internal serialization to make sure that a single Tahoe +node cannot conflict with itself. For example, it is safe to issue two +directory modification requests to a single tahoe node's webapi server at the +same time, because the Tahoe node will internally delay one of them until +after the other has finished being applied. (This feature was introduced in +Tahoe-1.1; back with Tahoe-1.0 the web client was responsible for serializing +web requests themselves). + +For more details, please see the "Consistency vs Availability" and "The Prime +Coordination Directive" sections of mutable.txt, in the same directory as +this file. +