3 What makes up a Tahoe "grid"? The rough answer is a fairly-stable set of
6 The read- and write- caps that point to files and directories are scoped to a
7 particular set of servers. The Tahoe peer-selection and erasure-coding
8 algorithms provide high availability as long as there is significant overlap
9 between the servers that were used for upload and the servers that are
10 available for subsequent download. When new peers are added, the shares will
11 get spread out in the search space, so clients must work harder to download
12 their files. When peers are removed, shares are lost, and file health is
13 threatened. Repair bandwidth must be used to generate new shares, so cost
14 increases with the rate of server departure. If servers leave the grid too
15 quickly, repair may not be able to keep up, and files will be lost.
17 So to get long-term stability, we need that peer set to remain fairly stable.
18 A peer which joins the grid needs to stick around for a while.
22 The current Tahoe read-cap format doesn't admit the existence of multiple
23 grids. In fact, the "URI:" prefix implies that these cap strings are
24 universal: it suggests that this string (plus some protocol definition) is
25 completely sufficient to recover the file.
27 However, there are a variety of reasons why we may want to have more than one
28 Tahoe grid in the world:
30 * scaling: there are a variety of problems that are likely to be encountered
31 as we attempt to grow a Tahoe grid from a few dozen servers to a few
32 thousand, some of which are easier to deal with than others. Maintaining
33 connections to servers and keeping up-to-date on the locations of servers
34 is one issue. There are design improvements that can work around these,
35 but they will take time, and we may not want to wait for that work to be
36 done. Begin able to deploy multiple grids may be the best way to get a
37 large number of clients using tahoe at once.
39 * managing quality of storage, storage allocation: the members of a
40 friendnet may want to restrict access to storage space to just each other,
41 and may want to run their grid without involving any external coordination
43 * commercial goals: a company using Tahoe may want to restrict access to
44 storage space to just their customers
46 * protocol upgrades, development: new and experimental versions of the tahoe
47 software may need to be deployed and analyzed in isolation from the grid
48 that clients are using for active storage
50 So if we define a grid to be a set of storage servers, then two distinct
51 grids will have two distinct sets of storage servers. Clients are free to use
52 whichever grid they like (and have permission to use), however each time they
53 upload a file, they must choose a specific grid to put it in. Clients can
54 upload the same file to multiple grids in two separate upload operations.
56 == Grid IDs in URIs ==
58 Each URI needs to be scoped to a specific grid, to avoid confusion ("I looked
59 for URI123 and it said File Not Found.. oh, which grid did you upload that
60 into?"). To accomplish this, the URI will contain a "grid identifier" that
61 references a specific Tahoe grid. The grid ID is shorthand for a relatively
62 stable set of storage servers.
64 To make the URIs actually Universal, there must be a way to get from the grid
65 ID to the actual grid. This document defines a protocol by which a client
66 that wants to download a file from a previously-unknown grid will be able to
67 locate and connect to that grid.
69 == Grid ID specification ==
71 The Grid ID is a string, using a fairly limited character set, alphanumerics
72 plus possibly a few others. It can be very short: a gridid of just "0" can be
73 used. The gridID will be copied into the cap string for every file that is
74 uploaded to that grid, so there is pressure to keep them short.
76 The cap format needs to be able to distinguish the gridID from the rest of
77 the cap. This could be expressed in DNS-style dot notation, for example the
78 directory write-cap with a write-key of "0ZrD.." that lives on gridID "foo"
79 could be expressed as "D0ZrDNAHuxs0XhYJNmkdicBUFxsgiHzMdm.foo" .
81 * design goals: non-word-breaking, double-click-pasteable, maybe
82 human-readable (do humans need to know which grid is being used? probably
84 * does not need to be Secure (i.e. long and unguessable), but we must
85 analyze the sorts of DoS attack that can result if it is not (and even
87 * does not need to be human-memorable, although that may assist debugging
88 and discussion ("my file is on grid 4, where is yours?)
89 * *does* need to be unique, but the total number of grids is fairly small
90 (counted in the hundreds or thousands rather than millions or billions)
91 and we can afford to coordinate the use of short names. Folks who don't
92 like coordination can pick a largeish random string.
94 Each announcement that a Storage Server publishes (to introducers) will
95 include its grid id. If a server participates in multiple grids, it will make
96 multiple announcements, each with a single grid id. Clients will be able to
97 ask an introducer for information about all storage servers that participate
100 Clients are likely to have a default grid id, to which they upload files. If
101 a client is adding a file to a directory that lives in a different grid, they
102 may upload the file to that other grid instead of their default.
104 == Getting from a Grid ID to a grid ==
106 When a client decides to download a file, it starts by unpacking the cap and
107 extracting the grid ID.
109 Then it attempts to connect to at least one introducer for that grid, by
112 hash $GRIDID id (with some tag) to get a long base32-encoded string: $HASH
114 GET http://tahoe-$HASH.com/introducer/gridid/$GRIDID
116 the results should be a JSON-encoded list of introducer FURLs
118 for extra redundancy, if that query fails, perform the following additional
121 GET http://tahoe-$HASH.net/introducer/gridid/$GRIDID
122 GET http://tahoe-$HASH.org/introducer/gridid/$GRIDID
123 GET http://tahoe-$HASH.tv/introducer/gridid/$GRIDID
124 GET http://tahoe-$HASH.info/introducer/gridid/$GRIDID
126 GET http://grids.tahoe-lafs.org/introducer/gridid/$GRIDID
128 The first few introducers should be able to announce other introducers, via
129 the distributed gossip-based introduction scheme of #68.
133 * claiming a grid ID is cheap: a single domain name registration (in an
134 uncontested namespace), and a simple web server. allmydata.com can publish
135 introducer FURLs for grids that don't want to register their own domain.
137 * lookup is at least as robust as DNS. By using benevolent public services
138 like tahoe-grids.allmydata.com, reliability can be increased further. The
139 HTTP fetch can return a list of every known server node, all of which can
142 * not secure: anyone who can interfere with DNS lookups (or claims
143 tahoe-$HASH.com before you do) can cause clients to connect to their
144 servers instead of yours. This admits a moderate DoS attack against
145 download availability. Performing multiple queries (to .net, .org, etc)
146 and merging the results may mitigate this (you'll get their servers *and*
147 your servers; the download search will be slower but is still likely to
148 succeed). It may admit an upload DoS attack as well, or an upload
149 file-reliability attack (trick you into uploading to unreliable servers)
150 depending upon how the "server selection policy" (see below) is
153 Once the client is connected to an introducer, it will see if there is a
154 Helper who is willing to assist with the upload or download. (For download,
155 this might reduce the number of connections that the grid's storage servers
156 must deal with). If not, ask the introducers for storage servers, and connect
159 == Controlling Access ==
161 The introducers are not used to enforce access control. Instead, a system of
162 public keys are used.
164 There are a few kinds of access control that we might want to implement:
166 * protect storage space: only let authorized clients upload/consume storage
167 * protect download bandwidth: only give shares to authorized clients
168 * protect share reliability: only upload shares to "good" servers
170 The first two are implemented by the server, to protect their resources. The
171 last is implemented by the client, to avoid uploading shares to unreliable
172 servers (specifically, to maximize the utility of the client's limited upload
173 bandwidth: there's no problem with putting shares on unreliable peers per se,
174 but it is a problem if doing so means the client won't put a share on a more
177 The first limitation (protect storage space) will be implemented by public
178 keys and signed "storage authority" certificates. The client will present
179 some credentials to the storage server to convince it that the client
180 deserves the space. When storage servers are in this mode, they will have a
181 certificate that names a public key, and any credentials that can demonstrate
182 a path from that key will be accepted. This scheme is described in
183 docs/proposed/old-accounts-pubkey.txt .
185 The second limitation is unexplored. The read-cap does not currently contain
186 any notion of who must pay for the bandwidth incurred.
188 The third limitation (only upload to "good" servers), when enabled, is
189 implemented by a "server selection policy" on the client side, which defines
190 which server credentials will be accepted. This is just like the first
191 limitation in reverse. Before clients consider including a server in their
192 peer selection algorithm, they check the credentials, and ignore any that do
195 This means that a client may not wish to upload anything to "foreign grids",
196 because they have no promise of reliability. The reasons that a client might
197 want to upload to a foreign grid need to be examined: reliability may not be
198 important, or it might be good enough to upload the file to the client's
201 The server selection policy is intended to be fairly open-ended: we can
202 imagine a policy that says "upload to any server that has a good reputation
203 among group X", or more complicated schemes that require less and less
204 centralized management. One important and simple scheme is to simply have a
205 list of acceptable keys: a friendnet with 5 members would include 5 such keys
206 in each policy, enabling every member to use the services of the others,
207 without having a single central manager with unilateral control over the
208 definition of the group.
212 To implement these access controls, each client needs to be configured with
215 * home grid ID (used to find introducers, helpers, storage servers)
216 * storage authority (certificate to enable uploads)
217 * server selection policy (identify good/reliable servers)
219 If the server selection policy indicates centralized control (i.e. there is
220 some single key X which is used to sign the credentials for all "good"
221 servers), then this could be built in to the grid ID. By using the base32
222 hash of the pubkey as the grid ID, clients would only need to be configured
223 with two things: the grid ID, and their storage authority. In this case, the
224 introducer would provide the pubkey, and the client would compare the hashes
225 to make sure they match. This is analogous to how a TubID is used in a FURL.
227 Such grids would have significantly larger grid IDs, 24 characters or more.