docs/proposed/GridID.txt

   1 = Grid Identifiers =
   2
   3 What makes up a Tahoe "grid"? The rough answer is a fairly-stable set of
   4 Storage Servers.
   5
   6 The read- and write- caps that point to files and directories are scoped to a
   7 particular set of servers. The Tahoe peer-selection and erasure-coding
   8 algorithms provide high availability as long as there is significant overlap
   9 between the servers that were used for upload and the servers that are
  10 available for subsequent download. When new peers are added, the shares will
  11 get spread out in the search space, so clients must work harder to download
  12 their files. When peers are removed, shares are lost, and file health is
  13 threatened. Repair bandwidth must be used to generate new shares, so cost
  14 increases with the rate of server departure. If servers leave the grid too
  15 quickly, repair may not be able to keep up, and files will be lost.
  16
  17 So to get long-term stability, we need that peer set to remain fairly stable.
  18 A peer which joins the grid needs to stick around for a while.
  19
  20 == Multiple Grids ==
  21
  22 The current Tahoe read-cap format doesn't admit the existence of multiple
  23 grids. In fact, the "URI:" prefix implies that these cap strings are
  24 universal: it suggests that this string (plus some protocol definition) is
  25 completely sufficient to recover the file.
  26
  27 However, there are a variety of reasons why we may want to have more than one
  28 Tahoe grid in the world:
  29
  30  * scaling: there are a variety of problems that are likely to be encountered
  31    as we attempt to grow a Tahoe grid from a few dozen servers to a few
  32    thousand, some of which are easier to deal with than others. Maintaining
  33    connections to servers and keeping up-to-date on the locations of servers
  34    is one issue. There are design improvements that can work around these,
  35    but they will take time, and we may not want to wait for that work to be
  36    done. Begin able to deploy multiple grids may be the best way to get a
  37    large number of clients using tahoe at once.
  38
  39  * managing quality of storage, storage allocation: the members of a
  40    friendnet may want to restrict access to storage space to just each other,
  41    and may want to run their grid without involving any external coordination
  42
  43  * commercial goals: a company using Tahoe may want to restrict access to
  44    storage space to just their customers
  45
  46  * protocol upgrades, development: new and experimental versions of the tahoe
  47    software may need to be deployed and analyzed in isolation from the grid
  48    that clients are using for active storage
  49
  50 So if we define a grid to be a set of storage servers, then two distinct
  51 grids will have two distinct sets of storage servers. Clients are free to use
  52 whichever grid they like (and have permission to use), however each time they
  53 upload a file, they must choose a specific grid to put it in. Clients can
  54 upload the same file to multiple grids in two separate upload operations.
  55
  56 == Grid IDs in URIs ==
  57
  58 Each URI needs to be scoped to a specific grid, to avoid confusion ("I looked
  59 for URI123 and it said File Not Found.. oh, which grid did you upload that
  60 into?"). To accomplish this, the URI will contain a "grid identifier" that
  61 references a specific Tahoe grid. The grid ID is shorthand for a relatively
  62 stable set of storage servers.
  63
  64 To make the URIs actually Universal, there must be a way to get from the grid
  65 ID to the actual grid. This document defines a protocol by which a client
  66 that wants to download a file from a previously-unknown grid will be able to
  67 locate and connect to that grid.
  68
  69 == Grid ID specification ==
  70
  71 The Grid ID is a string, using a fairly limited character set, alphanumerics
  72 plus possibly a few others. It can be very short: a gridid of just "0" can be
  73 used. The gridID will be copied into the cap string for every file that is
  74 uploaded to that grid, so there is pressure to keep them short.
  75
  76 The cap format needs to be able to distinguish the gridID from the rest of
  77 the cap. This could be expressed in DNS-style dot notation, for example the
  78 directory write-cap with a write-key of "0ZrD.." that lives on gridID "foo"
  79 could be expressed as "D0ZrDNAHuxs0XhYJNmkdicBUFxsgiHzMdm.foo" .
  80
  81  * design goals: non-word-breaking, double-click-pasteable, maybe
  82    human-readable (do humans need to know which grid is being used? probably
  83    not).
  84  * does not need to be Secure (i.e. long and unguessable), but we must
  85    analyze the sorts of DoS attack that can result if it is not (and even
  86    if it is)
  87  * does not need to be human-memorable, although that may assist debugging
  88    and discussion ("my file is on grid 4, where is yours?)
  89  * *does* need to be unique, but the total number of grids is fairly small
  90    (counted in the hundreds or thousands rather than millions or billions)
  91    and we can afford to coordinate the use of short names. Folks who don't
  92    like coordination can pick a largeish random string.
  93
  94 Each announcement that a Storage Server publishes (to introducers) will
  95 include its grid id. If a server participates in multiple grids, it will make
  96 multiple announcements, each with a single grid id. Clients will be able to
  97 ask an introducer for information about all storage servers that participate
  98 in a specific grid.
  99
 100 Clients are likely to have a default grid id, to which they upload files. If
 101 a client is adding a file to a directory that lives in a different grid, they
 102 may upload the file to that other grid instead of their default.
 103
 104 == Getting from a Grid ID to a grid ==
 105
 106 When a client decides to download a file, it starts by unpacking the cap and
 107 extracting the grid ID.
 108
 109 Then it attempts to connect to at least one introducer for that grid, by
 110 leveraging DNS:
 111
 112  hash $GRIDID id (with some tag) to get a long base32-encoded string: $HASH
 113
 114  GET http://tahoe-$HASH.com/introducer/gridid/$GRIDID
 115
 116  the results should be a JSON-encoded list of introducer FURLs
 117
 118  for extra redundancy, if that query fails, perform the following additional
 119  queries:
 120
 121   GET http://tahoe-$HASH.net/introducer/gridid/$GRIDID
 122   GET http://tahoe-$HASH.org/introducer/gridid/$GRIDID
 123   GET http://tahoe-$HASH.tv/introducer/gridid/$GRIDID
 124   GET http://tahoe-$HASH.info/introducer/gridid/$GRIDID
 125    etc.
 126   GET http://grids.tahoe-lafs.org/introducer/gridid/$GRIDID
 127
 128  The first few introducers should be able to announce other introducers, via
 129  the distributed gossip-based introduction scheme of #68.
 130
 131 Properties:
 132
 133  * claiming a grid ID is cheap: a single domain name registration (in an
 134    uncontested namespace), and a simple web server. allmydata.com can publish
 135    introducer FURLs for grids that don't want to register their own domain.
 136
 137  * lookup is at least as robust as DNS. By using benevolent public services
 138    like tahoe-grids.allmydata.com, reliability can be increased further. The
 139    HTTP fetch can return a list of every known server node, all of which can
 140    act as introducers.
 141
 142  * not secure: anyone who can interfere with DNS lookups (or claims
 143    tahoe-$HASH.com before you do) can cause clients to connect to their
 144    servers instead of yours. This admits a moderate DoS attack against
 145    download availability. Performing multiple queries (to .net, .org, etc)
 146    and merging the results may mitigate this (you'll get their servers *and*
 147    your servers; the download search will be slower but is still likely to
 148    succeed). It may admit an upload DoS attack as well, or an upload
 149    file-reliability attack (trick you into uploading to unreliable servers)
 150    depending upon how the "server selection policy" (see below) is
 151    implemented.
 152
 153 Once the client is connected to an introducer, it will see if there is a
 154 Helper who is willing to assist with the upload or download. (For download,
 155 this might reduce the number of connections that the grid's storage servers
 156 must deal with). If not, ask the introducers for storage servers, and connect
 157 to them directly.
 158
 159 == Controlling Access ==
 160
 161 The introducers are not used to enforce access control. Instead, a system of
 162 public keys are used.
 163
 164 There are a few kinds of access control that we might want to implement:
 165
 166  * protect storage space: only let authorized clients upload/consume storage
 167  * protect download bandwidth: only give shares to authorized clients
 168  * protect share reliability: only upload shares to "good" servers
 169
 170 The first two are implemented by the server, to protect their resources. The
 171 last is implemented by the client, to avoid uploading shares to unreliable
 172 servers (specifically, to maximize the utility of the client's limited upload
 173 bandwidth: there's no problem with putting shares on unreliable peers per se,
 174 but it is a problem if doing so means the client won't put a share on a more
 175 reliable peer).
 176
 177 The first limitation (protect storage space) will be implemented by public
 178 keys and signed "storage authority" certificates. The client will present
 179 some credentials to the storage server to convince it that the client
 180 deserves the space. When storage servers are in this mode, they will have a
 181 certificate that names a public key, and any credentials that can demonstrate
 182 a path from that key will be accepted. This scheme is described in
 183 docs/proposed/old-accounts-pubkey.txt .
 184
 185 The second limitation is unexplored. The read-cap does not currently contain
 186 any notion of who must pay for the bandwidth incurred.
 187
 188 The third limitation (only upload to "good" servers), when enabled, is
 189 implemented by a "server selection policy" on the client side, which defines
 190 which server credentials will be accepted. This is just like the first
 191 limitation in reverse. Before clients consider including a server in their
 192 peer selection algorithm, they check the credentials, and ignore any that do
 193 not meet them.
 194
 195 This means that a client may not wish to upload anything to "foreign grids",
 196 because they have no promise of reliability. The reasons that a client might
 197 want to upload to a foreign grid need to be examined: reliability may not be
 198 important, or it might be good enough to upload the file to the client's
 199 "home grid" instead.
 200
 201 The server selection policy is intended to be fairly open-ended: we can
 202 imagine a policy that says "upload to any server that has a good reputation
 203 among group X", or more complicated schemes that require less and less
 204 centralized management. One important and simple scheme is to simply have a
 205 list of acceptable keys: a friendnet with 5 members would include 5 such keys
 206 in each policy, enabling every member to use the services of the others,
 207 without having a single central manager with unilateral control over the
 208 definition of the group.
 209
 210 == Closed Grids ==
 211
 212 To implement these access controls, each client needs to be configured with
 213 three things:
 214
 215  * home grid ID (used to find introducers, helpers, storage servers)
 216  * storage authority (certificate to enable uploads)
 217  * server selection policy (identify good/reliable servers)
 218
 219 If the server selection policy indicates centralized control (i.e. there is
 220 some single key X which is used to sign the credentials for all "good"
 221 servers), then this could be built in to the grid ID. By using the base32
 222 hash of the pubkey as the grid ID, clients would only need to be configured
 223 with two things: the grid ID, and their storage authority. In this case, the
 224 introducer would provide the pubkey, and the client would compare the hashes
 225 to make sure they match. This is analogous to how a TubID is used in a FURL.
 226
 227 Such grids would have significantly larger grid IDs, 24 characters or more.