1 .. -*- coding: utf-8-with-signature-unix; fill-column: 77 -*-
3 There are several ways you could use Tahoe-LAFS as a key-value store.
5 Looking only at things that are *already implemented*, there are three
14 This is spelled "`PUT /uri`_" in the API.
16 Note: the user (client code) of this API does not get to choose the key!
17 The key is determined programmatically using secure hash functions and
18 encryption of the value and of the optional "added convergence secret".
22 This is spelled "`GET /uri/$FILECAP`_" in the API. "$FILECAP" is the
25 For details, see "immutable files" in `performance.rst`_, but in summary:
26 the performance is not great but not bad.
28 That document doesn't mention that if the size of the A-byte mutable file
29 is less than or equal to `55 bytes`_ then the performance cost is much
30 smaller, because the value gets packed into the key. Added a ticket:
39 This is spelled "`PUT /uri?format=mdmf`_".
41 Note: again, the key cannot be chosen by the user! The key is
42 determined programmatically using secure hash functions and RSA public
49 This is spelled "`GET /uri/$FILECAP`_". Again, the "$FILECAP" is the
50 key. This is the same API as for getting the value from an immutable,
51 above. Whether the value you get this way is immutable (i.e. it will
52 always be the same value) or mutable (i.e. an authorized person can
53 change what value you get when you read) depends on the type of the
56 Again, for details, see "mutable files" in `performance.rst`_ (and
57 `these tickets`_ about how that doc is incomplete), but in summary, the
58 performance of the create() operation is *terrible*! (It involves
59 generating a 2048-bit RSA key pair.) The performance of the set and get
60 operations are probably merely not great but not bad.
66 * directory ← create()
68 This is spelled "`POST /uri?t=mkdir`_".
70 `performance.rst`_ does not mention directories (`#2228`_), but in order
71 to understand the performance of directories you have to understand how
72 they are implemented. Mkdir creates a new mutable file, exactly the
73 same, and with exactly the same performance, as the "create() mutable"
76 * set(directory, key, value)
78 This is spelled "`PUT /uri/$DIRCAP/[SUBDIRS../]FILENAME`_". "$DIRCAP"
79 is the directory, "FILENAME" is the key. The value is the body of the
80 HTTP PUT request. The part about "[SUBDIRS../]" in there is for
81 optional nesting which you can ignore for the purposes of this
84 This way, you *do* get to choose the key to be whatever you want (an
85 arbitrary unicode string).
87 To understand the performance of ``PUT /uri/$directory/$key``,
88 understand that this proceeds in two steps: first it uploads the value
89 as an immutable file, exactly the same as the "put(value)" API from the
90 immutable API above. So right there you've already paid exactly the
91 same cost as if you had used that API. Then after it has finished
92 uploading that, and it has the immutable file cap from that operation
93 in hand, it downloads the entire current directory, changes it to
94 include the mapping from key to the immutable file cap, and re-uploads
95 the entire directory. So that has a cost which is easy to understand:
96 you have to download and re-upload the entire directory, which is the
97 entire set of mappings from user-chosen keys (Unicode strings) to
98 immutable file caps. Each entry in the directory occupies something on
99 the order of 300 bytes.
101 So the "set()" call from this directory-based API has obviously much
102 worse performance than the the equivalent "set()" calls from the
103 immutable-file-based API or the mutable-file-based API. This is not
104 necessarily worse overall than the performance of the
105 mutable-file-based API if you take into account the cost of the
106 necessary create() calls.
108 * value ← get(directory, key)
110 This is spelled "`GET /uri/$DIRCAP/[SUBDIRS../]FILENAME`_". As above,
111 "$DIRCAP" is the directory, "FILENAME" is the key.
113 The performance of this is determined by the fact that it first
114 downloads the entire directory, then finds the immutable filecap for
115 the given key, then does a GET on that immutable filecap. So again,
116 it is strictly worse than using the immutable file API (about twice
117 as bad, if the directory size is similar to the value size).
119 What about ways to use LAFS as a key-value store that are not yet
120 implemented? Well, Zooko has lots of ideas about ways to extend Tahoe-LAFS to
121 support different kinds of storage APIs or better performance. One that he
122 thinks is pretty promising is just the Keep It Simple, Stupid idea of "store a
123 sqlite db in a Tahoe-LAFS mutable". ☺
125 .. _PUT /uri: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#writing-uploading-a-file
127 .. _GET /uri/$FILECAP: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#viewing-downloading-a-file
129 .. _55 bytes: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/immutable/upload.py?rev=196bd583b6c4959c60d3f73cdcefc9edda6a38ae#L1504
131 .. _PUT /uri?format=mdmf: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#writing-uploading-a-file
133 .. _performance.rst: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/performance.rst
135 .. _#2226: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2226
137 .. _these tickets: https://tahoe-lafs.org/trac/tahoe-lafs/query?status=assigned&status=new&status=reopened&keywords=~doc&description=~performance.rst&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone&order=priority
139 .. _POST /uri?t=mkdir: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#creating-a-new-directory
141 .. _#2228: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2228
143 .. _PUT /uri/$DIRCAP/[SUBDIRS../]FILENAME: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#creating-a-new-directory
145 .. _GET /uri/$DIRCAP/[SUBDIRS../]FILENAME: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#reading-a-file