1 = Performance costs for some common operations =
3 1. Publishing an A-byte immutable file
4 2. Publishing an A-byte mutable file
5 3. Downloading B bytes of an A-byte immutable file
6 4. Downloading B bytes of an A-byte mutable file
7 5. Modifying B bytes of an A-byte mutable file
8 6. Inserting/Removing B bytes in an A-byte mutable file
9 7. Adding an entry to an A-entry directory
10 8. Listing an A entry directory
11 9. Performing a file-check on an A-byte file
12 10. Performing a file-verify on an A-byte file
13 11. Repairing an A-byte file (mutable or immutable)
15 == Publishing an A-byte immutable file ==
18 memory footprint: N/k*128KiB
20 notes: An immutable file upload requires an additional I/O pass over the entire
21 source file before the upload process can start, since convergent
22 encryption derives the encryption key in part from the contents of the
25 == Publishing an A-byte mutable file ==
28 memory footprint: N/k*A
29 cpu: O(A) + a large constant for RSA keypair generation
31 notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that
32 it publishes to a grid. This takes up to 1 or 2 seconds on a
35 Part of the process of encrypting, encoding, and uploading a
36 mutable file to a Tahoe-LAFS grid requires that the entire file
37 be in memory at once. For larger files, this may cause
38 Tahoe-LAFS to have an unacceptably large memory footprint (at
39 least when uploading a mutable file).
41 == Downloading B bytes of an A-byte immutable file ==
44 memory footprint: 128KiB
46 notes: When Tahoe-LAFS 1.8.0 or later is asked to read an arbitrary range
47 of an immutable file, only the 128-KiB segments that overlap the
48 requested range will be downloaded.
50 (Earlier versions would download from the beginning of the file up
51 until the end of the requested range, and then continue to download
52 the rest of the file even after the request was satisfied.)
54 == Downloading B bytes of an A-byte mutable file ==
59 notes: As currently implemented, mutable files must be downloaded in
60 their entirety before any part of them can be read. We are
61 exploring fixes for this; see ticket #393 for more information.
63 == Modifying B bytes of an A-byte mutable file ==
66 memory footprint: N/k*A
68 notes: If you upload a changed version of a mutable file that you
69 earlier put onto your grid with, say, 'tahoe put --mutable',
70 Tahoe-LAFS will replace the old file with the new file on the
71 grid, rather than attempting to modify only those portions of the
72 file that have changed. Modifying a file in this manner is
73 essentially uploading the file over again, except that it re-uses
74 the existing RSA keypair instead of generating a new one.
76 == Inserting/Removing B bytes in an A-byte mutable file ==
79 memory footprint: N/k*A
81 notes: Modifying any part of a mutable file in Tahoe-LAFS requires that
82 the entire file be downloaded, modified, held in memory while it is
83 encrypted and encoded, and then re-uploaded. A future version of the
84 mutable file layout ("LDMF") may provide efficient inserts and
85 deletes. Note that this sort of modification is mostly used internally
86 for directories, and isn't something that the WUI, CLI, or other
87 interfaces will do -- instead, they will simply overwrite the file to
88 be modified, as described in "Modifying B bytes of an A-byte mutable
91 == Adding an entry to an A-entry directory ==
94 memory footprint: N/k*A
96 notes: In Tahoe-LAFS, directories are implemented as specialized mutable
97 files. So adding an entry to a directory is essentially adding B
98 (actually, 300-330) bytes somewhere in an existing mutable file.
100 == Listing an A entry directory ==
103 memory footprint: N/k*A
105 notes: Listing a directory requires that the mutable file storing the
106 directory be downloaded from the grid. So listing an A entry
107 directory requires downloading a (roughly) 330 * A byte mutable
108 file, since each directory entry is about 300-330 bytes in size.
110 == Performing a file-check on an A-byte file ==
112 network: O(S), where S is the number of servers on your grid
113 memory footprint: negligible
115 notes: To check a file, Tahoe-LAFS queries all the servers that it knows
116 about. Note that neither of these values directly depend on the size
117 of the file. This is relatively inexpensive, compared to the verify
118 and repair operations.
120 == Performing a file-verify on an A-byte file ==
123 memory footprint: N/k*128KiB
125 notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext
126 shares that were originally uploaded to the grid and integrity
127 checks them. This is, for well-behaved grids, likely to be more
128 expensive than downloading an A-byte file, since only a fraction
129 of these shares are necessary to recover the file.
131 == Repairing an A-byte file (mutable or immutable) ==
133 network: variable; up to around O(A)
134 memory footprint: from 128KiB to (1+N/k)*128KiB
136 notes: To repair a file, Tahoe-LAFS downloads the file, and generates/uploads
137 missing shares in the same way as when it initially uploads the file.
138 So, depending on how many shares are missing, this can be about as
139 expensive as initially uploading the file in the first place.