From: Brian Warner Date: Thu, 14 Feb 2008 00:40:45 +0000 (-0700) Subject: docs/dirnodes.txt: add notes on dirnode sizes X-Git-Tag: allmydata-tahoe-0.8.0~61 X-Git-Url: https://git.rkrishnan.org/pf/content/en/service/rgr-080307.php?a=commitdiff_plain;h=269d4bcded2e9ab1255a8048989bb34c2f021963;p=tahoe-lafs%2Ftahoe-lafs.git docs/dirnodes.txt: add notes on dirnode sizes --- diff --git a/docs/dirnodes.txt b/docs/dirnodes.txt index 5aa3fbda..adc8fcab 100644 --- a/docs/dirnodes.txt +++ b/docs/dirnodes.txt @@ -157,6 +157,52 @@ other users who have read-only access to 'foo' will be unable to decrypt its rwcap slot, this limits those users to read-only access to 'bar' as well, thus providing the transitive readonlyness that we desire. +=== Dirnode sizes, mutable-file initial read sizes === + +How big are dirnodes? When reading dirnode data out of mutable files, how +large should our initial read be? If we guess exactly, we can read a dirnode +in a single round-trip, and update one in two RTT. If we guess too high, +we'll waste some amount of bandwidth. If we guess low, we need to make a +second pass to get the data (or the encrypted privkey, for writes), which +will cost us at least another RTT. + +Assuming child names are between 10 and 99 characters long, how long are the +various pieces of a dirnode? + + netstring(name) ~= 4+len(name) + chk-cap = 97 (for 4-char filesizes) + dir-rw-cap = 88 + dir-ro-cap = 91 + netstring(cap) = 4+len(cap) + encrypted(cap) = 16+cap+32 + JSON({}) = 2 + JSON({ctime=float,mtime=float}): 57 + netstring(metadata) = 4+57 = 61 + +so a CHK entry is: + 5+ 4+len(name) + 4+97 + 5+16+97+32 + 4+57 +And a 15-byte filename gives a 336-byte entry. When the entry points at a +subdirectory instead of a file, the entry is a little bit smaller. So an +empty directory uses 0 bytes, a directory with one child uses about 336 +bytes, a directory with two children uses about 672, etc. + +When the dirnode data is encoding using our default 3-of-10, that means we +get 112ish bytes of data in each share per child. + +The pubkey, signature, and hashes form the first 935ish bytes of the +container, then comes our data, then about 1216 bytes of encprivkey. So if we +read the first: + + 1kB: we get 65bytes of dirnode data : only empty directories + 1kiB: 89bytes of dirnode data : maybe one short-named subdir + 2kB: 1065bytes: about 9 entries + 3kB: 2065bytes: about 18 entries, or 7.5 entries plus the encprivkey + 4kB: 3065bytes: about 27 entries, or about 16.5 plus the encprivkey + +So we've written the code to do an initial read of 2kB from each share when +we read the mutable file, which should give good performance (one RTT) for +small directories. + == Design Goals, redux ==