From: Brian Warner Date: Sun, 10 Jun 2007 03:32:34 +0000 (-0700) Subject: thingA.txt has finally been renamed X-Git-Tag: allmydata-tahoe-0.3.0~4 X-Git-Url: https://git.rkrishnan.org/provisioning?a=commitdiff_plain;h=dac76b508cf311a098d849ef0e5790001f623da2;p=tahoe-lafs%2Ftahoe-lafs.git thingA.txt has finally been renamed --- diff --git a/docs/URI-extension.txt b/docs/URI-extension.txt new file mode 100644 index 00000000..0b434809 --- /dev/null +++ b/docs/URI-extension.txt @@ -0,0 +1,62 @@ + +"URI Extension Block" + +This block is a bencoded dictionary. All buckets hold an identical copy. The +hash of the serialized data is kept in the URI. + +The download process must obtain a valid copy of this data before any +decoding can take place. The download process must also obtain other data +before incremental validation can be performed. Full-file validation (for +clients who do not wish to do incremental validation) can be performed solely +with the data from this block. + +At the moment, this data block contains the following keys (and an estimate +on their sizes): + + size 5 + segment_size 7 + num_segments 2 + needed_shares 2 + total_shares 3 + + codec_name 3 + codec_params 5+1+2+1+3=12 + tail_codec_params 12 + + share_root_hash 32 (binary) or 52 (base32-encoded) each + fileid + plaintext_root_hash + verifierid + crypttext_root_hash + +Some pieces are needed elsewhere (size should be visible without pulling the +block, the Tahoe3 algorithm needs total_shares to find the right peers, all +peer selection algorithms need needed_shares to ask a minimal set of peers). +Some pieces are arguably redundant but are convenient to have present +(test_encode.py makes use of num_segments). + +fileid/verifierid need to be renamed 'plaintext_hash' and 'crypttext_hash' +respectively. + +The rule for this data block is that it should be a constant size for all +files, regardless of file size. Therefore hash trees (which have a size that +depends linearly upon the number of segments) are stored elsewhere in the +bucket, with only the hash tree root stored in this data block. + +This block will be serialized as follows: + + assert that all keys match ^[a-zA-z_\-]+$ + sort all the keys lexicographically + for k in keys: + write("%s:" % k) + write(netstring(data[k])) + + +Serialized size: + + dense binary (but decimal) packing: 160+46=206 + including 'key:' (185) and netstring (6*3+7*4=46) on values: 231 + including 'key:%d\n' (185+13=198) and printable values (46+5*52=306)=504 + +We'll go with the 231-sized block, and provide a tool to dump it as text if +we really want one. diff --git a/docs/thingA.txt b/docs/thingA.txt deleted file mode 100644 index 0b434809..00000000 --- a/docs/thingA.txt +++ /dev/null @@ -1,62 +0,0 @@ - -"URI Extension Block" - -This block is a bencoded dictionary. All buckets hold an identical copy. The -hash of the serialized data is kept in the URI. - -The download process must obtain a valid copy of this data before any -decoding can take place. The download process must also obtain other data -before incremental validation can be performed. Full-file validation (for -clients who do not wish to do incremental validation) can be performed solely -with the data from this block. - -At the moment, this data block contains the following keys (and an estimate -on their sizes): - - size 5 - segment_size 7 - num_segments 2 - needed_shares 2 - total_shares 3 - - codec_name 3 - codec_params 5+1+2+1+3=12 - tail_codec_params 12 - - share_root_hash 32 (binary) or 52 (base32-encoded) each - fileid - plaintext_root_hash - verifierid - crypttext_root_hash - -Some pieces are needed elsewhere (size should be visible without pulling the -block, the Tahoe3 algorithm needs total_shares to find the right peers, all -peer selection algorithms need needed_shares to ask a minimal set of peers). -Some pieces are arguably redundant but are convenient to have present -(test_encode.py makes use of num_segments). - -fileid/verifierid need to be renamed 'plaintext_hash' and 'crypttext_hash' -respectively. - -The rule for this data block is that it should be a constant size for all -files, regardless of file size. Therefore hash trees (which have a size that -depends linearly upon the number of segments) are stored elsewhere in the -bucket, with only the hash tree root stored in this data block. - -This block will be serialized as follows: - - assert that all keys match ^[a-zA-z_\-]+$ - sort all the keys lexicographically - for k in keys: - write("%s:" % k) - write(netstring(data[k])) - - -Serialized size: - - dense binary (but decimal) packing: 160+46=206 - including 'key:' (185) and netstring (6*3+7*4=46) on values: 231 - including 'key:%d\n' (185+13=198) and printable values (46+5*52=306)=504 - -We'll go with the 231-sized block, and provide a tool to dump it as text if -we really want one.