computed to help validate the data afterwards (providing the "identification"
property). All of these pieces, plus information about the file's size and
the number of shares into which it has been distributed, are put into the
-"CHK" uri. The storage index is derived by hashing the read key, so it does
-not need to be physically present in the URI.
+"CHK" uri. The storage index is derived by hashing the read key (using a
+tagged SHA-256 hash, then truncated to 128 bits), so it does not need to be
+physically present in the URI.
The current format for CHK URIs is the concatenation of the following
strings:
Nodeid = StringConstraint(maxLength=20,
minLength=20) # binary format 20-byte SHA1 hash
FURL = StringConstraint(1000)
-StorageIndex = StringConstraint(32)
+StorageIndex = StringConstraint(16)
URI = StringConstraint(300) # kind of arbitrary
MAX_BUCKETS = 200 # per peer
ShareData = StringConstraint(400000) # 1MB segment / k=3 = 334kB
u = IFileURI(newuri)
self.failUnless(isinstance(u, uri.CHKFileURI))
self.failUnless(isinstance(u.storage_index, str))
- self.failUnlessEqual(len(u.storage_index), 32)
+ self.failUnlessEqual(len(u.storage_index), 16)
self.failUnless(isinstance(u.key, str))
self.failUnlessEqual(len(u.key), 16)
self.failUnlessEqual(u.size, size)
self._encoder.set_encryption_key(key)
storage_index = hashutil.storage_index_chk_hash(key)
assert isinstance(storage_index, str)
- # TODO: is there any point to having the SI be longer than the key?
- # There's certainly no extra entropy to be had..
- assert len(storage_index) == 32 # SHA-256
+ # There's no point to having the SI be longer than the key, so we
+ # specify that it is truncated to the same 128 bits as the AES key.
+ assert len(storage_index) == 16 # SHA-256 truncated to 128b
self._storage_index = storage_index
log.msg(" upload storage_index is [%s]" % (idlib.b2a(storage_index,)))
self.storage_index = hashutil.storage_index_chk_hash(self.key)
assert isinstance(self.storage_index, str)
- assert len(self.storage_index) == 32 # sha256 hash
+ assert len(self.storage_index) == 16 # sha256 hash truncated to 128
self.uri_extension_hash = idlib.a2b(uri_extension_hash_s)
assert isinstance(self.uri_extension_hash, str)
return SHA256.new(netstring(tag))
def storage_index_chk_hash(data):
- return tagged_hash("allmydata_CHK_storage_index_v1", data)
+ # storage index is truncated to 128 bits (16 bytes). We're only hashing a
+ # 16-byte value to get it, so there's no point in using a larger value.
+ return tagged_hash("allmydata_CHK_storage_index_v1", data)[:16]
def block_hash(data):
return tagged_hash("allmydata_encoded_subshare_v1", data)