From: Brian Warner Date: Mon, 22 Apr 2013 06:01:25 +0000 (-0700) Subject: docs/nodekeys.rst: explain what node keys are for, with compatibility notes X-Git-Tag: allmydata-tahoe-1.10.0b2~6 X-Git-Url: https://git.rkrishnan.org/components/com_hotproperty/%22doc.html/COPYING.GPL?a=commitdiff_plain;h=0ba5058f6a743ebd8a2c6669e3a535ae4afe522c;p=tahoe-lafs%2Ftahoe-lafs.git docs/nodekeys.rst: explain what node keys are for, with compatibility notes --- diff --git a/docs/nodekeys.rst b/docs/nodekeys.rst new file mode 100644 index 00000000..6ea5eb39 --- /dev/null +++ b/docs/nodekeys.rst @@ -0,0 +1,133 @@ +======================= +Node Keys in Tahoe-LAFS +======================= + +"Node Keys" are cryptographic signing/verifying keypairs used to +identify Tahoe-LAFS nodes (client-only and client+server). The private +signing key is stored in NODEDIR/private/node.privkey , and is used to +sign the announcements that are distributed to all nodes by the +Introducer. The public verifying key is used to identify the sending +node from those other systems: it is displayed as a "Node ID" that looks +like "v0-abc234xyz567..", which ends with a long base32-encoded string. + +These node keys were introduced in the 1.10 release (April 2013), as +part of ticket #466. In previous releases, announcements were unsigned, +and nodes were identified by their Foolscap "Tub ID" (a somewhat shorter +base32 string, with no "v0-" prefix). + +Why Announcements Are Signed +---------------------------- + +All nodes (both client-only and client+server) publish announcements to +the Introducer, which then relays them to all other nodes. These +announcements contain information about the publishing node's nickname, +how to reach the node, what services it offers, and what version of code +it is running. + +The new private node key is used to sign these announcements, preventing +the Introducer from modifying their contents en-route. This will enable +future versions of Tahoe-LAFS to use other forms of introduction +(gossip, multiple introducers) without weakening the security model. + +The Node ID is useful as a handle with which to talk about a node. For +example, when clients eventually gain the ability to control which +storage servers they are willing to use (#467), the configuration file +might simply include a list of Node IDs for the approved servers. + +TubIDs are currently also suitable for this job, but they depend upon +having a Foolscap connection to the server. Since our goal is to move +away from Foolscap towards a simpler (faster and more portable) +protocol, we want to reduce our dependence upon TubIDs. Node IDs and +Ed25519 signatures can be used for non-Foolscap non-SSL based protocols. + +How The Node ID Is Computed +--------------------------- + +The long-form Node ID is the Ed25519 public verifying key, 256 bits (32 +bytes) long, base32-encoded, with a "v0-" prefix appended, and the +trailing "=" padding removed, like so: + + v0-rlj3jnxqv4ee5rtpyngvzbhmhuikjfenjve7j5mzmfcxytwmyf6q + +The Node ID is displayed in this long form on the node's front Welcome +page, and on the Introducer's status page. In most other places +(share-placement lists, file health displays), the "short form" is used +instead. This is simply the first 8 characters of the base32 portion, +frequently enclosed in square brackets, like this: + + [rlj3jnxq] + +In contrast, old-style TubIDs are usually displayed with just 6 base32 +characters. + +Version Compatibility, Fallbacks For Old Versions +------------------------------------------------- + +Since Tahoe-LAFS 1.9 does not know about signed announcements, 1.10 +includes backwards-compatibility code to allow old and new versions to +interoperate. There are three relevant participants: the node publishing +an announcement, the Introducer which relays them, and the node +receiving the (possibly signed) announcement. + +When a 1.10 node connects to an old Introducer (version 1.9 or earlier), +it sends downgraded non-signed announcements. It likewise accepts +non-signed announcements from the Introducer. The non-signed +announcements use TubIDs to identify the sending node. The new 1.10 +Introducer, when it connects to an old node, downgrades any signed +announcements to non-signed ones before delivery. + +As a result, the only way to receive signed announcements is for all +three systems to be running the new 1.10 code. In a grid with a mixture +of old and new nodes, if the Introducer is old, then all nodes will see +unsigned TubIDs. If the Introducer is new, then nodes will see signed +Node IDs whenever possible. + +Share Placement +--------------- + +Tahoe-LAFS uses a "permuted ring" algorithm to decide where to place +shares for any given file. For each potential server, it uses that +server's "permutation seed" to compute a pseudo-random but deterministic +location on a ring, then walks the ring in clockwise order, asking each +server in turn to hold a share until all are placed. When downloading a +file, the servers are accessed in the same order. This minimizes the +number of queries that must be done to download a file, and tolerates +"churn" (nodes being added and removed from the grid) fairly well. + +This property depends upon server nodes having a stable permutation +seed. If a server's permutation seed were to change, it would +effectively wind up at a randomly selected place on the permuted ring. +Downloads would still complete, but clients would spend more time asking +other servers before querying the correct one. + +In the old 1.9 code, the permutation-seed was always equal to the TubID. +In 1.10, servers include their permutation-seed as part of their +announcement. To improve stability for existing grids, if an old server +(one with existing shares) is upgraded to run the 1.10 codebase, it will +use its old TubID as its permutation-seed. When a new empty server runs +the 1.10 code, it will use its Node ID instead. In both cases, once the +node has picked a permutation-seed, it will continue using that value +forever. + +To be specific, when a node wakes up running the 1.10 code, it will look +for a recorded NODEDIR/permutation-seed file, and use its contents if +present. If that file does not exist, it creates it (with the TubID if +it has any shares, otherwise with the Node ID), and uses the contents as +the permutation-seed. + +There is one unfortunate consequence of this pattern. If new 1.10 server +is created in a grid that has an old client, or has a new client but an +old Introducer, then that client will see downgraded non-signed +announcements, and thus will first upload shares with the TubID-based +permutation-seed. Later, when the client and/or Introducer is upgraded, +the client will start seeing signed announcements with the NodeID-based +permutation-seed, and will then look for shares in the wrong place. This +will hurt performance in a large grid, but should not affect +reliability. This effect shouldn't even be noticeable in grids for which +the number of servers is close to the "N" shares.total number (e.g. +where num-servers < 3*N). And the as-yet-unimplemented "share +rebalancing" feature should repair the misplacement. + +If you wish to avoid this effect, try to upgrade both Introducers and +clients at about the same time. (Upgrading servers does not matter: they +will continue to use the old permutation-seed).