From 8143183e3973378637584ff51e486974fd7fc0f9 Mon Sep 17 00:00:00 2001 From: Zooko O'Whielacronx Date: Thu, 14 Oct 2010 22:29:13 -0700 Subject: [PATCH] docs: convert all .txt docs to .rst thanks to Ravi Pinjala fixes #1225 --- CREDITS | 4 + NEWS | 6 + docs/{architecture.txt => architecture.rst} | 112 ++-- docs/backdoors.rst | 51 ++ docs/backdoors.txt | 25 - docs/{backupdb.txt => backupdb.rst} | 113 ++-- docs/configuration.rst | 572 ++++++++++++++++++ docs/configuration.txt | 529 ---------------- docs/{debian.txt => debian.rst} | 58 +- ...esystem-notes.txt => filesystem-notes.rst} | 16 +- ...-collection.txt => garbage-collection.rst} | 159 ++--- docs/{helper.txt => helper.rst} | 94 +-- ...t => how_to_make_a_tahoe-lafs_release.rst} | 0 docs/{known_issues.txt => known_issues.rst} | 103 ++-- docs/{logging.txt => logging.rst} | 120 ++-- docs/performance.rst | 162 +++++ docs/performance.txt | 139 ----- docs/stats.rst | 337 +++++++++++ docs/stats.txt | 276 --------- 19 files changed, 1569 insertions(+), 1307 deletions(-) rename docs/{architecture.txt => architecture.rst} (92%) create mode 100644 docs/backdoors.rst delete mode 100644 docs/backdoors.txt rename docs/{backupdb.txt => backupdb.rst} (82%) create mode 100644 docs/configuration.rst delete mode 100644 docs/configuration.txt rename docs/{debian.txt => debian.rst} (83%) rename docs/{filesystem-notes.txt => filesystem-notes.rst} (60%) rename docs/{garbage-collection.txt => garbage-collection.rst} (76%) rename docs/{helper.txt => helper.rst} (73%) rename docs/{how_to_make_a_tahoe-lafs_release.txt => how_to_make_a_tahoe-lafs_release.rst} (100%) rename docs/{known_issues.txt => known_issues.rst} (71%) rename docs/{logging.txt => logging.rst} (79%) create mode 100644 docs/performance.rst delete mode 100644 docs/performance.txt create mode 100644 docs/stats.rst delete mode 100644 docs/stats.txt diff --git a/CREDITS b/CREDITS index 34ce6ed5..6c6674e5 100644 --- a/CREDITS +++ b/CREDITS @@ -122,3 +122,7 @@ D: fix layout issue and server version numbers in WUI N: Jacob Lyles E: jacob.lyles@gmail.com D: fixed bug in WUI with Python 2.5 and a system clock set far in the past + +N: Ravi Pinjala +E: ravi@p-static.net +D: converted docs from .txt to .rst diff --git a/NEWS b/NEWS index dbbfb8a8..42f81cbb 100644 --- a/NEWS +++ b/NEWS @@ -1,5 +1,11 @@ User visible changes in Tahoe-LAFS. -*- outline; coding: utf-8 -*- +* Release 1.8.1 (coming) + +** Documentation + + - All .txt documents have been converted to .rst format (#1225) + * Release 1.8.0 (2010-09-23) ** New Features diff --git a/docs/architecture.txt b/docs/architecture.rst similarity index 92% rename from docs/architecture.txt rename to docs/architecture.rst index 604b5947..cea67cad 100644 --- a/docs/architecture.txt +++ b/docs/architecture.rst @@ -1,19 +1,22 @@ -= Tahoe-LAFS Architecture = - -1. Overview -2. The Key-Value Store -3. File Encoding -4. Capabilities -5. Server Selection -6. Swarming Download, Trickling Upload -7. The Filesystem Layer -8. Leases, Refreshing, Garbage Collection -9. File Repairer -10. Security -11. Reliability - - -== Overview == +======================= +Tahoe-LAFS Architecture +======================= + +1. `Overview`_ +2. `The Key-Value Store`_ +3. `File Encoding`_ +4. `Capabilities`_ +5. `Server Selection`_ +6. `Swarming Download, Trickling Upload`_ +7. `The Filesystem Layer`_ +8. `Leases, Refreshing, Garbage Collection`_ +9. `File Repairer`_ +10. `Security`_ +11. `Reliability`_ + + +Overview +======== (See the docs/specifications directory for more details.) @@ -40,10 +43,13 @@ Allmydata.com uses it for a backup service: the application periodically copies files from the local disk onto the decentralized filesystem. We later provide read-only access to those files, allowing users to recover them. There are several other applications built on top of the Tahoe-LAFS -filesystem (see the RelatedProjects page of the wiki for a list). +filesystem (see the `RelatedProjects +`_ page of the +wiki for a list). -== The Key-Value Store == +The Key-Value Store +=================== The key-value store is implemented by a grid of Tahoe-LAFS storage servers -- user-space processes. Tahoe-LAFS storage clients communicate with the storage @@ -76,7 +82,8 @@ For future releases, we have plans to decentralize introduction, allowing any server to tell a new client about all the others. -== File Encoding == +File Encoding +============= When a client stores a file on the grid, it first encrypts the file. It then breaks the encrypted file into small segments, in order to reduce the memory @@ -117,7 +124,8 @@ turn them into segments of ciphertext, use the decryption key to convert that into plaintext, then emit the plaintext bytes to the output target. -== Capabilities == +Capabilities +============ Capabilities to immutable files represent a specific set of bytes. Think of it like a hash function: you feed in a bunch of bytes, and you get out a @@ -142,20 +150,23 @@ to retrieve a set of bytes, and then you can use it to validate ("identify") that these potential bytes are indeed the ones that you were looking for. The "key-value store" layer doesn't include human-meaningful names. -Capabilities sit on the "global+secure" edge of Zooko's Triangle[1]. They are +Capabilities sit on the "global+secure" edge of `Zooko's Triangle`_. They are self-authenticating, meaning that nobody can trick you into accepting a file that doesn't match the capability you used to refer to that file. The filesystem layer (described below) adds human-meaningful names atop the key-value layer. +.. _`Zooko's Triangle`: http://en.wikipedia.org/wiki/Zooko%27s_triangle -== Server Selection == + +Server Selection +================ When a file is uploaded, the encoded shares are sent to some servers. But to which ones? The "server selection" algorithm is used to make this choice. The storage index is used to consistently-permute the set of all servers nodes -(by sorting them by HASH(storage_index+nodeid)). Each file gets a different +(by sorting them by ``HASH(storage_index+nodeid)``). Each file gets a different permutation, which (on average) will evenly distribute shares among the grid and avoid hotspots. Each server has announced its available space when it connected to the introducer, and we use that available space information to @@ -254,7 +265,7 @@ times), if possible. significantly hurt reliability (sometimes the permutation resulted in most of the shares being dumped on a single node). - Another algorithm (known as "denver airport"[2]) uses the permuted hash to + Another algorithm (known as "denver airport" [#naming]_) uses the permuted hash to decide on an approximate target for each share, then sends lease requests via Chord routing. The request includes the contact information of the uploading node, and asks that the node which eventually accepts the lease @@ -263,8 +274,17 @@ times), if possible. the same approach. This allows nodes to avoid maintaining a large number of long-term connections, at the expense of complexity and latency. +.. [#naming] all of these names are derived from the location where they were + concocted, in this case in a car ride from Boulder to DEN. To be + precise, "Tahoe 1" was an unworkable scheme in which everyone who holds + shares for a given file would form a sort of cabal which kept track of + all the others, "Tahoe 2" is the first-100-nodes in the permuted hash + described in this document, and "Tahoe 3" (or perhaps "Potrero hill 1") + was the abandoned ring-with-many-hands approach. + -== Swarming Download, Trickling Upload == +Swarming Download, Trickling Upload +=================================== Because the shares being downloaded are distributed across a large number of nodes, the download process will pull from many of them at the same time. The @@ -295,7 +315,8 @@ in the same facility, so the helper-to-storage-server bandwidth is huge. See "helper.txt" for details about the upload helper. -== The Filesystem Layer == +The Filesystem Layer +==================== The "filesystem" layer is responsible for mapping human-meaningful pathnames (directories and filenames) to pieces of data. The actual bytes inside these @@ -325,7 +346,8 @@ links to spaces that are shared with specific other users, and other spaces that are globally visible. -== Leases, Refreshing, Garbage Collection == +Leases, Refreshing, Garbage Collection +====================================== When a file or directory in the virtual filesystem is no longer referenced, the space that its shares occupied on each storage server can be freed, @@ -346,7 +368,8 @@ See docs/garbage-collection.txt for further information, and how to configure garbage collection. -== File Repairer == +File Repairer +============= Shares may go away because the storage server hosting them has suffered a failure: either temporary downtime (affecting availability of the file), or a @@ -403,19 +426,20 @@ to other nodes. in client behavior. -== Security == +Security +======== The design goal for this project is that an attacker may be able to deny service (i.e. prevent you from recovering a file that was uploaded earlier) but can accomplish none of the following three attacks: - 1) violate confidentiality: the attacker gets to view data to which you have - not granted them access - 2) violate integrity: the attacker convinces you that the wrong data is - actually the data you were intending to retrieve - 3) violate unforgeability: the attacker gets to modify a mutable file or - directory (either the pathnames or the file contents) to which you have - not given them write permission +1) violate confidentiality: the attacker gets to view data to which you have + not granted them access +2) violate integrity: the attacker convinces you that the wrong data is + actually the data you were intending to retrieve +3) violate unforgeability: the attacker gets to modify a mutable file or + directory (either the pathnames or the file contents) to which you have + not given them write permission Integrity (the promise that the downloaded data will match the uploaded data) is provided by the hashes embedded in the capability (for immutable files) or @@ -467,7 +491,8 @@ normal web site, using username and password to give a user access to her capabilities). -== Reliability == +Reliability +=========== File encoding and peer-node selection parameters can be adjusted to achieve different goals. Each choice results in a number of properties; there are @@ -532,16 +557,3 @@ decisions: this tool may help you evaluate different expansion factors and view the disk consumption of each. It is also acquiring some sections with availability/reliability numbers, as well as preliminary cost analysis data. This tool will continue to evolve as our analysis improves. - ------------------------------- - -[1]: http://en.wikipedia.org/wiki/Zooko%27s_triangle - -[2]: all of these names are derived from the location where they were - concocted, in this case in a car ride from Boulder to DEN. To be - precise, "Tahoe 1" was an unworkable scheme in which everyone who holds - shares for a given file would form a sort of cabal which kept track of - all the others, "Tahoe 2" is the first-100-nodes in the permuted hash - described in this document, and "Tahoe 3" (or perhaps "Potrero hill 1") - was the abandoned ring-with-many-hands approach. - diff --git a/docs/backdoors.rst b/docs/backdoors.rst new file mode 100644 index 00000000..ad12610e --- /dev/null +++ b/docs/backdoors.rst @@ -0,0 +1,51 @@ +====================== +Statement on Backdoors +====================== + +October 5, 2010 + +The New York Times has recently reported that the current U.S. administration +is proposing a bill that would apparently, if passed, require communication +systems to facilitate government wiretapping and access to encrypted data: + + http://www.nytimes.com/2010/09/27/us/27wiretap.html (login required; username/password pairs available at http://www.bugmenot.com/view/nytimes.com). + +Commentary by the Electronic Frontier Foundation +(https://www.eff.org/deeplinks/2010/09/government-seeks ), Peter Suderman / +Reason (http://reason.com/blog/2010/09/27/obama-administration-frustrate ), +Julian Sanchez / Cato Institute +(http://www.cato-at-liberty.org/designing-an-insecure-internet/ ). + +The core Tahoe developers promise never to change Tahoe-LAFS to facilitate +government access to data stored or transmitted by it. Even if it were +desirable to facilitate such access—which it is not—we believe it would not be +technically feasible to do so without severely compromising Tahoe-LAFS' +security against other attackers. There have been many examples in which +backdoors intended for use by government have introduced vulnerabilities +exploitable by other parties (a notable example being the Greek cellphone +eavesdropping scandal in 2004/5). RFCs 1984 and 2804 elaborate on the +security case against such backdoors. + +Note that since Tahoe-LAFS is open-source software, forks by people other than +the current core developers are possible. In that event, we would try to +persuade any such forks to adopt a similar policy. + +The following Tahoe-LAFS developers agree with this statement: + +David-Sarah Hopwood + +Zooko Wilcox-O'Hearn + +Brian Warner + +Kevan Carstensen + +Frédéric Marti + +Jack Lloyd + +François Deppierraz + +Yu Xue + +Marc Tooley diff --git a/docs/backdoors.txt b/docs/backdoors.txt deleted file mode 100644 index c08e26c9..00000000 --- a/docs/backdoors.txt +++ /dev/null @@ -1,25 +0,0 @@ -Statement on Backdoors - -October 5, 2010 - -The New York Times has recently reported that the current U.S. administration is proposing a bill that would apparently, if passed, require communication systems to facilitate government wiretapping and access to encrypted data: - - http://www.nytimes.com/2010/09/27/us/27wiretap.html (login required; username/password pairs available at http://www.bugmenot.com/view/nytimes.com). - -Commentary by the Electronic Frontier Foundation (https://www.eff.org/deeplinks/2010/09/government-seeks ), Peter Suderman / Reason (http://reason.com/blog/2010/09/27/obama-administration-frustrate ), Julian Sanchez / Cato Institute (http://www.cato-at-liberty.org/designing-an-insecure-internet/ ). - -The core Tahoe developers promise never to change Tahoe-LAFS to facilitate government access to data stored or transmitted by it. Even if it were desirable to facilitate such access—which it is not—we believe it would not be technically feasible to do so without severely compromising Tahoe-LAFS' security against other attackers. There have been many examples in which backdoors intended for use by government have introduced vulnerabilities exploitable by other parties (a notable example being the Greek cellphone eavesdropping scandal in 2004/5). RFCs 1984 and 2804 elaborate on the security case against such backdoors. - -Note that since Tahoe-LAFS is open-source software, forks by people other than the current core developers are possible. In that event, we would try to persuade any such forks to adopt a similar policy. - -The following Tahoe-LAFS developers agree with this statement: - -David-Sarah Hopwood -Zooko Wilcox-O'Hearn -Brian Warner -Kevan Carstensen -Frédéric Marti -Jack Lloyd -François Deppierraz -Yu Xue -Marc Tooley diff --git a/docs/backupdb.txt b/docs/backupdb.rst similarity index 82% rename from docs/backupdb.txt rename to docs/backupdb.rst index 7e4842fa..b91a8e4d 100644 --- a/docs/backupdb.txt +++ b/docs/backupdb.rst @@ -1,6 +1,14 @@ -= The Tahoe BackupDB = +================== +The Tahoe BackupDB +================== -== Overview == +1. `Overview`_ +2. `Schema`_ +3. `Upload Operation`_ +4. `Directory Operations`_ + +Overview +======== To speed up backup operations, Tahoe maintains a small database known as the "backupdb". This is used to avoid re-uploading files which have already been uploaded recently. @@ -33,46 +41,48 @@ actually provides sqlite3 rather than sqlite2), but on old distributions such as Debian etch (4.0 "oldstable") or Ubuntu Edgy (6.10) the "python-pysqlite2" package won't work, but the "sqlite3-dev" package will. -== Schema == - -The database contains the following tables: - -CREATE TABLE version -( - version integer # contains one row, set to 1 -); - -CREATE TABLE local_files -( - path varchar(1024), PRIMARY KEY -- index, this is os.path.abspath(fn) - size integer, -- os.stat(fn)[stat.ST_SIZE] - mtime number, -- os.stat(fn)[stat.ST_MTIME] - ctime number, -- os.stat(fn)[stat.ST_CTIME] - fileid integer -); - -CREATE TABLE caps -( - fileid integer PRIMARY KEY AUTOINCREMENT, - filecap varchar(256) UNIQUE -- URI:CHK:... -); - -CREATE TABLE last_upload -( - fileid INTEGER PRIMARY KEY, - last_uploaded TIMESTAMP, - last_checked TIMESTAMP -); - -CREATE TABLE directories -( - dirhash varchar(256) PRIMARY KEY, - dircap varchar(256), - last_uploaded TIMESTAMP, - last_checked TIMESTAMP -); - -== Upload Operation == +Schema +====== + +The database contains the following tables:: + + CREATE TABLE version + ( + version integer # contains one row, set to 1 + ); + + CREATE TABLE local_files + ( + path varchar(1024), PRIMARY KEY -- index, this is os.path.abspath(fn) + size integer, -- os.stat(fn)[stat.ST_SIZE] + mtime number, -- os.stat(fn)[stat.ST_MTIME] + ctime number, -- os.stat(fn)[stat.ST_CTIME] + fileid integer + ); + + CREATE TABLE caps + ( + fileid integer PRIMARY KEY AUTOINCREMENT, + filecap varchar(256) UNIQUE -- URI:CHK:... + ); + + CREATE TABLE last_upload + ( + fileid INTEGER PRIMARY KEY, + last_uploaded TIMESTAMP, + last_checked TIMESTAMP + ); + + CREATE TABLE directories + ( + dirhash varchar(256) PRIMARY KEY, + dircap varchar(256), + last_uploaded TIMESTAMP, + last_checked TIMESTAMP + ); + +Upload Operation +================ The upload process starts with a pathname (like ~/.emacs) and wants to end up with a file-cap (like URI:CHK:...). @@ -82,12 +92,16 @@ The first step is to convert the path to an absolute form is not present in this table, the file must be uploaded. The upload process is: - 1. record the file's size, creation time, and modification time - 2. upload the file into the grid, obtaining an immutable file read-cap - 3. add an entry to the 'caps' table, with the read-cap, to get a fileid - 4. add an entry to the 'last_upload' table, with the current time - 5. add an entry to the 'local_files' table, with the fileid, the path, - and the local file's size/ctime/mtime +1. record the file's size, creation time, and modification time + +2. upload the file into the grid, obtaining an immutable file read-cap + +3. add an entry to the 'caps' table, with the read-cap, to get a fileid + +4. add an entry to the 'last_upload' table, with the current time + +5. add an entry to the 'local_files' table, with the fileid, the path, + and the local file's size/ctime/mtime If the path *is* present in 'local_files', the easy-to-compute identifying information is compared: file size and ctime/mtime. If these differ, the file @@ -140,7 +154,8 @@ unmodified, and the "tahoe backup" command will not copy the new contents into the grid. The --no-timestamps can be used to disable this optimization, forcing every byte of the file to be hashed and encoded. -== Directory Operations == +Directory Operations +==================== Once the contents of a directory are known (a filecap for each file, and a dircap for each directory), the backup process must find or create a tahoe diff --git a/docs/configuration.rst b/docs/configuration.rst new file mode 100644 index 00000000..1e7fcb95 --- /dev/null +++ b/docs/configuration.rst @@ -0,0 +1,572 @@ +======================== +Configuring a Tahoe node +======================== + +1. `Overall Node Configuration`_ +2. `Client Configuration`_ +3. `Storage Server Configuration`_ +4. `Running A Helper`_ +5. `Running An Introducer`_ +6. `Other Files in BASEDIR`_ +7. `Other files`_ +8. `Backwards Compatibility Files`_ +9. `Example`_ + +A Tahoe node is configured by writing to files in its base directory. These +files are read by the node when it starts, so each time you change them, you +need to restart the node. + +The node also writes state to its base directory, so it will create files on +its own. + +This document contains a complete list of the config files that are examined +by the client node, as well as the state files that you'll observe in its +base directory. + +The main file is named 'tahoe.cfg', which is an ".INI"-style configuration +file (parsed by the Python stdlib 'ConfigParser' module: "[name]" section +markers, lines with "key.subkey: value", rfc822-style continuations). There +are other files that contain information which does not easily fit into this +format. The 'tahoe create-node' or 'tahoe create-client' command will create +an initial tahoe.cfg file for you. After creation, the node will never modify +the 'tahoe.cfg' file: all persistent state is put in other files. + +The item descriptions below use the following types: + +boolean + one of (True, yes, on, 1, False, off, no, 0), case-insensitive + +strports string + a Twisted listening-port specification string, like "tcp:80" + or "tcp:3456:interface=127.0.0.1". For a full description of + the format, see + http://twistedmatrix.com/documents/current/api/twisted.application.strports.html + +FURL string + a Foolscap endpoint identifier, like + pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm + + +Overall Node Configuration +========================== + +This section controls the network behavior of the node overall: which ports +and IP addresses are used, when connections are timed out, etc. This +configuration is independent of the services that the node is offering: the +same controls are used for client and introducer nodes. + +If your node is behind a firewall or NAT device and you want other clients to +connect to it, you'll need to open a port in the firewall or NAT, and specify +that port number in the tub.port option. If behind a NAT, you *may* need to +set the tub.location option described below. + +:: + + [node] + + nickname = (UTF-8 string, optional) + + This value will be displayed in management tools as this node's "nickname". + If not provided, the nickname will be set to "". This string + shall be a UTF-8 encoded unicode string. + + web.port = (strports string, optional) + + This controls where the node's webserver should listen, providing filesystem + access and node status as defined in webapi.txt . This file contains a + Twisted "strports" specification such as "3456" or + "tcp:3456:interface=127.0.0.1". The 'tahoe create-node' or 'tahoe create-client' + commands set the web.port to "tcp:3456:interface=127.0.0.1" by default; this + is overridable by the "--webport" option. You can make it use SSL by writing + "ssl:3456:privateKey=mykey.pem:certKey=cert.pem" instead. + + If this is not provided, the node will not run a web server. + + web.static = (string, optional) + + This controls where the /static portion of the URL space is served. The + value is a directory name (~username is allowed, and non-absolute names are + interpreted relative to the node's basedir) which can contain HTML and other + files. This can be used to serve a javascript-based frontend to the Tahoe + node, or other services. + + The default value is "public_html", which will serve $BASEDIR/public_html . + With the default settings, http://127.0.0.1:3456/static/foo.html will serve + the contents of $BASEDIR/public_html/foo.html . + + tub.port = (integer, optional) + + This controls which port the node uses to accept Foolscap connections from + other nodes. If not provided, the node will ask the kernel for any available + port. The port will be written to a separate file (named client.port or + introducer.port), so that subsequent runs will re-use the same port. + + tub.location = (string, optional) + + In addition to running as a client, each Tahoe node also runs as a server, + listening for connections from other Tahoe clients. The node announces its + location by publishing a "FURL" (a string with some connection hints) to the + Introducer. The string it publishes can be found in + $BASEDIR/private/storage.furl . The "tub.location" configuration controls + what location is published in this announcement. + + If you don't provide tub.location, the node will try to figure out a useful + one by itself, by using tools like 'ifconfig' to determine the set of IP + addresses on which it can be reached from nodes both near and far. It will + also include the TCP port number on which it is listening (either the one + specified by tub.port, or whichever port was assigned by the kernel when + tub.port is left unspecified). + + You might want to override this value if your node lives behind a firewall + that is doing inbound port forwarding, or if you are using other proxies + such that the local IP address or port number is not the same one that + remote clients should use to connect. You might also want to control this + when using a Tor proxy to avoid revealing your actual IP address through the + Introducer announcement. + + The value is a comma-separated string of host:port location hints, like + this: + + 123.45.67.89:8098,tahoe.example.com:8098,127.0.0.1:8098 + + A few examples: + + Emulate default behavior, assuming your host has IP address 123.45.67.89 + and the kernel-allocated port number was 8098: + + tub.port = 8098 + tub.location = 123.45.67.89:8098,127.0.0.1:8098 + + Use a DNS name so you can change the IP address more easily: + + tub.port = 8098 + tub.location = tahoe.example.com:8098 + + Run a node behind a firewall (which has an external IP address) that has + been configured to forward port 7912 to our internal node's port 8098: + + tub.port = 8098 + tub.location = external-firewall.example.com:7912 + + Run a node behind a Tor proxy (perhaps via torsocks), in client-only mode + (i.e. we can make outbound connections, but other nodes will not be able to + connect to us). The literal 'unreachable.example.org' will not resolve, but + will serve as a reminder to human observers that this node cannot be + reached. "Don't call us.. we'll call you": + + tub.port = 8098 + tub.location = unreachable.example.org:0 + + Run a node behind a Tor proxy, and make the server available as a Tor + "hidden service". (this assumes that other clients are running their node + with torsocks, such that they are prepared to connect to a .onion address). + The hidden service must first be configured in Tor, by giving it a local + port number and then obtaining a .onion name, using something in the torrc + file like: + + HiddenServiceDir /var/lib/tor/hidden_services/tahoe + HiddenServicePort 29212 127.0.0.1:8098 + + once Tor is restarted, the .onion hostname will be in + /var/lib/tor/hidden_services/tahoe/hostname . Then set up your tahoe.cfg + like: + + tub.port = 8098 + tub.location = ualhejtq2p7ohfbb.onion:29212 + + Most users will not need to set tub.location . + + Note that the old 'advertised_ip_addresses' file from earlier releases is no + longer supported. Tahoe 1.3.0 and later will ignore this file. + + log_gatherer.furl = (FURL, optional) + + If provided, this contains a single FURL string which is used to contact a + 'log gatherer', which will be granted access to the logport. This can be + used by centralized storage meshes to gather operational logs in a single + place. Note that when an old-style BASEDIR/log_gatherer.furl file exists + (see 'Backwards Compatibility Files', below), both are used. (for most other + items, the separate config file overrides the entry in tahoe.cfg) + + timeout.keepalive = (integer in seconds, optional) + timeout.disconnect = (integer in seconds, optional) + + If timeout.keepalive is provided, it is treated as an integral number of + seconds, and sets the Foolscap "keepalive timer" to that value. For each + connection to another node, if nothing has been heard for a while, we will + attempt to provoke the other end into saying something. The duration of + silence that passes before sending the PING will be between KT and 2*KT. + This is mainly intended to keep NAT boxes from expiring idle TCP sessions, + but also gives TCP's long-duration keepalive/disconnect timers some traffic + to work with. The default value is 240 (i.e. 4 minutes). + + If timeout.disconnect is provided, this is treated as an integral number of + seconds, and sets the Foolscap "disconnect timer" to that value. For each + connection to another node, if nothing has been heard for a while, we will + drop the connection. The duration of silence that passes before dropping the + connection will be between DT-2*KT and 2*DT+2*KT (please see ticket #521 for + more details). If we are sending a large amount of data to the other end + (which takes more than DT-2*KT to deliver), we might incorrectly drop the + connection. The default behavior (when this value is not provided) is to + disable the disconnect timer. + + See ticket #521 for a discussion of how to pick these timeout values. Using + 30 minutes means we'll disconnect after 22 to 68 minutes of inactivity. + Receiving data will reset this timeout, however if we have more than 22min + of data in the outbound queue (such as 800kB in two pipelined segments of 10 + shares each) and the far end has no need to contact us, our ping might be + delayed, so we may disconnect them by accident. + + ssh.port = (strports string, optional) + ssh.authorized_keys_file = (filename, optional) + + This enables an SSH-based interactive Python shell, which can be used to + inspect the internal state of the node, for debugging. To cause the node to + accept SSH connections on port 8022 from the same keys as the rest of your + account, use: + + [tub] + ssh.port = 8022 + ssh.authorized_keys_file = ~/.ssh/authorized_keys + + tempdir = (string, optional) + + This specifies a temporary directory for the webapi server to use, for + holding large files while they are being uploaded. If a webapi client + attempts to upload a 10GB file, this tempdir will need to have at least 10GB + available for the upload to complete. + + The default value is the "tmp" directory in the node's base directory (i.e. + $NODEDIR/tmp), but it can be placed elsewhere. This directory is used for + files that usually (on a unix system) go into /tmp . The string will be + interpreted relative to the node's base directory. + +Client Configuration +==================== + +:: + + [client] + introducer.furl = (FURL string, mandatory) + + This FURL tells the client how to connect to the introducer. Each Tahoe grid + is defined by an introducer. The introducer's furl is created by the + introducer node and written into its base directory when it starts, + whereupon it should be published to everyone who wishes to attach a client + to that grid + + helper.furl = (FURL string, optional) + + If provided, the node will attempt to connect to and use the given helper + for uploads. See docs/helper.txt for details. + + key_generator.furl = (FURL string, optional) + + If provided, the node will attempt to connect to and use the given + key-generator service, using RSA keys from the external process rather than + generating its own. + + stats_gatherer.furl = (FURL string, optional) + + If provided, the node will connect to the given stats gatherer and provide + it with operational statistics. + + shares.needed = (int, optional) aka "k", default 3 + shares.total = (int, optional) aka "N", N >= k, default 10 + shares.happy = (int, optional) 1 <= happy <= N, default 7 + + These three values set the default encoding parameters. Each time a new file + is uploaded, erasure-coding is used to break the ciphertext into separate + pieces. There will be "N" (i.e. shares.total) pieces created, and the file + will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved. + The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10). + Setting k to 1 is equivalent to simple replication (uploading N copies of + the file). + + These values control the tradeoff between storage overhead, performance, and + reliability. To a first approximation, a 1MB file will use (1MB*N/k) of + backend storage space (the actual value will be a bit more, because of other + forms of overhead). Up to N-k shares can be lost before the file becomes + unrecoverable, so assuming there are at least N servers, up to N-k servers + can be offline without losing the file. So large N/k ratios are more + reliable, and small N/k ratios use less disk space. Clearly, k must never be + smaller than N. + + Large values of N will slow down upload operations slightly, since more + servers must be involved, and will slightly increase storage overhead due to + the hash trees that are created. Large values of k will cause downloads to + be marginally slower, because more servers must be involved. N cannot be + larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe + uses. + + shares.happy allows you control over the distribution of your immutable file. + For a successful upload, shares are guaranteed to be initially placed on + at least 'shares.happy' distinct servers, the correct functioning of any + k of which is sufficient to guarantee the availability of the uploaded file. + This value should not be larger than the number of servers on your grid. + + A value of shares.happy <= k is allowed, but does not provide any redundancy + if some servers fail or lose shares. + + (Mutable files use a different share placement algorithm that does not + consider this parameter.) + + +Storage Server Configuration +============================ + +:: + + [storage] + enabled = (boolean, optional) + + If this is True, the node will run a storage server, offering space to other + clients. If it is False, the node will not run a storage server, meaning + that no shares will be stored on this node. Use False this for clients who + do not wish to provide storage service. The default value is True. + + readonly = (boolean, optional) + + If True, the node will run a storage server but will not accept any shares, + making it effectively read-only. Use this for storage servers which are + being decommissioned: the storage/ directory could be mounted read-only, + while shares are moved to other servers. Note that this currently only + affects immutable shares. Mutable shares (used for directories) will be + written and modified anyway. See ticket #390 for the current status of this + bug. The default value is False. + + reserved_space = (str, optional) + + If provided, this value defines how much disk space is reserved: the storage + server will not accept any share which causes the amount of free disk space + to drop below this value. (The free space is measured by a call to statvfs(2) + on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the + user account under which the storage server runs.) + + This string contains a number, with an optional case-insensitive scale + suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So + "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same + thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing. + + expire.enabled = + expire.mode = + expire.override_lease_duration = + expire.cutoff_date = + expire.immutable = + expire.mutable = + + These settings control garbage-collection, in which the server will delete + shares that no longer have an up-to-date lease on them. Please see the + neighboring "garbage-collection.txt" document for full details. + + +Running A Helper +================ + +A "helper" is a regular client node that also offers the "upload helper" +service. + +:: + + [helper] + enabled = (boolean, optional) + + If True, the node will run a helper (see docs/helper.txt for details). The + helper's contact FURL will be placed in private/helper.furl, from which it + can be copied to any clients which wish to use it. Clearly nodes should not + both run a helper and attempt to use one: do not create both helper.furl and + run_helper in the same node. The default is False. + + +Running An Introducer +===================== + +The introducer node uses a different '.tac' file (named introducer.tac), and +pays attention to the "[node]" section, but not the others. + +The Introducer node maintains some different state than regular client +nodes. + +BASEDIR/introducer.furl : This is generated the first time the introducer +node is started, and used again on subsequent runs, to give the introduction +service a persistent long-term identity. This file should be published and +copied into new client nodes before they are started for the first time. + + +Other Files in BASEDIR +====================== + +Some configuration is not kept in tahoe.cfg, for the following reasons: + +* it is generated by the node at startup, e.g. encryption keys. The node + never writes to tahoe.cfg +* it is generated by user action, e.g. the 'tahoe create-alias' command + +In addition, non-configuration persistent state is kept in the node's base +directory, next to the configuration knobs. + +This section describes these other files. + +private/node.pem + This contains an SSL private-key certificate. The node + generates this the first time it is started, and re-uses it on subsequent + runs. This certificate allows the node to have a cryptographically-strong + identifier (the Foolscap "TubID"), and to establish secure connections to + other nodes. + +storage/ + Nodes which host StorageServers will create this directory to hold + shares of files on behalf of other clients. There will be a directory + underneath it for each StorageIndex for which this node is holding shares. + There is also an "incoming" directory where partially-completed shares are + held while they are being received. + +client.tac + this file defines the client, by constructing the actual Client + instance each time the node is started. It is used by the 'twistd' + daemonization program (in the "-y" mode), which is run internally by the + "tahoe start" command. This file is created by the "tahoe create-node" or + "tahoe create-client" commands. + +private/control.furl + this file contains a FURL that provides access to a + control port on the client node, from which files can be uploaded and + downloaded. This file is created with permissions that prevent anyone else + from reading it (on operating systems that support such a concept), to insure + that only the owner of the client node can use this feature. This port is + intended for debugging and testing use. + +private/logport.furl + this file contains a FURL that provides access to a + 'log port' on the client node, from which operational logs can be retrieved. + Do not grant logport access to strangers, because occasionally secret + information may be placed in the logs. + +private/helper.furl + if the node is running a helper (for use by other + clients), its contact FURL will be placed here. See docs/helper.txt for more + details. + +private/root_dir.cap (optional) + The command-line tools will read a directory + cap out of this file and use it, if you don't specify a '--dir-cap' option or + if you specify '--dir-cap=root'. + +private/convergence (automatically generated) + An added secret for encrypting + immutable files. Everyone who has this same string in their + private/convergence file encrypts their immutable files in the same way when + uploading them. This causes identical files to "converge" -- to share the + same storage space since they have identical ciphertext -- which conserves + space and optimizes upload time, but it also exposes files to the possibility + of a brute-force attack by people who know that string. In this attack, if + the attacker can guess most of the contents of a file, then they can use + brute-force to learn the remaining contents. + +So the set of people who know your private/convergence string is the set of +people who converge their storage space with you when you and they upload +identical immutable files, and it is also the set of people who could mount +such an attack. + +The content of the private/convergence file is a base-32 encoded string. If +the file doesn't exist, then when the Tahoe client starts up it will generate +a random 256-bit string and write the base-32 encoding of this string into +the file. If you want to converge your immutable files with as many people as +possible, put the empty string (so that private/convergence is a zero-length +file). + +Other files +=========== + +logs/ + Each Tahoe node creates a directory to hold the log messages produced + as the node runs. These logfiles are created and rotated by the "twistd" + daemonization program, so logs/twistd.log will contain the most recent + messages, logs/twistd.log.1 will contain the previous ones, logs/twistd.log.2 + will be older still, and so on. twistd rotates logfiles after they grow + beyond 1MB in size. If the space consumed by logfiles becomes troublesome, + they should be pruned: a cron job to delete all files that were created more + than a month ago in this logs/ directory should be sufficient. + +my_nodeid + this is written by all nodes after startup, and contains a + base32-encoded (i.e. human-readable) NodeID that identifies this specific + node. This NodeID is the same string that gets displayed on the web page (in + the "which peers am I connected to" list), and the shortened form (the first + characters) is recorded in various log messages. + +Backwards Compatibility Files +============================= + +Tahoe releases before 1.3.0 had no 'tahoe.cfg' file, and used distinct files +for each item listed below. For each configuration knob, if the distinct file +exists, it will take precedence over the corresponding item in tahoe.cfg. + +=========================== =============================== ================= +Config setting File Comment +=========================== =============================== ================= +[node]nickname BASEDIR/nickname +[node]web.port BASEDIR/webport +[node]tub.port BASEDIR/client.port (for Clients, not Introducers) +[node]tub.port BASEDIR/introducer.port (for Introducers, not Clients) (note that, unlike other keys, tahoe.cfg overrides this file) +[node]tub.location BASEDIR/advertised_ip_addresses +[node]log_gatherer.furl BASEDIR/log_gatherer.furl (one per line) +[node]timeout.keepalive BASEDIR/keepalive_timeout +[node]timeout.disconnect BASEDIR/disconnect_timeout +[client]introducer.furl BASEDIR/introducer.furl +[client]helper.furl BASEDIR/helper.furl +[client]key_generator.furl BASEDIR/key_generator.furl +[client]stats_gatherer.furl BASEDIR/stats_gatherer.furl +[storage]enabled BASEDIR/no_storage (False if no_storage exists) +[storage]readonly BASEDIR/readonly_storage (True if readonly_storage exists) +[storage]sizelimit BASEDIR/sizelimit +[storage]debug_discard BASEDIR/debug_discard_storage +[helper]enabled BASEDIR/run_helper (True if run_helper exists) +=========================== =============================== ================= + +Note: the functionality of [node]ssh.port and [node]ssh.authorized_keys_file +were previously combined, controlled by the presence of a +BASEDIR/authorized_keys.SSHPORT file, in which the suffix of the filename +indicated which port the ssh server should listen on, and the contents of the +file provided the ssh public keys to accept. Support for these files has been +removed completely. To ssh into your Tahoe node, add [node]ssh.port and +[node].ssh_authorized_keys_file statements to your tahoe.cfg. + +Likewise, the functionality of [node]tub.location is a variant of the +now-unsupported BASEDIR/advertised_ip_addresses . The old file was additive +(the addresses specified in advertised_ip_addresses were used in addition to +any that were automatically discovered), whereas the new tahoe.cfg directive +is not (tub.location is used verbatim). + + +Example +======= + +The following is a sample tahoe.cfg file, containing values for all keys +described above. Note that this is not a recommended configuration (most of +these are not the default values), merely a legal one. + +:: + + [node] + nickname = Bob's Tahoe Node + tub.port = 34912 + tub.location = 123.45.67.89:8098,44.55.66.77:8098 + web.port = 3456 + log_gatherer.furl = pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm + timeout.keepalive = 240 + timeout.disconnect = 1800 + ssh.port = 8022 + ssh.authorized_keys_file = ~/.ssh/authorized_keys + + [client] + introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo + helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr + + [storage] + enabled = True + readonly_storage = True + sizelimit = 10000000000 + + [helper] + run_helper = True diff --git a/docs/configuration.txt b/docs/configuration.txt deleted file mode 100644 index 2f932140..00000000 --- a/docs/configuration.txt +++ /dev/null @@ -1,529 +0,0 @@ - -= Configuring a Tahoe node = - -A Tahoe node is configured by writing to files in its base directory. These -files are read by the node when it starts, so each time you change them, you -need to restart the node. - -The node also writes state to its base directory, so it will create files on -its own. - -This document contains a complete list of the config files that are examined -by the client node, as well as the state files that you'll observe in its -base directory. - -The main file is named 'tahoe.cfg', which is an ".INI"-style configuration -file (parsed by the Python stdlib 'ConfigParser' module: "[name]" section -markers, lines with "key.subkey: value", rfc822-style continuations). There -are other files that contain information which does not easily fit into this -format. The 'tahoe create-node' or 'tahoe create-client' command will create -an initial tahoe.cfg file for you. After creation, the node will never modify -the 'tahoe.cfg' file: all persistent state is put in other files. - -The item descriptions below use the following types: - - boolean: one of (True, yes, on, 1, False, off, no, 0), case-insensitive - strports string: a Twisted listening-port specification string, like "tcp:80" - or "tcp:3456:interface=127.0.0.1". For a full description of - the format, see - http://twistedmatrix.com/documents/current/api/twisted.application.strports.html - FURL string: a Foolscap endpoint identifier, like - pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm - - -== Overall Node Configuration == - -This section controls the network behavior of the node overall: which ports -and IP addresses are used, when connections are timed out, etc. This -configuration is independent of the services that the node is offering: the -same controls are used for client and introducer nodes. - -If your node is behind a firewall or NAT device and you want other clients to -connect to it, you'll need to open a port in the firewall or NAT, and specify -that port number in the tub.port option. If behind a NAT, you *may* need to -set the tub.location option described below. - - -[node] - -nickname = (UTF-8 string, optional) - - This value will be displayed in management tools as this node's "nickname". - If not provided, the nickname will be set to "". This string - shall be a UTF-8 encoded unicode string. - -web.port = (strports string, optional) - - This controls where the node's webserver should listen, providing filesystem - access and node status as defined in webapi.txt . This file contains a - Twisted "strports" specification such as "3456" or - "tcp:3456:interface=127.0.0.1". The 'tahoe create-node' or 'tahoe create-client' - commands set the web.port to "tcp:3456:interface=127.0.0.1" by default; this - is overridable by the "--webport" option. You can make it use SSL by writing - "ssl:3456:privateKey=mykey.pem:certKey=cert.pem" instead. - - If this is not provided, the node will not run a web server. - -web.static = (string, optional) - - This controls where the /static portion of the URL space is served. The - value is a directory name (~username is allowed, and non-absolute names are - interpreted relative to the node's basedir) which can contain HTML and other - files. This can be used to serve a javascript-based frontend to the Tahoe - node, or other services. - - The default value is "public_html", which will serve $BASEDIR/public_html . - With the default settings, http://127.0.0.1:3456/static/foo.html will serve - the contents of $BASEDIR/public_html/foo.html . - -tub.port = (integer, optional) - - This controls which port the node uses to accept Foolscap connections from - other nodes. If not provided, the node will ask the kernel for any available - port. The port will be written to a separate file (named client.port or - introducer.port), so that subsequent runs will re-use the same port. - -tub.location = (string, optional) - - In addition to running as a client, each Tahoe node also runs as a server, - listening for connections from other Tahoe clients. The node announces its - location by publishing a "FURL" (a string with some connection hints) to the - Introducer. The string it publishes can be found in - $BASEDIR/private/storage.furl . The "tub.location" configuration controls - what location is published in this announcement. - - If you don't provide tub.location, the node will try to figure out a useful - one by itself, by using tools like 'ifconfig' to determine the set of IP - addresses on which it can be reached from nodes both near and far. It will - also include the TCP port number on which it is listening (either the one - specified by tub.port, or whichever port was assigned by the kernel when - tub.port is left unspecified). - - You might want to override this value if your node lives behind a firewall - that is doing inbound port forwarding, or if you are using other proxies - such that the local IP address or port number is not the same one that - remote clients should use to connect. You might also want to control this - when using a Tor proxy to avoid revealing your actual IP address through the - Introducer announcement. - - The value is a comma-separated string of host:port location hints, like - this: - - 123.45.67.89:8098,tahoe.example.com:8098,127.0.0.1:8098 - - A few examples: - - Emulate default behavior, assuming your host has IP address 123.45.67.89 - and the kernel-allocated port number was 8098: - - tub.port = 8098 - tub.location = 123.45.67.89:8098,127.0.0.1:8098 - - Use a DNS name so you can change the IP address more easily: - - tub.port = 8098 - tub.location = tahoe.example.com:8098 - - Run a node behind a firewall (which has an external IP address) that has - been configured to forward port 7912 to our internal node's port 8098: - - tub.port = 8098 - tub.location = external-firewall.example.com:7912 - - Run a node behind a Tor proxy (perhaps via torsocks), in client-only mode - (i.e. we can make outbound connections, but other nodes will not be able to - connect to us). The literal 'unreachable.example.org' will not resolve, but - will serve as a reminder to human observers that this node cannot be - reached. "Don't call us.. we'll call you": - - tub.port = 8098 - tub.location = unreachable.example.org:0 - - Run a node behind a Tor proxy, and make the server available as a Tor - "hidden service". (this assumes that other clients are running their node - with torsocks, such that they are prepared to connect to a .onion address). - The hidden service must first be configured in Tor, by giving it a local - port number and then obtaining a .onion name, using something in the torrc - file like: - - HiddenServiceDir /var/lib/tor/hidden_services/tahoe - HiddenServicePort 29212 127.0.0.1:8098 - - once Tor is restarted, the .onion hostname will be in - /var/lib/tor/hidden_services/tahoe/hostname . Then set up your tahoe.cfg - like: - - tub.port = 8098 - tub.location = ualhejtq2p7ohfbb.onion:29212 - - Most users will not need to set tub.location . - - Note that the old 'advertised_ip_addresses' file from earlier releases is no - longer supported. Tahoe 1.3.0 and later will ignore this file. - -log_gatherer.furl = (FURL, optional) - - If provided, this contains a single FURL string which is used to contact a - 'log gatherer', which will be granted access to the logport. This can be - used by centralized storage meshes to gather operational logs in a single - place. Note that when an old-style BASEDIR/log_gatherer.furl file exists - (see 'Backwards Compatibility Files', below), both are used. (for most other - items, the separate config file overrides the entry in tahoe.cfg) - -timeout.keepalive = (integer in seconds, optional) -timeout.disconnect = (integer in seconds, optional) - - If timeout.keepalive is provided, it is treated as an integral number of - seconds, and sets the Foolscap "keepalive timer" to that value. For each - connection to another node, if nothing has been heard for a while, we will - attempt to provoke the other end into saying something. The duration of - silence that passes before sending the PING will be between KT and 2*KT. - This is mainly intended to keep NAT boxes from expiring idle TCP sessions, - but also gives TCP's long-duration keepalive/disconnect timers some traffic - to work with. The default value is 240 (i.e. 4 minutes). - - If timeout.disconnect is provided, this is treated as an integral number of - seconds, and sets the Foolscap "disconnect timer" to that value. For each - connection to another node, if nothing has been heard for a while, we will - drop the connection. The duration of silence that passes before dropping the - connection will be between DT-2*KT and 2*DT+2*KT (please see ticket #521 for - more details). If we are sending a large amount of data to the other end - (which takes more than DT-2*KT to deliver), we might incorrectly drop the - connection. The default behavior (when this value is not provided) is to - disable the disconnect timer. - - See ticket #521 for a discussion of how to pick these timeout values. Using - 30 minutes means we'll disconnect after 22 to 68 minutes of inactivity. - Receiving data will reset this timeout, however if we have more than 22min - of data in the outbound queue (such as 800kB in two pipelined segments of 10 - shares each) and the far end has no need to contact us, our ping might be - delayed, so we may disconnect them by accident. - -ssh.port = (strports string, optional) -ssh.authorized_keys_file = (filename, optional) - - This enables an SSH-based interactive Python shell, which can be used to - inspect the internal state of the node, for debugging. To cause the node to - accept SSH connections on port 8022 from the same keys as the rest of your - account, use: - - [tub] - ssh.port = 8022 - ssh.authorized_keys_file = ~/.ssh/authorized_keys - -tempdir = (string, optional) - - This specifies a temporary directory for the webapi server to use, for - holding large files while they are being uploaded. If a webapi client - attempts to upload a 10GB file, this tempdir will need to have at least 10GB - available for the upload to complete. - - The default value is the "tmp" directory in the node's base directory (i.e. - $NODEDIR/tmp), but it can be placed elsewhere. This directory is used for - files that usually (on a unix system) go into /tmp . The string will be - interpreted relative to the node's base directory. - -== Client Configuration == - -[client] -introducer.furl = (FURL string, mandatory) - - This FURL tells the client how to connect to the introducer. Each Tahoe grid - is defined by an introducer. The introducer's furl is created by the - introducer node and written into its base directory when it starts, - whereupon it should be published to everyone who wishes to attach a client - to that grid - -helper.furl = (FURL string, optional) - - If provided, the node will attempt to connect to and use the given helper - for uploads. See docs/helper.txt for details. - -key_generator.furl = (FURL string, optional) - - If provided, the node will attempt to connect to and use the given - key-generator service, using RSA keys from the external process rather than - generating its own. - -stats_gatherer.furl = (FURL string, optional) - - If provided, the node will connect to the given stats gatherer and provide - it with operational statistics. - -shares.needed = (int, optional) aka "k", default 3 -shares.total = (int, optional) aka "N", N >= k, default 10 -shares.happy = (int, optional) 1 <= happy <= N, default 7 - - These three values set the default encoding parameters. Each time a new file - is uploaded, erasure-coding is used to break the ciphertext into separate - pieces. There will be "N" (i.e. shares.total) pieces created, and the file - will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved. - The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10). - Setting k to 1 is equivalent to simple replication (uploading N copies of - the file). - - These values control the tradeoff between storage overhead, performance, and - reliability. To a first approximation, a 1MB file will use (1MB*N/k) of - backend storage space (the actual value will be a bit more, because of other - forms of overhead). Up to N-k shares can be lost before the file becomes - unrecoverable, so assuming there are at least N servers, up to N-k servers - can be offline without losing the file. So large N/k ratios are more - reliable, and small N/k ratios use less disk space. Clearly, k must never be - smaller than N. - - Large values of N will slow down upload operations slightly, since more - servers must be involved, and will slightly increase storage overhead due to - the hash trees that are created. Large values of k will cause downloads to - be marginally slower, because more servers must be involved. N cannot be - larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe - uses. - - shares.happy allows you control over the distribution of your immutable file. - For a successful upload, shares are guaranteed to be initially placed on - at least 'shares.happy' distinct servers, the correct functioning of any - k of which is sufficient to guarantee the availability of the uploaded file. - This value should not be larger than the number of servers on your grid. - - A value of shares.happy <= k is allowed, but does not provide any redundancy - if some servers fail or lose shares. - - (Mutable files use a different share placement algorithm that does not - consider this parameter.) - - -== Storage Server Configuration == - -[storage] -enabled = (boolean, optional) - - If this is True, the node will run a storage server, offering space to other - clients. If it is False, the node will not run a storage server, meaning - that no shares will be stored on this node. Use False this for clients who - do not wish to provide storage service. The default value is True. - -readonly = (boolean, optional) - - If True, the node will run a storage server but will not accept any shares, - making it effectively read-only. Use this for storage servers which are - being decommissioned: the storage/ directory could be mounted read-only, - while shares are moved to other servers. Note that this currently only - affects immutable shares. Mutable shares (used for directories) will be - written and modified anyway. See ticket #390 for the current status of this - bug. The default value is False. - -reserved_space = (str, optional) - - If provided, this value defines how much disk space is reserved: the storage - server will not accept any share which causes the amount of free disk space - to drop below this value. (The free space is measured by a call to statvfs(2) - on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the - user account under which the storage server runs.) - - This string contains a number, with an optional case-insensitive scale - suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So - "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same - thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing. - -expire.enabled = -expire.mode = -expire.override_lease_duration = -expire.cutoff_date = -expire.immutable = -expire.mutable = - - These settings control garbage-collection, in which the server will delete - shares that no longer have an up-to-date lease on them. Please see the - neighboring "garbage-collection.txt" document for full details. - - -== Running A Helper == - -A "helper" is a regular client node that also offers the "upload helper" -service. - -[helper] -enabled = (boolean, optional) - - If True, the node will run a helper (see docs/helper.txt for details). The - helper's contact FURL will be placed in private/helper.furl, from which it - can be copied to any clients which wish to use it. Clearly nodes should not - both run a helper and attempt to use one: do not create both helper.furl and - run_helper in the same node. The default is False. - - -== Running An Introducer == - -The introducer node uses a different '.tac' file (named introducer.tac), and -pays attention to the "[node]" section, but not the others. - -The Introducer node maintains some different state than regular client -nodes. - -BASEDIR/introducer.furl : This is generated the first time the introducer -node is started, and used again on subsequent runs, to give the introduction -service a persistent long-term identity. This file should be published and -copied into new client nodes before they are started for the first time. - - -== Other Files in BASEDIR == - -Some configuration is not kept in tahoe.cfg, for the following reasons: - - * it is generated by the node at startup, e.g. encryption keys. The node - never writes to tahoe.cfg - * it is generated by user action, e.g. the 'tahoe create-alias' command - -In addition, non-configuration persistent state is kept in the node's base -directory, next to the configuration knobs. - -This section describes these other files. - - -private/node.pem : This contains an SSL private-key certificate. The node -generates this the first time it is started, and re-uses it on subsequent -runs. This certificate allows the node to have a cryptographically-strong -identifier (the Foolscap "TubID"), and to establish secure connections to -other nodes. - -storage/ : Nodes which host StorageServers will create this directory to hold -shares of files on behalf of other clients. There will be a directory -underneath it for each StorageIndex for which this node is holding shares. -There is also an "incoming" directory where partially-completed shares are -held while they are being received. - -client.tac : this file defines the client, by constructing the actual Client -instance each time the node is started. It is used by the 'twistd' -daemonization program (in the "-y" mode), which is run internally by the -"tahoe start" command. This file is created by the "tahoe create-node" or -"tahoe create-client" commands. - -private/control.furl : this file contains a FURL that provides access to a -control port on the client node, from which files can be uploaded and -downloaded. This file is created with permissions that prevent anyone else -from reading it (on operating systems that support such a concept), to insure -that only the owner of the client node can use this feature. This port is -intended for debugging and testing use. - -private/logport.furl : this file contains a FURL that provides access to a -'log port' on the client node, from which operational logs can be retrieved. -Do not grant logport access to strangers, because occasionally secret -information may be placed in the logs. - -private/helper.furl : if the node is running a helper (for use by other -clients), its contact FURL will be placed here. See docs/helper.txt for more -details. - -private/root_dir.cap (optional): The command-line tools will read a directory -cap out of this file and use it, if you don't specify a '--dir-cap' option or -if you specify '--dir-cap=root'. - -private/convergence (automatically generated): An added secret for encrypting -immutable files. Everyone who has this same string in their -private/convergence file encrypts their immutable files in the same way when -uploading them. This causes identical files to "converge" -- to share the -same storage space since they have identical ciphertext -- which conserves -space and optimizes upload time, but it also exposes files to the possibility -of a brute-force attack by people who know that string. In this attack, if -the attacker can guess most of the contents of a file, then they can use -brute-force to learn the remaining contents. - -So the set of people who know your private/convergence string is the set of -people who converge their storage space with you when you and they upload -identical immutable files, and it is also the set of people who could mount -such an attack. - -The content of the private/convergence file is a base-32 encoded string. If -the file doesn't exist, then when the Tahoe client starts up it will generate -a random 256-bit string and write the base-32 encoding of this string into -the file. If you want to converge your immutable files with as many people as -possible, put the empty string (so that private/convergence is a zero-length -file). - - -== Other files == - -logs/ : Each Tahoe node creates a directory to hold the log messages produced -as the node runs. These logfiles are created and rotated by the "twistd" -daemonization program, so logs/twistd.log will contain the most recent -messages, logs/twistd.log.1 will contain the previous ones, logs/twistd.log.2 -will be older still, and so on. twistd rotates logfiles after they grow -beyond 1MB in size. If the space consumed by logfiles becomes troublesome, -they should be pruned: a cron job to delete all files that were created more -than a month ago in this logs/ directory should be sufficient. - -my_nodeid : this is written by all nodes after startup, and contains a -base32-encoded (i.e. human-readable) NodeID that identifies this specific -node. This NodeID is the same string that gets displayed on the web page (in -the "which peers am I connected to" list), and the shortened form (the first -characters) is recorded in various log messages. - - -== Backwards Compatibility Files == - -Tahoe releases before 1.3.0 had no 'tahoe.cfg' file, and used distinct files -for each item listed below. For each configuration knob, if the distinct file -exists, it will take precedence over the corresponding item in tahoe.cfg . - - -[node]nickname : BASEDIR/nickname -[node]web.port : BASEDIR/webport -[node]tub.port : BASEDIR/client.port (for Clients, not Introducers) -[node]tub.port : BASEDIR/introducer.port (for Introducers, not Clients) - (note that, unlike other keys, tahoe.cfg overrides the *.port file) -[node]tub.location : replaces BASEDIR/advertised_ip_addresses -[node]log_gatherer.furl : BASEDIR/log_gatherer.furl (one per line) -[node]timeout.keepalive : BASEDIR/keepalive_timeout -[node]timeout.disconnect : BASEDIR/disconnect_timeout -[client]introducer.furl : BASEDIR/introducer.furl -[client]helper.furl : BASEDIR/helper.furl -[client]key_generator.furl : BASEDIR/key_generator.furl -[client]stats_gatherer.furl : BASEDIR/stats_gatherer.furl -[storage]enabled : BASEDIR/no_storage (False if no_storage exists) -[storage]readonly : BASEDIR/readonly_storage (True if readonly_storage exists) -[storage]sizelimit : BASEDIR/sizelimit -[storage]debug_discard : BASEDIR/debug_discard_storage -[helper]enabled : BASEDIR/run_helper (True if run_helper exists) - -Note: the functionality of [node]ssh.port and [node]ssh.authorized_keys_file -were previously combined, controlled by the presence of a -BASEDIR/authorized_keys.SSHPORT file, in which the suffix of the filename -indicated which port the ssh server should listen on, and the contents of the -file provided the ssh public keys to accept. Support for these files has been -removed completely. To ssh into your Tahoe node, add [node]ssh.port and -[node].ssh_authorized_keys_file statements to your tahoe.cfg . - -Likewise, the functionality of [node]tub.location is a variant of the -now-unsupported BASEDIR/advertised_ip_addresses . The old file was additive -(the addresses specified in advertised_ip_addresses were used in addition to -any that were automatically discovered), whereas the new tahoe.cfg directive -is not (tub.location is used verbatim). - - -== Example == - -The following is a sample tahoe.cfg file, containing values for all keys -described above. Note that this is not a recommended configuration (most of -these are not the default values), merely a legal one. - -[node] -nickname = Bob's Tahoe Node -tub.port = 34912 -tub.location = 123.45.67.89:8098,44.55.66.77:8098 -web.port = 3456 -log_gatherer.furl = pb://soklj4y7eok5c3xkmjeqpw@192.168.69.247:44801/eqpwqtzm -timeout.keepalive = 240 -timeout.disconnect = 1800 -ssh.port = 8022 -ssh.authorized_keys_file = ~/.ssh/authorized_keys - -[client] -introducer.furl = pb://ok45ssoklj4y7eok5c3xkmj@tahoe.example:44801/ii3uumo -helper.furl = pb://ggti5ssoklj4y7eok5c3xkmj@helper.tahoe.example:7054/kk8lhr - -[storage] -enabled = True -readonly_storage = True -sizelimit = 10000000000 - -[helper] -run_helper = True diff --git a/docs/debian.txt b/docs/debian.rst similarity index 83% rename from docs/debian.txt rename to docs/debian.rst index afa0a7fe..a0248f18 100644 --- a/docs/debian.txt +++ b/docs/debian.rst @@ -1,27 +1,31 @@ -= Debian Support = +============== +Debian Support +============== -1. Overview -2. TL;DR supporting package building instructions -3. TL;DR package building instructions for Tahoe -4. Building Debian Packages -5. Using Pre-Built Debian Packages -6. Building From Source on Debian Systems +1. `Overview`_ +2. `TL;DR supporting package building instructions`_ +3. `TL;DR package building instructions for Tahoe`_ +4. `Building Debian Packages`_ +5. `Using Pre-Built Debian Packages`_ +6. `Building From Source on Debian Systems`_ -= Overview == +Overview +======== One convenient way to install Tahoe-LAFS is with debian packages. This document attempts to explain how to complete a desert island build for people in a hurry. It also attempts to explain more about our Debian packaging for those willing to read beyond the simple pragmatic packaging exercises. -== TL;DR supporting package building instructions == +TL;DR supporting package building instructions +============================================== There are only four supporting packages that are currently not available from -the debian apt repositories in Debian Lenny: +the debian apt repositories in Debian Lenny:: python-foolscap python-zfec argparse zbase32 -First, we'll install some common packages for development: +First, we'll install some common packages for development:: sudo apt-get install -y build-essential debhelper cdbs python-central \ python-setuptools python python-dev python-twisted-core \ @@ -31,7 +35,7 @@ First, we'll install some common packages for development: sudo apt-file update -To create packages for Lenny, we'll also install stdeb: +To create packages for Lenny, we'll also install stdeb:: sudo apt-get install python-all-dev STDEB_VERSION="0.5.1" @@ -41,7 +45,7 @@ To create packages for Lenny, we'll also install stdeb: python setup.py --command-packages=stdeb.command bdist_deb sudo dpkg -i deb_dist/python-stdeb_$STDEB_VERSION-1_all.deb -Now we're ready to build and install the zfec Debian package: +Now we're ready to build and install the zfec Debian package:: darcs get http://allmydata.org/source/zfec/trunk zfac cd zfac/zfec/ @@ -50,7 +54,7 @@ Now we're ready to build and install the zfec Debian package: dpkg-buildpackage -rfakeroot -uc -us sudo dpkg -i ../python-zfec_1.4.6-r333-1_amd64.deb -We need to build a pyutil package: +We need to build a pyutil package:: wget http://pypi.python.org/packages/source/p/pyutil/pyutil-1.6.1.tar.gz tar -xvzf pyutil-1.6.1.tar.gz @@ -60,12 +64,12 @@ We need to build a pyutil package: dpkg-buildpackage -rfakeroot -uc -us sudo dpkg -i ../python-pyutil_1.6.1-1_all.deb -We also need to install argparse and zbase32: +We also need to install argparse and zbase32:: sudo easy_install argparse # argparse won't install with stdeb (!) :-( sudo easy_install zbase32 # XXX TODO: package with stdeb -Finally, we'll fetch, unpack, build and install foolscap: +Finally, we'll fetch, unpack, build and install foolscap:: # You may not already have Brian's key: # gpg --recv-key 0x1514A7BD @@ -79,10 +83,11 @@ Finally, we'll fetch, unpack, build and install foolscap: dpkg-buildpackage -rfakeroot -uc -us sudo dpkg -i ../python-foolscap_0.5.0-1_all.deb -== TL;DR package building instructions for Tahoe == +TL;DR package building instructions for Tahoe +============================================= If you want to build your own Debian packages from the darcs tree or from -a source release, do the following: +a source release, do the following:: cd ~/ mkdir src && cd src/ @@ -98,7 +103,8 @@ supported libraries as .deb packages. You'll need to edit the Debian specific /etc/defaults/allmydata-tahoe file to get Tahoe started. Data is by default stored in /var/lib/tahoelafsd/ and Tahoe runs as the 'tahoelafsd' user. -== Building Debian Packages == +Building Debian Packages +======================== The Tahoe source tree comes with limited support for building debian packages on a variety of Debian and Ubuntu platforms. For each supported platform, @@ -109,7 +115,7 @@ the tree (e.g. "1.1-r2678"). To create debian packages from a Tahoe tree, you will need some additional tools installed. The canonical list of these packages is in the -"Build-Depends" clause of misc/sid/debian/control , and includes: +"Build-Depends" clause of misc/sid/debian/control , and includes:: build-essential debhelper @@ -130,18 +136,20 @@ release, for example if there is no "deb-hardy-head" target, try building Note that we haven't tried to build source packages (.orig.tar.gz + dsc) yet, and there are no such source packages in our APT repository. -== Using Pre-Built Debian Packages == +Using Pre-Built Debian Packages +=============================== The allmydata.org site hosts an APT repository with debian packages that are -built after each checkin. The following wiki page describes this repository: - - http://allmydata.org/trac/tahoe/wiki/DownloadDebianPackages +built after each checkin. `This wiki page +`_ describes this +repository. The allmydata.org APT repository also includes debian packages of support libraries, like Foolscap, zfec, pycryptopp, and everything else you need that isn't already in debian. -== Building From Source on Debian Systems == +Building From Source on Debian Systems +====================================== Many of Tahoe's build dependencies can be satisfied by first installing certain debian packages: simplejson is one of these. Some debian/ubuntu diff --git a/docs/filesystem-notes.txt b/docs/filesystem-notes.rst similarity index 60% rename from docs/filesystem-notes.txt rename to docs/filesystem-notes.rst index 34e02029..7c261b6d 100644 --- a/docs/filesystem-notes.txt +++ b/docs/filesystem-notes.rst @@ -1,18 +1,24 @@ +========================= +Filesystem-specific notes +========================= + +1. ext3_ Tahoe storage servers use a large number of subdirectories to store their shares on local disk. This format is simple and robust, but depends upon the local filesystem to provide fast access to those directories. -= ext3 = +ext3 +==== For moderate- or large-sized storage servers, you'll want to make sure the "directory index" feature is enabled on your ext3 directories, otherwise share lookup may be very slow. Recent versions of ext3 enable this -automatically, but older filesystems may not have it enabled. +automatically, but older filesystems may not have it enabled:: -$ sudo tune2fs -l /dev/sda1 |grep feature -Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file + $ sudo tune2fs -l /dev/sda1 |grep feature + Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file If "dir_index" is present in the "features:" line, then you're all set. If not, you'll need to use tune2fs and e2fsck to enable and build the index. See -this page for some hints: http://wiki.dovecot.org/MailboxFormat/Maildir . + for some hints. diff --git a/docs/garbage-collection.txt b/docs/garbage-collection.rst similarity index 76% rename from docs/garbage-collection.txt rename to docs/garbage-collection.rst index 8543a3fb..5bfe6548 100644 --- a/docs/garbage-collection.txt +++ b/docs/garbage-collection.rst @@ -1,12 +1,15 @@ -= Garbage Collection in Tahoe = +=========================== +Garbage Collection in Tahoe +=========================== -1. Overview -2. Client-side Renewal -3. Server Side Expiration -4. Expiration Progress -5. Future Directions +1. `Overview`_ +2. `Client-side Renewal`_ +3. `Server Side Expiration`_ +4. `Expiration Progress`_ +5. `Future Directions`_ -== Overview == +Overview +======== When a file or directory in the virtual filesystem is no longer referenced, the space that its shares occupied on each storage server can be freed, @@ -40,7 +43,8 @@ not yet a way for the client to request a different duration (however the server can use the "expire.override_lease_duration" configuration setting to increase or decrease the effective duration to something other than 31 days). -== Client-side Renewal == +Client-side Renewal +=================== If all of the files and directories which you care about are reachable from a single starting point (usually referred to as a "rootcap"), and you store @@ -69,7 +73,8 @@ perform these jobs in parallel. Eventually, this daemon will be made appropriate for use by individual users as well, and may be incorporated directly into the client node. -== Server Side Expiration == +Server Side Expiration +====================== Expiration must be explicitly enabled on each storage server, since the default behavior is to never expire shares. Expiration is enabled by adding @@ -112,97 +117,98 @@ for a long period of time: once the lease-checker has examined all shares and expired whatever it is going to expire, the second and subsequent passes are not going to find any new leases to remove. -The tahoe.cfg file uses the following keys to control lease expiration: +The tahoe.cfg file uses the following keys to control lease expiration:: -[storage] + [storage] -expire.enabled = (boolean, optional) + expire.enabled = (boolean, optional) - If this is True, the storage server will delete shares on which all leases - have expired. Other controls dictate when leases are considered to have - expired. The default is False. + If this is True, the storage server will delete shares on which all leases + have expired. Other controls dictate when leases are considered to have + expired. The default is False. -expire.mode = (string, "age" or "cutoff-date", required if expiration enabled) + expire.mode = (string, "age" or "cutoff-date", required if expiration enabled) - If this string is "age", the age-based expiration scheme is used, and the - "expire.override_lease_duration" setting can be provided to influence the - lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used, - and the "expire.cutoff_date" setting must be provided to specify the cutoff - date. The mode setting currently has no default: you must provide a value. + If this string is "age", the age-based expiration scheme is used, and the + "expire.override_lease_duration" setting can be provided to influence the + lease ages. If it is "cutoff-date", the absolute-date-cutoff mode is used, + and the "expire.cutoff_date" setting must be provided to specify the cutoff + date. The mode setting currently has no default: you must provide a value. - In a future release, this setting is likely to default to "age", but in this - release it was deemed safer to require an explicit mode specification. + In a future release, this setting is likely to default to "age", but in this + release it was deemed safer to require an explicit mode specification. -expire.override_lease_duration = (duration string, optional) + expire.override_lease_duration = (duration string, optional) - When age-based expiration is in use, a lease will be expired if its - "lease.create_renew" timestamp plus its "lease.duration" time is - earlier/older than the current time. This key, if present, overrides the - duration value for all leases, changing the algorithm from: + When age-based expiration is in use, a lease will be expired if its + "lease.create_renew" timestamp plus its "lease.duration" time is + earlier/older than the current time. This key, if present, overrides the + duration value for all leases, changing the algorithm from: - if (lease.create_renew_timestamp + lease.duration) < now: - expire_lease() + if (lease.create_renew_timestamp + lease.duration) < now: + expire_lease() - to: + to: - if (lease.create_renew_timestamp + override_lease_duration) < now: - expire_lease() + if (lease.create_renew_timestamp + override_lease_duration) < now: + expire_lease() - The value of this setting is a "duration string", which is a number of days, - months, or years, followed by a units suffix, and optionally separated by a - space, such as one of the following: + The value of this setting is a "duration string", which is a number of days, + months, or years, followed by a units suffix, and optionally separated by a + space, such as one of the following: - 7days - 31day - 60 days - 2mo - 3 month - 12 months - 2years + 7days + 31day + 60 days + 2mo + 3 month + 12 months + 2years - This key is meant to compensate for the fact that clients do not yet have - the ability to ask for leases that last longer than 31 days. A grid which - wants to use faster or slower GC than a 31-day lease timer permits can use - this parameter to implement it. The current fixed 31-day lease duration - makes the server behave as if "lease.override_lease_duration = 31days" had - been passed. + This key is meant to compensate for the fact that clients do not yet have + the ability to ask for leases that last longer than 31 days. A grid which + wants to use faster or slower GC than a 31-day lease timer permits can use + this parameter to implement it. The current fixed 31-day lease duration + makes the server behave as if "lease.override_lease_duration = 31days" had + been passed. - This key is only valid when age-based expiration is in use (i.e. when - "expire.mode = age" is used). It will be rejected if cutoff-date expiration - is in use. + This key is only valid when age-based expiration is in use (i.e. when + "expire.mode = age" is used). It will be rejected if cutoff-date expiration + is in use. -expire.cutoff_date = (date string, required if mode=cutoff-date) + expire.cutoff_date = (date string, required if mode=cutoff-date) - When cutoff-date expiration is in use, a lease will be expired if its - create/renew timestamp is older than the cutoff date. This string will be a - date in the following format: + When cutoff-date expiration is in use, a lease will be expired if its + create/renew timestamp is older than the cutoff date. This string will be a + date in the following format: - 2009-01-16 (January 16th, 2009) - 2008-02-02 - 2007-12-25 + 2009-01-16 (January 16th, 2009) + 2008-02-02 + 2007-12-25 - The actual cutoff time shall be midnight UTC at the beginning of the given - day. Lease timers should naturally be generous enough to not depend upon - differences in timezone: there should be at least a few days between the - last renewal time and the cutoff date. + The actual cutoff time shall be midnight UTC at the beginning of the given + day. Lease timers should naturally be generous enough to not depend upon + differences in timezone: there should be at least a few days between the + last renewal time and the cutoff date. - This key is only valid when cutoff-based expiration is in use (i.e. when - "expire.mode = cutoff-date"). It will be rejected if age-based expiration is - in use. + This key is only valid when cutoff-based expiration is in use (i.e. when + "expire.mode = cutoff-date"). It will be rejected if age-based expiration is + in use. -expire.immutable = (boolean, optional) + expire.immutable = (boolean, optional) - If this is False, then immutable shares will never be deleted, even if their - leases have expired. This can be used in special situations to perform GC on - mutable files but not immutable ones. The default is True. + If this is False, then immutable shares will never be deleted, even if their + leases have expired. This can be used in special situations to perform GC on + mutable files but not immutable ones. The default is True. -expire.mutable = (boolean, optional) + expire.mutable = (boolean, optional) - If this is False, then mutable shares will never be deleted, even if their - leases have expired. This can be used in special situations to perform GC on - immutable files but not mutable ones. The default is True. + If this is False, then mutable shares will never be deleted, even if their + leases have expired. This can be used in special situations to perform GC on + immutable files but not mutable ones. The default is True. -== Expiration Progress == +Expiration Progress +=================== In the current release, leases are stored as metadata in each share file, and no separate database is maintained. As a result, checking and expiring leases @@ -229,7 +235,8 @@ lose significant progress. The state file is located in two files crawler can be forcibly reset by stopping the node, deleting these two files, then restarting the node. -== Future Directions == +Future Directions +================= Tahoe's GC mechanism is undergoing significant changes. The global mark-and-sweep garbage-collection scheme can require considerable network diff --git a/docs/helper.txt b/docs/helper.rst similarity index 73% rename from docs/helper.txt rename to docs/helper.rst index fa53791c..8d0283a2 100644 --- a/docs/helper.txt +++ b/docs/helper.rst @@ -1,11 +1,14 @@ -= The Tahoe Upload Helper = +======================= +The Tahoe Upload Helper +======================= -1. Overview -2. Setting Up A Helper -3. Using a Helper -4. Other Helper Modes +1. `Overview`_ +2. `Setting Up A Helper`_ +3. `Using a Helper`_ +4. `Other Helper Modes`_ -== Overview == +Overview +======== As described in the "SWARMING DOWNLOAD, TRICKLING UPLOAD" section of architecture.txt, Tahoe uploads require more bandwidth than downloads: you @@ -45,31 +48,30 @@ connection to the helper. This can improve TCP fairness, and should allow other applications that are sharing the same uplink to compete more evenly for the limited bandwidth. - - -== Setting Up A Helper == +Setting Up A Helper +=================== Who should consider running a helper? - * Benevolent entities which wish to provide better upload speed for clients - that have slow uplinks - * Folks which have machines with upload bandwidth to spare. - * Server grid operators who want clients to connect to a small number of - helpers rather than a large number of storage servers (a "multi-tier" - architecture) +* Benevolent entities which wish to provide better upload speed for clients + that have slow uplinks +* Folks which have machines with upload bandwidth to spare. +* Server grid operators who want clients to connect to a small number of + helpers rather than a large number of storage servers (a "multi-tier" + architecture) What sorts of machines are good candidates for running a helper? - * The Helper needs to have good bandwidth to the storage servers. In - particular, it needs to have at least 3.3x better upload bandwidth than - the client does, or the client might as well upload directly to the - storage servers. In a commercial grid, the helper should be in the same - colo (and preferably in the same rack) as the storage servers. - * The Helper will take on most of the CPU load involved in uploading a file. - So having a dedicated machine will give better results. - * The Helper buffers ciphertext on disk, so the host will need at least as - much free disk space as there will be simultaneous uploads. When an upload - is interrupted, that space will be used for a longer period of time. +* The Helper needs to have good bandwidth to the storage servers. In + particular, it needs to have at least 3.3x better upload bandwidth than + the client does, or the client might as well upload directly to the + storage servers. In a commercial grid, the helper should be in the same + colo (and preferably in the same rack) as the storage servers. +* The Helper will take on most of the CPU load involved in uploading a file. + So having a dedicated machine will give better results. +* The Helper buffers ciphertext on disk, so the host will need at least as + much free disk space as there will be simultaneous uploads. When an upload + is interrupted, that space will be used for a longer period of time. To turn a Tahoe-LAFS node into a helper (i.e. to run a helper service in addition to whatever else that node is doing), edit the tahoe.cfg file in your @@ -82,7 +84,9 @@ file named private/helper.furl which contains the contact information for the helper: you will need to give this FURL to any clients that wish to use your helper. - cat $BASEDIR/private/helper.furl |mail -s "helper furl" friend@example.com +:: + + cat $BASEDIR/private/helper.furl | mail -s "helper furl" friend@example.com You can tell if your node is running a helper by looking at its web status page. Assuming that you've set up the 'webport' to use port 3456, point your @@ -105,27 +109,30 @@ finished. For long-running and busy helpers, it may be a good idea to delete files in these directories that have not been modified for a week or two. Future versions of tahoe will try to self-manage these files a bit better. -== Using a Helper == +Using a Helper +============== Who should consider using a Helper? - * clients with limited upstream bandwidth, such as a consumer ADSL line - * clients who believe that the helper will give them faster uploads than - they could achieve with a direct upload - * clients who experience problems with TCP connection fairness: if other - programs or machines in the same home are getting less than their fair - share of upload bandwidth. If the connection is being shared fairly, then - a Tahoe upload that is happening at the same time as a single FTP upload - should get half the bandwidth. - * clients who have been given the helper.furl by someone who is running a - Helper and is willing to let them use it +* clients with limited upstream bandwidth, such as a consumer ADSL line +* clients who believe that the helper will give them faster uploads than + they could achieve with a direct upload +* clients who experience problems with TCP connection fairness: if other + programs or machines in the same home are getting less than their fair + share of upload bandwidth. If the connection is being shared fairly, then + a Tahoe upload that is happening at the same time as a single FTP upload + should get half the bandwidth. +* clients who have been given the helper.furl by someone who is running a + Helper and is willing to let them use it To take advantage of somebody else's Helper, take the helper.furl file that they give you, and copy it into your node's base directory, then restart the node: - cat email >$BASEDIR/helper.furl - tahoe restart $BASEDIR +:: + + cat email >$BASEDIR/helper.furl + tahoe restart $BASEDIR This will signal the client to try and connect to the helper. Subsequent uploads will use the helper rather than using direct connections to the @@ -146,15 +153,16 @@ tahoe/foolscap connections. The upload/download status page (http://localhost:3456/status) will announce the using-helper-or-not state of each upload, in the "Helper?" column. -== Other Helper Modes == +Other Helper Modes +================== The Tahoe Helper only currently helps with one kind of operation: uploading immutable files. There are three other things it might be able to help with in the future: - * downloading immutable files - * uploading mutable files (such as directories) - * downloading mutable files (like directories) +* downloading immutable files +* uploading mutable files (such as directories) +* downloading mutable files (like directories) Since mutable files are currently limited in size, the ADSL upstream penalty is not so severe for them. There is no ADSL penalty to downloads, but there diff --git a/docs/how_to_make_a_tahoe-lafs_release.txt b/docs/how_to_make_a_tahoe-lafs_release.rst similarity index 100% rename from docs/how_to_make_a_tahoe-lafs_release.txt rename to docs/how_to_make_a_tahoe-lafs_release.rst diff --git a/docs/known_issues.txt b/docs/known_issues.rst similarity index 71% rename from docs/known_issues.txt rename to docs/known_issues.rst index aa112708..58be6ab9 100644 --- a/docs/known_issues.txt +++ b/docs/known_issues.rst @@ -1,14 +1,18 @@ -= known issues = +============ +Known issues +============ -* overview -* issues in Tahoe-LAFS v1.8.0, released 2010-09-23 - - potential unauthorized access by JavaScript in unrelated files - - potential disclosure of file through embedded hyperlinks or JavaScript in that file - - command-line arguments are leaked to other local users - - capabilities may be leaked to web browser phishing filter / "safe browsing" servers === - - known issues in the FTP and SFTP frontends === +* `Overview`_ +* `Issues in Tahoe-LAFS v1.8.0, released 2010-09-23` -== overview == + * `Potential unauthorized access by JavaScript in unrelated files`_ + * `Potential disclosure of file through embedded hyperlinks or JavaScript in that file`_ + * `Command-line arguments are leaked to other local users`_ + * `Capabilities may be leaked to web browser phishing filter / "safe browsing" servers`_ + * `Known issues in the FTP and SFTP frontends`_ + +Overview +======== Below is a list of known issues in recent releases of Tahoe-LAFS, and how to manage them. The current version of this file can be found at @@ -21,9 +25,11 @@ want to read the "historical known issues" document: http://tahoe-lafs.org/source/tahoe-lafs/trunk/docs/historical/historical_known_issues.txt -== issues in Tahoe-LAFS v1.8.0, released 2010-09-18 == +Issues in Tahoe-LAFS v1.8.0, released 2010-09-23 +================================================ -=== potential unauthorized access by JavaScript in unrelated files === +Potential unauthorized access by JavaScript in unrelated files +-------------------------------------------------------------- If you view a file stored in Tahoe-LAFS through a web user interface, JavaScript embedded in that file might be able to access other files or @@ -33,11 +39,12 @@ those other files or directories to the author of the script, and if you have the ability to modify the contents of those files or directories, then that script could modify or delete those files or directories. -==== how to manage it ==== +how to manage it +~~~~~~~~~~~~~~~~ For future versions of Tahoe-LAFS, we are considering ways to close off this leakage of authority while preserving ease of use -- the discussion -of this issue is ticket #615. +of this issue is ticket `#615 `_. For the present, either do not view files stored in Tahoe-LAFS through a web user interface, or turn off JavaScript in your web browser before @@ -45,7 +52,8 @@ doing so, or limit your viewing to files which you know don't contain malicious JavaScript. -=== potential disclosure of file through embedded hyperlinks or JavaScript in that file === +Potential disclosure of file through embedded hyperlinks or JavaScript in that file +----------------------------------------------------------------------------------- If there is a file stored on a Tahoe-LAFS storage grid, and that file gets downloaded and displayed in a web browser, then JavaScript or @@ -61,11 +69,12 @@ file. Note that IMG tags are typically followed automatically by web browsers, so being careful which hyperlinks you click on is not sufficient to prevent this from happening. -==== how to manage it ==== +how to manage it +~~~~~~~~~~~~~~~~ For future versions of Tahoe-LAFS, we are considering ways to close off this leakage of authority while preserving ease of use -- the discussion -of this issue is ticket #127. +of this issue is ticket `#127 `_. For the present, a good work-around is that if you want to store and view a file on Tahoe-LAFS and you want that file to remain private, then @@ -74,7 +83,8 @@ and remove any JavaScript unless you are sure that the JavaScript is not written to maliciously leak access. -=== command-line arguments are leaked to other local users === +Command-line arguments are leaked to other local users +------------------------------------------------------ Remember that command-line arguments are visible to other users (through the 'ps' command, or the windows Process Explorer tool), so if you are @@ -83,7 +93,8 @@ be able to see (and copy) any caps that you pass as command-line arguments. This includes directory caps that you set up with the "tahoe add-alias" command. -==== how to manage it ==== +how to manage it +~~~~~~~~~~~~~~~~ As of Tahoe-LAFS v1.3.0 there is a "tahoe create-alias" command that does the following technique for you. @@ -91,7 +102,7 @@ the following technique for you. Bypass add-alias and edit the NODEDIR/private/aliases file directly, by adding a line like this: -fun: URI:DIR2:ovjy4yhylqlfoqg2vcze36dhde:4d4f47qko2xm5g7osgo2yyidi5m4muyo2vjjy53q4vjju2u55mfa + fun: URI:DIR2:ovjy4yhylqlfoqg2vcze36dhde:4d4f47qko2xm5g7osgo2yyidi5m4muyo2vjjy53q4vjju2u55mfa By entering the dircap through the editor, the command-line arguments are bypassed, and other users will not be able to see them. Once you've @@ -102,7 +113,8 @@ arguments you type there, but not the caps that Tahoe-LAFS uses to permit access to your files and directories. -=== capabilities may be leaked to web browser phishing filter / "safe browsing" servers === +Capabilities may be leaked to web browser phishing filter / "safe browsing" servers +----------------------------------------------------------------------------------- Firefox, Internet Explorer, and Chrome include a "phishing filter" or "safe browing" component, which is turned on by default, and which sends @@ -134,7 +146,8 @@ Opera also has a similar facility that is disabled by default. A previous version of this file stated that Firefox had abandoned their phishing filter; this was incorrect. -==== how to manage it ==== +how to manage it +~~~~~~~~~~~~~~~~ If you use any phishing filter or "safe browsing" feature, consider either disabling it, or not using the WUI via that browser. Phishing filters have @@ -143,31 +156,47 @@ very limited effectiveness (see or malware attackers have learnt how to bypass them. To disable the filter in IE7 or IE8: - - Click Internet Options from the Tools menu. - - Click the Advanced tab. - - If an "Enable SmartScreen Filter" option is present, uncheck it. - If a "Use Phishing Filter" or "Phishing Filter" option is present, - set it to Disable. - - Confirm (click OK or Yes) out of all dialogs. +```````````````````````````````````` + +- Click Internet Options from the Tools menu. + +- Click the Advanced tab. + +- If an "Enable SmartScreen Filter" option is present, uncheck it. + If a "Use Phishing Filter" or "Phishing Filter" option is present, + set it to Disable. + +- Confirm (click OK or Yes) out of all dialogs. If you have a version of IE that splits the settings between security zones, do this for all zones. To disable the filter in Firefox: - - Click Options from the Tools menu. - - Click the Security tab. - - Uncheck both the "Block reported attack sites" and "Block reported - web forgeries" options. - - Click OK. +````````````````````````````````` + +- Click Options from the Tools menu. + +- Click the Security tab. + +- Uncheck both the "Block reported attack sites" and "Block reported + web forgeries" options. + +- Click OK. To disable the filter in Chrome: - - Click Options from the Tools menu. - - Click the "Under the Hood" tab and find the "Privacy" section. - - Uncheck the "Enable phishing and malware protection" option. - - Click Close. +```````````````````````````````` + +- Click Options from the Tools menu. + +- Click the "Under the Hood" tab and find the "Privacy" section. + +- Uncheck the "Enable phishing and malware protection" option. + +- Click Close. -=== known issues in the FTP and SFTP frontends === +Known issues in the FTP and SFTP frontends +------------------------------------------ These are documented in docs/frontends/FTP-and-SFTP.txt and at . diff --git a/docs/logging.txt b/docs/logging.rst similarity index 79% rename from docs/logging.txt rename to docs/logging.rst index 641c9ca6..936e8467 100644 --- a/docs/logging.txt +++ b/docs/logging.rst @@ -1,17 +1,22 @@ -= Tahoe Logging = +============= +Tahoe Logging +============= -1. Overview -2. Realtime Logging -3. Incidents -4. Working with flogfiles -5. Gatherers - 5.1. Incident Gatherer - 5.2. Log Gatherer -6. Local twistd.log files -7. Adding log messages -8. Log Messages During Unit Tests +1. `Overview`_ +2. `Realtime Logging`_ +3. `Incidents`_ +4. `Working with flogfiles`_ +5. `Gatherers`_ -== Overview == + 1. `Incident Gatherer`_ + 2. `Log Gatherer`_ + +6. `Local twistd.log files`_ +7. `Adding log messages`_ +8. `Log Messages During Unit Tests`_ + +Overview +======== Tahoe uses the Foolscap logging mechanism (known as the "flog" subsystem) to record information about what is happening inside the Tahoe node. This is @@ -26,7 +31,8 @@ The foolscap distribution includes a utility named "flogtool" (usually at /usr/bin/flogtool) which is used to get access to many foolscap logging features. -== Realtime Logging == +Realtime Logging +================ When you are working on Tahoe code, and want to see what the node is doing, the easiest tool to use is "flogtool tail". This connects to the tahoe node @@ -37,7 +43,7 @@ to stdout, and optionally saved to a file. BASEDIR/private/logport.furl . The following command will connect to this port and start emitting log information: - flogtool tail BASEDIR/private/logport.furl + flogtool tail BASEDIR/private/logport.furl The "--save-to FILENAME" option will save all received events to a file, where then can be examined later with "flogtool dump" or "flogtool @@ -45,7 +51,8 @@ web-viewer". The --catch-up flag will ask the node to dump all stored events before subscribing to new ones (without --catch-up, you will only hear about events that occur after the tool has connected and subscribed). -== Incidents == +Incidents +========= Foolscap keeps a short list of recent events in memory. When something goes wrong, it writes all the history it has (and everything that gets logged in @@ -72,7 +79,8 @@ view provides more structure than the output of "flogtool dump": the parent/child relationships of log events is displayed in a nested format. "flogtool web-viewer" is still fairly immature. -== Working with flogfiles == +Working with flogfiles +====================== The "flogtool filter" command can be used to take a large flogfile (perhaps one created by the log-gatherer, see below) and copy a subset of events into @@ -85,7 +93,8 @@ retains events send by a specific tubid. --strip-facility removes events that were emitted with a given facility (like foolscap.negotiation or tahoe.upload). -== Gatherers == +Gatherers +========= In a deployed Tahoe grid, it is useful to get log information automatically transferred to a central log-gatherer host. This offloads the (admittedly @@ -101,7 +110,8 @@ gatherer will then use the logport to subscribe to hear about events. The gatherer will write to files in its working directory, which can then be examined with tools like "flogtool dump" as described above. -=== Incident Gatherer === +Incident Gatherer +----------------- The "incident gatherer" only collects Incidents: records of the log events that occurred just before and slightly after some high-level "trigger event" @@ -120,7 +130,7 @@ WORKDIR" command, and started with "tahoe start". The generated "gatherer.tac" file should be modified to add classifier functions. The incident gatherer writes incident names (which are simply the relative -pathname of the incident-*.flog.bz2 file) into classified/CATEGORY. For +pathname of the incident-\*.flog.bz2 file) into classified/CATEGORY. For example, the classified/mutable-retrieve-uncoordinated-write-error file contains a list of all incidents which were triggered by an uncoordinated write that was detected during mutable file retrieval (caused when somebody @@ -145,7 +155,8 @@ In our experience, each Incident takes about two seconds to transfer from the node which generated it to the gatherer. The gatherer will automatically catch up to any incidents which occurred while it is offline. -=== Log Gatherer === +Log Gatherer +------------ The "Log Gatherer" subscribes to hear about every single event published by the connected nodes, regardless of severity. This server writes these log @@ -172,7 +183,8 @@ the outbound TCP queue), publishing nodes will start dropping log events when the outbound queue grows too large. When this occurs, there will be gaps (non-sequential event numbers) in the log-gatherer's flogfiles. -== Local twistd.log files == +Local twistd.log files +====================== [TODO: not yet true, requires foolscap-0.3.1 and a change to allmydata.node] @@ -188,53 +200,55 @@ Only events at the log.OPERATIONAL level or higher are bridged to twistd.log (i.e. not the log.NOISY debugging events). In addition, foolscap internal events (like connection negotiation messages) are not bridged to twistd.log . -== Adding log messages == +Adding log messages +=================== When adding new code, the Tahoe developer should add a reasonable number of new log events. For details, please see the Foolscap logging documentation, but a few notes are worth stating here: - * use a facility prefix of "tahoe.", like "tahoe.mutable.publish" +* use a facility prefix of "tahoe.", like "tahoe.mutable.publish" - * assign each severe (log.WEIRD or higher) event a unique message - identifier, as the umid= argument to the log.msg() call. The - misc/coding_tools/make_umid script may be useful for this purpose. This will make it - easier to write a classification function for these messages. +* assign each severe (log.WEIRD or higher) event a unique message + identifier, as the umid= argument to the log.msg() call. The + misc/coding_tools/make_umid script may be useful for this purpose. This will make it + easier to write a classification function for these messages. - * use the parent= argument whenever the event is causally/temporally - clustered with its parent. For example, a download process that involves - three sequential hash fetches could announce the send and receipt of those - hash-fetch messages with a parent= argument that ties them to the overall - download process. However, each new wapi download request should be - unparented. +* use the parent= argument whenever the event is causally/temporally + clustered with its parent. For example, a download process that involves + three sequential hash fetches could announce the send and receipt of those + hash-fetch messages with a parent= argument that ties them to the overall + download process. However, each new wapi download request should be + unparented. - * use the format= argument in preference to the message= argument. E.g. - use log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k) instead of - log.msg("got %d shares, need %d" % (n,k)). This will allow later tools to - analyze the event without needing to scrape/reconstruct the structured - data out of the formatted string. +* use the format= argument in preference to the message= argument. E.g. + use log.msg(format="got %(n)d shares, need %(k)d", n=n, k=k) instead of + log.msg("got %d shares, need %d" % (n,k)). This will allow later tools to + analyze the event without needing to scrape/reconstruct the structured + data out of the formatted string. - * Pass extra information as extra keyword arguments, even if they aren't - included in the format= string. This information will be displayed in the - "flogtool dump --verbose" output, as well as being available to other - tools. The umid= argument should be passed this way. +* Pass extra information as extra keyword arguments, even if they aren't + included in the format= string. This information will be displayed in the + "flogtool dump --verbose" output, as well as being available to other + tools. The umid= argument should be passed this way. - * use log.err for the catch-all addErrback that gets attached to the end of - any given Deferred chain. When used in conjunction with LOGTOTWISTED=1, - log.err() will tell Twisted about the error-nature of the log message, - causing Trial to flunk the test (with an "ERROR" indication that prints a - copy of the Failure, including a traceback). Don't use log.err for events - that are BAD but handled (like hash failures: since these are often - deliberately provoked by test code, they should not cause test failures): - use log.msg(level=BAD) for those instead. +* use log.err for the catch-all addErrback that gets attached to the end of + any given Deferred chain. When used in conjunction with LOGTOTWISTED=1, + log.err() will tell Twisted about the error-nature of the log message, + causing Trial to flunk the test (with an "ERROR" indication that prints a + copy of the Failure, including a traceback). Don't use log.err for events + that are BAD but handled (like hash failures: since these are often + deliberately provoked by test code, they should not cause test failures): + use log.msg(level=BAD) for those instead. -== Log Messages During Unit Tests == +Log Messages During Unit Tests +============================== If a test is failing and you aren't sure why, start by enabling FLOGTOTWISTED=1 like this: - make test FLOGTOTWISTED=1 + make test FLOGTOTWISTED=1 With FLOGTOTWISTED=1, sufficiently-important log events will be written into _trial_temp/test.log, which may give you more ideas about why the test is @@ -246,7 +260,7 @@ below the level=OPERATIONAL threshold, due to this issue: If that isn't enough, look at the detailed foolscap logging messages instead, by running the tests like this: - make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1 + make test FLOGFILE=flog.out.bz2 FLOGLEVEL=1 FLOGTOTWISTED=1 The first environment variable will cause foolscap log events to be written to ./flog.out.bz2 (instead of merely being recorded in the circular buffers diff --git a/docs/performance.rst b/docs/performance.rst new file mode 100644 index 00000000..4165b776 --- /dev/null +++ b/docs/performance.rst @@ -0,0 +1,162 @@ +============================================ +Performance costs for some common operations +============================================ + +1. `Publishing an A-byte immutable file`_ +2. `Publishing an A-byte mutable file`_ +3. `Downloading B bytes of an A-byte immutable file`_ +4. `Downloading B bytes of an A-byte mutable file`_ +5. `Modifying B bytes of an A-byte mutable file`_ +6. `Inserting/Removing B bytes in an A-byte mutable file`_ +7. `Adding an entry to an A-entry directory`_ +8. `Listing an A entry directory`_ +9. `Performing a file-check on an A-byte file`_ +10. `Performing a file-verify on an A-byte file`_ +11. `Repairing an A-byte file (mutable or immutable)`_ + +Publishing an ``A``-byte immutable file +======================================= + +network: A + +memory footprint: N/k*128KiB + +notes: An immutable file upload requires an additional I/O pass over the entire +source file before the upload process can start, since convergent +encryption derives the encryption key in part from the contents of the +source file. + +Publishing an ``A``-byte mutable file +===================================== + +network: A + +memory footprint: N/k*A + +cpu: O(A) + a large constant for RSA keypair generation + +notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that it +publishes to a grid. This takes up to 1 or 2 seconds on a typical desktop PC. + +Part of the process of encrypting, encoding, and uploading a mutable file to a +Tahoe-LAFS grid requires that the entire file be in memory at once. For larger +files, this may cause Tahoe-LAFS to have an unacceptably large memory footprint +(at least when uploading a mutable file). + +Downloading ``B`` bytes of an ``A``-byte immutable file +======================================================= + +network: B + +memory footprint: 128KiB + +notes: When Tahoe-LAFS 1.8.0 or later is asked to read an arbitrary range +of an immutable file, only the 128-KiB segments that overlap the +requested range will be downloaded. + +(Earlier versions would download from the beginning of the file up +until the end of the requested range, and then continue to download +the rest of the file even after the request was satisfied.) + +Downloading ``B`` bytes of an ``A``-byte mutable file +===================================================== + +network: A + +memory footprint: A + +notes: As currently implemented, mutable files must be downloaded in +their entirety before any part of them can be read. We are +exploring fixes for this; see ticket #393 for more information. + +Modifying ``B`` bytes of an ``A``-byte mutable file +=================================================== + +network: A + +memory footprint: N/k*A + +notes: If you upload a changed version of a mutable file that you +earlier put onto your grid with, say, 'tahoe put --mutable', +Tahoe-LAFS will replace the old file with the new file on the +grid, rather than attempting to modify only those portions of the +file that have changed. Modifying a file in this manner is +essentially uploading the file over again, except that it re-uses +the existing RSA keypair instead of generating a new one. + +Inserting/Removing ``B`` bytes in an ``A``-byte mutable file +============================================================ + +network: A + +memory footprint: N/k*A + +notes: Modifying any part of a mutable file in Tahoe-LAFS requires that +the entire file be downloaded, modified, held in memory while it is +encrypted and encoded, and then re-uploaded. A future version of the +mutable file layout ("LDMF") may provide efficient inserts and +deletes. Note that this sort of modification is mostly used internally +for directories, and isn't something that the WUI, CLI, or other +interfaces will do -- instead, they will simply overwrite the file to +be modified, as described in "Modifying B bytes of an A-byte mutable +file". + +Adding an entry to an ``A``-entry directory +=========================================== + +network: O(A) + +memory footprint: N/k*A + +notes: In Tahoe-LAFS, directories are implemented as specialized mutable +files. So adding an entry to a directory is essentially adding B +(actually, 300-330) bytes somewhere in an existing mutable file. + +Listing an ``A`` entry directory +================================ + +network: O(A) + +memory footprint: N/k*A + +notes: Listing a directory requires that the mutable file storing the +directory be downloaded from the grid. So listing an A entry +directory requires downloading a (roughly) 330 * A byte mutable +file, since each directory entry is about 300-330 bytes in size. + +Performing a file-check on an ``A``-byte file +============================================= + +network: O(S), where S is the number of servers on your grid + +memory footprint: negligible + +notes: To check a file, Tahoe-LAFS queries all the servers that it knows +about. Note that neither of these values directly depend on the size +of the file. This is relatively inexpensive, compared to the verify +and repair operations. + +Performing a file-verify on an ``A``-byte file +============================================== + +network: N/k*A + +memory footprint: N/k*128KiB + +notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext +shares that were originally uploaded to the grid and integrity +checks them. This is, for well-behaved grids, likely to be more +expensive than downloading an A-byte file, since only a fraction +of these shares are necessary to recover the file. + +Repairing an ``A``-byte file (mutable or immutable) +=================================================== + +network: variable; up to around O(A) + +memory footprint: from 128KiB to (1+N/k)*128KiB + +notes: To repair a file, Tahoe-LAFS downloads the file, and generates/uploads +missing shares in the same way as when it initially uploads the file. +So, depending on how many shares are missing, this can be about as +expensive as initially uploading the file in the first place. diff --git a/docs/performance.txt b/docs/performance.txt deleted file mode 100644 index ba9d5055..00000000 --- a/docs/performance.txt +++ /dev/null @@ -1,139 +0,0 @@ -= Performance costs for some common operations = - -1. Publishing an A-byte immutable file -2. Publishing an A-byte mutable file -3. Downloading B bytes of an A-byte immutable file -4. Downloading B bytes of an A-byte mutable file -5. Modifying B bytes of an A-byte mutable file -6. Inserting/Removing B bytes in an A-byte mutable file -7. Adding an entry to an A-entry directory -8. Listing an A entry directory -9. Performing a file-check on an A-byte file -10. Performing a file-verify on an A-byte file -11. Repairing an A-byte file (mutable or immutable) - -== Publishing an A-byte immutable file == - -network: A -memory footprint: N/k*128KiB - -notes: An immutable file upload requires an additional I/O pass over the entire - source file before the upload process can start, since convergent - encryption derives the encryption key in part from the contents of the - source file. - -== Publishing an A-byte mutable file == - -network: A -memory footprint: N/k*A -cpu: O(A) + a large constant for RSA keypair generation - -notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that - it publishes to a grid. This takes up to 1 or 2 seconds on a - typical desktop PC. - - Part of the process of encrypting, encoding, and uploading a - mutable file to a Tahoe-LAFS grid requires that the entire file - be in memory at once. For larger files, this may cause - Tahoe-LAFS to have an unacceptably large memory footprint (at - least when uploading a mutable file). - -== Downloading B bytes of an A-byte immutable file == - -network: B -memory footprint: 128KiB - -notes: When Tahoe-LAFS 1.8.0 or later is asked to read an arbitrary range - of an immutable file, only the 128-KiB segments that overlap the - requested range will be downloaded. - - (Earlier versions would download from the beginning of the file up - until the end of the requested range, and then continue to download - the rest of the file even after the request was satisfied.) - -== Downloading B bytes of an A-byte mutable file == - -network: A -memory footprint: A - -notes: As currently implemented, mutable files must be downloaded in - their entirety before any part of them can be read. We are - exploring fixes for this; see ticket #393 for more information. - -== Modifying B bytes of an A-byte mutable file == - -network: A -memory footprint: N/k*A - -notes: If you upload a changed version of a mutable file that you - earlier put onto your grid with, say, 'tahoe put --mutable', - Tahoe-LAFS will replace the old file with the new file on the - grid, rather than attempting to modify only those portions of the - file that have changed. Modifying a file in this manner is - essentially uploading the file over again, except that it re-uses - the existing RSA keypair instead of generating a new one. - -== Inserting/Removing B bytes in an A-byte mutable file == - -network: A -memory footprint: N/k*A - -notes: Modifying any part of a mutable file in Tahoe-LAFS requires that - the entire file be downloaded, modified, held in memory while it is - encrypted and encoded, and then re-uploaded. A future version of the - mutable file layout ("LDMF") may provide efficient inserts and - deletes. Note that this sort of modification is mostly used internally - for directories, and isn't something that the WUI, CLI, or other - interfaces will do -- instead, they will simply overwrite the file to - be modified, as described in "Modifying B bytes of an A-byte mutable - file". - -== Adding an entry to an A-entry directory == - -network: O(A) -memory footprint: N/k*A - -notes: In Tahoe-LAFS, directories are implemented as specialized mutable - files. So adding an entry to a directory is essentially adding B - (actually, 300-330) bytes somewhere in an existing mutable file. - -== Listing an A entry directory == - -network: O(A) -memory footprint: N/k*A - -notes: Listing a directory requires that the mutable file storing the - directory be downloaded from the grid. So listing an A entry - directory requires downloading a (roughly) 330 * A byte mutable - file, since each directory entry is about 300-330 bytes in size. - -== Performing a file-check on an A-byte file == - -network: O(S), where S is the number of servers on your grid -memory footprint: negligible - -notes: To check a file, Tahoe-LAFS queries all the servers that it knows - about. Note that neither of these values directly depend on the size - of the file. This is relatively inexpensive, compared to the verify - and repair operations. - -== Performing a file-verify on an A-byte file == - -network: N/k*A -memory footprint: N/k*128KiB - -notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext - shares that were originally uploaded to the grid and integrity - checks them. This is, for well-behaved grids, likely to be more - expensive than downloading an A-byte file, since only a fraction - of these shares are necessary to recover the file. - -== Repairing an A-byte file (mutable or immutable) == - -network: variable; up to around O(A) -memory footprint: from 128KiB to (1+N/k)*128KiB - -notes: To repair a file, Tahoe-LAFS downloads the file, and generates/uploads - missing shares in the same way as when it initially uploads the file. - So, depending on how many shares are missing, this can be about as - expensive as initially uploading the file in the first place. diff --git a/docs/stats.rst b/docs/stats.rst new file mode 100644 index 00000000..47f6b04b --- /dev/null +++ b/docs/stats.rst @@ -0,0 +1,337 @@ +================ +Tahoe Statistics +================ + +1. `Overview`_ +2. `Statistics Categories`_ +3. `Running a Tahoe Stats-Gatherer Service`_ +4. `Using Munin To Graph Stats Values`_ + +Overview +======== + +Each Tahoe node collects and publishes statistics about its operations as it +runs. These include counters of how many files have been uploaded and +downloaded, CPU usage information, performance numbers like latency of +storage server operations, and available disk space. + +The easiest way to see the stats for any given node is use the web interface. +From the main "Welcome Page", follow the "Operational Statistics" link inside +the small "This Client" box. If the welcome page lives at +http://localhost:3456/, then the statistics page will live at +http://localhost:3456/statistics . This presents a summary of the stats +block, along with a copy of the raw counters. To obtain just the raw counters +(in JSON format), use /statistics?t=json instead. + +Statistics Categories +===================== + +The stats dictionary contains two keys: 'counters' and 'stats'. 'counters' +are strictly counters: they are reset to zero when the node is started, and +grow upwards. 'stats' are non-incrementing values, used to measure the +current state of various systems. Some stats are actually booleans, expressed +as '1' for true and '0' for false (internal restrictions require all stats +values to be numbers). + +Under both the 'counters' and 'stats' dictionaries, each individual stat has +a key with a dot-separated name, breaking them up into groups like +'cpu_monitor' and 'storage_server'. + +The currently available stats (as of release 1.6.0 or so) are described here: + +**counters.storage_server.\*** + + this group counts inbound storage-server operations. They are not provided + by client-only nodes which have been configured to not run a storage server + (with [storage]enabled=false in tahoe.cfg) + + allocate, write, close, abort + these are for immutable file uploads. 'allocate' is incremented when a + client asks if it can upload a share to the server. 'write' is + incremented for each chunk of data written. 'close' is incremented when + the share is finished. 'abort' is incremented if the client abandons + the upload. + + get, read + these are for immutable file downloads. 'get' is incremented + when a client asks if the server has a specific share. 'read' is + incremented for each chunk of data read. + + readv, writev + these are for immutable file creation, publish, and retrieve. 'readv' + is incremented each time a client reads part of a mutable share. + 'writev' is incremented each time a client sends a modification + request. + + add-lease, renew, cancel + these are for share lease modifications. 'add-lease' is incremented + when an 'add-lease' operation is performed (which either adds a new + lease or renews an existing lease). 'renew' is for the 'renew-lease' + operation (which can only be used to renew an existing one). 'cancel' + is used for the 'cancel-lease' operation. + + bytes_freed + this counts how many bytes were freed when a 'cancel-lease' + operation removed the last lease from a share and the share + was thus deleted. + + bytes_added + this counts how many bytes were consumed by immutable share + uploads. It is incremented at the same time as the 'close' + counter. + +**stats.storage_server.\*** + + allocated + this counts how many bytes are currently 'allocated', which + tracks the space that will eventually be consumed by immutable + share upload operations. The stat is increased as soon as the + upload begins (at the same time the 'allocated' counter is + incremented), and goes back to zero when the 'close' or 'abort' + message is received (at which point the 'disk_used' stat should + incremented by the same amount). + + disk_total, disk_used, disk_free_for_root, disk_free_for_nonroot, disk_avail, reserved_space + these all reflect disk-space usage policies and status. + 'disk_total' is the total size of disk where the storage + server's BASEDIR/storage/shares directory lives, as reported + by /bin/df or equivalent. 'disk_used', 'disk_free_for_root', + and 'disk_free_for_nonroot' show related information. + 'reserved_space' reports the reservation configured by the + tahoe.cfg [storage]reserved_space value. 'disk_avail' + reports the remaining disk space available for the Tahoe + server after subtracting reserved_space from disk_avail. All + values are in bytes. + + accepting_immutable_shares + this is '1' if the storage server is currently accepting uploads of + immutable shares. It may be '0' if a server is disabled by + configuration, or if the disk is full (i.e. disk_avail is less than + reserved_space). + + total_bucket_count + this counts the number of 'buckets' (i.e. unique + storage-index values) currently managed by the storage + server. It indicates roughly how many files are managed + by the server. + + latencies.*.* + these stats keep track of local disk latencies for + storage-server operations. A number of percentile values are + tracked for many operations. For example, + 'storage_server.latencies.readv.50_0_percentile' records the + median response time for a 'readv' request. All values are in + seconds. These are recorded by the storage server, starting + from the time the request arrives (post-deserialization) and + ending when the response begins serialization. As such, they + are mostly useful for measuring disk speeds. The operations + tracked are the same as the counters.storage_server.* counter + values (allocate, write, close, get, read, add-lease, renew, + cancel, readv, writev). The percentile values tracked are: + mean, 01_0_percentile, 10_0_percentile, 50_0_percentile, + 90_0_percentile, 95_0_percentile, 99_0_percentile, + 99_9_percentile. (the last value, 99.9 percentile, means that + 999 out of the last 1000 operations were faster than the + given number, and is the same threshold used by Amazon's + internal SLA, according to the Dynamo paper). + +**counters.uploader.files_uploaded** + +**counters.uploader.bytes_uploaded** + +**counters.downloader.files_downloaded** + +**counters.downloader.bytes_downloaded** + + These count client activity: a Tahoe client will increment these when it + uploads or downloads an immutable file. 'files_uploaded' is incremented by + one for each operation, while 'bytes_uploaded' is incremented by the size of + the file. + +**counters.mutable.files_published** + +**counters.mutable.bytes_published** + +**counters.mutable.files_retrieved** + +**counters.mutable.bytes_retrieved** + + These count client activity for mutable files. 'published' is the act of + changing an existing mutable file (or creating a brand-new mutable file). + 'retrieved' is the act of reading its current contents. + +**counters.chk_upload_helper.\*** + + These count activity of the "Helper", which receives ciphertext from clients + and performs erasure-coding and share upload for files that are not already + in the grid. The code which implements these counters is in + src/allmydata/immutable/offloaded.py . + + upload_requests + incremented each time a client asks to upload a file + upload_already_present: incremented when the file is already in the grid + + upload_need_upload + incremented when the file is not already in the grid + + resumes + incremented when the helper already has partial ciphertext for + the requested upload, indicating that the client is resuming an + earlier upload + + fetched_bytes + this counts how many bytes of ciphertext have been fetched + from uploading clients + + encoded_bytes + this counts how many bytes of ciphertext have been + encoded and turned into successfully-uploaded shares. If no + uploads have failed or been abandoned, encoded_bytes should + eventually equal fetched_bytes. + +**stats.chk_upload_helper.\*** + + These also track Helper activity: + + active_uploads + how many files are currently being uploaded. 0 when idle. + + incoming_count + how many cache files are present in the incoming/ directory, + which holds ciphertext files that are still being fetched + from the client + + incoming_size + total size of cache files in the incoming/ directory + + incoming_size_old + total size of 'old' cache files (more than 48 hours) + + encoding_count + how many cache files are present in the encoding/ directory, + which holds ciphertext files that are being encoded and + uploaded + + encoding_size + total size of cache files in the encoding/ directory + + encoding_size_old + total size of 'old' cache files (more than 48 hours) + +**stats.node.uptime** + how many seconds since the node process was started + +**stats.cpu_monitor.\*** + + 1min_avg, 5min_avg, 15min_avg + estimate of what percentage of system CPU time was consumed by the + node process, over the given time interval. Expressed as a float, 0.0 + for 0%, 1.0 for 100% + + total + estimate of total number of CPU seconds consumed by node since + the process was started. Ticket #472 indicates that .total may + sometimes be negative due to wraparound of the kernel's counter. + +**stats.load_monitor.\*** + + When enabled, the "load monitor" continually schedules a one-second + callback, and measures how late the response is. This estimates system load + (if the system is idle, the response should be on time). This is only + enabled if a stats-gatherer is configured. + + avg_load + average "load" value (seconds late) over the last minute + + max_load + maximum "load" value over the last minute + + +Running a Tahoe Stats-Gatherer Service +====================================== + +The "stats-gatherer" is a simple daemon that periodically collects stats from +several tahoe nodes. It could be useful, e.g., in a production environment, +where you want to monitor dozens of storage servers from a central management +host. It merely gatherers statistics from many nodes into a single place: it +does not do any actual analysis. + +The stats gatherer listens on a network port using the same Foolscap_ +connection library that Tahoe clients use to connect to storage servers. +Tahoe nodes can be configured to connect to the stats gatherer and publish +their stats on a periodic basis. (In fact, what happens is that nodes connect +to the gatherer and offer it a second FURL which points back to the node's +"stats port", which the gatherer then uses to pull stats on a periodic basis. +The initial connection is flipped to allow the nodes to live behind NAT +boxes, as long as the stats-gatherer has a reachable IP address.) + +.. _Foolscap: http://foolscap.lothar.com/trac + +The stats-gatherer is created in the same fashion as regular tahoe client +nodes and introducer nodes. Choose a base directory for the gatherer to live +in (but do not create the directory). Then run: + +:: + + tahoe create-stats-gatherer $BASEDIR + +and start it with "tahoe start $BASEDIR". Once running, the gatherer will +write a FURL into $BASEDIR/stats_gatherer.furl . + +To configure a Tahoe client/server node to contact the stats gatherer, copy +this FURL into the node's tahoe.cfg file, in a section named "[client]", +under a key named "stats_gatherer.furl", like so: + +:: + + [client] + stats_gatherer.furl = pb://qbo4ktl667zmtiuou6lwbjryli2brv6t@192.168.0.8:49997/wxycb4kaexzskubjnauxeoptympyf45y + +or simply copy the stats_gatherer.furl file into the node's base directory +(next to the tahoe.cfg file): it will be interpreted in the same way. + +The first time it is started, the gatherer will listen on a random unused TCP +port, so it should not conflict with anything else that you have running on +that host at that time. On subsequent runs, it will re-use the same port (to +keep its FURL consistent). To explicitly control which port it uses, write +the desired portnumber into a file named "portnum" (i.e. $BASEDIR/portnum), +and the next time the gatherer is started, it will start listening on the +given port. The portnum file is actually a "strports specification string", +as described in docs/configuration.txt . + +Once running, the stats gatherer will create a standard python "pickle" file +in $BASEDIR/stats.pickle . Once a minute, the gatherer will pull stats +information from every connected node and write them into the pickle. The +pickle will contain a dictionary, in which node identifiers (known as "tubid" +strings) are the keys, and the values are a dict with 'timestamp', +'nickname', and 'stats' keys. d[tubid][stats] will contain the stats +dictionary as made available at http://localhost:3456/statistics?t=json . The +pickle file will only contain the most recent update from each node. + +Other tools can be built to examine these stats and render them into +something useful. For example, a tool could sum the +"storage_server.disk_avail' values from all servers to compute a +total-disk-available number for the entire grid (however, the "disk watcher" +daemon, in misc/operations_helpers/spacetime/, is better suited for this specific task). + +Using Munin To Graph Stats Values +================================= + +The misc/munin/ directory contains various plugins to graph stats for Tahoe +nodes. They are intended for use with the Munin_ system-management tool, which +typically polls target systems every 5 minutes and produces a web page with +graphs of various things over multiple time scales (last hour, last month, +last year). + +.. _Munin: http://munin-monitoring.org/ + +Most of the plugins are designed to pull stats from a single Tahoe node, and +are configured with the e.g. http://localhost:3456/statistics?t=json URL. The +"tahoe_stats" plugin is designed to read from the pickle file created by the +stats-gatherer. Some plugins are to be used with the disk watcher, and a few +(like tahoe_nodememory) are designed to watch the node processes directly +(and must therefore run on the same host as the target node). + +Please see the docstrings at the beginning of each plugin for details, and +the "tahoe-conf" file for notes about configuration and installing these +plugins into a Munin environment. diff --git a/docs/stats.txt b/docs/stats.txt deleted file mode 100644 index 8cb8dfb4..00000000 --- a/docs/stats.txt +++ /dev/null @@ -1,276 +0,0 @@ -= Tahoe Statistics = - -1. Overview -2. Statistics Categories -3. Running a Tahoe Stats-Gatherer Service -4. Using Munin To Graph Stats Values - -== Overview == - -Each Tahoe node collects and publishes statistics about its operations as it -runs. These include counters of how many files have been uploaded and -downloaded, CPU usage information, performance numbers like latency of -storage server operations, and available disk space. - -The easiest way to see the stats for any given node is use the web interface. -From the main "Welcome Page", follow the "Operational Statistics" link inside -the small "This Client" box. If the welcome page lives at -http://localhost:3456/, then the statistics page will live at -http://localhost:3456/statistics . This presents a summary of the stats -block, along with a copy of the raw counters. To obtain just the raw counters -(in JSON format), use /statistics?t=json instead. - -== Statistics Categories == - -The stats dictionary contains two keys: 'counters' and 'stats'. 'counters' -are strictly counters: they are reset to zero when the node is started, and -grow upwards. 'stats' are non-incrementing values, used to measure the -current state of various systems. Some stats are actually booleans, expressed -as '1' for true and '0' for false (internal restrictions require all stats -values to be numbers). - -Under both the 'counters' and 'stats' dictionaries, each individual stat has -a key with a dot-separated name, breaking them up into groups like -'cpu_monitor' and 'storage_server'. - -The currently available stats (as of release 1.6.0 or so) are described here: - -counters.storage_server.*: this group counts inbound storage-server - operations. They are not provided by client-only - nodes which have been configured to not run a - storage server (with [storage]enabled=false in - tahoe.cfg) - allocate, write, close, abort: these are for immutable file uploads. - 'allocate' is incremented when a client asks - if it can upload a share to the server. - 'write' is incremented for each chunk of - data written. 'close' is incremented when - the share is finished. 'abort' is - incremented if the client abandons the - uploaed. - get, read: these are for immutable file downloads. 'get' is incremented - when a client asks if the server has a specific share. 'read' is - incremented for each chunk of data read. - readv, writev: these are for immutable file creation, publish, and - retrieve. 'readv' is incremented each time a client reads - part of a mutable share. 'writev' is incremented each time a - client sends a modification request. - add-lease, renew, cancel: these are for share lease modifications. - 'add-lease' is incremented when an 'add-lease' - operation is performed (which either adds a new - lease or renews an existing lease). 'renew' is - for the 'renew-lease' operation (which can only - be used to renew an existing one). 'cancel' is - used for the 'cancel-lease' operation. - bytes_freed: this counts how many bytes were freed when a 'cancel-lease' - operation removed the last lease from a share and the share - was thus deleted. - bytes_added: this counts how many bytes were consumed by immutable share - uploads. It is incremented at the same time as the 'close' - counter. - -stats.storage_server.*: - allocated: this counts how many bytes are currently 'allocated', which - tracks the space that will eventually be consumed by immutable - share upload operations. The stat is increased as soon as the - upload begins (at the same time the 'allocated' counter is - incremented), and goes back to zero when the 'close' or 'abort' - message is received (at which point the 'disk_used' stat should - incremented by the same amount). - disk_total - disk_used - disk_free_for_root - disk_free_for_nonroot - disk_avail - reserved_space: these all reflect disk-space usage policies and status. - 'disk_total' is the total size of disk where the storage - server's BASEDIR/storage/shares directory lives, as reported - by /bin/df or equivalent. 'disk_used', 'disk_free_for_root', - and 'disk_free_for_nonroot' show related information. - 'reserved_space' reports the reservation configured by the - tahoe.cfg [storage]reserved_space value. 'disk_avail' - reports the remaining disk space available for the Tahoe - server after subtracting reserved_space from disk_avail. All - values are in bytes. - accepting_immutable_shares: this is '1' if the storage server is currently - accepting uploads of immutable shares. It may be - '0' if a server is disabled by configuration, or - if the disk is full (i.e. disk_avail is less - than reserved_space). - total_bucket_count: this counts the number of 'buckets' (i.e. unique - storage-index values) currently managed by the storage - server. It indicates roughly how many files are managed - by the server. - latencies.*.*: these stats keep track of local disk latencies for - storage-server operations. A number of percentile values are - tracked for many operations. For example, - 'storage_server.latencies.readv.50_0_percentile' records the - median response time for a 'readv' request. All values are in - seconds. These are recorded by the storage server, starting - from the time the request arrives (post-deserialization) and - ending when the response begins serialization. As such, they - are mostly useful for measuring disk speeds. The operations - tracked are the same as the counters.storage_server.* counter - values (allocate, write, close, get, read, add-lease, renew, - cancel, readv, writev). The percentile values tracked are: - mean, 01_0_percentile, 10_0_percentile, 50_0_percentile, - 90_0_percentile, 95_0_percentile, 99_0_percentile, - 99_9_percentile. (the last value, 99.9 percentile, means that - 999 out of the last 1000 operations were faster than the - given number, and is the same threshold used by Amazon's - internal SLA, according to the Dynamo paper). - -counters.uploader.files_uploaded -counters.uploader.bytes_uploaded -counters.downloader.files_downloaded -counters.downloader.bytes_downloaded - - These count client activity: a Tahoe client will increment these when it - uploads or downloads an immutable file. 'files_uploaded' is incremented by - one for each operation, while 'bytes_uploaded' is incremented by the size of - the file. - -counters.mutable.files_published -counters.mutable.bytes_published -counters.mutable.files_retrieved -counters.mutable.bytes_retrieved - - These count client activity for mutable files. 'published' is the act of - changing an existing mutable file (or creating a brand-new mutable file). - 'retrieved' is the act of reading its current contents. - -counters.chk_upload_helper.* - - These count activity of the "Helper", which receives ciphertext from clients - and performs erasure-coding and share upload for files that are not already - in the grid. The code which implements these counters is in - src/allmydata/immutable/offloaded.py . - - upload_requests: incremented each time a client asks to upload a file - upload_already_present: incremented when the file is already in the grid - upload_need_upload: incremented when the file is not already in the grid - resumes: incremented when the helper already has partial ciphertext for - the requested upload, indicating that the client is resuming an - earlier upload - fetched_bytes: this counts how many bytes of ciphertext have been fetched - from uploading clients - encoded_bytes: this counts how many bytes of ciphertext have been - encoded and turned into successfully-uploaded shares. If no - uploads have failed or been abandoned, encoded_bytes should - eventually equal fetched_bytes. - -stats.chk_upload_helper.* - - These also track Helper activity: - - active_uploads: how many files are currently being uploaded. 0 when idle. - incoming_count: how many cache files are present in the incoming/ directory, - which holds ciphertext files that are still being fetched - from the client - incoming_size: total size of cache files in the incoming/ directory - incoming_size_old: total size of 'old' cache files (more than 48 hours) - encoding_count: how many cache files are present in the encoding/ directory, - which holds ciphertext files that are being encoded and - uploaded - encoding_size: total size of cache files in the encoding/ directory - encoding_size_old: total size of 'old' cache files (more than 48 hours) - -stats.node.uptime: how many seconds since the node process was started - -stats.cpu_monitor.*: - .1min_avg, 5min_avg, 15min_avg: estimate of what percentage of system CPU - time was consumed by the node process, over - the given time interval. Expressed as a - float, 0.0 for 0%, 1.0 for 100% - .total: estimate of total number of CPU seconds consumed by node since - the process was started. Ticket #472 indicates that .total may - sometimes be negative due to wraparound of the kernel's counter. - -stats.load_monitor.*: - When enabled, the "load monitor" continually schedules a one-second - callback, and measures how late the response is. This estimates system load - (if the system is idle, the response should be on time). This is only - enabled if a stats-gatherer is configured. - - .avg_load: average "load" value (seconds late) over the last minute - .max_load: maximum "load" value over the last minute - - -== Running a Tahoe Stats-Gatherer Service == - -The "stats-gatherer" is a simple daemon that periodically collects stats from -several tahoe nodes. It could be useful, e.g., in a production environment, -where you want to monitor dozens of storage servers from a central management -host. It merely gatherers statistics from many nodes into a single place: it -does not do any actual analysis. - -The stats gatherer listens on a network port using the same Foolscap -connection library that Tahoe clients use to connect to storage servers. -Tahoe nodes can be configured to connect to the stats gatherer and publish -their stats on a periodic basis. (in fact, what happens is that nodes connect -to the gatherer and offer it a second FURL which points back to the node's -"stats port", which the gatherer then uses to pull stats on a periodic basis. -The initial connection is flipped to allow the nodes to live behind NAT -boxes, as long as the stats-gatherer has a reachable IP address) - -The stats-gatherer is created in the same fashion as regular tahoe client -nodes and introducer nodes. Choose a base directory for the gatherer to live -in (but do not create the directory). Then run: - - tahoe create-stats-gatherer $BASEDIR - -and start it with "tahoe start $BASEDIR". Once running, the gatherer will -write a FURL into $BASEDIR/stats_gatherer.furl . - -To configure a Tahoe client/server node to contact the stats gatherer, copy -this FURL into the node's tahoe.cfg file, in a section named "[client]", -under a key named "stats_gatherer.furl", like so: - - [client] - stats_gatherer.furl = pb://qbo4ktl667zmtiuou6lwbjryli2brv6t@192.168.0.8:49997/wxycb4kaexzskubjnauxeoptympyf45y - -or simply copy the stats_gatherer.furl file into the node's base directory -(next to the tahoe.cfg file): it will be interpreted in the same way. - -The first time it is started, the gatherer will listen on a random unused TCP -port, so it should not conflict with anything else that you have running on -that host at that time. On subsequent runs, it will re-use the same port (to -keep its FURL consistent). To explicitly control which port it uses, write -the desired portnumber into a file named "portnum" (i.e. $BASEDIR/portnum), -and the next time the gatherer is started, it will start listening on the -given port. The portnum file is actually a "strports specification string", -as described in docs/configuration.txt . - -Once running, the stats gatherer will create a standard python "pickle" file -in $BASEDIR/stats.pickle . Once a minute, the gatherer will pull stats -information from every connected node and write them into the pickle. The -pickle will contain a dictionary, in which node identifiers (known as "tubid" -strings) are the keys, and the values are a dict with 'timestamp', -'nickname', and 'stats' keys. d[tubid][stats] will contain the stats -dictionary as made available at http://localhost:3456/statistics?t=json . The -pickle file will only contain the most recent update from each node. - -Other tools can be built to examine these stats and render them into -something useful. For example, a tool could sum the -"storage_server.disk_avail' values from all servers to compute a -total-disk-available number for the entire grid (however, the "disk watcher" -daemon, in misc/operations_helpers/spacetime/, is better suited for this specific task). - -== Using Munin To Graph Stats Values == - -The misc/munin/ directory contains various plugins to graph stats for Tahoe -nodes. They are intended for use with the Munin system-management tool, which -typically polls target systems every 5 minutes and produces a web page with -graphs of various things over multiple time scales (last hour, last month, -last year). - -Most of the plugins are designed to pull stats from a single Tahoe node, and -are configured with the e.g. http://localhost:3456/statistics?t=json URL. The -"tahoe_stats" plugin is designed to read from the pickle file created by the -stats-gatherer. Some plugins are to be used with the disk watcher, and a few -(like tahoe_nodememory) are designed to watch the node processes directly -(and must therefore run on the same host as the target node). - -Please see the docstrings at the beginning of each plugin for details, and -the "tahoe-conf" file for notes about configuration and installing these -plugins into a Munin environment. -- 2.37.2