Brian Warner [Tue, 12 Feb 2008 01:17:01 +0000 (18:17 -0700)]
add 'tahoe catalog-shares' tool, to make a one-line summary of each share file. This can help do cross-server correlation of sharefiles, looking for anomalies
Brian Warner [Fri, 8 Feb 2008 03:15:37 +0000 (20:15 -0700)]
test_system: remove the hackish debug_interrupt= attribute magic used to exercise interrupted-upload resumption, instead just make the Uploadable bounce the helper halfway through the upload
Brian Warner [Thu, 7 Feb 2008 02:50:47 +0000 (19:50 -0700)]
change encryption-key hash to include encoding parameters. This is a minor compatibility break: CHK files encoded (with convergence) before and after this will have different keys and ciphertexts. Also switched to SHA-256d for both the data-to-key hash and the key-to-storageindex hash
Brian Warner [Wed, 6 Feb 2008 09:38:16 +0000 (02:38 -0700)]
webish: add when_done= to POST /uri?t=upload . I did not add a 'recent uploads' section to the welcome page, but I think the new upload-results page provides the desired data
robk-tahoe [Sat, 2 Feb 2008 01:57:31 +0000 (18:57 -0700)]
stats: fix service issues
having moved inititalisation into startService to handle tub init cleanly,
I neglected the up-call to startService, which wound up not starting the
load_monitor.
also I changed the 'running' attribute to 'started' since 'running' is
the name used internally by MultiService itself.
robk-tahoe [Sat, 2 Feb 2008 01:44:57 +0000 (18:44 -0700)]
munin stats: don't suppress series with no current data
having changed tahoe-stats to not report data series if there was no recent
data recorded for a node, I wound up making it hide the data series. this
change causes it to report all data series for which stats exist in the
'config' phase, so that they show up, but only report actual data if the
stats are recent, so that they show up as missing if the node is not
reporting stats currently
nejucomo [Wed, 30 Jan 2008 09:59:43 +0000 (02:59 -0700)]
tahoe_fuse: system test: Populate a testdir with files and empty children directories, then test the fuse interface for proper listings and size metadata.
nejucomo [Wed, 30 Jan 2008 09:57:54 +0000 (02:57 -0700)]
tahoe_fuse: system test: Create a separate directory for each test and pass the cap and local path to each test. Add two basic sanity tests for empty directories.
nejucomo [Wed, 30 Jan 2008 08:27:42 +0000 (01:27 -0700)]
docs: webapi: Add concise shorthand for options, input, output, and statuses to the operation descriptions...
I'm not convinced if this is the best way to do this, but I find it
handy to have a conscise "quick reference" for the webapi operations
which summarize all related I/O.
Another possibility is to reject this patch, but create a separate
"webapi_quickref.html" with a concise table.
this adds an interface, IStatsProducer, defining the get_stats() method
which the stats provider calls upon and registered producer, and made the
register_producer() method check that interface is implemented.
also refine the startup logic, so that the stats provider doesn't try and
connect out to the stats gatherer until after the node declares the tub
'ready'. this is to address an issue whereby providers would attach to
the gatherer without providing a valid furl, and hence the gatherer would
be unable to determine the tubid of the connected client, leading to lost
samples.
robk-tahoe [Fri, 1 Feb 2008 04:04:23 +0000 (21:04 -0700)]
munin stats: suppress output of data more that 5min old
if a node fails to report stats, the natural thing to do in re munin is to
supress the data for that data series. the previous tahoe-stats would output
whatever data was present in the stats_gatherer's stats.pickle, regardless of
how old.
this change means that if the gatherer hasn't received data within the last
5 min, then no data is reported to munin for that node.
storage: make two levels of share directories so as not to exceed certain filesystems's limitations on directory size
The filesystem which gets my vote for most undeservedly popular is ext3, and it has a hard limit of 32,000 entries in a directory. Many other filesystems (even ones that I like more than I like ext3) have either hard limits or bad performance consequences or weird edge cases when you get too many entries in a single directory.
This patch makes it so that there is a layer of intermediate directories between the "shares" directory and the actual storage-index directory (the one whose name contains the entire storage index (z-base-32 encoded) and which contains one or more share files named by their share number).
The intermediate directories are named by the first 14 bits of the storage index, which means there are at most 16384 of them. (This also means that the intermediate directory names are not a leading prefix of the storage-index directory names -- to do that would have required us to have intermediate directories limited to either 1024 (2-char), which is too few, or 32768 (3-chars of a full 5 bits each), which would overrun ext3's funny hard limit of 32,000.))
This closes #150, and please see the "convertshares.py" script attached to #150 to convert your old tahoe-0.7.0 storage/shares directory into a new tahoe-0.8.0 storage/shares directory.
robk-tahoe [Thu, 31 Jan 2008 03:11:07 +0000 (20:11 -0700)]
stats: add a simple stats gathering system
We have a desire to collect runtime statistics from multiple nodes primarily
for server monitoring purposes. This implements a simple implementation of
such a system, as a skeleton to build more sophistication upon.
Each client now looks for a 'stats_gatherer.furl' config file. If it has
been configured to use a stats gatherer, then it instantiates internally
a StatsProvider. This is a central place for code which wishes to offer
stats up for monitoring to report them to, either by calling
stats_provider.count('stat.name', value) to increment a counter, or by
registering a class as a stats producer with sp.register_producer(obj).
The StatsProvider connects to the StatsGatherer server and provides its
provider upon startup. The StatsGatherer is then responsible for polling
the attached providers periodically to retrieve the data provided.
The provider queries each registered producer when the gatherer queries
the provider. Both the internal 'counters' and the queried 'stats' are
then reported to the gatherer.
This provides a simple gatherer app, (c.f. make stats-gatherer-run)
which prints its furl and listens for incoming connections. Once a
minute, the gatherer polls all connected providers, and writes the
retrieved data into a pickle file.
Also included is a munin plugin which knows how to read the gatherer's
stats.pickle and output data munin can interpret. this plugin,
tahoe-stats.py can be symlinked as multiple different names within
munin's 'plugins' directory, and inspects argv to determine which
data to display, doing a lookup in a table within that file.
It looks in the environment for 'statsfile' to determine the path to
the gatherer's stats.pickle. An example plugins-conf.d file is
provided.