Peter Secor [Fri, 11 Apr 2008 01:48:27 +0000 (18:48 -0700)]
native client - upgrade to WinFUSE 0.6, attempting to solve the issue where the Windows client thinks there isn't enough room on the virtual drive to do a copy
I'd implemented stats gathering hooks in the helper a while back.
Brian did the same without reference to my changes. This reconciles
those two changes, encompassing all the stats in both changes,
implemented through the stats_provider interface.
this also provide templates for all 10 helper graphs in the
tahoe-stats munin plugin.
robk-tahoe [Wed, 26 Mar 2008 01:19:08 +0000 (18:19 -0700)]
stats: added stats reporting to the upload helper
adds a stats_producer for the upload helper, which provides a series of counters
to the stats gatherer, under the name 'chk_upload_helper'.
it examines both the 'incoming' directory, and the 'encoding' dir, providing
inc_count inc_size inc_size_old enc_count enc_size enc_size_old, respectively
the number of files in each dir, the total size thereof, and the aggregate
size of all files older than 48hrs
Brian Warner [Thu, 10 Apr 2008 23:31:59 +0000 (16:31 -0700)]
back our runtime setuptools dependency down to 0.6a9 . We need a newer version to build, but can handle an older version to simply run a pre-built package
Brian Warner [Thu, 10 Apr 2008 21:36:27 +0000 (14:36 -0700)]
debian: use setuptools-generated support/bin/tahoe instead of bin/tahoe, to match Zooko's change that makes our in-tree bin/tahoe spawn support/bin/tahoe
the windows (cygwin) buildslave has been failing the key generator test
it turns out that the time check on whether to refill the pool, and the
reactor, are interacting such that when the maybe_refill_pool call posted
on the reactor fires, the test on whether to fill the pool fails.
this adds a loop in the failure case to retry each 1s until it is time
to refill the pool, thus mitigating this timing accuracy problem on
windows.
stats gathering: fix storage server stats if not tracking consumed
the RIStatsProvider interface requires that counter and stat values be
ChoiceOf(float, int, long) the recent changes to storage server to not
track 'consumed' led to returning None as the value of a counter.
this causes violations to be experienced by nodes whose stats are being
gathered.
this patch simply omits that stat if 'consumed' is not being tracked.
one of the storage servers is throwing foolscap violations about the
return value of get_stats(). this adds a log of the data returned
to the foolscap log event stream at the debug level '12' (between
NOISY(10) and OPERATIONAL(20)) hopefully this will facilitate
finding the cause of this problem.
the timeouts on uses of 'poll' were there purely to make sure a test doesn't
poll indefinitely. however having such timeouts makes tests susceptible
to premature timeouts under high load, or on slow machines. (e.g. cygwin
slaves running in virtual machines on loaded hosts)
purportedly trial by default applies a timeout to tests to prevent them
hanging out indefinitely, so these poll timeouts are redundant and cause
intermittent failures on slow hosts. hence they're more bother than they're
worth, and should be culled.
don't do a du on startup if there is no size limit configured
This also turns off the production of the "space measurement done" log message, if there is no size limit configured.
Peter Secor [Tue, 8 Apr 2008 08:19:02 +0000 (01:19 -0700)]
native client - added icon to About screen, added ability to change backup client that is launched, changed capitalization in share namme to Allmydata, changed tool-tip to Allmydata 3.0, updated linking method to automatically link if the next file will take more than 60sec to upload or it's already been 5 minutes since the last link happened
previously there was an edge case in the timing of expected behaviour
of the key_generator (w.r.t. the refresh delay and twisted/foolscap
delivery). if it took >6s for a key to be generated, then it was
possible for the pool refresh delay to transpire _during_ the
synchronous creation of a key in remote_get_rsa_key_pair. this could
lead to the timer elapsing during key creation and hence the pool
being refilled before control returned to the client.
this change ensures that the time window from a get key request
until the key gen reactor blocks to refill the pool is the time
since a request was answered, not since a request was asked.
this causes the behaviour to match expectations, as embodied in
test_keygen, even if the delay window is dropped to 0.1s
in both these cases, the timeout only serves to abort a stuck test, and
the key_generator should respond more quickly, but seeing test failures
in buildbot on some platforms suggests that the test is too susceptible
to timing issues on loaded buildslaves.
key_generator: service related cleanups, incorporation into system test
this cleans up KeyGenerator to be a service (a subservice of the
KeyGeneratorService as instantiated by the key-generator.tac app)
this means that the timer which replenishes the keypool will be
shutdown cleanly when the service is stopped.
adds checks on the key_generator service and client into the system
test 'test_mutable' such that one of the nodes (clients[3]) uses
the key_generator service, and checks that mutable file creation
in that node, via a variety of means, are all consuming keys from
the key_generator.
Peter Secor [Thu, 3 Apr 2008 20:39:46 +0000 (13:39 -0700)]
native client - MikeB's updates to do delayed caching (introduces write delays if cache gets big), status indicator if uploading files on Windows (flashing system tray icon)
this adds a new service to pre-generate RSA key pairs. This allows
the expensive (i.e. slow) key generation to be placed into a process
outside the node, so that the node's reactor will not block when it
needs a key pair, but instead can retrieve them from a pool of already
generated key pairs in the key-generator service.
it adds a tahoe create-key-generator command which initialises an
empty dir with a tahoe-key-generator.tac file which can then be run
via twistd. it stashes its .pem and portnum for furl stability and
writes the furl of the key gen service to key_generator.furl, also
printing it to stdout.
by placing a key_generator.furl file into the nodes config directory
(e.g. ~/.tahoe) a node will attempt to connect to such a service, and
will use that when creating mutable files (i.e. directories) whenever
possible. if the keygen service is unavailable, it will perform the
key generation locally instead, as before.
setup: rename GNUmakefile to Makefile
It's evil and wrong to call something a "Makefile" when it contains code that can't be interpreted by POSIX make and requires GNU make.
But everyone else is doing it. ;-)
Brian Warner [Mon, 31 Mar 2008 22:28:45 +0000 (15:28 -0700)]
introducer.py: accelerate reconnection after being offline. Closes #374.
When we establish any new connection, reset the delays on all the other
Reconnectors. This will trigger a new batch of connection attempts. The idea
is to detect when we (the client) have been offline for a while, and to
connect to all servers when we get back online. By accelerating the timers
inside the Reconnectors, we try to avoid spending a long time in a
partially-connected state (which increases the chances of causing problems
with mutable files, by not updating all the shares that we ought to).
Peter Secor [Thu, 27 Mar 2008 04:05:30 +0000 (21:05 -0700)]
native client - updated to automatically create a Backup directory, temp directory cleanup, automatic mounting and unmounting of the drive when starting and stopping the service, lots of Vista backup error fixes
Brian Warner [Thu, 27 Mar 2008 01:20:07 +0000 (18:20 -0700)]
web-status: client methods like list_all_uploads() return Upload instances,
not status instances. Fix this. The symptom was that following a link like
'up-123' that referred to an old operation (no longer in memory) while an
upload was active would get an ugly traceback instead of a "no such resource"
message.
Peter Secor [Wed, 26 Mar 2008 00:00:59 +0000 (17:00 -0700)]
native client - updated to fix windows vista backup rproblems, edit word documents directly on the drive, requeue files that failed to upload from the node to the helper
Brian Warner [Tue, 25 Mar 2008 01:55:37 +0000 (18:55 -0700)]
encode: log a plaintext hash and SI for each upload. This will allow the log gatherer to correlate the two, to better measure the benefits of convergence
robk-tahoe [Mon, 24 Mar 2008 22:47:12 +0000 (15:47 -0700)]
confwiz: set a convergence domain based on root_dir upon config
when the confwiz configures a node (i.e. typically once on mac, once per
install on windows) in addition to writing the root_dir.cap retrieved from
the native_client backend into a config file, it additionally writes a hash
thereof into the 'convergence' config file.
this causes uploads from this node to use a consistent 'convergence' hashing
value matching any other nodes with the same configured root_dir, i.e. for
the most part other systems installed and configured on the same account.
Brian Warner [Sun, 23 Mar 2008 22:35:54 +0000 (15:35 -0700)]
UNDO: upload: stop putting plaintext and ciphertext hashes in shares.
This removes the guess-partial-information attack vector, and reduces
the amount of overhead that we consume with each file. It also introduces
a forwards-compability break: older versions of the code (before the
previous download-time "make hashes optional" patch) will be unable
to read files uploaded by this version, as they will complain about the
missing hashes. This patch is experimental, and is being pushed into
trunk to obtain test coverage. We may undo it before releasing 1.0.
fix check-memory to use new upload API (which requires a "convergence" argument), and change it to measure convergence instead of random-key, since convergence is the use case we care about more
Now upload or encode methods take a required argument named "convergence" which can be either None, indicating no convergent encryption at all, or a string, which is the "added secret" to be mixed in to the content hash key. If you want traditional convergent encryption behavior, set the added secret to be the empty string.
This patch also renames "content hash key" to "convergent encryption" in a argument names and variable names. (A different and larger renaming is needed in order to clarify that Tahoe supports immutable files which are not encrypted content-hash-key a.k.a. convergent encryption.)
This patch also changes a few unit tests to use non-convergent encryption, because it doesn't matter for what they are testing and non-convergent encryption is slightly faster.
Brian Warner [Sun, 23 Mar 2008 22:35:54 +0000 (15:35 -0700)]
upload: stop putting plaintext and ciphertext hashes in shares.
This removes the guess-partial-information attack vector, and reduces
the amount of overhead that we consume with each file. It also introduces
a forwards-compability break: older versions of the code (before the
previous download-time "make hashes optional" patch) will be unable
to read files uploaded by this version, as they will complain about the
missing hashes. This patch is experimental, and is being pushed into
trunk to obtain test coverage. We may undo it before releasing 1.0.
Brian Warner [Sun, 23 Mar 2008 21:46:49 +0000 (14:46 -0700)]
download: make plaintext and ciphertext hashes in the UEB optional.
Removing the plaintext hashes can help with the guess-partial-information
attack. This does not affect compatibility, but if and when we actually
remove any hashes from the share, that will introduce a
forwards-compatibility break: tahoe-0.9 will not be able to read such files.
robk-tahoe [Tue, 18 Mar 2008 23:15:36 +0000 (16:15 -0700)]
confwiz: reworked confwiz look and feel
this changes the confwiz to have a look and feel much more consistent
with that of the innosetup installer it is launched within the context
of. this applies, naturally, primarily to windows.