Brian Warner [Fri, 10 Oct 2008 01:13:27 +0000 (18:13 -0700)]
storage: introduce v2 immutable shares, with 8-byte offsets fields, to remove two of the three size limitations in #346. This code handles v2 shares but does not generate them. We'll make a release with this v2-tolerance, wait a while, then make a second release that actually generates v2 shares, to avoid compatibility problems.
interfaces: loosen a few max-size constraints which would limit us to a mere 1.09 TB maximum file size
These constraints were originally intended to protect against attacks on the
storage server protocol layer which exhaust memory in the peer. However,
defending against that sort of DoS is hard -- probably it isn't completely
achieved -- and it costs development time to think about it, and it sometimes
imposes limits on legitimate users which we don't necessarily want to impose.
So, for now we forget about limiting the amount of RAM that a foolscap peer can
cause you to start using.
Brian Warner [Tue, 7 Oct 2008 01:06:05 +0000 (18:06 -0700)]
ftp: change the twisted hack necessary for async-write-close, to one more agreeable to the twisted-dev folks, add a copy of the necessary patch to docs/ftp.txt
docs: update architecture.txt 's section on the vdrive a.k.a. filesystem layer
Remove some obsolete parts (correct at the time, now incorrect), change terminology to reflect my preference: s/vdrive/filesystem/ and s/dirnode/directory/, and make a few other small changes.
repair: fix test to map from storage index to directory structure properly (thanks, cygwin buildbot, for being so kloodgey that you won't accept random binary filenames and thus making me notice this bug)
repairer: enhance the repairer tests
Make sure the file can actually be downloaded afterward, that it used one of the
deleted and then repaired shares to do so, and that it repairs from multiple
deletions at once (without using more than a reasonable amount of calls to
storage server allocate).
repairer: add basic test of repairer, move tests of immutable checker/repairer from test_system to test_immutable_checker, remove obsolete test helper code from test_filenode
Hm... "Checker" ought to be renamed to "CheckerRepairer" or "Repairer" at some point...
gui/macapp: rough cut of ui tweaks; configurability, auto-mount
chatting with peter, two things the mac gui needed were the ability to mount
the 'allmydata drive' automatically upon launching the app, and open the
Finder to reveal it. (also a request to hide the debug 'open webroot' stuff)
this (somewhat rough) patch implements all the above as default behaviour
it also contains a quick configuration mechanism for the gui - rather than a
preferences gui, running with a more 'tahoe' styled mechanism, the contents
of a few optional files can modify the default behaviour, specifically file
in ~/.tahoe/gui.conf control behaviour as follows:
auto-mount (bool): if set (the default) then the mac app will, upon launch
automatically mount the 'tahoe:' alias with the display name 'Allmydata'
using a mountpoint of ~/.tahoe/mnt/__auto__
auto-open (bool): if set (the default) then upon mounting a file system
(including the auto-mount if set) finder will be opened to the mountpoint
of the filesystem, which essentially reveals the newly mounted drive in a
Finder window
show-webopen (bool): if set (false by default) then the 'open webroot'
action will be made available in both the dock and file menus of the app
daemon-timout (int): sets the daemon-timeout option passed into tahoe fuse
when a filesystem is mounted. this defaults to 5 min
files of type (int) much, naturally contain a parsable int representation.
files of type (bool) are considered true if their (case-insensitive) contents
are any of ['y', 'yes', 'true', 'on', '1'] and considered false otherwise.
this was inspired by reading the fuse docs and discovering the 'fsid' option
to fuse_main, and was _intended_ to support a sort of 'stability' to the
filesystem (specifically derived from the root-uri mounted, whether directly
or via an alias) to support mac aliases across unmount/remount etc.
some experimentation shows that that doesn't actually work, and that, at
least for mac aliases in my testing, they're tied to path-to-mountpoint and
not to the fsid - which seems to have no bearing. perhaps the 'local' flag
is causing weirdness therein.
at any rate, I'm recording it simply for posterity, in case it turns out to
be useful after all somewhere down the road.
this was inspired by reading the fuse docs and discovering the 'fsid' option
to fuse_main, and was _intended_ to support a sort of 'stability' to the
filesystem (specifically derived from the root-uri mounted, whether directly
or via an alias) to support mac aliases across unmount/remount etc.
some experimentation shows that that doesn't actually work, and that, at
least for mac aliases in my testing, they're tied to path-to-mountpoint and
not to the fsid - which seems to have no bearing. perhaps the 'local' flag
is causing weirdness therein.
at any rate, I'm recording it simply for posterity, in case it turns out to
be useful after all somewhere down the road.
manhole: be more tolerant of authorized_keys. files in .tahoe
both peter and I independently tried to do the same thing to eliminate the
authorized_keys file which was causing problems with the broken mac build
(c.f. #522) namely mv authorized_keys.8223{,.bak} but the node is, ahem,
let's say 'intolerant' of the trailing .bak - rather than disable the
manhole as one might expect, it instead causes the node to explode on
startup. this patch makes it skip over anything that doesn't pass the
'parse this trailing stuff as an int' test.
fuse/impl_c: move mac tahoefuse impl out into contrib/fuse
For a variety of reasons, high amongst them the fact that many people
interested in fuse support for tahoe seem to have missed its existence,
the existing fuse implementation for tahoe, previously 'mac/tahoefuse.py'
has been renamed and moved.
It was suggested that, even though the mac build depends upon it, that
the mac/tahoefuse implementation be moved into contrib/fuse along with
the other fuse implementations. The fact that it's not as extensively
covered by unit tests as mainline tahoe was given as corroboration.
In a bid to try and stem the confusion inherent in having tahoe_fuse,
tfuse and tahoefuse jumbled together (not necessarily helped by
referring to them as impl_a, b and c respectively) I'm hereby renaming
tahoefuse as 'blackmatch' (black match is, per wikipedia "a type of
crude fuse" hey, I'm a punny guy) Maybe one day it'll be promoted to
be 'quickmatch' instead...
Anyway, this patch moves mac/tahoefuse.py out to contrib/fuse/impl_c/
as blackmatch.py, and makes appropriate changes to the mac build process
to transclude blackmatch therein. this leaves the extant fuse.py and
fuseparts business in mac/ as-is and doesn't attempt to address such
issues in contrib/fuse/impl_c.
it is left as an exercise to the reader (or the reader of a message
to follow) as to how to deal with the 'fuse' python module on the mac.
as of this time, blackmatch should work on both mac and linux, and
passes the four extant tests in runtests. (fwiw neither impl_a nor
impl_b have I managed to get working on the mac yet)
since blackmatch supports a read-write and caching fuse interface to
tahoe, some write tests obviously need to be added to runtests.
macapp: changes to support aliases, updated tahoefuse command line options
the tahoefuse command line options changed to support the runtests harness,
and as part of that gained support for named aliases via --alias
this changes the mac app's invocation of tahoefuse to match that, and also
changes the gui to present the list of defined aliases as valid mounts
this replaces the previous logic which examined the ~/.tahoe/private directory
looking for files ending in '.cap' - an ad-hoc alias mechanism.
if a file is found matching ~/.tahoe/private/ALIASNAME.icns then that will still
be passed to tahoefuse as the icon to display for that filesystem. if no such
file is found, the allmydata icon will be used by default.
the '-olocal' option is passed to tahoefuse. this is potentially contentious.
specifically this is telling the OS that this is a 'local' filesystem, which is
intended to be used to locally attached devices. however leopard (OSX 10.5)
will only display non-local filesystems in the Finder's side bar if they are of
fs types specifically known by Finder to be network file systems (nfs, cifs,
webdav, afp) hence the -olocal flag is the only way on leopard to cause finder
to display the mounted filesystem in the sidebar, but it displays as a 'device'.
there is a potential (i.e. the fuse docs carry warnings) that this may cause
vague and unspecified undesirable behaviour.
(c.f. http://code.google.com/p/macfuse/wiki/FAQ specifically Q4.3 and Q4.1)
fuse/impl_c: reworking of mac/tahoefuse, command line options, test integration
a handful of changes to the tahoefuse implementation used by the mac build, to
make command line option parsing more flexible and robust, and moreover to
facilitate integration of this implementation with the 'runtests' test harness
used to test the other two implementations.
this patch includes;
- improvements to command line option parsing [ see below ]
- support for 'aliases' akin to other tahoe tools
- tweaks to support linux (ubuntu hardy)
the linux support tweaks are, or at least seem to be, a result of the fact that
hardy ships with fuse 0.2pre3, as opposed to the fuse0.2 that macfuse is based
upon. at least the versions I was working with have discrepencies in their
interfaces, but on reflection this is probably a 'python-fuse' version issue
rather than fuse per se. At any rate, the fixes to handling the Stat objects
should be safe against either version, it's just that the bindings on hardy
lacked code that was in the 'fuse' python module on the mac...
command line options:
the need for more flexible invocation in support of the runtests harness led
me to rework the argument parsing from some simple positional hacks with a
pass-through of the remainder to the fuse binding's 'fuse_main' to a system
using twisted.usage to parse arguments, and having just one option '-o' being
explicitly a pass-through for -o options to fuse_main. the options are now:
--node-directory NODEDIR : this is used to look up the node-url to connect
to if that's not specified concretely on the command line, and also used to
determine the location of the cache directory used by the implementation,
specifically '_cache' within the nodedir. default value: ~/.tahoe
--node-url NODEURL : specify a node-url taking precendence over that found
in the node.url file within the nodedir
--alias ALIAS : specifies the named alias should be mounted. a lookup is
performed in the alias table within 'nodedir' to find the root dir cap
the named alias must exist in the alias table of the specified nodedir
--root-uri ROOTURI : specifies that the given directory uri should be mounted
at least one of --alias and --root-uri must be given (which directory to mount
must be specified somehow) if both are given --alias takes precedence.
--cache-timeout TIMEOUTSECS : specifies the number of seconds that cached
directory data should be considered valid for. this tahoefuse implementation
implements directory caching for a limited time; largely because the mac (i.e.
the Finder in particular) tends to make a large number of requests in quick
successsion when browsing the filesystem. on the flip side, the 'runtests'
unit tests fail in the face of such caching because the changes made to the
underlying tahoe directories are not reflected in the fuse presentation. by
specifying a cache-timeout of 0 seconds, runtests can force the fuse layer
into refetching directory data upon each request.
any number of -oname=value options may be specified on the command line,
and they will all be passed into the underlying fuse_main call.
a single non-optional argument, the mountpoint, must also be given.
This patch makes a significant number of changes to the fuse 'runtests' script
which stem from my efforts to integrate the third fuse implementation into this
framework. Perhaps not all were necessary to that end, and I beg nejucomo's
forebearance if I got too carried away.
- cleaned up the blank lines; imho blank lines should be empty
- made the unmount command switch based on platform, since macfuse just uses
'umount' not the 'fusermount' command (which doesn't exist)
- made the expected working dir for runtests the contrib/fuse dir, not the
top-level tahoe source tree - see also discussion of --path-to-tahoe below
- significantly reworked the ImplProcManager class. rather than subclassing
for each fuse implementation to be tested, the new version is based on
instantiating objects and providing relevant config info to the constructor.
this was motivated by a desire to eliminate the duplication of similar but
subtly different code between instances, framed by consideration of increasing
the number of platforms and implementations involved. each implementation to
test is thus reduced to the pertinent import and an entry in the
'implementations' table defining how to handle that implementation. this also
provides a way to specify which sets of tests to run for each implementation,
more on that below.
- significantly reworked the command line options parsing, using twisted.usage;
what used to be a single optional argument is now represented by the
--test-type option which allows one to choose between running unittests, the
system tests, or both.
the --implementations option allows for a specific (comma-separated) list of
implemenations to be tested, or the default 'all'
the --tests option allows for a specific (comma-separated) list of tests sets
to be run, or the default 'all'. note that only the intersection of tests
requested on the command line and tests relevant to each implementation will
be run. see below for more on tests sets.
the --path-to-tahoe open allows for the path to the 'tahoe' executable to be
specified. it defaults to '../../bin/tahoe' which is the location of the tahoe
script in the source tree relative to the contrib/fuse dir by default.
the --tmp-dir option controls where temporary directories (and hence
mountpoints) are created during the test. this defaults to /tmp - a change
from the previous behaviour of using the system default dir for calls to
tempfile.mkdtemp(), a behaviour which can be obtained by providing an empty
value, e.g. "--tmp-dir="
the --debug-wait flag causes the test runner to pause waiting upon user
input at various stages through the testing, which facilitates debugging e.g.
by allowing the user to open a browser and explore or modify the contents of
the ephemeral grid after it has been instantiated but before tests are run,
or make environmental adjustments before actually triggering fuse mounts etc.
note that the webapi url for the first client node is printed out upon its
startup to facilitate this sort of debugging also.
- the default tmp dir was changed, and made configurable. previously the
default behaviour of tempfile.mkdtemp() was used. it turns out that, at least
on the mac, that led to temporary directories to be created in a location
which ultimately led to mountpoint paths longer than could be handled by
macfuse - specifically mounted filesystems could not be unmounted and would
'leak'. by changing the default location to be rooted at /tmp this leads to
mountpoint paths short enough to be supported without problems.
- tests are now grouped into 'sets' by method name prefix. all the existing
tests have been moved into the 'read' set, i.e. with method names starting
'test_read_'. this is intended to facilitate the fact that some implementations
are read-only, and some support write, so the applicability of tests will vary
by implementation. the 'implementations' table, which governs the configuration
of the ImplProcManager responsible for a given implementation, provides a list
of 'test' (i.e test set names) which are applicable to that implementation.
note no 'write' tests yet exist, this is merely laying the groundwork.
- the 'expected output' of the tahoe command, which is checked for 'surprising'
output by regex match, can be confused by spurious output from libraries.
specfically, testing on the mac produced a warning message about zope interface
resolution various multiple eggs. the 'check_tahoe_output()' function now has
a list of 'ignorable_lines' (each a regex) which will be discarded before the
remainder of the output of the tahoe script is matched against expectation.
- cleaned up a typo, and a few spurious imports caught by pyflakes
specifically change the expectation of the code to be such that the node-url
(self.url) always includes the trailing slash to be a correctly formed url
moreover read the node-url from the 'node.url' file found in the node 'basedir'
and only if that doesn't exist, then fall back to reading the 'webport' file
from therein and assuming localhost. This then supports the general tahoe
pattern that tools needing only a webapi server can be pointed at a directory
containing the node.url file, which can optionally point to another server,
rather than requiring a complete node dir and locally running node instance.
from testing on linux (specifically ubuntu hardy) the libfuse dll has a
different name, specifically libfuse.so.2. this patch tries libfuse.so
and then falls back to trying .2 if the former fails.
it also changes the unmount behaviour, to simply return from the handler's
loop_forever() loop upon being unmounted, rather than raising an EOFError,
since none of the client code I looked at actually handled that exception,
but did seem to expect to fall off of main() when loop_forever() returned.
Additionally, from my testing unmount typically led to an OSError from the
fuse fd read, rather than an empty read, as the code seemed to expect.
also removed a spurious import pyflakes quibbled about.
setup: fix site-dirs to find system installed twisted on mac.
zooko helped me unravel a build weirdness today. somehow the system installed
twisted (/System/Library) was pulling in parts of the other twisted (/Library)
which had been installed by easy_install, and exploding.
getting rid of the latter helped, but it took this change to get the tahoe
build to stop trying to rebuild twisted and instead use the one that was
already installed. c.f. tkt #229
CLI: rework webopen, and moreover its tests w.r.t. path handling
in the recent reconciliation of webopen patches, I wound up adjusting
webopen to 'pass through' the state of the trailing slash on the given
argument to the resultant url passed to the browser. this change
removes the requirement that arguments must be directories, and allows
webopen to be used with files. it also broke the tests that assumed
that webopen would always normalise the url to have a trailing slash.
in fixing the tests, I realised that, IMHO, there's something deeply
awry with the way tahoe handles paths; specifically in the combination
of '/' being the name of the root path within an alias, but a leading
slash on paths, e.g. 'alias:/path', is catagorically incorrect. i.e.
'tahoe:' == 'tahoe:/' == '/'
but 'tahoe:/foo' is an invalid path, and must be 'tahoe:foo'
I wound up making the internals of webopen simply spot a 'path' of
'/' and smash it to '', which 'fixes' webopen to match the behaviour
of tahoe's path handling elsewhere, but that special case sort of
points to the weirdness.
(fwiw, I personally found the fact that the leading / in a path was
disallowed to be weird - I'm just used to seeing paths qualified by
the leading / I guess - so in a debate about normalising path handling
I'd vote to include the /)
I think this is largely attributable to a cleanup patch I'd made
which never got committed upstream somehow, but at any rate various
conflicting changes to webopen had been made. This cleans up the
conflicts therein, and hopefully brings 'tahoe webopen' in line with
other cli commands.
robk-tahoe [Wed, 18 Jun 2008 20:19:40 +0000 (13:19 -0700)]
cli: cleanup webopen command
moved the body of webopen out of cli.py into tahoe_webopen.py
made its invocation consistent with the other cli commands, most
notably replacing its 'vdrive path' with the same alias parsing,
allowing usage such as 'tahoe webopen private:Pictures/xti'
setup: when detecting platform, ask the Python Standard Library's platform.dist() before executing lsb_release, and cache the result in global (module) variables
This should make it sufficiently fast, while still giving a better answer on Ubuntu than platform.dist() currently does, and also falling back to lsb_release if platform.dist() says that it doesn't know.
setup: stop catching EnvironmentError when attempting to copy ./_auto_deps.py to ./src/allmydata/_auto_deps.py
It is no longer the case that we can run okay without _auto_deps.py being in place in ./src/allmydata, so if that cp fails then the build should fail.
immutable: refactor immutable filenodes and comparison thereof
* the two kinds of immutable filenode now have a common base class
* they store only an instance of their URI, not both an instance and a string
* they delegate comparison to that instance
setup: try parsing /etc/lsb-release first, then invoking lsb_release, because the latter takes half-a-second on my workstation, which is too long
Also because in some cases the former will work and the latter won't.
This patch also tightens the regexes so it won't match random junk.
setup: if executing lsb_release doesn't work, fall back to parsing /etc/lsb-release before falling back to platform.dist()
An explanatio of why we do it this way is in the docstring.
setup: if invoking lsb_release doesn't work (which it doesn't on our etch buildslave), then fall back to the Python Standard Library's platform.dist() function
setup: when using the foolscap "what versions are here?" feature, use allmydata.get_package_versions() instead of specifically importing allmydata, pycryptopp, and zfec
setup: simplify the implementation of allmydata.get_package_versions() and add "platform" which is a human-oriented summary of the underlying operating system and machine