From: Brian Warner Date: Wed, 22 Oct 2008 00:03:07 +0000 (-0700) Subject: Change deep-size/stats/check/manifest to a start+poll model instead of a single long... X-Git-Url: https://git.rkrishnan.org/%5B/%5D%20/uri/%22doc.html/architecture.txt?a=commitdiff_plain;h=ad3d9207a93ee7e731628ce05ed537f161e8c2af;p=tahoe-lafs%2Ftahoe-lafs.git Change deep-size/stats/check/manifest to a start+poll model instead of a single long-running synchronous operation. No cancel or handle-expiration yet. #514. --- diff --git a/NEWS b/NEWS index bd558446..b48c432c 100644 --- a/NEWS +++ b/NEWS @@ -42,7 +42,10 @@ histogram, etc). The client web interface now features some extra buttons to initiate check and deep-check operations. When these operations finish, they display a -results page that summarizes any problems that were encountered. +results page that summarizes any problems that were encountered. All +long-running deep-traversal operations, including deep-check, use a +start-and-poll mechanism, to avoid depending upon a single long-lived HTTP +connection. docs/webapi.txt has details. ** Configuration Changes: single INI-format tahoe.cfg file @@ -94,6 +97,21 @@ code, and obviously should not be used on user data. ** Web changes +All deep-traversal operations (start-manifest, start-deep-size, +start-deep-stats, start-deep-check) now use a start-and-poll approach, +instead of using a single (fragile) long-running synchronous HTTP connection. +All these "start-" operations use POST instead of GET. The old "GET +manifest", "GET deep-size", and "POST deep-check" operations have been +removed. + +The new "POST start-manifest" operation, when it finally completes, results +in a table of (path,cap), instead of the list of verifycaps produced by the +old "GET manifest". The table is available in several formats: use +output=html, output=text, or output=json to choose one. + +The "return_to=" and "when_done=" arguments have been removed from the +t=check and deep-check operations. + The top-level status page (/status) now has a machine-readable form, via "/status/?t=json". This includes information about the currently-active uploads and downloads, which may be useful for frontends that wish to display @@ -124,10 +142,6 @@ directories). For mutable files, the "replace contents" upload form has been moved here too. As a result, the directory page is now much simpler and cleaner, and several potentially-misleading links (like t=uri) are now gone. -The "t=manifest" webapi command now generates a table of (path,cap), instead -of creating a set of verifycaps. The table is available in several formats: -use output=html, output=text, or output=json to choose one. - Slashes are discouraged in Tahoe file/directory names, since they cause problems when accessing the filesystem through the webapi. However, there are a couple of accidental ways to generate such names. This release tries to diff --git a/docs/webapi.txt b/docs/webapi.txt index 62705f21..de46302a 100644 --- a/docs/webapi.txt +++ b/docs/webapi.txt @@ -184,6 +184,67 @@ for you. If you don't know the cap, you can't access the file. This allows the security properties of Tahoe caps to be extended across the webapi interface. +== Slow Operations, Progress, and Cancelling == + +Certain operations can be expected to take a long time. The "t=deep-check", +described below, will recursively visit every file and directory reachable +from a given starting point, which can take minutes or even hours for +extremely large directory structures. A single long-running HTTP request is a +fragile thing: proxies, NAT boxes, browsers, and users may all grow impatient +with waiting and give up on the connection. + +For this reason, long-running operations have an "operation handle", which +can be used to poll for status/progress messages while the operation +proceeds. This handle can also be used to cancel the operation. These handles +are created by the client, and passed in as a an "ophandle=" query argument +to the POST or PUT request which starts the operation. The following +operations can then be used to retrieve status: + +GET /operations/$HANDLE?t=status&output=HTML +GET /operations/$HANDLE?t=status&output=JSON + + These two retrieve the current status of the given operation. Each operation + presents a different sort of information, but in general the page retrieved + will indicate: + + * whether the operation is complete, or if it is still running + * how much of the operation is complete, and how much is left, if possible + + The HTML form will include a meta-refresh tag, which will cause a regular + web browser to reload the status page about 30 seconds later. This tag will + be removed once the operation has completed. + +POST /operations/$HANDLE?t=cancel + + This terminates the operation, and returns an HTML page explaining what was + cancelled. If the operation handle has already expired (see below), this + POST will return a 404, which indicates that the operation is no longer + running (either it was completed or terminated). + +The operation handle will eventually expire, to avoid consuming an unbounded +amount of memory. The handle's time-to-live can be reset at any time, by +passing a retain-for= argument (with a count of seconds) to either the +initial POST that starts the operation, or the subsequent 'GET t=status' +request which asks about the operation. For example, if a 'GET +/operations/$HANDLE?t=status&output=JSON&retain-for=600' query is performed, +the handle will remain active for 600 seconds (10 minutes) after the GET was +received. + +In addition, if the GET t=status includes a release-after-complete=True +argument, and the operation has completed, the operation handle will be +released immediately. + +If a retain-for= argument is not used, the default handle lifetimes are: + + * handles will remain valid at least until their operation finishes + * uncollected handles for finished operations (i.e. handles for operations + which have finished but for which the t=status page has not been accessed + since completion) will remain valid for one hour, or for the total time + consumed by the operation, whichever is greater. + * collected handles (i.e. the t=status page has been retrieved at least once + since the operation completed) will remain valid for ten minutes. + + == Programmatic Operations == Now that we know how to build URLs that refer to files and directories in a @@ -674,12 +735,6 @@ POST $URL?t=check page that is returned will display the results. This can be used as a "show me detailed information about this file" page. - If a when_done=url argument is provided, the return value will be a redirect - to that URL instead of the checker results. - - If a return_to=url argument is provided, the returned page will include a - link to the given URL entitled "Return to the parent directory". - If a verify=true argument is provided, the node will perform a more intensive check, downloading and verifying every single bit of every share. @@ -741,28 +796,37 @@ POST $URL?t=check 'seq%d-%s-sh%d', containing the sequence number, the roothash, and the share number. -POST $URL?t=deep-check +POST $URL?t=start-deep-check (must add &ophandle=XYZ) - This triggers a recursive walk of all files and directories reachable from + This initiates a recursive walk of all files and directories reachable from the target, performing a check on each one just like t=check. The result page will contain a summary of the results, including details on any file/directory that was not fully healthy. - t=deep-check can only be invoked on a directory. An error (400 BAD_REQUEST) - will be signalled if it is invoked on a file. The recursive walker will - deal with loops safely. + t=start-deep-check can only be invoked on a directory. An error (400 + BAD_REQUEST) will be signalled if it is invoked on a file. The recursive + walker will deal with loops safely. - This accepts the same verify=, when_done=, and return_to= arguments as - t=check. + This accepts the same verify= argument as t=check. - Be aware that this can take a long time: perhaps a second per object. No - progress information is currently provided: the server will be silent until - the full tree has been traversed, then will emit the complete response. + Since this operation can take a long time (perhaps a second per object), + the ophandle= argument is required (see "Slow Operations, Progress, and + Cancelling" above). The response to this POST will be a redirect to the + corresponding /operations/$HANDLE?t=status page (with output=HTML or + output=JSON to match the output= argument given to the POST). The + deep-check operation will continue to run in the background, and the + /operations page should be used to find out when the operation is done. - If an output=JSON argument is provided, the response will be - machine-readable JSON instead of human-oriented HTML. The data is a - dictionary with the following keys: + The HTML /operations/$HANDLE?t=status page for incomplete operations will + contain a meta-refresh tag, set to 30 seconds, so that a browser which uses + deep-check will automatically poll until the operation has completed. (TODO) + + The JSON page (/options/$HANDLE?t=status&output=JSON) will contain a + machine-readable JSON dictionary with the following keys: + finished: a boolean, True if the operation is complete, else False. Some + of the remaining keys may not be present until the operation + is complete. root-storage-index: a base32-encoded string with the storage index of the starting point of the deep-check operation count-objects-checked: count of how many objects were checked. Note that @@ -794,9 +858,9 @@ POST $URL?t=check&repair=true or corrupted), it will perform a "repair". During repair, any missing shares will be regenerated and uploaded to new servers. - This accepts the same when_done=URL, return_to=URL, and verify=true - arguments as t=check. When an output=JSON argument is provided, the - machine-readable JSON response will contain the following keys: + This accepts the same verify=true argument as t=check. When an output=JSON + argument is provided, the machine-readable JSON response will contain the + following keys: storage-index: a base32-encoded string with the objects's storage index, or an empty string for LIT files @@ -815,19 +879,20 @@ POST $URL?t=check&repair=true as the 'results' value of the t=check response, described above. -POST $URL?t=deep-check&repair=true +POST $URL?t=start-deep-check&repair=true (must add &ophandle=XYZ) This triggers a recursive walk of all files and directories, performing a t=check&repair=true on each one. - Like t=deep-check without the repair= argument, this can only be invoked on - a directory. An error (400 BAD_REQUEST) will be signalled if it is invoked - on a file. The recursive walker will deal with loops safely. + Like t=start-deep-check without the repair= argument, this can only be + invoked on a directory. An error (400 BAD_REQUEST) will be signalled if it + is invoked on a file. The recursive walker will deal with loops safely. - This accepts the same when_done=URL, return_to=URL, and verify=true - arguments as t=deep-check. When an output=JSON argument is provided, the - response will contain the following keys: + This accepts the same verify=true argument as t=start-deep-check. It uses + the same ophandle= mechanism as start-deep-check. When an output=JSON + argument is provided, the response will contain the following keys: + finished: (bool) True if the operation has completed, else False root-storage-index: a base32-encoded string with the storage index of the starting point of the deep-check operation count-objects-checked: count of how many objects were checked @@ -868,35 +933,52 @@ POST $URL?t=deep-check&repair=true stats: a dictionary with the same keys as the t=deep-stats command (described below) -GET $DIRURL?t=manifest +POST $DIRURL?t=start-manifest (must add &ophandle=XYZ) - Return an HTML-formatted manifest of the given directory, for debugging. - This is a table of (path, filecap/dircap), for every object reachable from - the starting directory. The path will be slash-joined, and the - filecap/dircap will contain a link to the object in question. This page + This operation generates a "manfest" of the given directory tree, mostly + for debugging. This is a table of (path, filecap/dircap), for every object + reachable from the starting directory. The path will be slash-joined, and + the filecap/dircap will contain a link to the object in question. This page gives immediate access to every object in the virtual filesystem subtree. + This operation uses the same ophandle= mechanism as deep-check. The + corresponding /operations/$HANDLE?t=status page has three different forms. + The default is output=HTML. + If output=text is added to the query args, the results will be a text/plain - list, with one file/dir per line, slash-separated, with the filecap/dircap - separated by a space. + list. The first line is special: it is either "finished: yes" or "finished: + no"; if the operation is not finished, you must periodically reload the + page until it completes. The rest of the results are a plaintext list, with + one file/dir per line, slash-separated, with the filecap/dircap separated + by a space. If output=JSON is added to the queryargs, then the results will be a - JSON-formatted list of (path, cap) tuples, where path is a list of strings. + JSON-formatted dictionary with three keys: + + finished (bool): if False then you must reload the page until True + origin_si (str): the storage index of the starting point + manifest: list of (path, cap) tuples, where path is a list of strings. + +POST $DIRURL?t=start-deep-size (must add &ophandle=XYZ) -GET $DIRURL?t=deep-size + This operation generates a number (in bytes) containing the sum of the + filesize of all directories and immutable files reachable from the given + directory. This is a rough lower bound of the total space consumed by this + subtree. It does not include space consumed by mutable files, nor does it + take expansion or encoding overhead into account. Later versions of the + code may improve this estimate upwards. - Return a number (in bytes) containing the sum of the filesize of all - directories and immutable files reachable from the given directory. This is - a rough lower bound of the total space consumed by this subtree. It does - not include space consumed by mutable files, nor does it take expansion or - encoding overhead into account. Later versions of the code may improve this - estimate upwards. +POST $DIRURL?t=start-deep-stats (must add &ophandle=XYZ) -GET $DIRURL?t=deep-stats + This operation performs a recursive walk of all files and directories + reachable from the given directory, and generates a collection of + statistics about those objects. - Return a JSON-encoded dictionary that lists interesting statistics about - the set of all files and directories reachable from the given directory: + The result (obtained from the /operations/$OPHANDLE page) is a + JSON-serialized dictionary with the following keys (note that some of these + keys may be missing until 'finished' is True): + finished: (bool) True if the operation has finished, else False count-immutable-files: count of how many CHK files are in the set count-mutable-files: same, for mutable files (does not include directories) count-literal-files: same, for LIT files (data contained inside the URI) diff --git a/src/allmydata/dirnode.py b/src/allmydata/dirnode.py index c9366941..a4bebcbd 100644 --- a/src/allmydata/dirnode.py +++ b/src/allmydata/dirnode.py @@ -11,6 +11,7 @@ from allmydata.interfaces import IMutableFileNode, IDirectoryNode,\ ExistingChildError, ICheckable, IDeepCheckable from allmydata.checker_results import DeepCheckResults, \ DeepCheckAndRepairResults +from allmydata.monitor import Monitor from allmydata.util import hashutil, mathutil, base32, log from allmydata.util.hashutil import netstring from allmydata.util.limiter import ConcurrencyLimiter @@ -471,14 +472,19 @@ class NewDirectoryNode: # requires a Deferred. We use a ConcurrencyLimiter to make sure the # fan-out doesn't cause problems. + monitor = Monitor() + walker.set_monitor(monitor) + found = set([self.get_verifier()]) limiter = ConcurrencyLimiter(10) d = self._deep_traverse_dirnode(self, [], walker, found, limiter) d.addCallback(lambda ignored: walker.finish()) - return d + d.addBoth(monitor.finish) + return monitor def _deep_traverse_dirnode(self, node, path, walker, found, limiter): # process this directory, then walk its children + # TODO: check monitor.is_cancelled() d = limiter.add(walker.add_node, node, path) d.addCallback(lambda ignored: limiter.add(node.list)) d.addCallback(self._deep_traverse_dirnode_children, node, path, @@ -503,25 +509,32 @@ class NewDirectoryNode: def build_manifest(self): - """Return a list of (path, cap) tuples, for all nodes (directories - and files) reachable from this one.""" - return self.deep_traverse(ManifestWalker()) + """Return a Monitor, with a ['status'] that will be a list of (path, + cap) tuples, for all nodes (directories and files) reachable from + this one.""" + walker = ManifestWalker(self) + return self.deep_traverse(walker) - def deep_stats(self): + def start_deep_stats(self): # Since deep_traverse tracks verifier caps, we avoid double-counting # children for which we've got both a write-cap and a read-cap - return self.deep_traverse(DeepStats()) + return self.deep_traverse(DeepStats(self)) - def deep_check(self, verify=False): + def start_deep_check(self, verify=False): return self.deep_traverse(DeepChecker(self, verify, repair=False)) - def deep_check_and_repair(self, verify=False): + def start_deep_check_and_repair(self, verify=False): return self.deep_traverse(DeepChecker(self, verify, repair=True)) class ManifestWalker: - def __init__(self): + def __init__(self, origin): self.manifest = [] + self.origin = origin + def set_monitor(self, monitor): + self.monitor = monitor + monitor.origin_si = self.origin.get_storage_index() + monitor.set_status(self.manifest) def add_node(self, node, path): self.manifest.append( (tuple(path), node.get_uri()) ) def enter_directory(self, parent, children): @@ -531,7 +544,8 @@ class ManifestWalker: class DeepStats: - def __init__(self): + def __init__(self, origin): + self.origin = origin self.stats = {} for k in ["count-immutable-files", "count-mutable-files", @@ -554,6 +568,11 @@ class DeepStats: self.buckets = [ (0,0), (1,3)] self.root = math.sqrt(10) + def set_monitor(self, monitor): + self.monitor = monitor + monitor.origin_si = self.origin.get_storage_index() + monitor.set_status(self.stats) + def add_node(self, node, childpath): if IDirectoryNode.providedBy(node): self.add("count-directories") @@ -636,7 +655,11 @@ class DeepChecker: self._results = DeepCheckAndRepairResults(root_si) else: self._results = DeepCheckResults(root_si) - self._stats = DeepStats() + self._stats = DeepStats(root) + + def set_monitor(self, monitor): + self.monitor = monitor + monitor.set_status(self._results) def add_node(self, node, childpath): if self._repair: diff --git a/src/allmydata/interfaces.py b/src/allmydata/interfaces.py index 86efd1f6..92edabc6 100644 --- a/src/allmydata/interfaces.py +++ b/src/allmydata/interfaces.py @@ -796,15 +796,18 @@ class IDirectoryNode(IMutableFilesystemNode): operation finishes. The child name must be a unicode string.""" def build_manifest(): - """Return a Deferred that fires with a list of (path, cap) tuples for - nodes (directories and files) reachable from this one. 'path' will be - a tuple of unicode strings. The origin dirnode will be represented by - an empty path tuple.""" + """Return a Monitor. The Monitor's results will be a list of (path, + cap) tuples for nodes (directories and files) reachable from this + one. 'path' will be a tuple of unicode strings. The origin dirnode + will be represented by an empty path tuple. The Monitor will also + have an .origin_si attribute with the (binary) storage index of the + starting point. + """ - def deep_stats(): - """Return a Deferred that fires with a dictionary of statistics - computed by examining all nodes (directories and files) reachable - from this one, with the following keys:: + def start_deep_stats(): + """Return a Monitor, examining all nodes (directories and files) + reachable from this one. The Monitor's results will be a dictionary + with the following keys:: count-immutable-files: count of how many CHK files are in the set count-mutable-files: same, for mutable files (does not include @@ -828,6 +831,9 @@ class IDirectoryNode(IMutableFilesystemNode): size-mutable-files is not yet implemented, because it would involve even more queries than deep_stats does. + The Monitor will also have an .origin_si attribute with the (binary) + storage index of the starting point. + This operation will visit every directory node underneath this one, and can take a long time to run. On a typical workstation with good bandwidth, this can examine roughly 15 directories per second (and @@ -1494,23 +1500,24 @@ class ICheckable(Interface): ICheckAndRepairResults.""" class IDeepCheckable(Interface): - def deep_check(verify=False): + def start_deep_check(verify=False): """Check upon the health of me and everything I can reach. This is a recursive form of check(), useable only on dirnodes. - I return a Deferred that fires with an IDeepCheckResults object. + I return a Monitor, with results that are an IDeepCheckResults + object. """ - def deep_check_and_repair(verify=False): + def start_deep_check_and_repair(verify=False): """Check upon the health of me and everything I can reach. Repair anything that isn't healthy. This is a recursive form of check_and_repair(), useable only on dirnodes. - I return a Deferred that fires with an IDeepCheckAndRepairResults - object. + I return a Monitor, with results that are an + IDeepCheckAndRepairResults object. """ class ICheckerResults(Interface): diff --git a/src/allmydata/monitor.py b/src/allmydata/monitor.py new file mode 100644 index 00000000..dad89b85 --- /dev/null +++ b/src/allmydata/monitor.py @@ -0,0 +1,120 @@ + +from zope.interface import Interface, implements +from allmydata.util import observer + +class IMonitor(Interface): + """I manage status, progress, and cancellation for long-running operations. + + Whoever initiates the operation should create a Monitor instance and pass + it into the code that implements the operation. That code should + periodically check in with the Monitor, perhaps after each major unit of + work has been completed, for two purposes. + + The first is to inform the Monitor about progress that has been made, so + that external observers can be reassured that the operation is proceeding + normally. If the operation has a well-known amount of work to perform, + this notification should reflect that, so that an ETA or 'percentage + complete' value can be derived. + + The second purpose is to check to see if the operation has been + cancelled. The impatient observer who no longer wants the operation to + continue will inform the Monitor; the next time the operation code checks + in, it should notice that the operation has been cancelled, and wrap + things up. The same monitor can be passed to multiple operations, all of + which may check for cancellation: this pattern may be simpler than having + the original caller keep track of subtasks and cancel them individually. + """ + + # the following methods are provided for the operation code + + def is_cancelled(self): + """Returns True if the operation has been cancelled. If True, + operation code should stop creating new work, and attempt to stop any + work already in progress.""" + + def set_status(self, status): + """Sets the Monitor's 'status' object to an arbitrary value. + Different operations will store different sorts of status information + here. Operation code should use get+modify+set sequences to update + this.""" + + def get_status(self): + """Return the status object.""" + + def finish(self, status): + """Call this when the operation is done, successful or not. The + Monitor's lifetime is influenced by the completion of the operation + it is monitoring. The Monitor's 'status' value will be set with the + 'status' argument, just as if it had been passed to set_status(). + This value will be used to fire the Deferreds that are returned by + when_done(). + + Operations that fire a Deferred when they finish should trigger this + with d.addBoth(monitor.finish)""" + + # the following methods are provided for the initiator of the operation + + def is_finished(self): + """Return a boolean, True if the operation is done (whether + successful or failed), False if it is still running.""" + + def when_done(self): + """Return a Deferred that fires when the operation is complete. It + will fire with the operation status, the same value as returned by + get_status().""" + + def cancel(self): + """Cancel the operation as soon as possible. is_cancelled() will + start returning True after this is called.""" + + # get_status() is useful too, but it is operation-specific + +class Monitor: + implements(IMonitor) + + def __init__(self): + self.cancelled = False + self.finished = False + self.status = None + self.observer = observer.OneShotObserverList() + + def is_cancelled(self): + return self.cancelled + + def is_finished(self): + return self.finished + + def when_done(self): + return self.observer.when_fired() + + def cancel(self): + self.cancelled = True + + def finish(self, status_or_failure): + self.set_status(status_or_failure) + self.finished = True + self.observer.fire(status_or_failure) + return status_or_failure + + def get_status(self): + return self.status + def set_status(self, status): + self.status = status + +class MonitorTable: + def __init__(self): + self.handles = {} # maps ophandle (an arbitrary string) to a Monitor + # TODO: all timeouts, handle lifetime, retain-for=, etc, goes here. + # self.handles should probably be a WeakValueDictionary, and we need + # a table of timers, and operations which have finished should be + # handled slightly differently. + + def get_monitor(self, handle): + return self.handles.get(handle) + + def add_monitor(self, handle, monitor): + self.handles[handle] = monitor + + def delete_monitor(self, handle): + if handle in self.handles: + del self.handles[handle] diff --git a/src/allmydata/test/common.py b/src/allmydata/test/common.py index da9bb766..40a51c08 100644 --- a/src/allmydata/test/common.py +++ b/src/allmydata/test/common.py @@ -4,6 +4,7 @@ from zope.interface import implements from twisted.internet import defer from twisted.python import failure from twisted.application import service +from twisted.web.error import Error as WebError from foolscap import Tub from foolscap.eventual import flushEventualQueue, fireEventually from allmydata import uri, dirnode, client @@ -890,3 +891,14 @@ class ShouldFailMixin: (which, expected_failure, res)) d.addBoth(done) return d + +class WebErrorMixin: + def explain_web_error(self, f): + # an error on the server side causes the client-side getPage() to + # return a failure(t.web.error.Error), and its str() doesn't show the + # response body, which is where the useful information lives. Attach + # this method as an errback handler, and it will reveal the hidden + # message. + f.trap(WebError) + print "Web Error:", f.value, ":", f.value.response + return f diff --git a/src/allmydata/test/test_dirnode.py b/src/allmydata/test/test_dirnode.py index e90c5859..242a47ea 100644 --- a/src/allmydata/test/test_dirnode.py +++ b/src/allmydata/test/test_dirnode.py @@ -157,7 +157,7 @@ class Dirnode(unittest.TestCase, testutil.ShouldFailMixin, testutil.StallMixin): def test_deepcheck(self): d = self._test_deepcheck_create() - d.addCallback(lambda rootnode: rootnode.deep_check()) + d.addCallback(lambda rootnode: rootnode.start_deep_check().when_done()) def _check_results(r): self.failUnless(IDeepCheckResults.providedBy(r)) c = r.get_counters() @@ -174,7 +174,8 @@ class Dirnode(unittest.TestCase, testutil.ShouldFailMixin, testutil.StallMixin): def test_deepcheck_and_repair(self): d = self._test_deepcheck_create() - d.addCallback(lambda rootnode: rootnode.deep_check_and_repair()) + d.addCallback(lambda rootnode: + rootnode.start_deep_check_and_repair().when_done()) def _check_results(r): self.failUnless(IDeepCheckAndRepairResults.providedBy(r)) c = r.get_counters() @@ -204,7 +205,7 @@ class Dirnode(unittest.TestCase, testutil.ShouldFailMixin, testutil.StallMixin): def test_deepcheck_problems(self): d = self._test_deepcheck_create() d.addCallback(lambda rootnode: self._mark_file_bad(rootnode)) - d.addCallback(lambda rootnode: rootnode.deep_check()) + d.addCallback(lambda rootnode: rootnode.start_deep_check().when_done()) def _check_results(r): c = r.get_counters() self.failUnlessEqual(c, @@ -326,13 +327,13 @@ class Dirnode(unittest.TestCase, testutil.ShouldFailMixin, testutil.StallMixin): self.failUnlessEqual(sorted(children.keys()), sorted([u"child", u"subdir"]))) - d.addCallback(lambda res: n.build_manifest()) + d.addCallback(lambda res: n.build_manifest().when_done()) def _check_manifest(manifest): self.failUnlessEqual(sorted(manifest), sorted(self.expected_manifest)) d.addCallback(_check_manifest) - d.addCallback(lambda res: n.deep_stats()) + d.addCallback(lambda res: n.start_deep_stats().when_done()) def _check_deepstats(stats): self.failUnless(isinstance(stats, dict)) expected = {"count-immutable-files": 0, @@ -689,7 +690,7 @@ class Dirnode(unittest.TestCase, testutil.ShouldFailMixin, testutil.StallMixin): class DeepStats(unittest.TestCase): def test_stats(self): - ds = dirnode.DeepStats() + ds = dirnode.DeepStats(None) ds.add("count-files") ds.add("size-immutable-files", 123) ds.histogram("size-files-histogram", 123) @@ -714,7 +715,7 @@ class DeepStats(unittest.TestCase): self.failUnlessEqual(s["size-files-histogram"], [ (101, 316, 1), (317, 1000, 1) ]) - ds = dirnode.DeepStats() + ds = dirnode.DeepStats(None) for i in range(1, 1100): ds.histogram("size-files-histogram", i) ds.histogram("size-files-histogram", 4*1000*1000*1000*1000) # 4TB diff --git a/src/allmydata/test/test_system.py b/src/allmydata/test/test_system.py index 19c90573..9915c7cd 100644 --- a/src/allmydata/test/test_system.py +++ b/src/allmydata/test/test_system.py @@ -23,7 +23,7 @@ from twisted.python.failure import Failure from twisted.web.client import getPage from twisted.web.error import Error -from allmydata.test.common import SystemTestMixin +from allmydata.test.common import SystemTestMixin, WebErrorMixin LARGE_DATA = """ This is some data to publish to the virtual drive, which needs to be large @@ -644,7 +644,7 @@ class SystemTest(SystemTestMixin, unittest.TestCase): d1.addCallback(lambda res: dnode.set_node(u"see recursive", dnode)) d1.addCallback(lambda res: dnode.has_child(u"see recursive")) d1.addCallback(lambda answer: self.failUnlessEqual(answer, True)) - d1.addCallback(lambda res: dnode.build_manifest()) + d1.addCallback(lambda res: dnode.build_manifest().when_done()) d1.addCallback(lambda manifest: self.failUnlessEqual(len(manifest), 1)) return d1 @@ -926,7 +926,7 @@ class SystemTest(SystemTestMixin, unittest.TestCase): d1.addCallback(lambda res: home.move_child_to(u"sekrit data", personal)) - d1.addCallback(lambda res: home.build_manifest()) + d1.addCallback(lambda res: home.build_manifest().when_done()) d1.addCallback(self.log, "manifest") # five items: # P/ @@ -936,7 +936,7 @@ class SystemTest(SystemTestMixin, unittest.TestCase): # P/s2-rw/mydata992 (same as P/s2-rw/mydata992) d1.addCallback(lambda manifest: self.failUnlessEqual(len(manifest), 5)) - d1.addCallback(lambda res: home.deep_stats()) + d1.addCallback(lambda res: home.start_deep_stats().when_done()) def _check_stats(stats): expected = {"count-immutable-files": 1, "count-mutable-files": 0, @@ -1721,7 +1721,7 @@ class SystemTest(SystemTestMixin, unittest.TestCase): return d -class MutableChecker(SystemTestMixin, unittest.TestCase): +class MutableChecker(SystemTestMixin, unittest.TestCase, WebErrorMixin): def _run_cli(self, argv): stdout, stderr = StringIO(), StringIO() @@ -1751,6 +1751,7 @@ class MutableChecker(SystemTestMixin, unittest.TestCase): self.failIf("Unhealthy" in out, out) self.failIf("Corrupt Shares" in out, out) d.addCallback(_got_results) + d.addErrback(self.explain_web_error) return d def test_corrupt(self): @@ -1800,6 +1801,7 @@ class MutableChecker(SystemTestMixin, unittest.TestCase): self.failIf("Not Healthy!" in out, out) self.failUnless("Recoverable Versions: 10*seq" in out, out) d.addCallback(_got_postrepair_results) + d.addErrback(self.explain_web_error) return d @@ -1850,10 +1852,11 @@ class MutableChecker(SystemTestMixin, unittest.TestCase): self.failIf("Not Healthy!" in out, out) self.failUnless("Recoverable Versions: 10*seq" in out) d.addCallback(_got_postrepair_results) + d.addErrback(self.explain_web_error) return d -class DeepCheckWeb(SystemTestMixin, unittest.TestCase): +class DeepCheckWeb(SystemTestMixin, unittest.TestCase, WebErrorMixin): # construct a small directory tree (with one dir, one immutable file, one # mutable file, one LIT file, and a loop), and then check/examine it in # various ways. @@ -1954,11 +1957,12 @@ class DeepCheckWeb(SystemTestMixin, unittest.TestCase): d.addCallback(self.do_stats) d.addCallback(self.do_test_good) d.addCallback(self.do_test_web) + d.addErrback(self.explain_web_error) return d def do_stats(self, ignored): d = defer.succeed(None) - d.addCallback(lambda ign: self.root.deep_stats()) + d.addCallback(lambda ign: self.root.start_deep_stats().when_done()) d.addCallback(self.check_stats) return d @@ -1973,10 +1977,11 @@ class DeepCheckWeb(SystemTestMixin, unittest.TestCase): # s["size-directories"] self.failUnlessEqual(s["largest-directory-children"], 4) self.failUnlessEqual(s["largest-immutable-file"], 13000) - # to re-use this function for both the local dirnode.deep_stats() and - # the webapi t=deep-stats, we coerce the result into a list of - # tuples. dirnode.deep_stats() returns a list of tuples, but JSON - # only knows about lists., so t=deep-stats returns a list of lists. + # to re-use this function for both the local + # dirnode.start_deep_stats() and the webapi t=start-deep-stats, we + # coerce the result into a list of tuples. dirnode.start_deep_stats() + # returns a list of tuples, but JSON only knows about lists., so + # t=start-deep-stats returns a list of lists. histogram = [tuple(stuff) for stuff in s["size-files-histogram"]] self.failUnlessEqual(histogram, [(11, 31, 1), (10001, 31622, 1), @@ -2030,13 +2035,17 @@ class DeepCheckWeb(SystemTestMixin, unittest.TestCase): # now deep-check the root, with various verify= and repair= options - d.addCallback(lambda ign: self.root.deep_check()) + d.addCallback(lambda ign: + self.root.start_deep_check().when_done()) d.addCallback(self.deep_check_is_healthy, 3, "root") - d.addCallback(lambda ign: self.root.deep_check(verify=True)) + d.addCallback(lambda ign: + self.root.start_deep_check(verify=True).when_done()) d.addCallback(self.deep_check_is_healthy, 3, "root") - d.addCallback(lambda ign: self.root.deep_check_and_repair()) + d.addCallback(lambda ign: + self.root.start_deep_check_and_repair().when_done()) d.addCallback(self.deep_check_and_repair_is_healthy, 3, "root") - d.addCallback(lambda ign: self.root.deep_check_and_repair(verify=True)) + d.addCallback(lambda ign: + self.root.start_deep_check_and_repair(verify=True).when_done()) d.addCallback(self.deep_check_and_repair_is_healthy, 3, "root") return d @@ -2062,6 +2071,47 @@ class DeepCheckWeb(SystemTestMixin, unittest.TestCase): d.addCallback(lambda data: (data,url)) return d + def wait_for_operation(self, ignored, ophandle): + url = self.webish_url + "operations/" + ophandle + url += "?t=status&output=JSON" + d = getPage(url) + def _got(res): + try: + data = simplejson.loads(res) + except ValueError: + self.fail("%s: not JSON: '%s'" % (url, res)) + if not data["finished"]: + d = self.stall(delay=1.0) + d.addCallback(self.wait_for_operation, ophandle) + return d + return data + d.addCallback(_got) + return d + + def get_operation_results(self, ignored, ophandle, output=None): + url = self.webish_url + "operations/" + ophandle + url += "?t=status" + if output: + url += "&output=" + output + d = getPage(url) + def _got(res): + if output and output.lower() == "json": + try: + return simplejson.loads(res) + except ValueError: + self.fail("%s: not JSON: '%s'" % (url, res)) + return res + d.addCallback(_got) + return d + + def slow_web(self, n, output=None, **kwargs): + # use ophandle= + handle = base32.b2a(os.urandom(4)) + d = self.web(n, "POST", ophandle=handle, **kwargs) + d.addCallback(self.wait_for_operation, handle) + d.addCallback(self.get_operation_results, handle, output=output) + return d + def json_check_is_healthy(self, data, n, where, incomplete=False): self.failUnlessEqual(data["storage-index"], @@ -2146,8 +2196,9 @@ class DeepCheckWeb(SystemTestMixin, unittest.TestCase): d = defer.succeed(None) # stats - d.addCallback(lambda ign: self.web(self.root, t="deep-stats")) - d.addCallback(self.decode_json) + d.addCallback(lambda ign: + self.slow_web(self.root, + t="start-deep-stats", output="json")) d.addCallback(self.json_check_stats, "deep-stats") # check, no verify @@ -2204,16 +2255,18 @@ class DeepCheckWeb(SystemTestMixin, unittest.TestCase): # now run a deep-check, with various verify= and repair= flags d.addCallback(lambda ign: - self.web_json(self.root, t="deep-check")) + self.slow_web(self.root, t="start-deep-check", output="json")) d.addCallback(self.json_full_deepcheck_is_healthy, self.root, "root") d.addCallback(lambda ign: - self.web_json(self.root, t="deep-check", verify="true")) + self.slow_web(self.root, t="start-deep-check", verify="true", + output="json")) d.addCallback(self.json_full_deepcheck_is_healthy, self.root, "root") d.addCallback(lambda ign: - self.web_json(self.root, t="deep-check", repair="true")) + self.slow_web(self.root, t="start-deep-check", repair="true", + output="json")) d.addCallback(self.json_full_deepcheck_and_repair_is_healthy, self.root, "root") d.addCallback(lambda ign: - self.web_json(self.root, t="deep-check", verify="true", repair="true")) + self.slow_web(self.root, t="start-deep-check", verify="true", repair="true", output="json")) d.addCallback(self.json_full_deepcheck_and_repair_is_healthy, self.root, "root") # now look at t=info diff --git a/src/allmydata/test/test_web.py b/src/allmydata/test/test_web.py index 89599764..695d9bfc 100644 --- a/src/allmydata/test/test_web.py +++ b/src/allmydata/test/test_web.py @@ -8,7 +8,7 @@ from twisted.python import failure, log from allmydata import interfaces, provisioning, uri, webish from allmydata.immutable import upload, download from allmydata.web import status, common -from allmydata.util import fileutil +from allmydata.util import fileutil, testutil from allmydata.test.common import FakeDirectoryNode, FakeCHKFileNode, \ FakeMutableFileNode, create_chk_filenode from allmydata.interfaces import IURI, INewDirectoryURI, \ @@ -362,7 +362,7 @@ class WebMixin(object): return d -class Web(WebMixin, unittest.TestCase): +class Web(WebMixin, testutil.StallMixin, unittest.TestCase): def test_create(self): pass @@ -830,28 +830,40 @@ class Web(WebMixin, unittest.TestCase): d.addCallback(self.failUnlessIsFooJSON) return d - def test_GET_DIRURL_manifest(self): - def getman(ignored, suffix, followRedirect=False): - return self.GET(self.public_url + "/foo" + suffix, - followRedirect=followRedirect) + + def test_POST_DIRURL_manifest_no_ophandle(self): + d = self.shouldFail2(error.Error, + "test_POST_DIRURL_manifest_no_ophandle", + "400 Bad Request", + "slow operation requires ophandle=", + self.POST, self.public_url, t="start-manifest") + return d + + def test_POST_DIRURL_manifest(self): d = defer.succeed(None) - d.addCallback(getman, "?t=manifest", followRedirect=True) + def getman(ignored, output): + d = self.POST(self.public_url + "/foo/?t=start-manifest&ophandle=125", + followRedirect=True) + d.addCallback(self.wait_for_operation, "125") + d.addCallback(self.get_operation_results, "125", output) + return d + d.addCallback(getman, None) def _got_html(manifest): self.failUnless("Manifest of SI=" in manifest) self.failUnless("sub" in manifest) self.failUnless(self._sub_uri in manifest) self.failUnless("sub/baz.txt" in manifest) d.addCallback(_got_html) - d.addCallback(getman, "/?t=manifest") + d.addCallback(getman, "html") d.addCallback(_got_html) - d.addCallback(getman, "/?t=manifest&output=text") + d.addCallback(getman, "text") def _got_text(manifest): self.failUnless("\nsub " + self._sub_uri + "\n" in manifest) self.failUnless("\nsub/baz.txt URI:CHK:" in manifest) d.addCallback(_got_text) - d.addCallback(getman, "/?t=manifest&output=JSON") + d.addCallback(getman, "JSON") def _got_json(manifest): - data = simplejson.loads(manifest) + data = manifest["manifest"] got = {} for (path_list, cap) in data: got[tuple(path_list)] = cap @@ -860,20 +872,48 @@ class Web(WebMixin, unittest.TestCase): d.addCallback(_got_json) return d - def test_GET_DIRURL_deepsize(self): - d = self.GET(self.public_url + "/foo?t=deep-size", followRedirect=True) - def _got(res): - self.failUnless(re.search(r'^\d+$', res), res) - size = int(res) + def test_POST_DIRURL_deepsize_no_ophandle(self): + d = self.shouldFail2(error.Error, + "test_POST_DIRURL_deepsize_no_ophandle", + "400 Bad Request", + "slow operation requires ophandle=", + self.POST, self.public_url, t="start-deep-size") + return d + + def test_POST_DIRURL_deepsize(self): + d = self.POST(self.public_url + "/foo/?t=start-deep-size&ophandle=126", + followRedirect=True) + d.addCallback(self.wait_for_operation, "126") + d.addCallback(self.get_operation_results, "126", "json") + def _got_json(data): + self.failUnlessEqual(data["finished"], True) + size = data["size"] + self.failUnless(size > 1000) + d.addCallback(_got_json) + d.addCallback(self.get_operation_results, "126", "text") + def _got_text(res): + mo = re.search(r'^size: (\d+)$', res, re.M) + self.failUnless(mo, res) + size = int(mo.group(1)) # with directories, the size varies. self.failUnless(size > 1000) - d.addCallback(_got) + d.addCallback(_got_text) + return d + + def test_POST_DIRURL_deepstats_no_ophandle(self): + d = self.shouldFail2(error.Error, + "test_POST_DIRURL_deepstats_no_ophandle", + "400 Bad Request", + "slow operation requires ophandle=", + self.POST, self.public_url, t="start-deep-stats") return d - def test_GET_DIRURL_deepstats(self): - d = self.GET(self.public_url + "/foo?t=deep-stats", followRedirect=True) - def _got(stats_json): - stats = simplejson.loads(stats_json) + def test_POST_DIRURL_deepstats(self): + d = self.POST(self.public_url + "/foo/?t=start-deep-stats&ophandle=127", + followRedirect=True) + d.addCallback(self.wait_for_operation, "127") + d.addCallback(self.get_operation_results, "127", "json") + def _got_json(stats): expected = {"count-immutable-files": 3, "count-mutable-files": 0, "count-literal-files": 0, @@ -892,7 +932,7 @@ class Web(WebMixin, unittest.TestCase): (k, stats[k], v)) self.failUnlessEqual(stats["size-files-histogram"], [ [11, 31, 3] ]) - d.addCallback(_got) + d.addCallback(_got_json) return d def test_GET_DIRURL_uri(self): @@ -1521,34 +1561,80 @@ class Web(WebMixin, unittest.TestCase): d.addCallback(_check3) return d + def wait_for_operation(self, ignored, ophandle): + url = "/operations/" + ophandle + url += "?t=status&output=JSON" + d = self.GET(url) + def _got(res): + data = simplejson.loads(res) + if not data["finished"]: + d = self.stall(delay=1.0) + d.addCallback(self.wait_for_operation, ophandle) + return d + return data + d.addCallback(_got) + return d + + def get_operation_results(self, ignored, ophandle, output=None): + url = "/operations/" + ophandle + url += "?t=status" + if output: + url += "&output=" + output + d = self.GET(url) + def _got(res): + if output and output.lower() == "json": + return simplejson.loads(res) + return res + d.addCallback(_got) + return d + + def test_POST_DIRURL_deepcheck_no_ophandle(self): + d = self.shouldFail2(error.Error, + "test_POST_DIRURL_deepcheck_no_ophandle", + "400 Bad Request", + "slow operation requires ophandle=", + self.POST, self.public_url, t="start-deep-check") + return d + def test_POST_DIRURL_deepcheck(self): - d = self.POST(self.public_url, t="deep-check") - def _check(res): + def _check_redirect(statuscode, target): + self.failUnlessEqual(statuscode, str(http.FOUND)) + self.failUnless(target.endswith("/operations/123?t=status")) + d = self.shouldRedirect2("test_POST_DIRURL_deepcheck", _check_redirect, + self.POST, self.public_url, + t="start-deep-check", ophandle="123") + d.addCallback(self.wait_for_operation, "123") + def _check_json(data): + self.failUnlessEqual(data["finished"], True) + self.failUnlessEqual(data["count-objects-checked"], 8) + self.failUnlessEqual(data["count-objects-healthy"], 8) + d.addCallback(_check_json) + d.addCallback(self.get_operation_results, "123", "html") + def _check_html(res): self.failUnless("Objects Checked: 8" in res) self.failUnless("Objects Healthy: 8" in res) - d.addCallback(_check) - redir_url = "http://allmydata.org/TARGET" - def _check2(statuscode, target): - self.failUnlessEqual(statuscode, str(http.FOUND)) - self.failUnlessEqual(target, redir_url) - d.addCallback(lambda res: - self.shouldRedirect2("test_POST_DIRURL_check", - _check2, - self.POST, self.public_url, - t="deep-check", - when_done=redir_url)) - d.addCallback(lambda res: - self.POST(self.public_url, t="deep-check", - return_to=redir_url)) - def _check3(res): - self.failUnless("Return to parent directory" in res) - self.failUnless(redir_url in res) - d.addCallback(_check3) + d.addCallback(_check_html) return d def test_POST_DIRURL_deepcheck_and_repair(self): - d = self.POST(self.public_url, t="deep-check", repair="true") - def _check(res): + d = self.POST(self.public_url, t="start-deep-check", repair="true", + ophandle="124", output="json", followRedirect=True) + d.addCallback(self.wait_for_operation, "124") + def _check_json(data): + self.failUnlessEqual(data["finished"], True) + self.failUnlessEqual(data["count-objects-checked"], 8) + self.failUnlessEqual(data["count-objects-healthy-pre-repair"], 8) + self.failUnlessEqual(data["count-objects-unhealthy-pre-repair"], 0) + self.failUnlessEqual(data["count-corrupt-shares-pre-repair"], 0) + self.failUnlessEqual(data["count-repairs-attempted"], 0) + self.failUnlessEqual(data["count-repairs-successful"], 0) + self.failUnlessEqual(data["count-repairs-unsuccessful"], 0) + self.failUnlessEqual(data["count-objects-healthy-post-repair"], 8) + self.failUnlessEqual(data["count-objects-unhealthy-post-repair"], 0) + self.failUnlessEqual(data["count-corrupt-shares-post-repair"], 0) + d.addCallback(_check_json) + d.addCallback(self.get_operation_results, "124", "html") + def _check_html(res): self.failUnless("Objects Checked: 8" in res) self.failUnless("Objects Healthy (before repair): 8" in res) @@ -1562,24 +1648,7 @@ class Web(WebMixin, unittest.TestCase): self.failUnless("Objects Healthy (after repair): 8" in res) self.failUnless("Objects Unhealthy (after repair): 0" in res) self.failUnless("Corrupt Shares (after repair): 0" in res) - d.addCallback(_check) - redir_url = "http://allmydata.org/TARGET" - def _check2(statuscode, target): - self.failUnlessEqual(statuscode, str(http.FOUND)) - self.failUnlessEqual(target, redir_url) - d.addCallback(lambda res: - self.shouldRedirect2("test_POST_DIRURL_check", - _check2, - self.POST, self.public_url, - t="deep-check", - when_done=redir_url)) - d.addCallback(lambda res: - self.POST(self.public_url, t="deep-check", - return_to=redir_url)) - def _check3(res): - self.failUnless("Return to parent directory" in res) - self.failUnless(redir_url in res) - d.addCallback(_check3) + d.addCallback(_check_html) return d def test_POST_FILEURL_bad_t(self): diff --git a/src/allmydata/web/checker_results.py b/src/allmydata/web/checker_results.py index e82e63b8..5f2eb65c 100644 --- a/src/allmydata/web/checker_results.py +++ b/src/allmydata/web/checker_results.py @@ -4,8 +4,8 @@ import simplejson from nevow import rend, inevow, tags as T from twisted.web import html from allmydata.web.common import getxmlfile, get_arg, IClient -from allmydata.interfaces import ICheckAndRepairResults, ICheckerResults, \ - IDeepCheckResults, IDeepCheckAndRepairResults +from allmydata.web.operations import ReloadMixin +from allmydata.interfaces import ICheckAndRepairResults, ICheckerResults from allmydata.util import base32, idlib class ResultsBase: @@ -169,12 +169,13 @@ class CheckAndRepairResults(rend.Page, ResultsBase): return T.div[T.a(href=return_to)["Return to parent directory"]] return "" -class DeepCheckResults(rend.Page, ResultsBase): +class DeepCheckResults(rend.Page, ResultsBase, ReloadMixin): docFactory = getxmlfile("deep-check-results.xhtml") - def __init__(self, results): - assert IDeepCheckResults(results) - self.r = results + def __init__(self, monitor): + #assert IDeepCheckResults(results) + #self.r = results + self.monitor = monitor def renderHTTP(self, ctx): if self.want_json(ctx): @@ -184,8 +185,10 @@ class DeepCheckResults(rend.Page, ResultsBase): def json(self, ctx): inevow.IRequest(ctx).setHeader("content-type", "text/plain") data = {} - data["root-storage-index"] = self.r.get_root_storage_index_string() - c = self.r.get_counters() + data["finished"] = self.monitor.is_finished() + res = self.monitor.get_status() + data["root-storage-index"] = res.get_root_storage_index_string() + c = res.get_counters() data["count-objects-checked"] = c["count-objects-checked"] data["count-objects-healthy"] = c["count-objects-healthy"] data["count-objects-unhealthy"] = c["count-objects-unhealthy"] @@ -194,35 +197,35 @@ class DeepCheckResults(rend.Page, ResultsBase): base32.b2a(storage_index), shnum) for (serverid, storage_index, shnum) - in self.r.get_corrupt_shares() ] + in res.get_corrupt_shares() ] data["list-unhealthy-files"] = [ (path_t, self._json_check_results(r)) for (path_t, r) - in self.r.get_all_results().items() + in res.get_all_results().items() if not r.is_healthy() ] - data["stats"] = self.r.get_stats() + data["stats"] = res.get_stats() return simplejson.dumps(data, indent=1) + "\n" def render_root_storage_index(self, ctx, data): - return self.r.get_root_storage_index_string() + return self.monitor.get_status().get_root_storage_index_string() def data_objects_checked(self, ctx, data): - return self.r.get_counters()["count-objects-checked"] + return self.monitor.get_status().get_counters()["count-objects-checked"] def data_objects_healthy(self, ctx, data): - return self.r.get_counters()["count-objects-healthy"] + return self.monitor.get_status().get_counters()["count-objects-healthy"] def data_objects_unhealthy(self, ctx, data): - return self.r.get_counters()["count-objects-unhealthy"] + return self.monitor.get_status().get_counters()["count-objects-unhealthy"] def data_count_corrupt_shares(self, ctx, data): - return self.r.get_counters()["count-corrupt-shares"] + return self.monitor.get_status().get_counters()["count-corrupt-shares"] def render_problems_p(self, ctx, data): - c = self.r.get_counters() + c = self.monitor.get_status().get_counters() if c["count-objects-unhealthy"]: return ctx.tag return "" def data_problems(self, ctx, data): - all_objects = self.r.get_all_results() + all_objects = self.monitor.get_status().get_all_results() for path in sorted(all_objects.keys()): cr = all_objects[path] assert ICheckerResults.providedBy(cr) @@ -240,14 +243,14 @@ class DeepCheckResults(rend.Page, ResultsBase): def render_servers_with_corrupt_shares_p(self, ctx, data): - if self.r.get_counters()["count-corrupt-shares"]: + if self.monitor.get_status().get_counters()["count-corrupt-shares"]: return ctx.tag return "" def data_servers_with_corrupt_shares(self, ctx, data): servers = [serverid for (serverid, storage_index, sharenum) - in self.r.get_corrupt_shares()] + in self.monitor.get_status().get_corrupt_shares()] servers.sort() return servers @@ -262,11 +265,11 @@ class DeepCheckResults(rend.Page, ResultsBase): def render_corrupt_shares_p(self, ctx, data): - if self.r.get_counters()["count-corrupt-shares"]: + if self.monitor.get_status().get_counters()["count-corrupt-shares"]: return ctx.tag return "" def data_corrupt_shares(self, ctx, data): - return self.r.get_corrupt_shares() + return self.monitor.get_status().get_corrupt_shares() def render_share_problem(self, ctx, data): serverid, storage_index, sharenum = data nickname = IClient(ctx).get_nickname_for_peerid(serverid) @@ -285,7 +288,7 @@ class DeepCheckResults(rend.Page, ResultsBase): return "" def data_all_objects(self, ctx, data): - r = self.r.get_all_results() + r = self.monitor.get_status().get_all_results() for path in sorted(r.keys()): yield (path, r[path]) @@ -301,12 +304,13 @@ class DeepCheckResults(rend.Page, ResultsBase): runtime = time.time() - req.processing_started_timestamp return ctx.tag["runtime: %s seconds" % runtime] -class DeepCheckAndRepairResults(rend.Page, ResultsBase): +class DeepCheckAndRepairResults(rend.Page, ResultsBase, ReloadMixin): docFactory = getxmlfile("deep-check-and-repair-results.xhtml") - def __init__(self, results): - assert IDeepCheckAndRepairResults(results) - self.r = results + def __init__(self, monitor): + #assert IDeepCheckAndRepairResults(results) + #self.r = results + self.monitor = monitor def renderHTTP(self, ctx): if self.want_json(ctx): @@ -315,9 +319,11 @@ class DeepCheckAndRepairResults(rend.Page, ResultsBase): def json(self, ctx): inevow.IRequest(ctx).setHeader("content-type", "text/plain") + res = self.monitor.get_status() data = {} - data["root-storage-index"] = self.r.get_root_storage_index_string() - c = self.r.get_counters() + data["finished"] = self.monitor.is_finished() + data["root-storage-index"] = res.get_root_storage_index_string() + c = res.get_counters() data["count-objects-checked"] = c["count-objects-checked"] data["count-objects-healthy-pre-repair"] = c["count-objects-healthy-pre-repair"] @@ -336,55 +342,55 @@ class DeepCheckAndRepairResults(rend.Page, ResultsBase): base32.b2a(storage_index), shnum) for (serverid, storage_index, shnum) - in self.r.get_corrupt_shares() ] + in res.get_corrupt_shares() ] data["list-remaining-corrupt-shares"] = [ (idlib.nodeid_b2a(serverid), base32.b2a(storage_index), shnum) for (serverid, storage_index, shnum) - in self.r.get_remaining_corrupt_shares() ] + in res.get_remaining_corrupt_shares() ] data["list-unhealthy-files"] = [ (path_t, self._json_check_results(r)) for (path_t, r) - in self.r.get_all_results().items() + in res.get_all_results().items() if not r.get_pre_repair_results().is_healthy() ] - data["stats"] = self.r.get_stats() + data["stats"] = res.get_stats() return simplejson.dumps(data, indent=1) + "\n" def render_root_storage_index(self, ctx, data): - return self.r.get_root_storage_index_string() + return self.monitor.get_status().get_root_storage_index_string() def data_objects_checked(self, ctx, data): - return self.r.get_counters()["count-objects-checked"] + return self.monitor.get_status().get_counters()["count-objects-checked"] def data_objects_healthy(self, ctx, data): - return self.r.get_counters()["count-objects-healthy-pre-repair"] + return self.monitor.get_status().get_counters()["count-objects-healthy-pre-repair"] def data_objects_unhealthy(self, ctx, data): - return self.r.get_counters()["count-objects-unhealthy-pre-repair"] + return self.monitor.get_status().get_counters()["count-objects-unhealthy-pre-repair"] def data_corrupt_shares(self, ctx, data): - return self.r.get_counters()["count-corrupt-shares-pre-repair"] + return self.monitor.get_status().get_counters()["count-corrupt-shares-pre-repair"] def data_repairs_attempted(self, ctx, data): - return self.r.get_counters()["count-repairs-attempted"] + return self.monitor.get_status().get_counters()["count-repairs-attempted"] def data_repairs_successful(self, ctx, data): - return self.r.get_counters()["count-repairs-successful"] + return self.monitor.get_status().get_counters()["count-repairs-successful"] def data_repairs_unsuccessful(self, ctx, data): - return self.r.get_counters()["count-repairs-unsuccessful"] + return self.monitor.get_status().get_counters()["count-repairs-unsuccessful"] def data_objects_healthy_post(self, ctx, data): - return self.r.get_counters()["count-objects-healthy-post-repair"] + return self.monitor.get_status().get_counters()["count-objects-healthy-post-repair"] def data_objects_unhealthy_post(self, ctx, data): - return self.r.get_counters()["count-objects-unhealthy-post-repair"] + return self.monitor.get_status().get_counters()["count-objects-unhealthy-post-repair"] def data_corrupt_shares_post(self, ctx, data): - return self.r.get_counters()["count-corrupt-shares-post-repair"] + return self.monitor.get_status().get_counters()["count-corrupt-shares-post-repair"] def render_pre_repair_problems_p(self, ctx, data): - c = self.r.get_counters() + c = self.monitor.get_status().get_counters() if c["count-objects-unhealthy-pre-repair"]: return ctx.tag return "" def data_pre_repair_problems(self, ctx, data): - all_objects = self.r.get_all_results() + all_objects = self.monitor.get_status().get_all_results() for path in sorted(all_objects.keys()): r = all_objects[path] assert ICheckAndRepairResults.providedBy(r) @@ -397,14 +403,14 @@ class DeepCheckAndRepairResults(rend.Page, ResultsBase): return ["/".join(self._html(path)), ": ", self._html(cr.get_summary())] def render_post_repair_problems_p(self, ctx, data): - c = self.r.get_counters() + c = self.monitor.get_status().get_counters() if (c["count-objects-unhealthy-post-repair"] or c["count-corrupt-shares-post-repair"]): return ctx.tag return "" def data_post_repair_problems(self, ctx, data): - all_objects = self.r.get_all_results() + all_objects = self.monitor.get_status().get_all_results() for path in sorted(all_objects.keys()): r = all_objects[path] assert ICheckAndRepairResults.providedBy(r) @@ -413,7 +419,7 @@ class DeepCheckAndRepairResults(rend.Page, ResultsBase): yield path, cr def render_servers_with_corrupt_shares_p(self, ctx, data): - if self.r.get_counters()["count-corrupt-shares-pre-repair"]: + if self.monitor.get_status().get_counters()["count-corrupt-shares-pre-repair"]: return ctx.tag return "" def data_servers_with_corrupt_shares(self, ctx, data): @@ -423,7 +429,7 @@ class DeepCheckAndRepairResults(rend.Page, ResultsBase): def render_remaining_corrupt_shares_p(self, ctx, data): - if self.r.get_counters()["count-corrupt-shares-post-repair"]: + if self.monitor.get_status().get_counters()["count-corrupt-shares-post-repair"]: return ctx.tag return "" def data_post_repair_corrupt_shares(self, ctx, data): @@ -441,7 +447,7 @@ class DeepCheckAndRepairResults(rend.Page, ResultsBase): return "" def data_all_objects(self, ctx, data): - r = self.r.get_all_results() + r = self.monitor.get_status().get_all_results() for path in sorted(r.keys()): yield (path, r[path]) diff --git a/src/allmydata/web/common.py b/src/allmydata/web/common.py index f11e62a8..1050e9cc 100644 --- a/src/allmydata/web/common.py +++ b/src/allmydata/web/common.py @@ -8,7 +8,8 @@ from allmydata.interfaces import ExistingChildError, FileTooLargeError class IClient(Interface): pass - +class IOpHandleTable(Interface): + pass def getxmlfile(name): return loaders.xmlfile(resource_filename('allmydata.web', '%s' % name)) @@ -18,13 +19,21 @@ def boolean_of_arg(arg): assert arg.lower() in ("true", "t", "1", "false", "f", "0", "on", "off") return arg.lower() in ("true", "t", "1", "on") -def get_arg(req, argname, default=None, multiple=False): +def get_root(ctx_or_req): + req = IRequest(ctx_or_req) + # the addSlash=True gives us one extra (empty) segment + depth = len(req.prepath) + len(req.postpath) - 1 + link = "/".join([".."] * depth) + return link + +def get_arg(ctx_or_req, argname, default=None, multiple=False): """Extract an argument from either the query args (req.args) or the form body fields (req.fields). If multiple=False, this returns a single value (or the default, which defaults to None), and the query args take precedence. If multiple=True, this returns a tuple of arguments (possibly empty), starting with all those in the query args. """ + req = IRequest(ctx_or_req) results = [] if argname in req.args: results.extend(req.args[argname]) @@ -103,6 +112,7 @@ class MyExceptionHandler(appserver.DefaultExceptionHandler): if isinstance(text, unicode): text = text.encode("utf-8") req.write(text) + # TODO: consider putting the requested URL here req.finishRequest(False) def renderHTTP_exception(self, ctx, f): @@ -128,6 +138,9 @@ class MyExceptionHandler(appserver.DefaultExceptionHandler): super = appserver.DefaultExceptionHandler return super.renderHTTP_exception(self, ctx, f) +class NeedOperationHandleError(WebError): + pass + class RenderMixin: def renderHTTP(self, ctx): diff --git a/src/allmydata/web/deep-check-and-repair-results.xhtml b/src/allmydata/web/deep-check-and-repair-results.xhtml index 36afd272..2b842f76 100644 --- a/src/allmydata/web/deep-check-and-repair-results.xhtml +++ b/src/allmydata/web/deep-check-and-repair-results.xhtml @@ -11,6 +11,8 @@

Deep-Check-And-Repair Results for root SI=

+

+

Counters: