From: Brian Warner Date: Wed, 2 May 2007 18:55:47 +0000 (-0700) Subject: import new foolscap snapshot, still 0.1.2+ but very close to 0.1.3 . This version... X-Git-Tag: allmydata-tahoe-0.2.0~9 X-Git-Url: https://git.rkrishnan.org/components/reedownlee?a=commitdiff_plain;h=6c295337b7e786c56c12469fccfb8412756e271e;p=tahoe-lafs%2Ftahoe-lafs.git import new foolscap snapshot, still 0.1.2+ but very close to 0.1.3 . This version is fully compatible with the previous 0.1.2+ snapshot --- diff --git a/src/foolscap/ChangeLog b/src/foolscap/ChangeLog index 5d5c289a..af1c82eb 100644 --- a/src/foolscap/ChangeLog +++ b/src/foolscap/ChangeLog @@ -1,5 +1,136 @@ +2007-05-02 Brian Warner + + * foolscap/reconnector.py (Reconnector._failed): simplify + log/no-log logic + + * foolscap/slicers/unicode.py (UnicodeConstraint): add a new + constraint that only accepts unicode objects. It isn't complete: + I've forgotten how the innards of Constraints work, and as a + result this one is too permissive: it will probably accept too + many tokens over the wire before raising a Violation (although the + post-receive just-before-the-method-is-called check should still + be enforced, so application code shouldn't notice the issue). + * foolscap/test/test_schema.py (ConformTest.testUnicode): test it + (CreateTest.testMakeConstraint): check the typemap too + * foolscap/test/test_call.py (TestCall.testMegaSchema): test in a call + * foolscap/test/common.py: same + + * foolscap/constraint.py (ByteStringConstraint): rename + StringConstraint to ByteStringConstraint, to more accurately + describe its function. This constraint will *not* accept unicode + objects. + * foolscap/call.py, foolscap/copyable.py, foolscap/referenceable.py: + * foolscap/slicers/vocab.py: same + + * foolscap/schema.py (AnyStringConstraint): add a new constraint + to accept either bytestrings or unicode objects. I don't think it + actually works yet, particularly when used inside containers. + (constraintMap): map 'str' to ByteStringConstraint for now. Maybe + someday it should be mapped to AnyStringConstraint, but not today. + Map 'unicode' to UnicodeConstraint. + + + * foolscap/pb.py (Tub.getReference): assert that the Tub is + already running, either because someone called Tub.startService(), + or because we've been attached (with tub.setServiceParent) to a + running service. This requirement appeared with the + connector-tracking code, and I hope to relax it at some + point (such that any pre-startService getReferences will be queued + and serviced when the Tub is finally started), but for this + release it is a requirement to start the service before trying to + use it. + (Tub.connectTo): same + * doc/using-pb.xhtml: document it + * doc/listings/pb1client.py: update example to match + * doc/listings/pb2client.py: update example to match + * doc/listings/pb3client.py: update example to match + + * foolscap/pb.py (Tub.connectorFinished): if, for some reason, + we're removing the same connector twice, log and ignore rather + than explode. I can't find a code path that would allow this, but + I *have* seen it occur in practice, and the results aren't pretty. + Since the whole connection-tracking thing is really for the + benefit of unit tests anyways (who want to know when + Tub.stopService is done), I think it's more important to keep + application code running. + + * foolscap/negotiate.py (TubConnector.shutdown): clear out + self.remainingLocations too, in case it helps to shut things down + faster. Add some comments. + + * foolscap/negotiate.py (Negotiation): improve error-message + delivery, by keeping track of what state the receiver is in (i.e. + whether we should send them an HTTP error block, an rfc822-style + error-block, or a banana ERROR token). + (Negotiation.switchToBanana): empty self.buffer, to make sure that + any extra data is passed entirely to the new Banana protocol and + none of it gets passed back to ourselves + (Negotiation.dataReceived): same, only recurse if there's something + still in self.buffer. In other situtations we recurse here because + we might have somehow received data for two separate phases in a + single packet. + + * foolscap/banana.py (Banana.sendError): rather than explode when + trying to send an overly-long error message, just truncate it. + +2007-04-30 Brian Warner + + * foolscap/broker.py (Broker.notifyOnDisconnect): if the + RemoteReference is already dead, notify the callback right away. + Previously we would never notify them, which was a problem. + (Broker.dontNotifyOnDisconnect): be tolerant of attempts to + unregister callbacks that have already fired. I think this makes it + easier to write correct code, but on the other hand it loses the + assertion feedback if somebody tries to unregister something that + was never registered in the first place. + * foolscap/test/test_call.py (TestCall.testNotifyOnDisconnect): + test this new tolerance + (TestCall.testNotifyOnDisconnect_unregister): same + (TestCall.testNotifyOnDisconnect_already): test that a handler + fires when the reference was already broken + + * foolscap/call.py (InboundDelivery.logFailure): don't use + f.getTraceback() on string exceptions: twisted explodes + (FailureSlicer.getStateToCopy): same + * foolscap/test/test_call.py (TestCall.testFailStringException): + skip the test on python2.5, since string exceptions are deprecated + anyways and I don't want the warning message to clutter the test + logs + + * doc/using-pb.xhtml (RemoteInterfaces): document the fact that + the default name is *not* fully-qualified, necessitating the use + of __remote_name__ to distinguish between foo.RIBar and baz.RIBar + * foolscap/remoteinterface.py: same + + * foolscap/call.py (FailureSlicer.getStateToCopy): handle string + exceptions without exploding, annoying as they are. + * foolscap/test/test_call.py (TestCall.testFail4): test them + 2007-04-27 Brian Warner + * foolscap/broker.py (Broker.freeYourReference._ignore_loss): + change the way we ignore DeadReferenceError and friends, since + f.trap is not suitable for direct use as an errback + + * foolscap/referenceable.py (SturdyRef.__init__): log the repr of + the unparseable FURL, rather than just the str, in case there are + weird control characters in it + + * foolscap/banana.py (Banana.handleData): rewrite the typebyte + scanning loop, to remove the redundant pos<64 check. Also, if we + get an overlong prefix, log it so we can figure out what's going + wrong. + * foolscap/test/test_banana.py: update to match + + * foolscap/negotiate.py (Negotiation.dataReceived): if a + non-NegotiationError exception occurs, log it, since it indicates + a foolscap coding failure rather than some disagreement with the + remote end. Log it with 'log.msg' for now, since some of the unit + tests seem to trigger startTLS errors that flunk tests which + should normally pass. I suspect some problems with error handling + in twisted's TLS implementation, but I'll have to investigate it + later. Eventually this will turn into a log.err. + * foolscap/pb.py (Tub.keepaliveTimeout): set the default keepalive timer to 4 minutes. This means that at most 8 minutes will go by without any traffic at all, which should be a reasonable value to diff --git a/src/foolscap/PKG-INFO b/src/foolscap/PKG-INFO deleted file mode 100644 index cb1ea5c7..00000000 --- a/src/foolscap/PKG-INFO +++ /dev/null @@ -1,14 +0,0 @@ -Metadata-Version: 1.0 -Name: foolscap -Version: 0.1.2+ -Summary: Foolscap contains an RPC protocol for Twisted. -Home-page: http://twistedmatrix.com/trac/wiki/FoolsCap -Author: Brian Warner -Author-email: warner@twistedmatrix.com -License: MIT -Description: Foolscap (aka newpb) is a new version of Twisted's native RPC protocol, known - as 'Perspective Broker'. This allows an object in one process to be used by - code in a distant process. This module provides data marshaling, a remote - object reference system, and a capability-based security model. - -Platform: UNKNOWN diff --git a/src/foolscap/doc/listings/pb1client.py b/src/foolscap/doc/listings/pb1client.py index 0dd29145..2d129b1f 100644 --- a/src/foolscap/doc/listings/pb1client.py +++ b/src/foolscap/doc/listings/pb1client.py @@ -22,10 +22,10 @@ def gotAnswer(answer): reactor.stop() tub = Tub() +tub.startService() d = tub.getReference("pbu://localhost:12345/math-service") d.addCallbacks(gotReference, gotError1) -tub.startService() reactor.run() diff --git a/src/foolscap/doc/listings/pb2client.py b/src/foolscap/doc/listings/pb2client.py index bb2c4ebd..5a70f82c 100644 --- a/src/foolscap/doc/listings/pb2client.py +++ b/src/foolscap/doc/listings/pb2client.py @@ -27,10 +27,10 @@ if len(sys.argv) < 2: sys.exit(1) url = sys.argv[1] tub = Tub() +tub.startService() d = tub.getReference(url) d.addCallbacks(gotReference, gotError1) -tub.startService() reactor.run() diff --git a/src/foolscap/doc/listings/pb3user.py b/src/foolscap/doc/listings/pb3user.py index 762bd179..58593f7d 100644 --- a/src/foolscap/doc/listings/pb3user.py +++ b/src/foolscap/doc/listings/pb3user.py @@ -27,8 +27,8 @@ def gotRemote(remote): url = sys.argv[1] tub = Tub() +tub.startService() d = tub.getReference(url) d.addCallback(gotRemote) -tub.startService() reactor.run() diff --git a/src/foolscap/doc/stylesheet-unprocessed.css b/src/foolscap/doc/stylesheet-unprocessed.css new file mode 100644 index 00000000..e4a62cc1 --- /dev/null +++ b/src/foolscap/doc/stylesheet-unprocessed.css @@ -0,0 +1,20 @@ + +span.footnote { + vertical-align: super; + font-size: small; +} + +span.footnote:before +{ + content: "[Footnote: "; +} + +span.footnote:after +{ + content: "]"; +} + +div.note:before +{ + content: "Note: "; +} diff --git a/src/foolscap/doc/stylesheet.css b/src/foolscap/doc/stylesheet.css new file mode 100644 index 00000000..c82fe2ec --- /dev/null +++ b/src/foolscap/doc/stylesheet.css @@ -0,0 +1,180 @@ + +body +{ + margin-left: 2em; + margin-right: 2em; + border: 0px; + padding: 0px; + font-family: sans-serif; + } + +.done { color: #005500; background-color: #99ff99 } +.notdone { color: #550000; background-color: #ff9999;} + +pre +{ + padding: 1em; + font-family: Neep Alt, Courier New, Courier; + font-size: 12pt; + border: thin black solid; +} + +.boxed +{ + padding: 1em; + border: thin black solid; +} + +.shell +{ + background-color: #ffffdd; +} + +.python +{ + background-color: #dddddd; +} + +.htmlsource +{ + background-color: #dddddd; +} + +.py-prototype +{ + background-color: #ddddff; +} + + +.python-interpreter +{ + background-color: #ddddff; +} + +.doit +{ + border: thin blue dashed ; + background-color: #0ef +} + +.py-src-comment +{ + color: #1111CC +} + +.py-src-keyword +{ + color: #3333CC; + font-weight: bold; +} + +.py-src-parameter +{ + color: #000066; + font-weight: bold; +} + +.py-src-identifier +{ + color: #CC0000 +} + +.py-src-string +{ + + color: #115511 +} + +.py-src-endmarker +{ + display: block; /* IE hack; prevents following line from being sucked into the py-listing box. */ +} + +.py-listing, .html-listing, .listing +{ + margin: 1ex; + border: thin solid black; + background-color: #eee; +} + +.py-listing pre, .html-listing pre, .listing pre +{ + margin: 0px; + border: none; + border-bottom: thin solid black; +} + +.py-listing .python +{ + margin-top: 0; + margin-bottom: 0; + border: none; + border-bottom: thin solid black; + } + +.html-listing .htmlsource +{ + margin-top: 0; + margin-bottom: 0; + border: none; + border-bottom: thin solid black; + } + +.caption +{ + text-align: center; + padding-top: 0.5em; + padding-bottom: 0.5em; +} + +.filename +{ + font-style: italic; + } + +.manhole-output +{ + color: blue; +} + +hr +{ + display: inline; + } + +ul +{ + padding: 0px; + margin: 0px; + margin-left: 1em; + padding-left: 1em; + border-left: 1em; + } + +li +{ + padding: 2px; + } + +dt +{ + font-weight: bold; + margin-left: 1ex; + } + +dd +{ + margin-bottom: 1em; + } + +div.note +{ + background-color: #FFFFCC; + margin-top: 1ex; + margin-left: 5%; + margin-right: 5%; + padding-top: 1ex; + padding-left: 5%; + padding-right: 5%; + border: thin black solid; +} diff --git a/src/foolscap/doc/template.tpl b/src/foolscap/doc/template.tpl new file mode 100644 index 00000000..62c88670 --- /dev/null +++ b/src/foolscap/doc/template.tpl @@ -0,0 +1,23 @@ + + + + + +Twisted Documentation: + + + + +

+
+
+ +
+ +

Index

+ Version: + + + diff --git a/src/foolscap/doc/using-pb.xhtml b/src/foolscap/doc/using-pb.xhtml index 7f721188..f7e8c457 100644 --- a/src/foolscap/doc/using-pb.xhtml +++ b/src/foolscap/doc/using-pb.xhtml @@ -128,6 +128,14 @@ established since last startup. If you have no parent to attach it to, you can use startService and stopService on the Tub directly.

+

Note that you must start the Tub before calling getReference +or connectTo, since both of these trigger network activity, and +Tubs are supposed to be silent until they are started. In a future release +this requirement may be relaxed, but the principle of "no network activity +until the Tub is started" will be maintained, probably by queueing the +getReference calls and handling them after the Tub been +started.

+

Making your Tub remotely accessible

To make any of your Referenceables available, you must make @@ -256,6 +264,7 @@ base="foolscap.pb.Tub">getReference:

from foolscap import Tub tub = Tub() +tub.startService() d = tub.getReference("pb://ABCD@myhost.example.com:12345/math-service") def gotReference(remote): print "Got the RemoteReference:", remote @@ -725,10 +734,10 @@ metaclass magic processes all of these attributes only once, immediately after the RemoteInterface body has been evaluated.

The RemoteInterface class has a name. Normally this is -the fully-qualified classnameRIFoo.__class__.__name__, if RemoteInterfaces were actually classes, which they're -not, like package.module.RIFoo. You can override this +not. You can override this name by setting a special __remote_name__ attribute on the RemoteInterface (again, in the body). This name is important because it is externally visible: all RemoteReferences that @@ -736,12 +745,28 @@ point at your Referenceables will remember the name of the RemoteInterfaces it implements. This is what enables the type-checking to be performed on both ends of the wire.

+

In the future, this ought to default to the fully-qualified +classname (like package.module.RIFoo), so that two +RemoteInterfaces with the same name in different modules can co-exist. In the +current release, these two RemoteInterfaces will collide (and provoke an +import-time error message complaining about the duplicate name). As a result, +if you have such classes (e.g. foo.RIBar and +baz.RIBar), you must use __remote_name__ to +distinguish them (by naming one of them something other than +RIBar to avoid this error. + +Hopefully this will be improved in a future version, but it looks like a +difficult change to implement, so the standing recommendation is to use +__remote_name__ on all your RemoteInterfaces, and set it to a +suitably unique string (like a URI).

+

Here's an example:

 from foolscap import RemoteInterface, schema
 
 class RIMath(RemoteInterface):
+    __remote_name__ = "RIMath.using-pb.docs.foolscap.twistedmatrix.com"
     def add(a=int, b=int):
         return int
     # declare it with an attribute instead of a function definition
diff --git a/src/foolscap/foolscap/banana.py b/src/foolscap/foolscap/banana.py
index 210c2a55..917f2265 100644
--- a/src/foolscap/foolscap/banana.py
+++ b/src/foolscap/foolscap/banana.py
@@ -527,8 +527,7 @@ class Banana(protocol.Protocol):
 
     def sendError(self, msg):
         if len(msg) > SIZE_LIMIT:
-            raise BananaError, \
-                  "error string is too long to send (%d)" % len(msg)
+            msg = msg[:SIZE_LIMIT-10] + "..."
         int2b128(len(msg), self.transport.write)
         self.transport.write(ERROR)
         self.transport.write(msg)
@@ -711,28 +710,27 @@ class Banana(protocol.Protocol):
                     (repr(buffer),))
             self.buffer = buffer
             pos = 0
+
             for ch in buffer:
                 if ch >= HIGH_BIT_SET:
                     break
                 pos = pos + 1
-                # TODO: the 'pos > 64' test should probably move here. If
-                # not, a huge chunk will consume more CPU than it needs to.
-                # On the other hand, the test would consume extra CPU all
-                # the time.
-            else:
                 if pos > 64:
-                    # drop the connection
+                    # drop the connection. We log more of the buffer, but not
+                    # all of it, to make it harder for someone to spam our
+                    # logs.
                     raise BananaError("token prefix is limited to 64 bytes: "
-                                      "but got %r" % (buffer[:pos],))
-                return # still waiting for header to finish
+                                      "but got %r" % (buffer[:200],))
+            else:
+                # we've run out of buffer without seeing the high bit, which
+                # means we're still waiting for header to finish
+                return
+            assert pos <= 64
 
             # At this point, the header and type byte have been received.
             # The body may or may not be complete.
 
             typebyte = buffer[pos]
-            if pos > 64:
-                # redundant?
-                raise BananaError("token prefix is limited to 64 bytes")
             if pos:
                 header = b1282int(buffer[:pos])
             else:
diff --git a/src/foolscap/foolscap/broker.py b/src/foolscap/foolscap/broker.py
index b15c362c..1977f4d6 100644
--- a/src/foolscap/foolscap/broker.py
+++ b/src/foolscap/foolscap/broker.py
@@ -257,9 +257,9 @@ class Broker(banana.Banana, referenceable.Referenceable):
         self.yourReferenceByURL = {}
         self.myGifts = {}
         self.myGiftsByGiftID = {}
-        dw, self.disconnectWatchers = self.disconnectWatchers, []
-        for (cb,args,kwargs) in dw:
+        for (cb,args,kwargs) in self.disconnectWatchers:
             eventually(cb, *args, **kwargs)
+        self.disconnectWatchers = []
         banana.Banana.connectionLost(self, why)
         if self.tub:
             # TODO: remove the conditional. It is only here to accomodate
@@ -268,10 +268,26 @@ class Broker(banana.Banana, referenceable.Referenceable):
 
     def notifyOnDisconnect(self, callback, *args, **kwargs):
         marker = (callback, args, kwargs)
-        self.disconnectWatchers.append(marker)
+        if self.disconnected:
+            eventually(callback, *args, **kwargs)
+        else:
+            self.disconnectWatchers.append(marker)
         return marker
     def dontNotifyOnDisconnect(self, marker):
-        self.disconnectWatchers.remove(marker)
+        if self.disconnected:
+            return
+        # be tolerant of attempts to unregister a callback that has already
+        # fired. I think it is hard to write safe code without this
+        # tolerance.
+
+        # TODO: on the other hand, I'm not sure this is the best policy,
+        # since you lose the feedback that tells you about
+        # unregistering-the-wrong-thing bugs. We need to look at the way that
+        # register/unregister gets used and see if there is a way to retain
+        # the typechecking that results from insisting that you can only
+        # remove something that was stil in the list.
+        if marker in self.disconnectWatchers:
+            self.disconnectWatchers.remove(marker)
 
     # methods to handle RemoteInterfaces
     def getRemoteInterfaceByName(self, name):
@@ -361,9 +377,12 @@ class Broker(banana.Banana, referenceable.Referenceable):
             d = rb.callRemote("decref", clid=tracker.clid, count=count)
             # if the connection was lost before we can get an ack, we're
             # tearing this down anyway
-            d.addErrback(lambda f: f.trap(DeadReferenceError))
-            d.addErrback(lambda f: f.trap(error.ConnectionLost))
-            d.addErrback(lambda f: f.trap(error.ConnectionDone))
+            def _ignore_loss(f):
+                f.trap(DeadReferenceError,
+                       error.ConnectionLost,
+                       error.ConnectionDone)
+                return None
+            d.addErrback(_ignore_loss)
             # once the ack comes back, or if we know we'll never get one,
             # release the tracker
             d.addCallback(self.freeYourReferenceTracker, tracker)
diff --git a/src/foolscap/foolscap/call.py b/src/foolscap/foolscap/call.py
index 5e2ba3df..8ed3af4b 100644
--- a/src/foolscap/foolscap/call.py
+++ b/src/foolscap/foolscap/call.py
@@ -1,13 +1,11 @@
 
-from cStringIO import StringIO
-
 from twisted.python import failure, log, reflect
 from twisted.internet import defer
 
 from foolscap import copyable, slicer, tokens
 from foolscap.eventual import eventually
 from foolscap.copyable import AttributeDictConstraint
-from foolscap.constraint import StringConstraint
+from foolscap.constraint import ByteStringConstraint
 from foolscap.slicers.list import ListConstraint
 from tokens import BananaError, Violation
 
@@ -18,10 +16,10 @@ class FailureConstraint(AttributeDictConstraint):
     klass = failure.Failure
 
     def __init__(self):
-        attrs = [('type', StringConstraint(200)),
-                 ('value', StringConstraint(1000)),
-                 ('traceback', StringConstraint(2000)),
-                 ('parents', ListConstraint(StringConstraint(200))),
+        attrs = [('type', ByteStringConstraint(200)),
+                 ('value', ByteStringConstraint(1000)),
+                 ('traceback', ByteStringConstraint(2000)),
+                 ('parents', ListConstraint(ByteStringConstraint(200))),
                  ]
         AttributeDictConstraint.__init__(self, *attrs)
 
@@ -188,7 +186,10 @@ class InboundDelivery:
                 (self.reqID, self.obj, self.methodname))
         log.msg(" args=%s" % (self.allargs.args,))
         log.msg(" kwargs=%s" % (self.allargs.kwargs,))
-        stack = f.getTraceback()
+        if isinstance(f.type, str):
+            stack = "getTraceback() not available for string exceptions\n"
+        else:
+            stack = f.getTraceback()
         # TODO: trim stack to everything below Broker._doCall
         stack = "LOCAL: " + stack.replace("\n", "\nLOCAL: ")
         log.msg(" the failure was:")
@@ -709,18 +710,28 @@ class FailureSlicer(slicer.BaseSlicer):
         #state['stack'] = []
 
         state = {}
+        # string exceptions show up as obj.value == None and
+        # isinstance(obj.type, str). Normal exceptions show up as obj.value
+        # == text and obj.type == exception class. We need to make sure we
+        # can handle both.
         if isinstance(obj.value, failure.Failure):
             # TODO: how can this happen? I got rid of failure2Copyable, so
             # if this case is possible, something needs to replace it
             raise RuntimeError("not implemented yet")
             #state['value'] = failure2Copyable(obj.value, banana.unsafeTracebacks)
+        elif isinstance(obj.type, str):
+            state['value'] = str(obj.value)
+            state['type'] = obj.type # a string
         else:
             state['value'] = str(obj.value) # Exception instance
-        state['type'] = reflect.qual(obj.type) # Exception class
+            state['type'] = reflect.qual(obj.type) # Exception class
+
         if broker.unsafeTracebacks:
-            io = StringIO()
-            obj.printTraceback(io)
-            state['traceback'] = io.getvalue()
+            if isinstance(obj.type, str):
+                stack = "getTraceback() not available for string exceptions\n"
+            else:
+                stack = obj.getTraceback()
+            state['traceback'] = stack
             # TODO: provide something with globals and locals and HTML and
             # all that cool stuff
         else:
diff --git a/src/foolscap/foolscap/constraint.py b/src/foolscap/foolscap/constraint.py
index c74df04e..e8c0b360 100644
--- a/src/foolscap/foolscap/constraint.py
+++ b/src/foolscap/foolscap/constraint.py
@@ -105,7 +105,8 @@ class Constraint:
         limit = self.taster.get(typebyte, "not in list")
         if limit == "not in list":
             if self.strictTaster:
-                raise BananaError("invalid token type")
+                raise BananaError("invalid token type: %s" %
+                                  tokenNames[typebyte])
             else:
                 raise Violation("%s token rejected by %s" % \
                                 (tokenNames[typebyte], self.name))
@@ -221,9 +222,9 @@ class Any(Constraint):
 
 # constraints which describe individual banana tokens
 
-class StringConstraint(Constraint):
+class ByteStringConstraint(Constraint):
     opentypes = [] # redundant, as taster doesn't accept OPEN
-    name = "StringConstraint"
+    name = "ByteStringConstraint"
 
     def __init__(self, maxLength=1000, minLength=0, regexp=None):
         self.maxLength = maxLength
@@ -236,9 +237,10 @@ class StringConstraint(Constraint):
             self.regexp = re.compile(regexp)
         self.taster = {STRING: self.maxLength,
                        VOCAB: None}
+
     def checkObject(self, obj, inbound):
-        if not isinstance(obj, (str, unicode)):
-            raise Violation("not a String")
+        if not isinstance(obj, str):
+            raise Violation("not a bytestring")
         if self.maxLength != None and len(obj) > self.maxLength:
             raise Violation("string too long (%d > %d)" %
                             (len(obj), self.maxLength))
diff --git a/src/foolscap/foolscap/copyable.py b/src/foolscap/foolscap/copyable.py
index 8259ee50..70bcbb32 100644
--- a/src/foolscap/foolscap/copyable.py
+++ b/src/foolscap/foolscap/copyable.py
@@ -10,7 +10,7 @@ from twisted.internet import defer
 import slicer, tokens
 from tokens import BananaError, Violation
 from foolscap.constraint import OpenerConstraint, IConstraint, \
-     StringConstraint, UnboundedSchema, Optional
+     ByteStringConstraint, UnboundedSchema, Optional
 
 Interface = interface.Interface
 
@@ -380,7 +380,7 @@ class AttributeDictConstraint(OpenerConstraint):
         seen.append(self)
         total = self.OPENBYTES("attributedict")
         for name, constraint in self.keys.iteritems():
-            total += StringConstraint(len(name)).maxSize(seen)
+            total += ByteStringConstraint(len(name)).maxSize(seen)
             total += constraint.maxSize(seen[:])
         return total
 
diff --git a/src/foolscap/foolscap/negotiate.py b/src/foolscap/foolscap/negotiate.py
index 0ba1fe42..f4a3e95b 100644
--- a/src/foolscap/foolscap/negotiate.py
+++ b/src/foolscap/foolscap/negotiate.py
@@ -6,8 +6,9 @@ from twisted.internet import protocol, reactor
 
 from foolscap import broker, referenceable, vocab
 from foolscap.eventual import eventually
-from foolscap.tokens import BananaError, \
-     NegotiationError, RemoteNegotiationError
+from foolscap.tokens import SIZE_LIMIT, ERROR, \
+     BananaError, NegotiationError, RemoteNegotiationError
+from foolscap.banana import int2b128
 
 crypto_available = False
 try:
@@ -36,7 +37,7 @@ def check_inrange(my_min, my_max, decision, name):
         raise NegotiationError("I can't handle %s %d" % (name, decision))
 
 # negotiation phases
-PLAINTEXT, ENCRYPTED, DECIDING, ABANDONED = range(4)
+PLAINTEXT, ENCRYPTED, DECIDING, BANANA, ABANDONED = range(5)
 
 
 # version number history:
@@ -141,7 +142,8 @@ class Negotiation(protocol.Protocol):
     tub = None
     theirTubID = None
 
-    phase = PLAINTEXT
+    receive_phase = PLAINTEXT # we are expecting this
+    send_phase = PLAINTEXT # the other end is expecting this
     encrypted = False
 
     doNegotiation = True
@@ -291,7 +293,7 @@ class Negotiation(protocol.Protocol):
             self.switchToBanana({})
 
     def connectionMadeClient(self):
-        assert self.phase == PLAINTEXT
+        assert self.receive_phase == PLAINTEXT
         # the client needs to send the HTTP-compatible tubid GET,
         # along with the TLS upgrade request
         self.sendPlaintextClient()
@@ -323,6 +325,8 @@ class Negotiation(protocol.Protocol):
         req.append("Connection: Upgrade")
         self.transport.write("\r\n".join(req))
         self.transport.write("\r\n\r\n")
+        # the next thing the other end expects to see is the encrypted phase
+        self.send_phase = ENCRYPTED
 
     def connectionMadeServer(self):
         # the server just waits for the GET message to arrive, but set up the
@@ -353,8 +357,8 @@ class Negotiation(protocol.Protocol):
     def dataReceived(self, chunk):
         if self.debugNegotiation:
             log.msg("dataReceived(isClient=%s,phase=%s,options=%s): '%s'"
-                    % (self.isClient, self.phase, self.options, chunk))
-        if self.phase == ABANDONED:
+                    % (self.isClient, self.receive_phase, self.options, chunk))
+        if self.receive_phase == ABANDONED:
             return
 
         self.buffer += chunk
@@ -371,24 +375,27 @@ class Negotiation(protocol.Protocol):
             if eoh == -1:
                 return
             header, self.buffer = self.buffer[:eoh], self.buffer[eoh+4:]
-            if self.phase == PLAINTEXT:
+            if self.receive_phase == PLAINTEXT:
                 if self.isClient:
                     self.handlePLAINTEXTClient(header)
                 else:
                     self.handlePLAINTEXTServer(header)
-            elif self.phase == ENCRYPTED:
+            elif self.receive_phase == ENCRYPTED:
                 self.handleENCRYPTED(header)
-            elif self.phase == DECIDING:
+            elif self.receive_phase == DECIDING:
                 self.handleDECIDING(header)
             else:
                 assert 0, "should not get here"
-            # there might be some leftover data for the next phase
-            self.dataReceived("")
+            # there might be some leftover data for the next phase.
+            # self.buffer will be emptied when we switchToBanana, so in that
+            # case we won't call the wrong dataReceived.
+            if self.buffer:
+                self.dataReceived("")
 
         except Exception, e:
             why = Failure()
             if self.debugNegotiation:
-                log.msg("negotation had exception: %s" % why)
+                log.msg("negotiation had exception: %s" % why)
             if isinstance(e, RemoteNegotiationError):
                 pass # they've already hung up
             else:
@@ -397,24 +404,33 @@ class Negotiation(protocol.Protocol):
                 if isinstance(e, NegotiationError):
                     errmsg = str(e)
                 else:
+                    log.msg("negotiation had internal error:")
+                    log.msg(why)
                     errmsg = "internal server error, see logs"
                 errmsg = errmsg.replace("\n", " ").replace("\r", " ")
-                if self.phase == PLAINTEXT:
+                if self.send_phase == PLAINTEXT:
                     resp = ("HTTP/1.1 500 Internal Server Error: %s\r\n\r\n"
                             % errmsg)
                     self.transport.write(resp)
-                elif self.phase in (ENCRYPTED, DECIDING):
+                elif self.send_phase in (ENCRYPTED, DECIDING):
                     block = {'banana-decision-version': 1,
                              'error': errmsg,
                              }
                     self.sendBlock(block)
+                elif self.send_phase == BANANA:
+                    self.sendBananaError(errmsg)
+
             self.failureReason = why
             self.transport.loseConnection()
             return
 
-        # TODO: the error-handling needs some work, try to tell the other end
-        # what happened. In certain states we may need to send a header
-        # block, in others we may have to send a banana ERROR token.
+    def sendBananaError(self, msg):
+        if len(msg) > SIZE_LIMIT:
+            msg = msg[:SIZE_LIMIT-10] + "..."
+        int2b128(len(msg), self.transport.write)
+        self.transport.write(ERROR)
+        self.transport.write(msg)
+        # now you should drop the connection
 
     def connectionLost(self, reason):
         # force connectionMade to happen, so connectionLost can occur
@@ -430,7 +446,7 @@ class Negotiation(protocol.Protocol):
         if self.isClient:
             l = self.tub.options.get("debug_gatherPhases")
             if l is not None:
-                l.append(self.phase)
+                l.append(self.receive_phase)
         if not self.failureReason:
             self.failureReason = reason
         self.negotiationFailed()
@@ -513,6 +529,8 @@ class Negotiation(protocol.Protocol):
                                 ])
         self.transport.write(resp)
         self.transport.write("\r\n\r\n")
+        # the next thing they expect is the encrypted block
+        self.send_phase = ENCRYPTED
         self.startENCRYPTED(encrypted)
 
     def sendRedirect(self, redirect):
@@ -553,7 +571,7 @@ class Negotiation(protocol.Protocol):
             self.startTLS(self.tub.myCertificate)
         self.encrypted = encrypted
         # TODO: can startTLS trigger dataReceived?
-        self.phase = ENCRYPTED
+        self.receive_phase = ENCRYPTED
         self.sendHello()
 
     def sendHello(self):
@@ -721,7 +739,9 @@ class Negotiation(protocol.Protocol):
         decision, params = None, None
 
         if iAmTheMaster:
-            # we get to decide everything
+            # we get to decide everything. The other side is now waiting for
+            # a decision block.
+            self.send_phase = DECIDING
             decision = {}
             # combine their 'offer' and our own self.negotiationOffer to come
             # up with a 'decision' to be sent back to the other end, and the
@@ -756,8 +776,9 @@ class Negotiation(protocol.Protocol):
                        }
 
         else:
-            # otherwise, the other side gets to decide
-            pass
+            # otherwise, the other side gets to decide. The next thing they
+            # expect to hear from us is banana.
+            self.send_phase = BANANA
 
 
         if iAmTheMaster:
@@ -769,7 +790,7 @@ class Negotiation(protocol.Protocol):
             self.sendDecision(decision, params)
         else:
             # I am not the master, I receive the decision
-            self.phase = DECIDING
+            self.receive_phase = DECIDING
 
     def evaluateNegotiationVersion2(self, offer):
         # version 2 changes the meaning of reqID=0 in a 'call' sequence, to
@@ -792,6 +813,7 @@ class Negotiation(protocol.Protocol):
                                        self.sendDecision, decision, params):
             return
         self.sendBlock(decision)
+        self.send_phase = BANANA
         self.switchToBanana(params)
 
     def handleDECIDING(self, header):
@@ -916,7 +938,8 @@ class Negotiation(protocol.Protocol):
         b.setTub(self.tub)
         self.transport.protocol = b
         b.makeConnection(self.transport)
-        b.dataReceived(self.buffer)
+        buf, self.buffer = self.buffer, "" # empty our buffer, just in case
+        b.dataReceived(buf) # and hand it to the new protocol
 
         # if we were created as a client, we'll have a TubConnector. Let them
         # know that this connection has succeeded, so they can stop any other
@@ -940,16 +963,16 @@ class Negotiation(protocol.Protocol):
             # track down
             log.msg("Negotiation.negotiationFailed: %s" % reason)
         self.stopNegotiationTimer()
-        if self.phase != ABANDONED and self.isClient:
+        if self.receive_phase != ABANDONED and self.isClient:
             eventually(self.connector.negotiationFailed, self.factory, reason)
-        self.phase = ABANDONED
+        self.receive_phase = ABANDONED
         cb = self.options.get("debug_negotiationFailed_cb")
         if cb:
             # note that this gets called with a NegotiationError, not a
             # Failure
             eventually(cb, reason)
 
-# TODO: make sure code that examines self.phase handles ABANDONED
+# TODO: make sure code that examines self.receive_phase handles ABANDONED
 
 class TubConnectorClientFactory(protocol.ClientFactory, object):
     # this is for internal use only. Application code should use
@@ -1022,7 +1045,7 @@ class TubConnector:
         self.tub = parent
         self.target = tubref
         self.remainingLocations = self.target.getLocations()
-        # attemptedLocations keeps track of where we've already try to
+        # attemptedLocations keeps track of where we've already tried to
         # connect, so we don't try them twice.
         self.attemptedLocations = []
 
@@ -1052,9 +1075,14 @@ class TubConnector:
 
     def shutdown(self):
         self.active = False
+        self.remainingLocations = []
         self.stopConnectionTimer()
         for c in self.pendingConnections.values():
             c.disconnect()
+        # as each disconnect() finishes, it will either trigger our
+        # clientConnectionFailed or our negotiationFailed methods, both of
+        # which will trigger checkForIdle, and the last such message will
+        # invoke self.tub.connectorFinished()
 
     def connectToAll(self):
         while self.remainingLocations:
diff --git a/src/foolscap/foolscap/pb.py b/src/foolscap/foolscap/pb.py
index 49865f8c..e8ed1675 100644
--- a/src/foolscap/foolscap/pb.py
+++ b/src/foolscap/foolscap/pb.py
@@ -4,6 +4,7 @@ import os.path, weakref
 from zope.interface import implements
 from twisted.internet import defer, protocol
 from twisted.application import service, strports
+from twisted.python import log
 
 from foolscap import ipb, base32, negotiate, broker, observer
 from foolscap.referenceable import SturdyRef
@@ -360,6 +361,17 @@ class Tub(service.MultiService):
         assert self.running
         self._activeConnectors.append(c)
     def connectorFinished(self, c):
+        if c not in self._activeConnectors:
+            # TODO: I've seen this happen, but I can't figure out how it
+            # could possibly happen. Log and ignore rather than exploding
+            # when we try to do .remove, since this whole connector-tracking
+            # thing is mainly for the benefit of the unit tests (applications
+            # which never shut down a Tub aren't going to care), and it is
+            # more important to let application code run normally than to
+            # force an error here.
+            log.msg("Tub.connectorFinished: WEIRD, %s is not in %s"
+                    % (c, self._activeConnectors))
+            return
         self._activeConnectors.remove(c)
         if not self.running and not self._activeConnectors:
             self._allConnectorsAreFinished.fire(self)
@@ -505,8 +517,14 @@ class Tub(service.MultiService):
     def getReference(self, sturdyOrURL):
         """Acquire a RemoteReference for the given SturdyRef/URL.
 
+        The Tub must be running (i.e. Tub.startService()) when this is
+        invoked. Future releases may relax this requirement.
+
         @return: a Deferred that fires with the RemoteReference
         """
+
+        assert self.running
+
         if isinstance(sturdyOrURL, SturdyRef):
             sturdy = sturdyOrURL
         else:
@@ -541,6 +559,9 @@ class Tub(service.MultiService):
         connection goes away. At some point after it goes away, the
         Reconnector will reconnect.
 
+        The Tub must be running (i.e. Tub.startService()) when this is
+        invoked. Future releases may relax this requirement.
+
         I return a Reconnector object. When you no longer want to maintain
         this connection, call the stopConnecting() method on the Reconnector.
         I promise to not invoke your callback after you've called
@@ -561,6 +582,7 @@ class Tub(service.MultiService):
          rc.stopConnecting() # later
         """
 
+        assert self.running
         rc = Reconnector(self, sturdyOrURL, cb, *args, **kwargs)
         self.reconnectors.append(rc)
         return rc
diff --git a/src/foolscap/foolscap/reconnector.py b/src/foolscap/foolscap/reconnector.py
index 93e8cb95..d22ab3e8 100644
--- a/src/foolscap/foolscap/reconnector.py
+++ b/src/foolscap/foolscap/reconnector.py
@@ -79,18 +79,23 @@ class Reconnector:
         cb(rref, *args, **kwargs)
 
     def _failed(self, f):
-        # I'd like to trap NegotiationError and basic TCP connection
-        # failures here, but not hide coding errors.
-        if self.verbose:
-            log.msg("Reconnector._failed: %s" % f)
+        # I'd like to quietly handle "normal" problems (basically TCP
+        # failures and NegotiationErrors that result from the target either
+        # not speaking Foolscap or not hosting the Tub that we want), but not
+        # hide coding errors or version mismatches.
+        log_it = self.verbose
+
         # log certain unusual errors, even without self.verbose, to help
         # people figure out why their reconnectors aren't connecting, since
         # the usual getReference errback chain isn't active. This doesn't
         # include ConnectError (which is a parent class of
-        # ConnectionRefusedError)
+        # ConnectionRefusedError), so it won't fire if we just had a bad
+        # host/port, since the way we use connection hints will provoke that
+        # all the time.
         if f.check(RemoteNegotiationError, NegotiationError):
-            if not self.verbose:
-                log.msg("Reconnector._failed: %s" % f)
+            log_it = True
+        if log_it:
+            log.msg("Reconnector._failed (furl=%s): %s" % (self._url, f))
         if not self._active:
             return
         self._delay = min(self._delay * self.factor, self.maxDelay)
diff --git a/src/foolscap/foolscap/referenceable.py b/src/foolscap/foolscap/referenceable.py
index baaa7ab8..283705a7 100644
--- a/src/foolscap/foolscap/referenceable.py
+++ b/src/foolscap/foolscap/referenceable.py
@@ -16,7 +16,7 @@ from twisted.python import failure
 from foolscap import ipb, slicer, tokens, call
 BananaError = tokens.BananaError
 Violation = tokens.Violation
-from foolscap.constraint import IConstraint, StringConstraint
+from foolscap.constraint import IConstraint, ByteStringConstraint
 from foolscap.remoteinterface import getRemoteInterface, \
      getRemoteInterfaceByName, RemoteInterfaceConstraint
 from foolscap.schema import constraintMap
@@ -187,8 +187,8 @@ class ReferenceUnslicer(slicer.BaseUnslicer):
     clid = None
     interfaceName = None
     url = None
-    inameConstraint = StringConstraint(200) # TODO: only known RI names?
-    urlConstraint = StringConstraint(200)
+    inameConstraint = ByteStringConstraint(200) # TODO: only known RI names?
+    urlConstraint = ByteStringConstraint(200)
 
     def checkToken(self, typebyte, size):
         if self.state == 0:
@@ -599,7 +599,7 @@ class TheirReferenceUnslicer(slicer.LeafUnslicer):
     state = 0
     giftID = None
     url = None
-    urlConstraint = StringConstraint(200)
+    urlConstraint = ByteStringConstraint(200)
 
     def checkToken(self, typebyte, size):
         if self.state == 0:
@@ -693,7 +693,7 @@ class SturdyRef(Copyable, RemoteCopy):
                 self.tubID = None
                 self.location = url[:slash]
             else:
-                raise ValueError("unknown PB-URL prefix in '%s'" % url)
+                raise ValueError("unknown FURL prefix in %r" % (url,))
 
     def getTubRef(self):
         if self.encrypted:
diff --git a/src/foolscap/foolscap/remoteinterface.py b/src/foolscap/foolscap/remoteinterface.py
index 7114407a..46df3616 100644
--- a/src/foolscap/foolscap/remoteinterface.py
+++ b/src/foolscap/foolscap/remoteinterface.py
@@ -18,7 +18,7 @@ class RemoteInterfaceClass(interface.InterfaceClass):
      __remote_name__: can be set to a string to specify the globally-unique
                       name for this interface. This should be a URL in a
                       namespace you administer. If not set, defaults to the
-                      fully qualified classname.
+                      short classname.
 
     RIFoo.names() returns the list of remote method names.
 
diff --git a/src/foolscap/foolscap/schema.py b/src/foolscap/foolscap/schema.py
index 50405cea..727ad133 100644
--- a/src/foolscap/foolscap/schema.py
+++ b/src/foolscap/foolscap/schema.py
@@ -58,9 +58,10 @@ modifiers:
 from foolscap.tokens import Violation, UnknownSchemaType
 
 # make constraints available in a single location
-from foolscap.constraint import Constraint, Any, StringConstraint, \
+from foolscap.constraint import Constraint, Any, ByteStringConstraint, \
      IntegerConstraint, NumberConstraint, \
      UnboundedSchema, IConstraint, Optional, Shared
+from foolscap.slicers.unicode import UnicodeConstraint
 from foolscap.slicers.bool import BooleanConstraint
 from foolscap.slicers.dict import DictConstraint
 from foolscap.slicers.list import ListConstraint
@@ -69,10 +70,10 @@ from foolscap.slicers.tuple import TupleConstraint
 from foolscap.slicers.none import Nothing
 #  we don't import RemoteMethodSchema from remoteinterface.py, because
 #  remoteinterface.py needs to import us (for addToConstraintTypeMap)
-ignored = [Constraint, Any, StringConstraint, IntegerConstraint,
-           NumberConstraint, BooleanConstraint, DictConstraint,
-           ListConstraint, SetConstraint, TupleConstraint, Nothing,
-           Optional, Shared,
+ignored = [Constraint, Any, ByteStringConstraint, UnicodeConstraint,
+           IntegerConstraint, NumberConstraint, BooleanConstraint,
+           DictConstraint, ListConstraint, SetConstraint, TupleConstraint,
+           Nothing, Optional, Shared,
            ] # hush pyflakes
 
 # convenience shortcuts
@@ -83,6 +84,14 @@ DictOf = DictConstraint
 SetOf = SetConstraint
 
 
+# note: using PolyConstraint (aka ChoiceOf) for inbound tasting is probably
+# not fully vetted. One of the issues would be with something like
+# ListOf(ChoiceOf(TupleOf(stuff), SetOf(stuff))). The ListUnslicer, when
+# handling an inbound Tuple, will do
+# TupleUnslicer.setConstraint(polyconstraint), since that's all it really
+# knows about, and the TupleUnslicer will then try to look inside the
+# polyconstraint for attributes that talk about tuples, and might fail.
+
 class PolyConstraint(Constraint):
     name = "PolyConstraint"
 
@@ -123,9 +132,17 @@ class PolyConstraint(Constraint):
 
 ChoiceOf = PolyConstraint
 
+def AnyStringConstraint(*args, **kwargs):
+    return ChoiceOf(ByteStringConstraint(*args, **kwargs),
+                    UnicodeConstraint(*args, **kwargs))
+
+# keep the old meaning, for now. Eventually StringConstraint should become an
+# AnyStringConstraint
+StringConstraint = ByteStringConstraint
 
 constraintMap = {
-    str: StringConstraint(),
+    str: ByteStringConstraint(),
+    unicode: UnicodeConstraint(),
     bool: BooleanConstraint(),
     int: IntegerConstraint(),
     long: IntegerConstraint(maxBytes=1024),
diff --git a/src/foolscap/foolscap/slicers/unicode.py b/src/foolscap/foolscap/slicers/unicode.py
index 97c65e61..94d79e3f 100644
--- a/src/foolscap/foolscap/slicers/unicode.py
+++ b/src/foolscap/foolscap/slicers/unicode.py
@@ -1,9 +1,10 @@
 # -*- test-case-name: foolscap.test.test_banana -*-
 
+import re
 from twisted.internet.defer import Deferred
-from foolscap.constraint import Any, StringConstraint
-from foolscap.tokens import BananaError, STRING
+from foolscap.tokens import BananaError, STRING, VOCAB, Violation
 from foolscap.slicer import BaseSlicer, LeafUnslicer
+from foolscap.constraint import OpenerConstraint, Any, UnboundedSchema
 
 class UnicodeSlicer(BaseSlicer):
     opentype = ("unicode",)
@@ -20,14 +21,14 @@ class UnicodeUnslicer(LeafUnslicer):
     def setConstraint(self, constraint):
         if isinstance(constraint, Any):
             return
-        assert isinstance(constraint, StringConstraint)
+        assert isinstance(constraint, UnicodeConstraint)
         self.constraint = constraint
 
     def checkToken(self, typebyte, size):
-        if typebyte != STRING:
+        if typebyte not in (STRING, VOCAB):
             raise BananaError("UnicodeUnslicer only accepts strings")
-        if self.constraint:
-            self.constraint.checkToken(typebyte, size)
+        #if self.constraint:
+        #    self.constraint.checkToken(typebyte, size)
 
     def receiveChild(self, obj, ready_deferred=None):
         assert not isinstance(obj, Deferred)
@@ -40,3 +41,53 @@ class UnicodeUnslicer(LeafUnslicer):
         return self.string, None
     def describe(self):
         return ""
+
+class UnicodeConstraint(OpenerConstraint):
+    """The object must be a unicode object. The maxLength and minLength
+    parameters restrict the number of characters (code points, *not* bytes)
+    that may be present in the object, which means that the on-wire (UTF-8)
+    representation may take up to 6 times as many bytes as characters.
+    """
+
+    strictTaster = True
+    opentypes = [("unicode",)]
+    name = "UnicodeConstraint"
+
+    def __init__(self, maxLength=1000, minLength=0, regexp=None):
+        self.maxLength = maxLength
+        self.minLength = minLength
+        # allow VOCAB in case the Banana-level tokenizer decides to tokenize
+        # the UTF-8 encoded body of a unicode object, since this is just as
+        # likely as tokenizing regular bytestrings. TODO: this is disabled
+        # because it doesn't currently work.. once I remember how Constraints
+        # work, I'll fix this. The current version is too permissive of
+        # tokens.
+        #self.taster = {STRING: 6*self.maxLength,
+        #               VOCAB: None}
+        # regexp can either be a string or a compiled SRE_Match object..
+        # re.compile appears to notice SRE_Match objects and pass them
+        # through unchanged.
+        self.regexp = None
+        if regexp:
+            self.regexp = re.compile(regexp)
+
+    def checkObject(self, obj, inbound):
+        if not isinstance(obj, unicode):
+            raise Violation("not a String")
+        if self.maxLength != None and len(obj) > self.maxLength:
+            raise Violation("string too long (%d > %d)" %
+                            (len(obj), self.maxLength))
+        if len(obj) < self.minLength:
+            raise Violation("string too short (%d < %d)" %
+                            (len(obj), self.minLength))
+        if self.regexp:
+            if not self.regexp.search(obj):
+                raise Violation("regexp failed to match")
+
+    def maxSize(self, seen=None):
+        if self.maxLength == None:
+            raise UnboundedSchema
+        return self.OPENBYTES("unicode") + self.maxLength * 6
+
+    def maxDepth(self, seen=None):
+        return 1+1
diff --git a/src/foolscap/foolscap/slicers/vocab.py b/src/foolscap/foolscap/slicers/vocab.py
index c2574f3b..e4d2a44c 100644
--- a/src/foolscap/foolscap/slicers/vocab.py
+++ b/src/foolscap/foolscap/slicers/vocab.py
@@ -1,7 +1,7 @@
 # -*- test-case-name: foolscap.test.test_banana -*-
 
 from twisted.internet.defer import Deferred
-from foolscap.constraint import Any, StringConstraint
+from foolscap.constraint import Any, ByteStringConstraint
 from foolscap.tokens import Violation, BananaError, INT, STRING
 from foolscap.slicer import BaseSlicer, BaseUnslicer, LeafUnslicer
 from foolscap.slicer import BananaUnslicerRegistry
@@ -57,12 +57,12 @@ class ReplaceVocabUnslicer(LeafUnslicer):
     opentype = ('set-vocab',)
     unslicerRegistry = BananaUnslicerRegistry
     maxKeys = None
-    valueConstraint = StringConstraint(100)
+    valueConstraint = ByteStringConstraint(100)
 
     def setConstraint(self, constraint):
         if isinstance(constraint, Any):
             return
-        assert isinstance(constraint, StringConstraint)
+        assert isinstance(constraint, ByteStringConstraint)
         self.valueConstraint = constraint
 
     def start(self, count):
@@ -144,12 +144,12 @@ class AddVocabUnslicer(BaseUnslicer):
     unslicerRegistry = BananaUnslicerRegistry
     index = None
     value = None
-    valueConstraint = StringConstraint(100)
+    valueConstraint = ByteStringConstraint(100)
 
     def setConstraint(self, constraint):
         if isinstance(constraint, Any):
             return
-        assert isinstance(constraint, StringConstraint)
+        assert isinstance(constraint, ByteStringConstraint)
         self.valueConstraint = constraint
 
     def checkToken(self, typebyte, size):
diff --git a/src/foolscap/foolscap/test/common.py b/src/foolscap/foolscap/test/common.py
index 2719239e..b53d0a72 100644
--- a/src/foolscap/foolscap/test/common.py
+++ b/src/foolscap/foolscap/test/common.py
@@ -10,7 +10,8 @@ from foolscap.eventual import eventually, fireEventually, flushEventualQueue
 from foolscap.remoteinterface import getRemoteInterface, RemoteMethodSchema, \
      UnconstrainedMethod
 from foolscap.schema import Any, SetOf, DictOf, ListOf, TupleOf, \
-     NumberConstraint, StringConstraint, IntegerConstraint
+     NumberConstraint, ByteStringConstraint, IntegerConstraint, \
+     UnicodeConstraint
 
 from twisted.python import failure
 from twisted.internet.main import CONNECTION_DONE
@@ -60,12 +61,14 @@ Digits = re.compile("\d*")
 MegaSchema1 = DictOf(str,
                      ListOf(TupleOf(SetOf(int, maxLength=10, mutable=True),
                                     str, bool, int, long, float, None,
+                                    UnicodeConstraint(),
+                                    ByteStringConstraint(),
                                     Any(), NumberConstraint(),
                                     IntegerConstraint(),
-                                    StringConstraint(maxLength=100,
-                                                     minLength=90,
-                                                     regexp="\w+"),
-                                    StringConstraint(regexp=Digits),
+                                    ByteStringConstraint(maxLength=100,
+                                                         minLength=90,
+                                                         regexp="\w+"),
+                                    ByteStringConstraint(regexp=Digits),
                                     ),
                             maxLength=20),
                      maxKeys=5)
@@ -215,6 +218,7 @@ class RIMyTarget(RemoteInterface):
     def getName(): return str
     disputed = RemoteMethodSchema(_response=int, a=int)
     def fail(): return str  # actually raises an exception
+    def failstring(): return str # raises a string exception
 
 class RIMyTarget2(RemoteInterface):
     __remote_name__ = "RIMyTargetInterface2"
@@ -262,6 +266,8 @@ class Target(Referenceable):
         return 24
     def remote_fail(self):
         raise ValueError("you asked me to fail")
+    def remote_failstring(self):
+        raise "string exceptions are annoying"
 
 class TargetWithoutInterfaces(Target):
     # undeclare the RIMyTarget interface
diff --git a/src/foolscap/foolscap/test/test_banana.py b/src/foolscap/foolscap/test/test_banana.py
index 9f72dd32..5951cafe 100644
--- a/src/foolscap/foolscap/test/test_banana.py
+++ b/src/foolscap/foolscap/test/test_banana.py
@@ -1052,8 +1052,7 @@ class DecodeFailureTest(TestBananaMixin, unittest.TestCase):
         # would be a string but the header is too long
         s = "\x01" * 66 + "\x82" + "stupidly long string"
         f = self.shouldDropConnection(s)
-        self.failUnlessEqual(f.value.args[0],
-                             "token prefix is limited to 64 bytes")
+        self.failUnless(f.value.args[0].startswith("token prefix is limited to 64 bytes"))
 
     def testLongHeader2(self):
         # bad string while discarding
@@ -1061,8 +1060,7 @@ class DecodeFailureTest(TestBananaMixin, unittest.TestCase):
         s = bOPEN("errorful",0) + bINT(1) + s + bINT(2) + bCLOSE(0)
         self.banana.mode = "start"
         f = self.shouldDropConnection(s)
-        self.failUnlessEqual(f.value.args[0],
-                             "token prefix is limited to 64 bytes")
+        self.failUnless(f.value.args[0].startswith("token prefix is limited to 64 bytes"))
 
     def testCheckToken1(self):
         # violation raised in top.openerCheckToken
@@ -1275,8 +1273,7 @@ class InboundByteStream(TestBananaMixin, unittest.TestCase):
         self.failUnless(f.value.args[0].startswith("token prefix is limited to 64 bytes"))
         f = self.shouldDropConnection("\x00" * 65 + "\x82")
         self.failUnlessEqual(f.value.where, "")
-        self.failUnlessEqual(f.value.args[0],
-                             "token prefix is limited to 64 bytes")
+        self.failUnless(f.value.args[0].startswith("token prefix is limited to 64 bytes"))
 
         self.check("a", "\x01\x82a")
         self.check("b"*130, "\x02\x01\x82" + "b"*130 + "extra")
diff --git a/src/foolscap/foolscap/test/test_call.py b/src/foolscap/foolscap/test/test_call.py
index f64ad675..04dc4abc 100644
--- a/src/foolscap/foolscap/test/test_call.py
+++ b/src/foolscap/foolscap/test/test_call.py
@@ -2,13 +2,13 @@
 import gc
 import re
 import sets
+import sys
 
 if False:
-    import sys
     from twisted.python import log
     log.startLogging(sys.stderr)
 
-from twisted.python import failure
+from twisted.python import failure, log
 from twisted.internet import reactor, defer
 from twisted.trial import unittest
 from twisted.internet.main import CONNECTION_LOST
@@ -124,6 +124,24 @@ class TestCall(TargetMixin, unittest.TestCase):
         self.failUnlessSubstring("TargetWithoutInterfaces", str(f))
         self.failUnlessSubstring(" has no attribute 'remote_bogus'", str(f))
 
+    def testFailStringException(self):
+        # make sure we handle string exceptions correctly
+        if sys.version_info >= (2,5):
+            log.msg("skipping test: string exceptions are deprecated in 2.5")
+            return
+        rr, target = self.setupTarget(TargetWithoutInterfaces())
+        d = rr.callRemote("failstring")
+        self.failIf(target.calls)
+        d.addBoth(self._testFailStringException_1)
+        return d
+    def _testFailStringException_1(self, f):
+        # f should be a CopiedFailure
+        self.failUnless(isinstance(f, failure.Failure),
+                        "Hey, we didn't fail: %s" % f)
+        self.failUnless(f.check("string exceptions are annoying"),
+                        "wrong exception type: %s" % f)
+
+
     def testCall2(self):
         # server end uses an interface this time, but not the client end
         rr, target = self.setupTarget(Target(), True)
@@ -154,6 +172,8 @@ class TestCall(TargetMixin, unittest.TestCase):
         rr, target = self.setupTarget(HelperTarget())
         t = (sets.Set([1, 2, 3]),
              "str", True, 12, 12L, 19.3, None,
+             u"unicode",
+             "bytestring",
              "any", 14.3,
              15,
              "a"*95,
@@ -291,7 +311,7 @@ class TestCall(TargetMixin, unittest.TestCase):
     testFailWrongReturnLocal.timeout = 2
     def _testFailWrongReturnLocal_1(self, f):
         self.failUnless(f.check(Violation))
-        self.failUnlessSubstring("INT token rejected by StringConstraint",
+        self.failUnlessSubstring("INT token rejected by ByteStringConstraint",
                                  str(f))
         self.failUnlessSubstring("in inbound method results", str(f))
         self.failUnlessSubstring(".Answer(req=1)", str(f))
@@ -305,7 +325,7 @@ class TestCall(TargetMixin, unittest.TestCase):
         return d
     testDefer.timeout = 2
 
-    def testDisconnect1(self):
+    def testDisconnect_during_call(self):
         rr, target = self.setupTarget(HelperTarget())
         d = rr.callRemote("hang")
         e = RuntimeError("lost connection")
@@ -313,13 +333,12 @@ class TestCall(TargetMixin, unittest.TestCase):
         d.addCallbacks(lambda res: self.fail("should have failed"),
                        lambda why: why.trap(RuntimeError) and None)
         return d
-    testDisconnect1.timeout = 2
 
     def disconnected(self, *args, **kwargs):
         self.lost = 1
         self.lost_args = (args, kwargs)
 
-    def testDisconnect2(self):
+    def testNotifyOnDisconnect(self):
         rr, target = self.setupTarget(HelperTarget())
         self.lost = 0
         rr.notifyOnDisconnect(self.disconnected)
@@ -328,20 +347,27 @@ class TestCall(TargetMixin, unittest.TestCase):
         def _check(res):
             self.failUnless(self.lost)
             self.failUnlessEqual(self.lost_args, ((),{}))
+            # it should be safe to unregister now, even though the callback
+            # has already fired, since dontNotifyOnDisconnect is tolerant
+            rr.dontNotifyOnDisconnect(self.disconnected)
         d.addCallback(_check)
         return d
 
-    def testDisconnect3(self):
+    def testNotifyOnDisconnect_unregister(self):
         rr, target = self.setupTarget(HelperTarget())
         self.lost = 0
         m = rr.notifyOnDisconnect(self.disconnected)
         rr.dontNotifyOnDisconnect(m)
+        # dontNotifyOnDisconnect is supposed to be tolerant of duplicate
+        # unregisters, because otherwise it is hard to avoid race conditions.
+        # Validate that we can unregister something multiple times.
+        rr.dontNotifyOnDisconnect(m)
         rr.tracker.broker.transport.loseConnection(CONNECTION_LOST)
         d = flushEventualQueue()
         d.addCallback(lambda res: self.failIf(self.lost))
         return d
 
-    def testDisconnect4(self):
+    def testNotifyOnDisconnect_args(self):
         rr, target = self.setupTarget(HelperTarget())
         self.lost = 0
         rr.notifyOnDisconnect(self.disconnected, "arg", foo="kwarg")
@@ -354,6 +380,21 @@ class TestCall(TargetMixin, unittest.TestCase):
         d.addCallback(_check)
         return d
 
+    def testNotifyOnDisconnect_already(self):
+        # make sure notifyOnDisconnect works even if the reference was already
+        # broken
+        rr, target = self.setupTarget(HelperTarget())
+        self.lost = 0
+        rr.tracker.broker.transport.loseConnection(CONNECTION_LOST)
+        d = flushEventualQueue()
+        d.addCallback(lambda res: rr.notifyOnDisconnect(self.disconnected))
+        d.addCallback(lambda res: flushEventualQueue())
+        def _check(res):
+            self.failUnless(self.lost, "disconnect handler not run")
+            self.failUnlessEqual(self.lost_args, ((),{}))
+        d.addCallback(_check)
+        return d
+
     def testUnsendable(self):
         rr, target = self.setupTarget(HelperTarget())
         d = rr.callRemote("set", obj=Unsendable())
diff --git a/src/foolscap/foolscap/test/test_crypto.py b/src/foolscap/foolscap/test/test_crypto.py
index 54b6f9bf..acf2b8ce 100644
--- a/src/foolscap/foolscap/test/test_crypto.py
+++ b/src/foolscap/foolscap/test/test_crypto.py
@@ -78,7 +78,7 @@ class TestPersist(UsefulMixin, unittest.TestCase):
         d = defer.maybeDeferred(s1.stopService)
         d.addCallback(self._testPersist_1, s1, s2, t1, public_url, port)
         return d
-    testPersist.timeout = 10
+    testPersist.timeout = 5
     def _testPersist_1(self, res, s1, s2, t1, public_url, port):
         self.services.remove(s1)
         s3 = Tub(certData=s1.getCertData())
@@ -161,7 +161,7 @@ class TestListeners(UsefulMixin, unittest.TestCase):
         d.addCallback(lambda ref: ref.callRemote('add', a=2, b=2))
         d.addCallback(self._testShared_1)
         return d
-    testShared.timeout = 10
+    testShared.timeout = 5
     def _testShared_1(self, res):
         t1,t2 = self.targets
         self.failUnlessEqual(t1.calls, [(1,1)])
diff --git a/src/foolscap/foolscap/test/test_schema.py b/src/foolscap/foolscap/test/test_schema.py
index ab18e2af..671b7a73 100644
--- a/src/foolscap/foolscap/test/test_schema.py
+++ b/src/foolscap/foolscap/test/test_schema.py
@@ -76,8 +76,8 @@ class ConformTest(unittest.TestCase):
         self.conforms(c, -2**512+1)
         self.violates(c, -2**512)
 
-    def testString(self):
-        c = schema.StringConstraint(10)
+    def testByteString(self):
+        c = schema.ByteStringConstraint(10)
         self.assertSize(c, STR10)
         self.assertSize(c, STR10) # twice to test seen=[] logic
         self.assertDepth(c, 1)
@@ -89,22 +89,66 @@ class ConformTest(unittest.TestCase):
         self.violates(c, Dummy())
         self.violates(c, None)
 
-        c2 = schema.StringConstraint(15, 10)
+        c2 = schema.ByteStringConstraint(15, 10)
         self.violates(c2, "too short")
         self.conforms(c2, "long enough")
         self.violates(c2, "this is too long")
+        self.violates(c2, u"I am unicode")
 
-        c3 = schema.StringConstraint(regexp="needle")
+        c3 = schema.ByteStringConstraint(regexp="needle")
         self.violates(c3, "no present")
         self.conforms(c3, "needle in a haystack")
-        c4 = schema.StringConstraint(regexp="[abc]+")
+        c4 = schema.ByteStringConstraint(regexp="[abc]+")
         self.violates(c4, "spelled entirely without those letters")
         self.conforms(c4, "add better cases")
-        c5 = schema.StringConstraint(regexp=re.compile("\d+\s\w+"))
+        c5 = schema.ByteStringConstraint(regexp=re.compile("\d+\s\w+"))
         self.conforms(c5, ": 123 boo")
         self.violates(c5, "more than 1  spaces")
         self.violates(c5, "letters first 123")
 
+    def testString(self):
+        # this test will change once the definition of "StringConstraint"
+        # changes. For now, we assert that StringConstraint is the same as
+        # ByteStringConstraint.
+
+        c = schema.StringConstraint(20)
+        self.conforms(c, "I'm short")
+        self.violates(c, u"I am unicode")
+
+    def testUnicode(self):
+        c = schema.UnicodeConstraint(10)
+        #self.assertSize(c, USTR10)
+        #self.assertSize(c, USTR10) # twice to test seen=[] logic
+        self.assertDepth(c, 2)
+        self.violates(c, "I'm a bytestring")
+        self.conforms(c, u"I'm short")
+        self.violates(c, u"I am too long")
+        self.conforms(c, u"a" * 10)
+        self.violates(c, u"a" * 11)
+        self.violates(c, 123)
+        self.violates(c, Dummy())
+        self.violates(c, None)
+
+        c2 = schema.UnicodeConstraint(15, 10)
+        self.violates(c2, "I'm a bytestring")
+        self.violates(c2, u"too short")
+        self.conforms(c2, u"long enough")
+        self.violates(c2, u"this is too long")
+
+        c3 = schema.UnicodeConstraint(regexp="needle")
+        self.violates(c3, "I'm a bytestring")
+        self.violates(c3, u"no present")
+        self.conforms(c3, u"needle in a haystack")
+        c4 = schema.UnicodeConstraint(regexp="[abc]+")
+        self.violates(c4, "I'm a bytestring")
+        self.violates(c4, u"spelled entirely without those letters")
+        self.conforms(c4, u"add better cases")
+        c5 = schema.UnicodeConstraint(regexp=re.compile("\d+\s\w+"))
+        self.violates(c5, "I'm a bytestring")
+        self.conforms(c5, u": 123 boo")
+        self.violates(c5, u"more than 1  spaces")
+        self.violates(c5, u"letters first 123")
+
     def testBool(self):
         c = schema.BooleanConstraint()
         self.assertSize(c, 147)
@@ -118,14 +162,14 @@ class ConformTest(unittest.TestCase):
         self.violates(c, None)
         
     def testPoly(self):
-        c = schema.PolyConstraint(schema.StringConstraint(100),
+        c = schema.PolyConstraint(schema.ByteStringConstraint(100),
                                   schema.IntegerConstraint())
         self.assertSize(c, 165)
         self.assertDepth(c, 1)
 
     def testTuple(self):
-        c = schema.TupleConstraint(schema.StringConstraint(10),
-                                   schema.StringConstraint(100),
+        c = schema.TupleConstraint(schema.ByteStringConstraint(10),
+                                   schema.ByteStringConstraint(100),
                                    schema.IntegerConstraint() )
         self.conforms(c, ("hi", "there buddy, you're number", 1))
         self.violates(c, "nope")
@@ -136,11 +180,11 @@ class ConformTest(unittest.TestCase):
         self.assertDepth(c, 2)
 
     def testNestedTuple(self):
-        inner = schema.TupleConstraint(schema.StringConstraint(10),
+        inner = schema.TupleConstraint(schema.ByteStringConstraint(10),
                                        schema.IntegerConstraint())
         self.assertSize(inner, 72+75+73)
         self.assertDepth(inner, 2)
-        outer = schema.TupleConstraint(schema.StringConstraint(100),
+        outer = schema.TupleConstraint(schema.ByteStringConstraint(100),
                                        inner)
         self.assertSize(outer, 72+165 + 72+75+73)
         self.assertDepth(outer, 3)
@@ -157,7 +201,7 @@ class ConformTest(unittest.TestCase):
         self.violates(outer2, ("hi", 1, "flat", 2) )
 
     def testUnbounded(self):
-        big = schema.StringConstraint(None)
+        big = schema.ByteStringConstraint(None)
         self.assertUnboundedSize(big)
         self.assertDepth(big, 1)
         self.conforms(big, "blah blah blah blah blah" * 1024)
@@ -175,7 +219,7 @@ class ConformTest(unittest.TestCase):
 
     def testRecursion(self):
         # we have to fiddle with PolyConstraint's innards
-        value = schema.ChoiceOf(schema.StringConstraint(),
+        value = schema.ChoiceOf(schema.ByteStringConstraint(),
                                 schema.IntegerConstraint(),
                                 # will add 'value' here
                                 )
@@ -185,7 +229,7 @@ class ConformTest(unittest.TestCase):
         self.conforms(value, 123)
         self.violates(value, [])
 
-        mapping = schema.TupleConstraint(schema.StringConstraint(10),
+        mapping = schema.TupleConstraint(schema.ByteStringConstraint(10),
                                          value)
         self.assertSize(mapping, 72+75+1065)
         self.assertDepth(mapping, 2)
@@ -209,7 +253,7 @@ class ConformTest(unittest.TestCase):
         self.violates(mapping, ("name", l))
 
     def testList(self):
-        l = schema.ListOf(schema.StringConstraint(10))
+        l = schema.ListOf(schema.ByteStringConstraint(10))
         self.assertSize(l, 71 + 30*75)
         self.assertDepth(l, 2)
         self.conforms(l, ["one", "two", "three"])
@@ -218,19 +262,19 @@ class ConformTest(unittest.TestCase):
         self.violates(l, [0, "numbers", "allowed"])
         self.conforms(l, ["short", "sweet"])
 
-        l2 = schema.ListOf(schema.StringConstraint(10), 3)
+        l2 = schema.ListOf(schema.ByteStringConstraint(10), 3)
         self.assertSize(l2, 71 + 3*75)
         self.assertDepth(l2, 2)
         self.conforms(l2, ["the number", "shall be", "three"])
         self.violates(l2, ["five", "is", "...", "right", "out"])
 
-        l3 = schema.ListOf(schema.StringConstraint(10), None)
+        l3 = schema.ListOf(schema.ByteStringConstraint(10), None)
         self.assertUnboundedSize(l3)
         self.assertDepth(l3, 2)
         self.conforms(l3, ["long"] * 35)
         self.violates(l3, ["number", 1, "rule", "is", 0, "numbers"])
 
-        l4 = schema.ListOf(schema.StringConstraint(10), 3, 3)
+        l4 = schema.ListOf(schema.ByteStringConstraint(10), 3, 3)
         self.conforms(l4, ["three", "is", "good"])
         self.violates(l4, ["but", "four", "is", "bad"])
         self.violates(l4, ["two", "too"])
@@ -308,7 +352,7 @@ class ConformTest(unittest.TestCase):
 
 
     def testDict(self):
-        d = schema.DictOf(schema.StringConstraint(10),
+        d = schema.DictOf(schema.ByteStringConstraint(10),
                           schema.IntegerConstraint(),
                           maxKeys=4)
         
@@ -352,7 +396,11 @@ class CreateTest(unittest.TestCase):
         self.failUnlessEqual(c.maxBytes, -1)
 
         c = make(str)
-        self.check(c, schema.StringConstraint)
+        self.check(c, schema.ByteStringConstraint)
+        self.failUnlessEqual(c.maxLength, 1000)
+
+        c = make(unicode)
+        self.check(c, schema.UnicodeConstraint)
         self.failUnlessEqual(c.maxLength, 1000)
 
         self.check(make(bool), schema.BooleanConstraint)
@@ -362,7 +410,7 @@ class CreateTest(unittest.TestCase):
         c = make((int, str))
         self.check(c, schema.TupleConstraint)
         self.check(c.constraints[0], schema.IntegerConstraint)
-        self.check(c.constraints[1], schema.StringConstraint)
+        self.check(c.constraints[1], schema.ByteStringConstraint)
 
         c = make(common.RIHelper)
         self.check(c, RemoteInterfaceConstraint)
@@ -392,7 +440,7 @@ class Arguments(unittest.TestCase):
         self.failUnless(isinstance(getkw("c")[1], schema.IntegerConstraint))
 
         self.failUnless(isinstance(r.getResponseConstraint(),
-                                   schema.StringConstraint))
+                                   schema.ByteStringConstraint))
 
         self.failUnless(isinstance(getkw("c", 1, [])[1],
                                    schema.IntegerConstraint))
diff --git a/src/foolscap/misc/testutils/figleaf.py b/src/foolscap/misc/testutils/figleaf.py
new file mode 100644
index 00000000..a24d4fab
--- /dev/null
+++ b/src/foolscap/misc/testutils/figleaf.py
@@ -0,0 +1,401 @@
+#! /usr/bin/env python
+"""
+figleaf is another tool to trace code coverage (yes, in Python ;).
+
+figleaf uses the sys.settrace hook to record which statements are
+executed by the CPython interpreter; this record can then be saved
+into a file, or otherwise communicated back to a reporting script.
+
+figleaf differs from the gold standard of coverage tools
+('coverage.py') in several ways.  First and foremost, figleaf uses the
+same criterion for "interesting" lines of code as the sys.settrace
+function, which obviates some of the complexity in coverage.py (but
+does mean that your "loc" count goes down).  Second, figleaf does not
+record code executed in the Python standard library, which results in
+a significant speedup.  And third, the format in which the coverage
+format is saved is very simple and easy to work with.
+
+You might want to use figleaf if you're recording coverage from
+multiple types of tests and need to aggregate the coverage in
+interesting ways, and/or control when coverage is recorded.
+coverage.py is a better choice for command-line execution, and its
+reporting is a fair bit nicer.
+
+Command line usage: ::
+
+  figleaf.py  
+
+The figleaf output is saved into the file '.figleaf', which is an
+*aggregate* of coverage reports from all figleaf runs from this
+directory.  '.figleaf' contains a pickled dictionary of sets; the keys
+are source code filenames, and the sets contain all line numbers
+executed by the Python interpreter. See the docs or command-line
+programs in bin/ for more information.
+
+High level API: ::
+
+ * ``start(ignore_lib=True)`` -- start recording code coverage.
+ * ``stop()``                 -- stop recording code coverage.
+ * ``get_trace_obj()``        -- return the (singleton) trace object.
+ * ``get_info()``             -- get the coverage dictionary
+
+Classes & functions worth knowing about, i.e. a lower level API:
+
+ * ``get_lines(fp)`` -- return the set of interesting lines in the fp.
+ * ``combine_coverage(d1, d2)`` -- combine coverage info from two dicts.
+ * ``read_coverage(filename)`` -- load the coverage dictionary
+ * ``write_coverage(filename)`` -- write the coverage out.
+ * ``annotate_coverage(...)`` -- annotate a Python file with its coverage info.
+
+Known problems:
+
+ -- module docstrings are *covered* but not found.
+
+AUTHOR: C. Titus Brown, titus@idyll.org
+
+'figleaf' is Copyright (C) 2006.  It will be released under the BSD license.
+"""
+import sys
+import os
+import threading
+from cPickle import dump, load
+
+### import builtin sets if in > 2.4, otherwise use 'sets' module.
+# we require 2.4 or later
+assert set
+
+
+from token import tok_name, NEWLINE, STRING, INDENT, DEDENT, COLON
+import parser, types, symbol
+
+def get_token_name(x):
+    """
+    Utility to help pretty-print AST symbols/Python tokens.
+    """
+    if symbol.sym_name.has_key(x):
+        return symbol.sym_name[x]
+    return tok_name.get(x, '-')
+
+class LineGrabber:
+    """
+    Count 'interesting' lines of Python in source files, where
+    'interesting' is defined as 'lines that could possibly be
+    executed'.
+
+    @CTB this badly needs to be refactored... once I have automated
+    tests ;)
+    """
+    def __init__(self, fp):
+        """
+        Count lines of code in 'fp'.
+        """
+        self.lines = set()
+
+        self.ast = parser.suite(fp.read())
+        self.tree = parser.ast2tuple(self.ast, True)
+
+        self.find_terminal_nodes(self.tree)
+
+    def find_terminal_nodes(self, tup):
+        """
+        Recursively eat an AST in tuple form, finding the first line
+        number for "interesting" code.
+        """
+        (sym, rest) = tup[0], tup[1:]
+
+        line_nos = set()
+        if type(rest[0]) == types.TupleType:  ### node
+
+            for x in rest:
+                token_line_no = self.find_terminal_nodes(x)
+                if token_line_no is not None:
+                    line_nos.add(token_line_no)
+
+            if symbol.sym_name[sym] in ('stmt', 'suite', 'lambdef',
+                                        'except_clause') and line_nos:
+                # store the line number that this statement started at
+                self.lines.add(min(line_nos))
+            elif symbol.sym_name[sym] in ('if_stmt',):
+                # add all lines under this
+                self.lines.update(line_nos)
+            elif symbol.sym_name[sym] in ('global_stmt',): # IGNORE
+                return
+            else:
+                if line_nos:
+                    return min(line_nos)
+
+        else:                               ### leaf
+            if sym not in (NEWLINE, STRING, INDENT, DEDENT, COLON) and \
+               tup[1] != 'else':
+                return tup[2]
+            return None
+
+    def pretty_print(self, tup=None, indent=0):
+        """
+        Pretty print the AST.
+        """
+        if tup is None:
+            tup = self.tree
+
+        s = tup[1]
+
+        if type(s) == types.TupleType:
+            print ' '*indent, get_token_name(tup[0])
+            for x in tup[1:]:
+                self.pretty_print(x, indent+1)
+        else:
+            print ' '*indent, get_token_name(tup[0]), tup[1:]
+
+def get_lines(fp):
+    """
+    Return the set of interesting lines in the source code read from
+    this file handle.
+    """
+    l = LineGrabber(fp)
+    return l.lines
+
+class CodeTracer:
+    """
+    Basic code coverage tracking, using sys.settrace.
+    """
+    def __init__(self, ignore_prefixes=[]):
+        self.c = {}
+        self.started = False
+        self.ignore_prefixes = ignore_prefixes
+
+    def start(self):
+        """
+        Start recording.
+        """
+        if not self.started:
+            self.LOG = open("/tmp/flog.out", "w")
+            self.started = True
+
+            sys.settrace(self.g)
+            if hasattr(threading, 'settrace'):
+                threading.settrace(self.g)
+
+    def stop(self):
+        if self.started:
+            sys.settrace(None)
+            if hasattr(threading, 'settrace'):
+                threading.settrace(None)
+
+            self.started = False
+
+    def g(self, f, e, a):
+        """
+        global trace function.
+        """
+        if e is 'call':
+            for p in self.ignore_prefixes:
+                if f.f_code.co_filename.startswith(p):
+                    return
+
+            return self.t
+
+    def t(self, f, e, a):
+        """
+        local trace function.
+        """
+
+        if e is 'line':
+            self.c[(f.f_code.co_filename, f.f_lineno)] = 1
+        return self.t
+
+    def clear(self):
+        """
+        wipe out coverage info
+        """
+
+        self.c = {}
+
+    def gather_files(self):
+        """
+        Return the dictionary of lines of executed code; the dict
+        contains items (k, v), where 'k' is the filename and 'v'
+        is a set of line numbers.
+        """
+        files = {}
+        for (filename, line) in self.c.keys():
+            d = files.get(filename, set())
+            d.add(line)
+            files[filename] = d
+
+        return files
+
+def combine_coverage(d1, d2):
+    """
+    Given two coverage dictionaries, combine the recorded coverage
+    and return a new dictionary.
+    """
+    keys = set(d1.keys())
+    keys.update(set(d2.keys()))
+
+    new_d = {}
+    for k in keys:
+        v = d1.get(k, set())
+        v2 = d2.get(k, set())
+
+        s = set(v)
+        s.update(v2)
+        new_d[k] = s
+
+    return new_d
+
+def write_coverage(filename, combine=True):
+    """
+    Write the current coverage info out to the given filename.  If
+    'combine' is false, destroy any previously recorded coverage info.
+    """
+    if _t is None:
+        return
+
+    d = _t.gather_files()
+
+    # combine?
+    if combine:
+        old = {}
+        fp = None
+        try:
+            fp = open(filename)
+        except IOError:
+            pass
+
+        if fp:
+            old = load(fp)
+            fp.close()
+            d = combine_coverage(d, old)
+
+    # ok, save.
+    outfp = open(filename, 'w')
+    try:
+        dump(d, outfp)
+    finally:
+        outfp.close()
+
+def read_coverage(filename):
+    """
+    Read a coverage dictionary in from the given file.
+    """
+    fp = open(filename)
+    try:
+        d = load(fp)
+    finally:
+        fp.close()
+
+    return d
+
+def annotate_coverage(in_fp, out_fp, covered, all_lines,
+                      mark_possible_lines=False):
+    """
+    A simple example coverage annotator that outputs text.
+    """
+    for i, line in enumerate(in_fp):
+        i = i + 1
+
+        if i in covered:
+            symbol = '>'
+        elif i in all_lines:
+            symbol = '!'
+        else:
+            symbol = ' '
+
+        symbol2 = ''
+        if mark_possible_lines:
+            symbol2 = ' '
+            if i in all_lines:
+                symbol2 = '-'
+
+        out_fp.write('%s%s %s' % (symbol, symbol2, line,))
+
+#######################
+
+#
+# singleton functions/top-level API
+#
+
+_t = None
+
+def start(ignore_python_lib=True, ignore_prefixes=[]):
+    """
+    Start tracing code coverage.  If 'ignore_python_lib' is True,
+    ignore all files that live below the same directory as the 'os'
+    module.
+    """
+    global _t
+    if _t is None:
+        ignore_prefixes = ignore_prefixes[:]
+        if ignore_python_lib:
+            ignore_prefixes.append(os.path.realpath(os.path.dirname(os.__file__)))
+        _t = CodeTracer(ignore_prefixes)
+
+    _t.start()
+
+def stop():
+    """
+    Stop tracing code coverage.
+    """
+    global _t
+    if _t is not None:
+        _t.stop()
+
+def get_trace_obj():
+    """
+    Return the (singleton) trace object, if it exists.
+    """
+    return _t
+
+def get_info():
+    """
+    Get the coverage dictionary from the trace object.
+    """
+    if _t:
+        return _t.gather_files()
+
+#############
+
+def display_ast():
+    l = LineGrabber(open(sys.argv[1]))
+    l.pretty_print()
+
+def main():
+    """
+    Execute the given Python file with coverage, making it look like it is
+    __main__.
+    """
+    ignore_pylibs = False
+
+    def print_help():
+        print 'Usage: figleaf [-i]  '
+        print ''
+        print 'Options:'
+        print '  -i             Ignore Python standard libraries when calculating coverage'
+
+    args = sys.argv[1:]
+
+    if len(args) < 1:
+        print_help()
+        raise SystemExit()
+    elif len(args) > 2 and args[0] == '-i':
+        ignore_pylibs = True
+
+        ## Make sure to strip off the -i or --ignore-python-libs option if it exists
+        args = args[1:]
+
+    ## Reset system args so that the subsequently exec'd file can read from sys.argv
+    sys.argv = args
+
+    sys.path[0] = os.path.dirname(args[0])
+
+    cwd = os.getcwd()
+
+    start(ignore_pylibs)        # START code coverage
+
+    import __main__
+    try:
+        execfile(args[0], __main__.__dict__)
+    finally:
+        stop()                          # STOP code coverage
+
+        write_coverage(os.path.join(cwd, '.figleaf'))
diff --git a/src/foolscap/misc/testutils/figleaf2html b/src/foolscap/misc/testutils/figleaf2html
new file mode 100644
index 00000000..68524669
--- /dev/null
+++ b/src/foolscap/misc/testutils/figleaf2html
@@ -0,0 +1,3 @@
+#! /usr/bin/env python
+import figleaf_htmlizer
+figleaf_htmlizer.main()
diff --git a/src/foolscap/misc/testutils/figleaf_htmlizer.py b/src/foolscap/misc/testutils/figleaf_htmlizer.py
new file mode 100644
index 00000000..9b009373
--- /dev/null
+++ b/src/foolscap/misc/testutils/figleaf_htmlizer.py
@@ -0,0 +1,272 @@
+#! /usr/bin/env python
+import sys
+import figleaf
+import os
+import re
+
+from optparse import OptionParser
+
+import logging
+logging.basicConfig(level=logging.DEBUG)
+
+logger = logging.getLogger('figleaf.htmlizer')
+
+def read_exclude_patterns(f):
+	if not f:
+		return []
+	exclude_patterns = []
+
+	fp = open(f)
+	for line in fp:
+		line = line.rstrip()
+		if line and not line.startswith('#'):
+			pattern = re.compile(line)
+		exclude_patterns.append(pattern)
+
+	return exclude_patterns
+
+def report_as_html(coverage, directory, exclude_patterns=[], root=None):
+	### now, output.
+
+	keys = coverage.keys()
+	info_dict = {}
+	for k in keys:
+		skip = False
+		for pattern in exclude_patterns:
+			if pattern.search(k):
+				logger.debug('SKIPPING %s -- matches exclusion pattern' % k)
+				skip = True
+				break
+
+		if skip:
+			continue
+
+		if k.endswith('figleaf.py'):
+			continue
+
+                display_filename = k
+                if root:
+                        if not k.startswith(root):
+                                continue
+                        display_filename = k[len(root):]
+                        assert not display_filename.startswith("/")
+                        assert display_filename.endswith(".py")
+                        display_filename = display_filename[:-3] # trim .py
+                        display_filename = display_filename.replace("/", ".")
+
+                if not k.startswith("/"):
+                        continue
+
+		try:
+			pyfile = open(k)
+#            print 'opened', k
+		except IOError:
+			logger.warning('CANNOT OPEN: %s' % k)
+			continue
+
+		try:
+			lines = figleaf.get_lines(pyfile)
+		except KeyboardInterrupt:
+			raise
+		except Exception, e:
+			pyfile.close()
+			logger.warning('ERROR: %s %s' % (k, str(e)))
+			continue
+
+		# ok, got all the info.  now annotate file ==> html.
+
+		covered = coverage[k]
+		n_covered = n_lines = 0
+
+		pyfile = open(k)
+		output = []
+		for i, line in enumerate(pyfile):
+			is_covered = False
+			is_line = False
+
+			i += 1
+
+			if i in covered:
+				is_covered = True
+
+				n_covered += 1
+				n_lines += 1
+			elif i in lines:
+				is_line = True
+
+				n_lines += 1
+
+			color = 'black'
+			if is_covered:
+				color = 'green'
+			elif is_line:
+				color = 'red'
+
+			line = escape_html(line.rstrip())
+			output.append('%4d. %s' % (color, i, line.rstrip()))
+
+		try:
+			pcnt = n_covered * 100. / n_lines
+		except ZeroDivisionError:
+			pcnt = 100
+		info_dict[k] = (n_lines, n_covered, pcnt, display_filename)
+
+		html_outfile = make_html_filename(display_filename)
+		html_outfp = open(os.path.join(directory, html_outfile), 'w')
+		html_outfp.write('source file: %s
\n' % (k,)) + html_outfp.write('file stats: %d lines, %d executed: %.1f%% covered\n' % (n_lines, n_covered, pcnt)) + + html_outfp.write('
\n')
+		html_outfp.write("\n".join(output))
+		html_outfp.close()
+
+	### print a summary, too.
+
+	info_dict_items = info_dict.items()
+
+	def sort_by_pcnt(a, b):
+		a = a[1][2]
+		b = b[1][2]
+
+		return -cmp(a,b)
+	info_dict_items.sort(sort_by_pcnt)
+
+	summary_lines = sum([ v[0] for (k, v) in info_dict_items])
+	summary_cover = sum([ v[1] for (k, v) in info_dict_items])
+
+	summary_pcnt = 100
+	if summary_lines:
+		summary_pcnt = float(summary_cover) * 100. / float(summary_lines)
+
+
+	pcnts = [ float(v[1]) * 100. / float(v[0]) for (k, v) in info_dict_items if v[0] ]
+	pcnt_90 = [ x for x in pcnts if x >= 90 ]
+	pcnt_75 = [ x for x in pcnts if x >= 75 ]
+	pcnt_50 = [ x for x in pcnts if x >= 50 ]
+
+        stats_fp = open('%s/stats.out' % (directory,), 'w')
+        stats_fp.write("total files: %d\n" % len(pcnts))
+        stats_fp.write("total source lines: %d\n" % summary_lines)
+        stats_fp.write("total covered lines: %d\n" % summary_cover)
+        stats_fp.write("total coverage percentage: %.1f\n" % summary_pcnt)
+        stats_fp.close()
+
+        ## index.html
+	index_fp = open('%s/index.html' % (directory,), 'w')
+        # summary info
+	index_fp.write('figleaf code coverage report\n')
+	index_fp.write('

Summary

%d files total: %d files > ' + '90%%, %d files > 75%%, %d files > 50%%

' + % (len(pcnts), len(pcnt_90), + len(pcnt_75), len(pcnt_50))) + # sorted by percentage covered + index_fp.write('

Sorted by Coverage Percentage

\n') + index_fp.write('' + '' + '\n') + index_fp.write('' + '' + '\n' + % (summary_lines, summary_cover, summary_pcnt,)) + + for filename, stuff in info_dict_items: + (n_lines, n_covered, percent_covered, display_filename) = stuff + html_outfile = make_html_filename(display_filename) + + index_fp.write('' + '\n' + % (html_outfile, display_filename, n_lines, + n_covered, percent_covered,)) + + index_fp.write('
Filename# lines# covered% covered
totals:%d%d%.1f%%
%s%d%d%.1f
\n') + + # sorted by module name + index_fp.write('

Sorted by Module Name (alphabetical)

\n') + info_dict_items.sort() + index_fp.write('' + '' + '\n') + + for filename, stuff in info_dict_items: + (n_lines, n_covered, percent_covered, display_filename) = stuff + html_outfile = make_html_filename(display_filename) + + index_fp.write('' + '\n' + % (html_outfile, display_filename, n_lines, + n_covered, percent_covered,)) + + index_fp.write('
Filename# lines# covered% covered
%s%d%d%.1f
\n') + + index_fp.close() + + logger.info('reported on %d file(s) total\n' % len(info_dict)) + return len(info_dict) + +def prepare_reportdir(dirname='html'): + try: + os.mkdir(dirname) + except OSError: # already exists + pass + +def make_html_filename(orig): + return orig + ".html" + +def escape_html(s): + s = s.replace("&", "&") + s = s.replace("<", "<") + s = s.replace(">", ">") + s = s.replace('"', """) + return s + +def main(): + ### + + option_parser = OptionParser() + + option_parser.add_option('-x', '--exclude-patterns', action="store", + dest="exclude_patterns_file", + help="file containing regexp patterns to exclude") + + option_parser.add_option('-d', '--output-directory', action='store', + dest="output_dir", + default = "html", + help="directory for HTML output") + option_parser.add_option('-r', '--root', action="store", + dest="root", + default=None, + help="only pay attention to modules under this directory") + + option_parser.add_option('-q', '--quiet', action='store_true', dest='quiet', help='Suppress all but error messages') + + (options, args) = option_parser.parse_args() + + if options.quiet: + logging.disable(logging.DEBUG) + + if options.root: + options.root = os.path.abspath(options.root) + if options.root[-1] != "/": + options.root = options.root + "/" + + ### load + + if not args: + args = ['.figleaf'] + + coverage = {} + for filename in args: + logger.debug("loading coverage info from '%s'\n" % (filename,)) + d = figleaf.read_coverage(filename) + coverage = figleaf.combine_coverage(coverage, d) + + if not coverage: + logger.warning('EXITING -- no coverage info!\n') + sys.exit(-1) + + ### make directory + prepare_reportdir(options.output_dir) + report_as_html(coverage, options.output_dir, + read_exclude_patterns(options.exclude_patterns_file), + options.root) + diff --git a/src/foolscap/misc/testutils/trial_figleaf.py b/src/foolscap/misc/testutils/trial_figleaf.py new file mode 100644 index 00000000..7bc96b0d --- /dev/null +++ b/src/foolscap/misc/testutils/trial_figleaf.py @@ -0,0 +1,125 @@ + +"""A Trial IReporter plugin that gathers figleaf code-coverage information. + +Once this plugin is installed, trial can be invoked with one of two new +--reporter options: + + trial --reporter=verbose-figleaf ARGS + trial --reporter-bwverbose-figleaf ARGS + +Once such a test run has finished, there will be a .figleaf file in the +top-level directory. This file can be turned into a directory of .html files +(with index.html as the starting point) by running: + + figleaf2html -d OUTPUTDIR [-x EXCLUDEFILE] + +Figleaf thinks of everyting in terms of absolute filenames rather than +modules. The EXCLUDEFILE may be necessary to keep it from providing reports +on non-Code-Under-Test files that live in unusual locations. In particular, +if you use extra PYTHONPATH arguments to point at some alternate version of +an upstream library (like Twisted), or if something like debian's +python-support puts symlinks to .py files in sys.path but not the .py files +themselves, figleaf will present coverage information on both of these. The +EXCLUDEFILE option might help to inhibit these. + +Other figleaf problems: + + the annotated code files are written to BASENAME(file).html, which results + in collisions between similarly-named source files. + + The line-wise coverage information isn't quite right. Blank lines are + counted as unreached code, lambdas aren't quite right, and some multiline + comments (docstrings?) aren't quite right. + +""" + +from twisted.trial.reporter import TreeReporter, VerboseTextReporter + +# These plugins are registered via twisted/plugins/figleaf_trial.py . See +# the notes there for an explanation of how that works. + + + +# Reporters don't really get told about the suite starting and stopping. + +# The Reporter class is imported before the test classes are. + +# The test classes are imported before the Reporter is created. To get +# control earlier than that requires modifying twisted/scripts/trial.py . + +# Then Reporter.__init__ is called. + +# Then tests run, calling things like write() and addSuccess(). Each test is +# framed by a startTest/stopTest call. + +# Then the results are emitted, calling things like printErrors, +# printSummary, and wasSuccessful. + +# So for code-coverage (not including import), start in __init__ and finish +# in printSummary. To include import, we have to start in our own import and +# finish in printSummary. + +import figleaf +figleaf.start() + + +class FigleafReporter(TreeReporter): + def __init__(self, *args, **kwargs): + TreeReporter.__init__(self, *args, **kwargs) + + def printSummary(self): + figleaf.stop() + figleaf.write_coverage(".figleaf") + print "Figleaf results written to .figleaf" + return TreeReporter.printSummary(self) + +class FigleafTextReporter(VerboseTextReporter): + def __init__(self, *args, **kwargs): + VerboseTextReporter.__init__(self, *args, **kwargs) + + def printSummary(self): + figleaf.stop() + figleaf.write_coverage(".figleaf") + print "Figleaf results written to .figleaf" + return VerboseTextReporter.printSummary(self) + +class not_FigleafReporter(object): + # this class, used as a reporter on a fully-passing test suite, doesn't + # trigger exceptions. So it is a guide to what methods are invoked on a + # Reporter. + def __init__(self, *args, **kwargs): + print "FIGLEAF HERE" + self.r = TreeReporter(*args, **kwargs) + self.shouldStop = self.r.shouldStop + self.separator = self.r.separator + self.testsRun = self.r.testsRun + self._starting2 = False + + def write(self, *args): + if not self._starting2: + self._starting2 = True + print "FIRST WRITE" + return self.r.write(*args) + + def startTest(self, *args, **kwargs): + return self.r.startTest(*args, **kwargs) + + def stopTest(self, *args, **kwargs): + return self.r.stopTest(*args, **kwargs) + + def addSuccess(self, *args, **kwargs): + return self.r.addSuccess(*args, **kwargs) + + def printErrors(self, *args, **kwargs): + return self.r.printErrors(*args, **kwargs) + + def writeln(self, *args, **kwargs): + return self.r.writeln(*args, **kwargs) + + def printSummary(self, *args, **kwargs): + print "PRINT SUMMARY" + return self.r.printSummary(*args, **kwargs) + + def wasSuccessful(self, *args, **kwargs): + return self.r.wasSuccessful(*args, **kwargs) + diff --git a/src/foolscap/misc/testutils/twisted/plugins/figleaf_trial_plugin.py b/src/foolscap/misc/testutils/twisted/plugins/figleaf_trial_plugin.py new file mode 100644 index 00000000..ee94984e --- /dev/null +++ b/src/foolscap/misc/testutils/twisted/plugins/figleaf_trial_plugin.py @@ -0,0 +1,47 @@ + +from zope.interface import implements +from twisted.trial.itrial import IReporter +from twisted.plugin import IPlugin + +# register a plugin that can create our FigleafReporter. The reporter itself +# lives in a separate place + +# note that this .py file is *not* in a package: there is no __init__.py in +# our parent directory. This is important, because otherwise ours would fight +# with Twisted's. When trial looks for plugins, it merely executes all the +# *.py files it finds in and twisted/plugins/ subdirectories of anything on +# sys.path . The namespace that results from executing these .py files is +# examined for instances which provide both IPlugin and the target interface +# (in this case, trial is looking for IReporter instances). Each such +# instance tells the application how to create a plugin by naming the module +# and class that should be instantiated. + +# When installing our package via setup.py, arrange for this file to be +# installed to the system-wide twisted/plugins/ directory. + +class _Reporter(object): + implements(IPlugin, IReporter) + + def __init__(self, name, module, description, longOpt, shortOpt, klass): + self.name = name + self.module = module + self.description = description + self.longOpt = longOpt + self.shortOpt = shortOpt + self.klass = klass + + +fig = _Reporter("Figleaf Code-Coverage Reporter", + "trial_figleaf", + description="verbose color output (with figleaf coverage)", + longOpt="verbose-figleaf", + shortOpt="f", + klass="FigleafReporter") + +bwfig = _Reporter("Figleaf Code-Coverage Reporter (colorless)", + "trial_figleaf", + description="Colorless verbose output (with figleaf coverage)", + longOpt="bwverbose-figleaf", + shortOpt=None, + klass="FigleafTextReporter") + diff --git a/src/foolscap/setup.py b/src/foolscap/setup.py index ae7c0dd6..e734e946 100644 --- a/src/foolscap/setup.py +++ b/src/foolscap/setup.py @@ -1,4 +1,4 @@ -#!/usr/bin/env python +#!/usr/bin/python import sys from distutils.core import setup