+2007-08-07 Brian Warner <warner@lothar.com>
+
+ * foolscap/__init__.py: release Foolscap-0.1.5
+ * misc/{sid|sarge|dapper|edgy|feisty}/debian/changelog: same
+
+2007-08-07 Brian Warner <warner@lothar.com>
+
+ * NEWS: update for the upcoming release
+
+ * foolscap/pb.py (Tub.registerNameLookupHandler): new function to
+ augment Tub.registerReference(). This allows names to be looked up
+ at request time, rather than requiring all Referenceables be
+ pre-registered with registerReference(). The chief use of this
+ would be for FURLs which point at objects that live on disk in
+ some persistent state until they are needed. Closes #6.
+ (Tub.unregisterNameLookupHandler): allow handlers to be removed
+ (Tub.getReferenceForName): use the handler during lookup
+ * foolscap/test/test_tub.py (NameLookup): test it
+
+2007-07-27 Brian Warner <warner@lothar.com>
+
+ * foolscap/referenceable.py (LocalReferenceable): implement an
+ adapter that allows code to do IRemoteReference(t).callRemote(...)
+ and have it work for both RemoteReferences and local
+ Referenceables. You might want to do this if you're getting back
+ introductions to a variety of remote Referenceables, some of which
+ might actually be on your local system, and you want to treat all
+ of the, the same way. Local Referenceables will be wrapped with a
+ class that implements callRemote() and makes it behave like an
+ actual remote callRemote() would. Closes ticket #1.
+ * foolscap/test/test_reference.py (LocalReference): test it
+
+2007-07-26 Brian Warner <warner@lothar.com>
+
+ * foolscap/call.py (AnswerUnslicer.receiveChild): accept a
+ ready_deferred, to accomodate Gifts in return values. Closes #5.
+ (AnswerUnslicer.receiveClose): .. and don't fire the response
+ until any such Gifts resolve
+ * foolscap/test/test_gifts.py (Gifts.testReturn): test it
+ (Gifts.testReturnInContainer): same
+ (Bad.testReturn_swissnum): and test the failure case too
+
+ * foolscap/test/test_pb.py (TestAnswer.testAccept1): fix a test
+ which wasn't calling start() properly and was broken by that change
+ (TestAnswer.testAccept2): same
+
+ * foolscap/test/test_gifts.py (Bad.setUp): disable these tests when
+ we don't have crypto, since TubIDs are not mangleable in the same
+ way without crypto.
+
+ * foolscap/slicer.py (BaseUnslicer.receiveChild): new convention:
+ Unslicers should accumulate their children's ready_deferreds into
+ an AsyncAND, and pass it to the parent. If something goes wrong,
+ the ready_deferred should errback, which will abandon the method
+ call that contains it.
+ * foolscap/slicers/dict.py (DictUnslicer.receiveClose): same
+ * foolscap/slicers/tuple.py (TupleUnslicer.receiveClose): same
+ (TupleUnslicer.complete): same
+ * foolscap/slicers/set.py (SetUnslicer.receiveClose): same
+ * foolscap/slicers/list.py (ListUnslicer.receiveClose): same
+ * foolscap/call.py (CallUnslicer.receiveClose): same
+
+ * foolscap/referenceable.py (TheirReferenceUnslicer.receiveClose):
+ use our ready_deferred to signal whether the gift resolves
+ correctly or not. If it fails, errback ready_deferred (to prevent
+ the message from being delivered without the resolved gift), but
+ callback obj_deferred with a placeholder to avoid causing too much
+ distress to the container.
+
+ * foolscap/broker.py (PBRootUnslicer.receiveChild): accept
+ ready_deferred in the InboundDelivery, stash both of them in the
+ broker.
+ (Broker.scheduleCall): rewrite inbound delivery handling: use a
+ self._call_is_running flag to prevent concurrent deliveries, and
+ wait for the ready_deferred before delivering the top-most
+ message. If the ready_deferred errbacks, that gets routed to
+ self.callFailed so the caller hears about the problem. This closes
+ ticket #2.
+
+ * foolscap/call.py (InboundDelivery): remove whenRunnable, relying
+ upon the ready_deferred to let the Broker know when the message
+ can be delivered.
+ (ArgumentUnslicer): significant cleanup, using ready_deferred.
+ Remove isReady and whenReady.
+
+ * foolscap/test/test_gifts.py (Base): factor setup code out
+ (Base.createCharacters): registerReference(tubname), for debugging
+ (Bad): add a bunch of tests to make sure that gifts which fail to
+ resolve (for various reasons) will inform the caller about the
+ problem, via an errback on the original callRemote()'s Deferred.
+
+2007-07-25 Brian Warner <warner@lothar.com>
+
+ * foolscap/util.py (AsyncAND): new utility class, which is like
+ DeferredList but is specifically for control flow rather than data
+ flow.
+ * foolscap/test/test_util.py: test it
+
+ * foolscap/call.py (CopiedFailure.setCopyableState): set .type to
+ a class that behaves (as least as far as reflect.qual() is
+ concerned) just like the original exception class. This improves
+ the behavior of derived Failure objects, as well as trial's
+ handling of CopiedFailures that get handed to log.err().
+ CopiedFailures are now a bit more like actual Failures. See ticket
+ #4 (http://foolscap.lothar.com/trac/ticket/4) for more details.
+ (CopiedFailureSlicer): make sure that CopiedFailures can be
+ serialized, so that A-calls-B-calls-C can return a failure all
+ the way back.
+ * foolscap/test/test_call.py (TestCall.testCopiedFailure): test it
+ * foolscap/test/test_copyable.py: update to match, now we must
+ compare reflect.qual(f.type) against some extension classname,
+ rather than just f.type.
+ * foolscap/test/test_pb.py: same
+ * foolscap/test/common.py: same
+
+2007-07-15 Brian Warner <warner@lothar.com>
+
+ * foolscap/test/test_interfaces.py (TestInterface.testStack):
+ don't look for a '/' in the stacktrace, since it won't be there
+ under windows. Thanks to 'strank'. Closes Twisted#2731.
+
+2007-06-29 Brian Warner <warner@lothar.com>
+
+ * foolscap/__init__.py: bump revision to 0.1.4+ while between releases
+ * misc/{sid|sarge|dapper|edgy|feisty}/debian/changelog: same
+
2007-05-14 Brian Warner <warner@lothar.com>
* foolscap/__init__.py: release Foolscap-0.1.4
lore -p --config template=$(DOC_TEMPLATE) --config ext=.html \
`find doc -name '*.xhtml'`
+
User visible changes in Foolscap (aka newpb/pb2). -*- outline -*-
+* Release 0.1.5 (07 Aug 2007)
+
+** Compatibility
+
+This release is fully compatible with 0.1.4 and 0.1.3 .
+
+** CopiedFailure improvements
+
+When a remote method call fails, the calling side gets back a CopiedFailure
+instance. These instances now behave slightly more like the (local) Failure
+objects that they are intended to mirror, in that .type now behaves much like
+the original class. This should allow trial tests which result in a
+CopiedFailure to be logged without exploding. In addition, chained failures
+(where A calls B, and B calls C, and C fails, so C's Failure is eventually
+returned back to A) should work correctly now.
+
+** Gift improvements
+
+Gifts inside return values should properly stall the delivery of the response
+until the gift is resolved. Gifts in all sorts of containers should work
+properly now. Gifts which cannot be resolved successfully (either because the
+hosting Tub cannot be reached, or because the name cannot be found) will now
+cause a proper error rather than hanging forever. Unresolvable gifts in
+method arguments will cause the message to not be delivered and an error to
+be returned to the caller. Unresolvable gifts in method return values will
+cause the caller to receive an error.
+
+** IRemoteReference() adapter
+
+The IRemoteReference() interface now has an adapter from Referenceable which
+creates a wrapper that enables the use of callRemote() and other
+IRemoteReference methods on a local object.
+
+The situation where this might be useful is when you have a central
+introducer and a bunch of clients, and the clients are introducing themselves
+to each other (to create a fully-connected mesh), and the introductions are
+using live references (i.e. Gifts), then when a specific client learns about
+itself from the introducer, that client will receive a local object instead
+of a RemoteReference. Each client will wind up with n-1 RemoteReferences and
+a single local object.
+
+This adapter allows the client to treat all these introductions as equal. A
+client that wishes to send a message to everyone it's been introduced to
+(including itself) can use:
+
+ for i in introductions:
+ IRemoteReference(i).callRemote("hello", args)
+
+In the future, if we implement coercing Guards (instead of
+compliance-asserting Constraints), then IRemoteReference will be useful as a
+guard on methods that want to insure that they can do callRemote (and
+notifyOnDisconnect, etc) on their argument.
+
+** Tub.registerNameLookupHandler
+
+This method allows a one-argument name-lookup callable to be attached to the
+Tub. This augments the table maintained by Tub.registerReference, allowing
+Referenceables to be created on the fly, or persisted/retrieved on disk
+instead of requiring all of them to be generated and registered at startup.
+
+
* Release 0.1.4 (14 May 2007)
** Compatibility
--- /dev/null
+-*- outline -*-
+
+Reasonably independent newpb sub-tasks that need doing. Most important come
+first.
+
+* decide on a version negotiation scheme
+
+Should be able to telnet into a PB server and find out that it is a PB
+server. Pointing a PB client at an HTTP server (or an HTTP client at a PB
+server) should result in an error, not a timeout. Implement in
+banana.Banana.connectionMade().
+
+desiderata:
+
+ negotiation should take place with regular banana sequences: don't invent a
+ new protocol that is only used at the start of the connection
+
+ Banana should be useable one-way, for storage or high-latency RPC (the mnet
+ folks want to create a method call, serialize it to a string, then encrypt
+ and forward it on to other nodes, sometimes storing it in relays along the
+ way if a node is offline for a few days). It should be easy for the layer
+ above Banana to feed it the results of what its negotiation would have been
+ (if it had actually used an interactive connection to its peer). Feeding the
+ same results to both sides should have them proceed as if they'd agreed to
+ those results.
+
+ negotiation should be flexible enough to be extended but still allow old
+ code to talk with new code. Magically predict every conceivable extension
+ and provide for it from the very first release :).
+
+There are many levels to banana, all of which could be useful targets of
+negotiation:
+
+ which basic tokens are in use? Is there a BOOLEAN token? a NONE token? Can
+ it accept a LONGINT token or is the target limited to 32-bit integers?
+
+ are there any variations in the basic Banana protocol being used? Could the
+ smaller-scope OPEN-counter decision be deferred until after the first
+ release and handled later with a compatibility negotiation flag?
+
+ What "base" OPEN sequences are known? 'unicode'? 'boolean'? 'dict'? This is
+ an overlap between expressing the capabilities of the host language, the
+ Banana implementation, and the needs of the application. How about
+ 'instance', probably only used for StorageBanana?
+
+ What "top-level" OPEN sequences are known? PB stuff (like 'call', and
+ 'your-reference')? Are there any variations or versions that need to be
+ known? We may add new functionality in the future, it might be useful for
+ one end to know whether this functionality is available or not. (the PB
+ 'call' sequence could some day take numeric argument names to convey
+ positional parameters, a 'reference' sequence could take a string to
+ indicate globally-visible PB URLs, it could become possible to pass
+ target.remote_foo directly to a peer and have a callable RemoteMethod object
+ pop out the other side).
+
+ What "application-level" sequences are available? (Which RemoteInterface
+ classes are known and valid in 'call' sequences? Which RemoteCopy names are
+ valid for targets of the 'copy' sequence?). This is not necessarily within
+ the realm of Banana negotiation, but applications may need to negotiate this
+ sort of thing, and any disagreements will be manifested when Banana starts
+ raising Violations, so it may be useful to include it in the Banana-level
+ negotiation.
+
+On the other hand, negotiation is only useful if one side is prepared to
+accomodate a peer which cannot do some of the things it would prefer to use,
+or if it wants to know about the incapabilities so it can report a useful
+failure rather than have an obscure protocol-level error message pop up an
+hour later. So negotiation isn't the only goal: simple capability awareness
+is a useful lesser goal.
+
+It kind of makes sense for the first object of a stream to be a negotiation
+blob. We could make a new 'version' opentype, and declare that the contents
+will be something simple and forever-after-parseable (like a dict, with heavy
+constraints on the keys and values, all strings emitted in full).
+
+DONE, at least the framework is in place. Uses HTTP-style header-block
+exchange instead of banana sequences, with client-sends-first and
+server-decides. This correctly handles PB-vs-HTTP, but requires a timeout to
+detect oldpb clients vs newpb servers. No actual feature negotiation is
+performed yet, because we still only have the one version of the code.
+
+* connection initiation
+
+** define PB URLs
+
+[newcred is the most important part of this, the URL stuff can wait]
+
+A URL defines an endpoint: a pb.Referenceable, with methods. Somewhere along
+the way it defines a transport (tcp+host+port, or unix+path) and an object
+reference (pathname). It might also define a RemoteInterface, or that might
+be put off until we actually invoke a method.
+
+ URL = f("pb:", host, port, pathname)
+ d = pb.callRemoteURL(URL, ifacename, methodname, args)
+
+probably give an actual RemoteInterface instead of just its name
+
+a pb.RemoteReference claims to provide access to zero-or-more
+RemoteInterfaces. You may choose which one you want to use when invoking
+callRemote.
+
+TODO: decide upon a syntax for URLs that refer to non-TCP transports
+ pb+foo://stuff, pby://stuff (for yURL-style self-authenticating names)
+
+TODO: write the URL parser, implementing pb.getRemoteURL and pb.callRemoteURL
+ DONE: use a Tub/PBService instead
+
+TODO: decide upon a calling convention for callRemote when specifying which
+RemoteInterface is being used.
+
+
+DONE, PB-URL is the way to go.
+** more URLs
+
+relative URLs (those without a host part) refer to objects on the same
+Broker. Absolute URLs (those with a host part) refer to objects on other
+Brokers.
+
+SKIP, interesting but not really useful
+
+** build/port pb.login: newcred for newpb
+
+Leave cred work for Glyph.
+
+<thomasvs> has some enhanced PB cred stuff (challenge/response, pb.Copyable
+credentials, etc).
+
+URL = pb.parseURL("pb://lothar.com:8789/users/warner/services/petmail",
+ IAuthorization)
+URL = doFullLogin(URL, "warner", "x8yzzy")
+URL.callRemote(methodname, args)
+
+NOTDONE
+
+* constrain ReferenceUnslicer properly
+
+The schema can use a ReferenceConstraint to indicate that the object must be
+a RemoteReference, and can also require that the remote object be capable of
+handling a particular Interface.
+
+This needs to be implemented. slicer.ReferenceUnslicer must somehow actually
+ask the constraint about the incoming tokens.
+
+An outstanding question is "what counts". The general idea is that
+RemoteReferences come over the wire as a connection-scoped ID number and an
+optional list of Interface names (strings and version numbers). In this case
+it is the far end which asserts that its object can implement any given
+Interface, and the receiving end just checks to see if the schema-imposed
+required Interface is in the list.
+
+This becomes more interesting when applied to local objects, or if a
+constraint is created which asserts that its object is *something* (maybe a
+RemoteReference, maybe a RemoteCopy) which implements a given Interface. In
+this case, the incoming object could be an actual instance, but the class
+name must be looked up in the unjellyableRegistry (and the class located, and
+the __implements__ list consulted) before any of the object's tokens are
+accepted.
+
+* security TODOs:
+
+** size constraints on the set-vocab sequence
+
+* implement schema.maxSize()
+
+In newpb, schemas serve two purposes:
+
+ a) make programs safer by reducing the surprises that can appear in their
+ arguments (i.e. factoring out argument-checking in a useful way)
+
+ b) remove memory-consumption DoS attacks by putting an upper bound on the
+ memory consumed by any particular message.
+
+Each schema has a pair of methods named maxSize() and maxDepth() which
+provide this upper bound. While the schema is in effect (say, during the
+receipt of a particular named argument to a remotely-invokable method), at
+most X bytes and Y slicer frames will be in use before either the object is
+accepted and processed or the schema notes the violation and the object is
+rejected (whereupon the temporary storage is released and all further bytes
+in the rejected object are simply discarded). Strictly speaking, the number
+returned by maxSize() is the largest string on the wire which has not yet
+been rejected as violating the constraint, but it is also a reasonable
+metric to describe how much internal storage must be used while processing
+it. (To achieve greater accuracy would involve knowing exactly how large
+each Python type is; not a sensible thing to attempt).
+
+The idea is that someone who is worried about an attacker throwing a really
+long string or an infinitely-nested list at them can ask the schema just what
+exactly their current exposure is. The tradeoff between flexibility ("accept
+any object whatsoever here") and exposure to DoS attack is then user-visible
+and thus user-selectable.
+
+To implement maxSize() for a basic schema (like a string), you simply need
+to look at banana.xhtml and see how basic tokens are encoded (you will also
+need to look at banana.py and see how deserialization is actually
+implemented). For a schema.StringConstraint(32) (which accepts strings <= 32
+characters in length), the largest serialized form that has not yet been
+either accepted or rejected is:
+
+ 64 bytes (header indicating 0x000000..0020 with lots of leading zeros)
+ + 1 byte (STRING token)
+ + 32 bytes (string contents)
+ = 97
+
+If the header indicates a conforming length (<=32) then just after the 32nd
+byte is received, the string object is created and handed to up the stack, so
+the temporary storage tops out at 97. If someone is trying to spam us with a
+million-character string, the serialized form would look like:
+
+ 64 bytes (header indicating 1-million in hex, with leading zeros)
++ 1 byte (STRING token)
+= 65
+
+at which point the receive parser would check the constraint, decide that
+1000000 > 32, and reject the remainder of the object.
+
+So (with the exception of pass/fail maxSize values, see below), the following
+should hold true:
+
+ schema.StringConstraint(32).maxSize() == 97
+
+Now, schemas which represent containers have size limits that are the sum of
+their contents, plus some overhead (and a stack level) for the container
+itself. For example, a list of two small integers is represented in newbanana
+as:
+
+ OPEN(list)
+ INT
+ INT
+ CLOSE()
+
+which really looks like:
+
+ opencount-OPEN
+ len-STRING-"list"
+ value-INT
+ value-INT
+ opencount-CLOSE
+
+This sequence takes at most:
+
+ opencount-OPEN: 64+1
+ len-STRING-"list": 64+1+1000 (opentypes are confined to be <= 1k long)
+ value-INT: 64+1
+ value-INT: 64+1
+ opencount-CLOSE: 64+1
+
+or 5*(64+1)+1000 = 1325, or rather:
+
+ 3*(64+1)+1000 + N*(IntConstraint().maxSize())
+
+So ListConstraint.maxSize is computed by doing some math involving the
+.maxSize value of the objects that go into it (the ListConstraint.constraint
+attribute). This suggests a recursive algorithm. If any constraint is
+unbounded (say a ListConstraint with no limit on the length of the list),
+then maxSize() raises UnboundedSchema to indicate that there is no limit on
+the size of a conforming string. Clearly, if any constraint is found to
+include itself, UnboundedSchema must also be raised.
+
+This is a loose upper bound. For example, one non-conforming input string
+would be:
+
+ opencount-OPEN: 64+1
+ len-STRING-"x"*1000: 64+1+1000
+
+The entire string would be accepted before checking to see which opentypes
+were valid: the ListConstraint only accepts the "list" opentype and would
+reject this string immediately after the 1000th "x" was received. So a
+tighter upper bound would be 2*65+1000 = 1130.
+
+In general, the bound is computed by walking through the deserialization
+process and identifying the largest string that could make it past the
+validity checks. There may be later checks that will reject the string, but
+if it has not yet been rejected, then it still represents exposure for a
+memory consumption DoS.
+
+** pass/fail sizes
+
+I started to think that it was necessary to have each constraint provide two
+maxSize numbers: one of the largest sequence that could possibly be accepted
+as valid, and a second which was the largest sequence that could be still
+undecided. This would provide a more accurate upper bound because most
+containers will respond to an invalid object by abandoning the rest of the
+container: i.e. if the current active constraint is:
+
+ ListConstraint(StringConstraint(32), maxLength=30)
+
+then the first thing that doesn't match the string constraint (say an
+instance, or a number, or a 33-character string) will cause the ListUnslicer
+to go into discard-everything mode. This makes a significant difference when
+the per-item constraint allows opentypes, because the OPEN type (a string) is
+constrained to 1k bytes. The item constraint probably imposes a much smaller
+limit on the set of actual strings that would be accepted, so no
+kilobyte-long opentype will possibly make it past that constraint. That means
+there can only be one outstanding invalid object. So the worst case (maximal
+length) string that has not yet been rejected would be something like:
+
+ OPEN(list)
+ validthing [0]
+ validthing [1]
+ ...
+ validthing [n-1]
+ long-invalid-thing
+
+because if the long-invalid thing had been received earlier, the entire list
+would have been abandoned.
+
+This suggests that the calculation for ListConstraint.maxSize() really needs
+to be like
+ overhead
+ +(len-1)*itemConstraint.maxSize(valid)
+ +(1)*itemConstraint.maxSize(invalid)
+
+I'm still not sure about this. I think it provides a significantly tighter
+upper bound. The deserialization process itself does not try to achieve the
+absolute minimal exposure (i.e., the opentype checker could take the set of
+all known-valid open types, compute the maximum length, and then impose a
+StringConstraint with that length instead of 1000), because it is, in
+general, a inefficient hassle. There is a tradeoff between computational
+efficiency and removing the slack in the maxSize bound, both in the
+deserialization process (where the memory is actually consumed) and in
+maxSize (where we estimate how much memory could be consumed).
+
+Anyway, maxSize() and maxDepth() (which is easier: containers add 1 to the
+maximum of the maxDepth values of their possible children) need to be
+implemented for all the Constraint classes. There are some tests (disabled)
+in test_schema.py for this code: those tests assert specific values for
+maxSize. Those values are probably wrong, so they must be updated to match
+however maxSize actually works.
+
+* decide upon what the "Shared" constraint should mean
+
+The idea of this one was to avoid some vulnerabilities by rejecting arbitrary
+object graphs. Fundamentally Banana can represent most anything (just like
+pickle), including objects that refer to each other in exciting loops and
+whorls. There are two problems with this: it is hard to enforce a schema that
+allows cycles in the object graph (indeed it is tricky to even describe one),
+and the shared references could be used to temporarily violate a schema.
+
+I think these might be fixable (the sample case is where one tuple is
+referenced in two different places, each with a different constraint, but the
+tuple is incomplete until some higher-level node in the graph has become
+referenceable, so [maybe] the schema can't be enforced until somewhat after
+the object has actually finished arriving).
+
+However, Banana is aimed at two different use-cases. One is kind of a
+replacement for pickle, where the goal is to allow arbitrary object graphs to
+be serialized but have more control over the process (in particular we still
+have an unjellyableRegistry to prevent arbitrary constructors from being
+executed during deserialization). In this mode, a larger set of Unslicers are
+available (for modules, bound methods, etc), and schemas may still be useful
+but are not enforced by default.
+
+PB will use the other mode, where the set of conveyable objects is much
+smaller, and security is the primary goal (including putting limits on
+resource consumption). Schemas are enforced by default, and all constraints
+default to sensible size limits (strings to 1k, lists to [currently] 30
+items). Because complex object graphs are not commonly transported across
+process boundaries, the default is to not allow any Copyable object to be
+referenced multiple times in the same serialization stream. The default is to
+reject both cycles and shared references in the object graph, allowing only
+strict trees, making life easier (and safer) for the remote methods which are
+being given this object tree.
+
+The "Shared" constraint is intended as a way to turn off this default
+strictness and allow the object to be referenced multiple times. The
+outstanding question is what this should really mean: must it be marked as
+such on all places where it could be referenced, what is the scope of the
+multiple-reference region (per- method-call, per-connection?), and finally
+what should be done when the limit is violated. Currently Unslicers see an
+Error object which they can respond to any way they please: the default
+containers abandon the rest of their contents and hand an Error to their
+parent, the MethodCallUnslicer returns an exception to the caller, etc. With
+shared references, the first recipient sees a valid object, while the second
+and later recipient sees an error.
+
+
+* figure out Deferred errors for immutable containers
+
+Somewhat related to the previous one. The now-classic example of an immutable
+container which cannot be created right away is the object created by this
+sequence:
+
+ t = ([],)
+ t[0].append((t,))
+
+This serializes into (with implicit reference numbers on the left):
+
+[0] OPEN(tuple)
+[1] OPEN(list)
+[2] OPEN(tuple)
+[3] OPEN(reference #0)
+ CLOSE
+ CLOSE
+ CLOSE
+
+In newbanana, the second TupleUnslicer cannot return a fully-formed tuple to
+its parent (the ListUnslicer), because that tuple cannot be created until the
+contents are all referenceable, and that cannot happen until the first
+TupleUnslicer has completed. So the second TupleUnslicer returns a Deferred
+instead of a tuple, and the ListUnslicer adds a callback which updates the
+list's item when the tuple is complete.
+
+The problem here is that of error handling. In general, if an exception is
+raised (perhaps a protocol error, perhaps a schema violation) while an
+Unslicer is active, that Unslicer is abandoned (all its remaining tokens are
+discarded) and the parent gets an Error object. (the parent may give up too..
+the basic Unslicers all behave this way, so any exception will cause
+everything up to the RootUnslicer to go boom, and the RootUnslicer has the
+option of dropping the connection altogether). When the error is noticed, the
+Unslicer stack is queried to figure out what path was taken from the root of
+the object graph to the site that had an error. This is really useful when
+trying to figure out which exact object cause a SchemaViolation: rather than
+being told a call trace or a description of the *object* which had a problem,
+you get a description of the path to that object (the same series of
+dereferences you'd use to print the object: obj.children[12].peer.foo.bar).
+
+When references are allowed, these exceptions could occur after the original
+object has been received, when that Deferred fires. There are two problems:
+one is that the error path is now misleading, the other is that it might not
+have been possible to enforce a schema because the object was incomplete.
+
+The most important thing is to make sure that an exception that occurs while
+the Deferred is being fired is caught properly and flunks the object just as
+if the problem were caught synchronously. This may involve discarding an
+otherwise complete object graph and blaming the problem on a node much closer
+to the root than the one which really caused the failure.
+
+* adaptive VOCAB compression
+
+We want to let banana figure out a good set of strings to compress on its
+own. In Banana.sendToken, keep a list of the last N strings that had to be
+sent in full (i.e. they weren't in the table). If the string being sent
+appears more than M times in that table, before we send the token, emit an
+ADDVOCAB sequence, add a vocab entry for it, then send a numeric VOCAB token
+instead of the string.
+
+Make sure the vocab mapping is not used until the ADDVOCAB sequence has been
+queued. Sending it inline should take care of this, but if for some reason we
+need to push it on the top-level object queue, we need to make sure the vocab
+table is not updated until it gets serialized. Queuing a VocabUpdate object,
+which updates the table when it gets serialized, would take care of this. The
+advantage of doing it inline is that later strings in the same object graph
+would benefit from the mapping. The disadvantage is that the receiving
+Unslicers must be prepared to deal with ADDVOCAB sequences at any time (so
+really they have to be stripped out). This disadvantage goes away if ADDVOCAB
+is a token instead of a sequence.
+
+Reasonable starting values for N and M might be 30 and 3.
+
+* write oldbanana compatibility code?
+
+An oldbanana peer can be detected because the server side sends its dialect
+list from connectionMade, and oldbanana lists are sent with OLDLIST tokens
+(the explicit-length kind).
+
+
+* add .describe methods to all Slicers
+
+This involves setting an attribute between each yield call, to indicate what
+part is about to be serialized.
+
+
+* serialize remotely-callable methods?
+
+It might be useful be able to do something like:
+
+ class Watcher(pb.Referenceable):
+ def remote_foo(self, args): blah
+
+ w = Watcher()
+ ref.callRemote("subscribe", w.remote_foo)
+
+That would involve looking up the method and its parent object, reversing
+the remote_*->* transformation, then sending a sequence which contained both
+the object's RemoteReference and the appropriate method name.
+
+It might also be useful to generalize this: passing a lambda expression to
+the remote end could stash the callable in a local table and send a Callable
+Reference to the other side. I can smell a good general-purpose object
+classification framework here, but I haven't quite been able to nail it down
+exactly.
+
+* testing
+
+** finish testing of LONGINT/LONGNEG
+
+test_banana.InboundByteStream.testConstrainedInt needs implementation
+
+** thoroughly test failure-handling at all points of in/out serialization
+
+places where BananaError or Violation might be raised
+
+sending side:
+ Slicer creation (schema pre-validation? no): no no
+ pre-validation is done before sending the object, Broker.callFinished,
+ RemoteReference.doCall
+ slicer creation is done in newSlicerFor
+
+ .slice (called in pushSlicer) ?
+ .slice.next raising Violation
+ .slice.next returning Deferrable when streaming isn't allowed
+ .sendToken (non-primitive token, can't happen)
+ .newSlicerFor (no ISlicer adapter)
+ top.childAborted
+
+receiving side:
+ long header (>64 bytes)
+ checkToken (top.openerCheckToken)
+ checkToken (top.checkToken)
+ typebyte == LIST (oldbanana)
+ bad VOCAB key
+ too-long vocab key
+ bad FLOAT encoding
+ top.receiveClose
+ top.finish
+ top.reportViolation
+ oldtop.finish (in from handleViolation)
+ top.doOpen
+ top.start
+plus all of these when discardCount != 0
+OPENOPEN
+
+send-side uses:
+ f = top.reportViolation(f)
+receive-side should use it too (instead of f.raiseException)
+
+** test failure-handing during callRemote argument serialization
+
+** implement/test some streaming Slicers
+
+** test producer Banana
+
+* profiling/optimization
+
+Several areas where I suspect performance issues but am unwilling to fix
+them before having proof that there is a problem:
+
+** Banana.produce
+
+This is the main loop which creates outbound tokens. It is called once at
+connectionMade() (after version negotiation) and thereafter is fired as the
+result of a Deferred whose callback is triggered by a new item being pushed
+on the output queue. It runs until the output queue is empty, or the
+production process is paused (by a consumer who is full), or streaming is
+enabled and one of the Slicers wants to pause.
+
+Each pass through the loop either pushes a single token into the transport,
+resulting in a number of short writes. We can do better than this by telling
+the transport to buffer the individual writes and calling a flush() method
+when we leave the loop. I think Itamar's new cprotocol work provides this
+sort of hook, but it would be nice if there were a generalized Transport
+interface so that Protocols could promise their transports that they will
+use flush() when they've stopped writing for a little while.
+
+Also, I want to be able to move produce() into C code. This means defining a
+CSlicer in addition to the cprotocol stuff before. The goal is to be able to
+slice a large tree of basic objects (lists, tuples, dicts, strings) without
+surfacing into Python code at all, only coming "up for air" when we hit an
+object type that we don't recognize as having a CSlicer available.
+
+** Banana.handleData
+
+The receive-tokenization process wants to be moved into C code. It's
+definitely on the critical path, but it's ugly because it has to keep
+calling into python code to handle each extracted token. Maybe there is a
+way to have fast C code peek through the incoming buffers for token
+boundaries, then give a list of offsets and lengths to the python code. The
+b128 conversion should also happen in C. The data shouldn't be pulled out of
+the input buffer until we've decided to accept it (i.e. the
+memory-consumption guarantees that the schemas provide do not take any
+transport-level buffering into account, and doing cprotocol tokenization
+would represent memory that an attacker can make us spend without triggering
+a schema violation). Itamar's CLineReceiver is a good example: you tokenize
+a big buffer as much as you can, pass the tokens upstairs to Python code,
+then hand the leftover tail to the next read() call. The tokenizer always
+works on the concatenation of two buffers: the tail of the previous read()
+and the complete contents of the current one.
+
+** Unslicer.doOpen delegation
+
+Unslicers form a stack, and each Unslicer gets to exert control over the way
+that its descendents are deserialized. Most don't bother, they just delegate
+the control methods up to the RootUnslicer. For example, doOpen() takes an
+opentype and may return a new Unslicer to handle the new OPEN sequence. Most
+of the time, each Unslicer delegates doOpen() to their parent, all the way
+up the stack to the RootUnslicer who actually performs the UnslicerRegistry
+lookup.
+
+This provides an optimization point. In general, the Unslicer knows ahead of
+time whether it cares to be involved in these methods or not (i.e. whether
+it wants to pay attention to its children/descendants or not). So instead of
+delegating all the time, we could just have a separate Opener stack.
+Unslicers that care would be pushed on the Opener stack at the same time
+they are pushed on the regular unslicer stack, likewise removed. The
+doOpen() method would only be invoked on the top-most Opener, removing a lot
+of method calls. (I think the math is something like turning
+avg(treedepth)*avg(nodes) into avg(nodes)).
+
+There are some other methods that are delegated in this way. open() is
+related to doOpen(). setObject()/getObject() keep track of references to
+shared objects and are typically only intercepted by a second-level object
+which defines a "serialization scope" (like a single remote method call), as
+well as connection-wide references (like pb.Referenceables) tracked by the
+PBRootUnslicer. These would also be targets for optimization.
+
+The fundamental reason for this optimization is that most Unslicers don't
+care about these methods. There are far more uses of doOpen() (one per
+object node) then there are changes to the desired behavior of doOpen().
+
+** CUnslicer
+
+Like CSlicer, the unslicing process wants to be able to be implemented (for
+built-in objects) entirely in C. This means a CUnslicer "object" (a struct
+full of function pointers), a table accessible from C that maps opentypes to
+both CUnslicers and regular python-based Unslicers, and a CProtocol
+tokenization code fed by a CTransport. It should be possible for the
+python->C transition to occur in the reactor when it calls ctransport.doRead
+python->and then not come back up to Python until Banana.receivedObject(),
+at least for built-in types like dicts and strings.
+++ /dev/null
--*- outline -*-
-
-Reasonably independent newpb sub-tasks that need doing. Most important come
-first.
-
-* decide on a version negotiation scheme
-
-Should be able to telnet into a PB server and find out that it is a PB
-server. Pointing a PB client at an HTTP server (or an HTTP client at a PB
-server) should result in an error, not a timeout. Implement in
-banana.Banana.connectionMade().
-
-desiderata:
-
- negotiation should take place with regular banana sequences: don't invent a
- new protocol that is only used at the start of the connection
-
- Banana should be useable one-way, for storage or high-latency RPC (the mnet
- folks want to create a method call, serialize it to a string, then encrypt
- and forward it on to other nodes, sometimes storing it in relays along the
- way if a node is offline for a few days). It should be easy for the layer
- above Banana to feed it the results of what its negotiation would have been
- (if it had actually used an interactive connection to its peer). Feeding the
- same results to both sides should have them proceed as if they'd agreed to
- those results.
-
- negotiation should be flexible enough to be extended but still allow old
- code to talk with new code. Magically predict every conceivable extension
- and provide for it from the very first release :).
-
-There are many levels to banana, all of which could be useful targets of
-negotiation:
-
- which basic tokens are in use? Is there a BOOLEAN token? a NONE token? Can
- it accept a LONGINT token or is the target limited to 32-bit integers?
-
- are there any variations in the basic Banana protocol being used? Could the
- smaller-scope OPEN-counter decision be deferred until after the first
- release and handled later with a compatibility negotiation flag?
-
- What "base" OPEN sequences are known? 'unicode'? 'boolean'? 'dict'? This is
- an overlap between expressing the capabilities of the host language, the
- Banana implementation, and the needs of the application. How about
- 'instance', probably only used for StorageBanana?
-
- What "top-level" OPEN sequences are known? PB stuff (like 'call', and
- 'your-reference')? Are there any variations or versions that need to be
- known? We may add new functionality in the future, it might be useful for
- one end to know whether this functionality is available or not. (the PB
- 'call' sequence could some day take numeric argument names to convey
- positional parameters, a 'reference' sequence could take a string to
- indicate globally-visible PB URLs, it could become possible to pass
- target.remote_foo directly to a peer and have a callable RemoteMethod object
- pop out the other side).
-
- What "application-level" sequences are available? (Which RemoteInterface
- classes are known and valid in 'call' sequences? Which RemoteCopy names are
- valid for targets of the 'copy' sequence?). This is not necessarily within
- the realm of Banana negotiation, but applications may need to negotiate this
- sort of thing, and any disagreements will be manifested when Banana starts
- raising Violations, so it may be useful to include it in the Banana-level
- negotiation.
-
-On the other hand, negotiation is only useful if one side is prepared to
-accomodate a peer which cannot do some of the things it would prefer to use,
-or if it wants to know about the incapabilities so it can report a useful
-failure rather than have an obscure protocol-level error message pop up an
-hour later. So negotiation isn't the only goal: simple capability awareness
-is a useful lesser goal.
-
-It kind of makes sense for the first object of a stream to be a negotiation
-blob. We could make a new 'version' opentype, and declare that the contents
-will be something simple and forever-after-parseable (like a dict, with heavy
-constraints on the keys and values, all strings emitted in full).
-
-DONE, at least the framework is in place. Uses HTTP-style header-block
-exchange instead of banana sequences, with client-sends-first and
-server-decides. This correctly handles PB-vs-HTTP, but requires a timeout to
-detect oldpb clients vs newpb servers. No actual feature negotiation is
-performed yet, because we still only have the one version of the code.
-
-* connection initiation
-
-** define PB URLs
-
-[newcred is the most important part of this, the URL stuff can wait]
-
-A URL defines an endpoint: a pb.Referenceable, with methods. Somewhere along
-the way it defines a transport (tcp+host+port, or unix+path) and an object
-reference (pathname). It might also define a RemoteInterface, or that might
-be put off until we actually invoke a method.
-
- URL = f("pb:", host, port, pathname)
- d = pb.callRemoteURL(URL, ifacename, methodname, args)
-
-probably give an actual RemoteInterface instead of just its name
-
-a pb.RemoteReference claims to provide access to zero-or-more
-RemoteInterfaces. You may choose which one you want to use when invoking
-callRemote.
-
-TODO: decide upon a syntax for URLs that refer to non-TCP transports
- pb+foo://stuff, pby://stuff (for yURL-style self-authenticating names)
-
-TODO: write the URL parser, implementing pb.getRemoteURL and pb.callRemoteURL
- DONE: use a Tub/PBService instead
-
-TODO: decide upon a calling convention for callRemote when specifying which
-RemoteInterface is being used.
-
-
-DONE, PB-URL is the way to go.
-** more URLs
-
-relative URLs (those without a host part) refer to objects on the same
-Broker. Absolute URLs (those with a host part) refer to objects on other
-Brokers.
-
-SKIP, interesting but not really useful
-
-** build/port pb.login: newcred for newpb
-
-Leave cred work for Glyph.
-
-<thomasvs> has some enhanced PB cred stuff (challenge/response, pb.Copyable
-credentials, etc).
-
-URL = pb.parseURL("pb://lothar.com:8789/users/warner/services/petmail",
- IAuthorization)
-URL = doFullLogin(URL, "warner", "x8yzzy")
-URL.callRemote(methodname, args)
-
-NOTDONE
-
-* constrain ReferenceUnslicer properly
-
-The schema can use a ReferenceConstraint to indicate that the object must be
-a RemoteReference, and can also require that the remote object be capable of
-handling a particular Interface.
-
-This needs to be implemented. slicer.ReferenceUnslicer must somehow actually
-ask the constraint about the incoming tokens.
-
-An outstanding question is "what counts". The general idea is that
-RemoteReferences come over the wire as a connection-scoped ID number and an
-optional list of Interface names (strings and version numbers). In this case
-it is the far end which asserts that its object can implement any given
-Interface, and the receiving end just checks to see if the schema-imposed
-required Interface is in the list.
-
-This becomes more interesting when applied to local objects, or if a
-constraint is created which asserts that its object is *something* (maybe a
-RemoteReference, maybe a RemoteCopy) which implements a given Interface. In
-this case, the incoming object could be an actual instance, but the class
-name must be looked up in the unjellyableRegistry (and the class located, and
-the __implements__ list consulted) before any of the object's tokens are
-accepted.
-
-* security TODOs:
-
-** size constraints on the set-vocab sequence
-
-* implement schema.maxSize()
-
-In newpb, schemas serve two purposes:
-
- a) make programs safer by reducing the surprises that can appear in their
- arguments (i.e. factoring out argument-checking in a useful way)
-
- b) remove memory-consumption DoS attacks by putting an upper bound on the
- memory consumed by any particular message.
-
-Each schema has a pair of methods named maxSize() and maxDepth() which
-provide this upper bound. While the schema is in effect (say, during the
-receipt of a particular named argument to a remotely-invokable method), at
-most X bytes and Y slicer frames will be in use before either the object is
-accepted and processed or the schema notes the violation and the object is
-rejected (whereupon the temporary storage is released and all further bytes
-in the rejected object are simply discarded). Strictly speaking, the number
-returned by maxSize() is the largest string on the wire which has not yet
-been rejected as violating the constraint, but it is also a reasonable
-metric to describe how much internal storage must be used while processing
-it. (To achieve greater accuracy would involve knowing exactly how large
-each Python type is; not a sensible thing to attempt).
-
-The idea is that someone who is worried about an attacker throwing a really
-long string or an infinitely-nested list at them can ask the schema just what
-exactly their current exposure is. The tradeoff between flexibility ("accept
-any object whatsoever here") and exposure to DoS attack is then user-visible
-and thus user-selectable.
-
-To implement maxSize() for a basic schema (like a string), you simply need
-to look at banana.xhtml and see how basic tokens are encoded (you will also
-need to look at banana.py and see how deserialization is actually
-implemented). For a schema.StringConstraint(32) (which accepts strings <= 32
-characters in length), the largest serialized form that has not yet been
-either accepted or rejected is:
-
- 64 bytes (header indicating 0x000000..0020 with lots of leading zeros)
- + 1 byte (STRING token)
- + 32 bytes (string contents)
- = 97
-
-If the header indicates a conforming length (<=32) then just after the 32nd
-byte is received, the string object is created and handed to up the stack, so
-the temporary storage tops out at 97. If someone is trying to spam us with a
-million-character string, the serialized form would look like:
-
- 64 bytes (header indicating 1-million in hex, with leading zeros)
-+ 1 byte (STRING token)
-= 65
-
-at which point the receive parser would check the constraint, decide that
-1000000 > 32, and reject the remainder of the object.
-
-So (with the exception of pass/fail maxSize values, see below), the following
-should hold true:
-
- schema.StringConstraint(32).maxSize() == 97
-
-Now, schemas which represent containers have size limits that are the sum of
-their contents, plus some overhead (and a stack level) for the container
-itself. For example, a list of two small integers is represented in newbanana
-as:
-
- OPEN(list)
- INT
- INT
- CLOSE()
-
-which really looks like:
-
- opencount-OPEN
- len-STRING-"list"
- value-INT
- value-INT
- opencount-CLOSE
-
-This sequence takes at most:
-
- opencount-OPEN: 64+1
- len-STRING-"list": 64+1+1000 (opentypes are confined to be <= 1k long)
- value-INT: 64+1
- value-INT: 64+1
- opencount-CLOSE: 64+1
-
-or 5*(64+1)+1000 = 1325, or rather:
-
- 3*(64+1)+1000 + N*(IntConstraint().maxSize())
-
-So ListConstraint.maxSize is computed by doing some math involving the
-.maxSize value of the objects that go into it (the ListConstraint.constraint
-attribute). This suggests a recursive algorithm. If any constraint is
-unbounded (say a ListConstraint with no limit on the length of the list),
-then maxSize() raises UnboundedSchema to indicate that there is no limit on
-the size of a conforming string. Clearly, if any constraint is found to
-include itself, UnboundedSchema must also be raised.
-
-This is a loose upper bound. For example, one non-conforming input string
-would be:
-
- opencount-OPEN: 64+1
- len-STRING-"x"*1000: 64+1+1000
-
-The entire string would be accepted before checking to see which opentypes
-were valid: the ListConstraint only accepts the "list" opentype and would
-reject this string immediately after the 1000th "x" was received. So a
-tighter upper bound would be 2*65+1000 = 1130.
-
-In general, the bound is computed by walking through the deserialization
-process and identifying the largest string that could make it past the
-validity checks. There may be later checks that will reject the string, but
-if it has not yet been rejected, then it still represents exposure for a
-memory consumption DoS.
-
-** pass/fail sizes
-
-I started to think that it was necessary to have each constraint provide two
-maxSize numbers: one of the largest sequence that could possibly be accepted
-as valid, and a second which was the largest sequence that could be still
-undecided. This would provide a more accurate upper bound because most
-containers will respond to an invalid object by abandoning the rest of the
-container: i.e. if the current active constraint is:
-
- ListConstraint(StringConstraint(32), maxLength=30)
-
-then the first thing that doesn't match the string constraint (say an
-instance, or a number, or a 33-character string) will cause the ListUnslicer
-to go into discard-everything mode. This makes a significant difference when
-the per-item constraint allows opentypes, because the OPEN type (a string) is
-constrained to 1k bytes. The item constraint probably imposes a much smaller
-limit on the set of actual strings that would be accepted, so no
-kilobyte-long opentype will possibly make it past that constraint. That means
-there can only be one outstanding invalid object. So the worst case (maximal
-length) string that has not yet been rejected would be something like:
-
- OPEN(list)
- validthing [0]
- validthing [1]
- ...
- validthing [n-1]
- long-invalid-thing
-
-because if the long-invalid thing had been received earlier, the entire list
-would have been abandoned.
-
-This suggests that the calculation for ListConstraint.maxSize() really needs
-to be like
- overhead
- +(len-1)*itemConstraint.maxSize(valid)
- +(1)*itemConstraint.maxSize(invalid)
-
-I'm still not sure about this. I think it provides a significantly tighter
-upper bound. The deserialization process itself does not try to achieve the
-absolute minimal exposure (i.e., the opentype checker could take the set of
-all known-valid open types, compute the maximum length, and then impose a
-StringConstraint with that length instead of 1000), because it is, in
-general, a inefficient hassle. There is a tradeoff between computational
-efficiency and removing the slack in the maxSize bound, both in the
-deserialization process (where the memory is actually consumed) and in
-maxSize (where we estimate how much memory could be consumed).
-
-Anyway, maxSize() and maxDepth() (which is easier: containers add 1 to the
-maximum of the maxDepth values of their possible children) need to be
-implemented for all the Constraint classes. There are some tests (disabled)
-in test_schema.py for this code: those tests assert specific values for
-maxSize. Those values are probably wrong, so they must be updated to match
-however maxSize actually works.
-
-* decide upon what the "Shared" constraint should mean
-
-The idea of this one was to avoid some vulnerabilities by rejecting arbitrary
-object graphs. Fundamentally Banana can represent most anything (just like
-pickle), including objects that refer to each other in exciting loops and
-whorls. There are two problems with this: it is hard to enforce a schema that
-allows cycles in the object graph (indeed it is tricky to even describe one),
-and the shared references could be used to temporarily violate a schema.
-
-I think these might be fixable (the sample case is where one tuple is
-referenced in two different places, each with a different constraint, but the
-tuple is incomplete until some higher-level node in the graph has become
-referenceable, so [maybe] the schema can't be enforced until somewhat after
-the object has actually finished arriving).
-
-However, Banana is aimed at two different use-cases. One is kind of a
-replacement for pickle, where the goal is to allow arbitrary object graphs to
-be serialized but have more control over the process (in particular we still
-have an unjellyableRegistry to prevent arbitrary constructors from being
-executed during deserialization). In this mode, a larger set of Unslicers are
-available (for modules, bound methods, etc), and schemas may still be useful
-but are not enforced by default.
-
-PB will use the other mode, where the set of conveyable objects is much
-smaller, and security is the primary goal (including putting limits on
-resource consumption). Schemas are enforced by default, and all constraints
-default to sensible size limits (strings to 1k, lists to [currently] 30
-items). Because complex object graphs are not commonly transported across
-process boundaries, the default is to not allow any Copyable object to be
-referenced multiple times in the same serialization stream. The default is to
-reject both cycles and shared references in the object graph, allowing only
-strict trees, making life easier (and safer) for the remote methods which are
-being given this object tree.
-
-The "Shared" constraint is intended as a way to turn off this default
-strictness and allow the object to be referenced multiple times. The
-outstanding question is what this should really mean: must it be marked as
-such on all places where it could be referenced, what is the scope of the
-multiple-reference region (per- method-call, per-connection?), and finally
-what should be done when the limit is violated. Currently Unslicers see an
-Error object which they can respond to any way they please: the default
-containers abandon the rest of their contents and hand an Error to their
-parent, the MethodCallUnslicer returns an exception to the caller, etc. With
-shared references, the first recipient sees a valid object, while the second
-and later recipient sees an error.
-
-
-* figure out Deferred errors for immutable containers
-
-Somewhat related to the previous one. The now-classic example of an immutable
-container which cannot be created right away is the object created by this
-sequence:
-
- t = ([],)
- t[0].append((t,))
-
-This serializes into (with implicit reference numbers on the left):
-
-[0] OPEN(tuple)
-[1] OPEN(list)
-[2] OPEN(tuple)
-[3] OPEN(reference #0)
- CLOSE
- CLOSE
- CLOSE
-
-In newbanana, the second TupleUnslicer cannot return a fully-formed tuple to
-its parent (the ListUnslicer), because that tuple cannot be created until the
-contents are all referenceable, and that cannot happen until the first
-TupleUnslicer has completed. So the second TupleUnslicer returns a Deferred
-instead of a tuple, and the ListUnslicer adds a callback which updates the
-list's item when the tuple is complete.
-
-The problem here is that of error handling. In general, if an exception is
-raised (perhaps a protocol error, perhaps a schema violation) while an
-Unslicer is active, that Unslicer is abandoned (all its remaining tokens are
-discarded) and the parent gets an Error object. (the parent may give up too..
-the basic Unslicers all behave this way, so any exception will cause
-everything up to the RootUnslicer to go boom, and the RootUnslicer has the
-option of dropping the connection altogether). When the error is noticed, the
-Unslicer stack is queried to figure out what path was taken from the root of
-the object graph to the site that had an error. This is really useful when
-trying to figure out which exact object cause a SchemaViolation: rather than
-being told a call trace or a description of the *object* which had a problem,
-you get a description of the path to that object (the same series of
-dereferences you'd use to print the object: obj.children[12].peer.foo.bar).
-
-When references are allowed, these exceptions could occur after the original
-object has been received, when that Deferred fires. There are two problems:
-one is that the error path is now misleading, the other is that it might not
-have been possible to enforce a schema because the object was incomplete.
-
-The most important thing is to make sure that an exception that occurs while
-the Deferred is being fired is caught properly and flunks the object just as
-if the problem were caught synchronously. This may involve discarding an
-otherwise complete object graph and blaming the problem on a node much closer
-to the root than the one which really caused the failure.
-
-* adaptive VOCAB compression
-
-We want to let banana figure out a good set of strings to compress on its
-own. In Banana.sendToken, keep a list of the last N strings that had to be
-sent in full (i.e. they weren't in the table). If the string being sent
-appears more than M times in that table, before we send the token, emit an
-ADDVOCAB sequence, add a vocab entry for it, then send a numeric VOCAB token
-instead of the string.
-
-Make sure the vocab mapping is not used until the ADDVOCAB sequence has been
-queued. Sending it inline should take care of this, but if for some reason we
-need to push it on the top-level object queue, we need to make sure the vocab
-table is not updated until it gets serialized. Queuing a VocabUpdate object,
-which updates the table when it gets serialized, would take care of this. The
-advantage of doing it inline is that later strings in the same object graph
-would benefit from the mapping. The disadvantage is that the receiving
-Unslicers must be prepared to deal with ADDVOCAB sequences at any time (so
-really they have to be stripped out). This disadvantage goes away if ADDVOCAB
-is a token instead of a sequence.
-
-Reasonable starting values for N and M might be 30 and 3.
-
-* write oldbanana compatibility code?
-
-An oldbanana peer can be detected because the server side sends its dialect
-list from connectionMade, and oldbanana lists are sent with OLDLIST tokens
-(the explicit-length kind).
-
-
-* add .describe methods to all Slicers
-
-This involves setting an attribute between each yield call, to indicate what
-part is about to be serialized.
-
-
-* serialize remotely-callable methods?
-
-It might be useful be able to do something like:
-
- class Watcher(pb.Referenceable):
- def remote_foo(self, args): blah
-
- w = Watcher()
- ref.callRemote("subscribe", w.remote_foo)
-
-That would involve looking up the method and its parent object, reversing
-the remote_*->* transformation, then sending a sequence which contained both
-the object's RemoteReference and the appropriate method name.
-
-It might also be useful to generalize this: passing a lambda expression to
-the remote end could stash the callable in a local table and send a Callable
-Reference to the other side. I can smell a good general-purpose object
-classification framework here, but I haven't quite been able to nail it down
-exactly.
-
-* testing
-
-** finish testing of LONGINT/LONGNEG
-
-test_banana.InboundByteStream.testConstrainedInt needs implementation
-
-** thoroughly test failure-handling at all points of in/out serialization
-
-places where BananaError or Violation might be raised
-
-sending side:
- Slicer creation (schema pre-validation? no): no no
- pre-validation is done before sending the object, Broker.callFinished,
- RemoteReference.doCall
- slicer creation is done in newSlicerFor
-
- .slice (called in pushSlicer) ?
- .slice.next raising Violation
- .slice.next returning Deferrable when streaming isn't allowed
- .sendToken (non-primitive token, can't happen)
- .newSlicerFor (no ISlicer adapter)
- top.childAborted
-
-receiving side:
- long header (>64 bytes)
- checkToken (top.openerCheckToken)
- checkToken (top.checkToken)
- typebyte == LIST (oldbanana)
- bad VOCAB key
- too-long vocab key
- bad FLOAT encoding
- top.receiveClose
- top.finish
- top.reportViolation
- oldtop.finish (in from handleViolation)
- top.doOpen
- top.start
-plus all of these when discardCount != 0
-OPENOPEN
-
-send-side uses:
- f = top.reportViolation(f)
-receive-side should use it too (instead of f.raiseException)
-
-** test failure-handing during callRemote argument serialization
-
-** implement/test some streaming Slicers
-
-** test producer Banana
-
-* profiling/optimization
-
-Several areas where I suspect performance issues but am unwilling to fix
-them before having proof that there is a problem:
-
-** Banana.produce
-
-This is the main loop which creates outbound tokens. It is called once at
-connectionMade() (after version negotiation) and thereafter is fired as the
-result of a Deferred whose callback is triggered by a new item being pushed
-on the output queue. It runs until the output queue is empty, or the
-production process is paused (by a consumer who is full), or streaming is
-enabled and one of the Slicers wants to pause.
-
-Each pass through the loop either pushes a single token into the transport,
-resulting in a number of short writes. We can do better than this by telling
-the transport to buffer the individual writes and calling a flush() method
-when we leave the loop. I think Itamar's new cprotocol work provides this
-sort of hook, but it would be nice if there were a generalized Transport
-interface so that Protocols could promise their transports that they will
-use flush() when they've stopped writing for a little while.
-
-Also, I want to be able to move produce() into C code. This means defining a
-CSlicer in addition to the cprotocol stuff before. The goal is to be able to
-slice a large tree of basic objects (lists, tuples, dicts, strings) without
-surfacing into Python code at all, only coming "up for air" when we hit an
-object type that we don't recognize as having a CSlicer available.
-
-** Banana.handleData
-
-The receive-tokenization process wants to be moved into C code. It's
-definitely on the critical path, but it's ugly because it has to keep
-calling into python code to handle each extracted token. Maybe there is a
-way to have fast C code peek through the incoming buffers for token
-boundaries, then give a list of offsets and lengths to the python code. The
-b128 conversion should also happen in C. The data shouldn't be pulled out of
-the input buffer until we've decided to accept it (i.e. the
-memory-consumption guarantees that the schemas provide do not take any
-transport-level buffering into account, and doing cprotocol tokenization
-would represent memory that an attacker can make us spend without triggering
-a schema violation). Itamar's CLineReceiver is a good example: you tokenize
-a big buffer as much as you can, pass the tokens upstairs to Python code,
-then hand the leftover tail to the next read() call. The tokenizer always
-works on the concatenation of two buffers: the tail of the previous read()
-and the complete contents of the current one.
-
-** Unslicer.doOpen delegation
-
-Unslicers form a stack, and each Unslicer gets to exert control over the way
-that its descendents are deserialized. Most don't bother, they just delegate
-the control methods up to the RootUnslicer. For example, doOpen() takes an
-opentype and may return a new Unslicer to handle the new OPEN sequence. Most
-of the time, each Unslicer delegates doOpen() to their parent, all the way
-up the stack to the RootUnslicer who actually performs the UnslicerRegistry
-lookup.
-
-This provides an optimization point. In general, the Unslicer knows ahead of
-time whether it cares to be involved in these methods or not (i.e. whether
-it wants to pay attention to its children/descendants or not). So instead of
-delegating all the time, we could just have a separate Opener stack.
-Unslicers that care would be pushed on the Opener stack at the same time
-they are pushed on the regular unslicer stack, likewise removed. The
-doOpen() method would only be invoked on the top-most Opener, removing a lot
-of method calls. (I think the math is something like turning
-avg(treedepth)*avg(nodes) into avg(nodes)).
-
-There are some other methods that are delegated in this way. open() is
-related to doOpen(). setObject()/getObject() keep track of references to
-shared objects and are typically only intercepted by a second-level object
-which defines a "serialization scope" (like a single remote method call), as
-well as connection-wide references (like pb.Referenceables) tracked by the
-PBRootUnslicer. These would also be targets for optimization.
-
-The fundamental reason for this optimization is that most Unslicers don't
-care about these methods. There are far more uses of doOpen() (one per
-object node) then there are changes to the desired behavior of doOpen().
-
-** CUnslicer
-
-Like CSlicer, the unslicing process wants to be able to be implemented (for
-built-in objects) entirely in C. This means a CUnslicer "object" (a struct
-full of function pointers), a table accessible from C that maps opentypes to
-both CUnslicers and regular python-based Unslicers, and a CProtocol
-tokenization code fed by a CTransport. It should be possible for the
-python->C transition to occur in the reactor when it calls ctransport.doRead
-python->and then not come back up to Python until Banana.receivedObject(),
-at least for built-in types like dicts and strings.
+++ /dev/null
--*- outline -*-
-
-non-independent things left to do on newpb. These require deeper magic or
-can not otherwise be done casually. Many of these involve fundamental
-protocol issues, and therefore need to be decided sooner rather than later.
-
-* summary
-** protocol issues
-*** negotiation
-*** VOCABADD/DEL/SET sequences
-*** remove 'copy' prefix from RemoteCopy type sequences?
-*** smaller scope for OPEN-counter reference numbers?
-** implementation issues
-*** cred
-*** oldbanana compatibility
-*** Copyable/RemoteCopy default to __getstate__ or self.__dict__ ?
-*** RIFoo['bar'] vs RIFoo.bar (should RemoteInterface inherit from Interface?)
-*** constrain ReferenceUnslicer
-*** serialize target.remote_foo usefully
-
-* decide whether to accept positional args in non-constrained methods
-
-DEFERRED until after 2.0
-<glyph> warner: that would be awesome but let's do it _later_
-
-This is really a backwards-source-compatibility issue. In newpb, the
-preferred way of invoking callRemote() is with kwargs exclusively: glyph's
-felt positional arguments are more fragile. If the client has a
-RemoteInterface, then they can convert any positional arguments into keyword
-arguments before sending the request.
-
-The question is what to do when the client is not using a RemoteInterface.
-Until recently, callRemote("bar") would try to find a matching RI. I changed
-that to have callRemote("bar") never use an RI, and instead you would use
-callRemote(RIFoo['bar']) to indicate that you want argument-checking.
-
-That makes positional arguments problematic in more situations than they were
-before. The decision to be made is if the OPEN(call) sequence should provide
-a way to convey positional args to the server (probably with numeric "names"
-in the (argname, argvalue) tuples). If we do this, the server (which always
-has the RemoteInterface) can do the positional-to-keyword mapping. But
-putting this in the protocol will oblige other implementations to handle them
-too.
-
-* change the method-call syntax to include an interfacename
-DONE
-
-Scope the method name to the interface. This implies (I think) one of two
-things:
-
- callRemote() must take a RemoteInterface argument
-
- each RemoteReference handles just a single Interface
-
-Probably the latter, maybe have the RR keep both default RI and a list of
-all implemented ones, then adapting the RR to a new RI can be a simple copy
-(and change of the default one) if the Referenceable knows about the RI.
-Otherwise something on the local side will need to adapt one RI to another.
-Need to handle reference-counting/DECREF properly for these shared RRs.
-
-From glyph:
-
- callRemote(methname, **args) # searches RIs
- callRemoteInterface(remoteinterface, methname, **args) # single RI
-
- getRemoteURL(url, *interfaces)
-
- URL-RRefs should turn into the original Referenceable (in args/results)
- (map through the factory's table upon receipt)
-
- URL-RRefs will not survive round trips. leave reference exchange for later.
- (like def remote_foo(): return GlobalReference(self) )
-
- move method-invocation code into pb.Referenceable (or IReferenceable
- adapter). Continue using remote_ prefix for now, but make it a property of
- that code so it can change easily.
- <warner> ok, for today I'm just going to stick with remote_foo() as a
- low-budget decorator, so the current restrictions are 1: subclass
- pb.Referenceable, 2: implements() a RemoteInterface with method named "foo",
- 3: implement a remote_foo method
- <warner> and #1 will probably go away within a week or two, to be replaced by
- #1a: subclass pb.Referenceable OR #1b: register an IReferenceable adapter
-
- try serializing with ISliceable first, then try IReferenceable. The
- IReferenceable adapter must implements() some RemoteInterfaces and gets
- serialized with a MyReferenceSlicer.
-
-http://svn.twistedmatrix.com/cvs/trunk/pynfo/admin.py?view=markup&rev=44&root=pynfo
-
-** use the methods of the RemoteInterface as the "method name"
-DONE (provisional), using RIFoo['add']
-
- rr.callRemote(RIFoo.add, **args)
-
-Nice and concise. However, #twisted doesn't like it, adding/using arbitrary
-attributes of Interfaces is not clean (think about IFoo.implements colliding
-with RIFoo.something).
-
- rr.callRemote(RIFoo['add'], **args)
- RIFoo(rr).callRemote('add', **args)
- adaptation, or narrowing?
-
-<warner> glyph: I'm adding callRemote(RIFoo.bar, **args) to newpb right now
-<radix> wow.
-<warner> seemed like a simpler interface than callRemoteInterface("RIFoo",
-"bar", **args)
-<radix> warner: Does this mean that IPerspective can be parameterized now?
-<glyph> warner: bad idea
-<exarkun> warner: Zope hates you!
-<glyph> warner: zope interfaces don't support that syntax
-<slyphon> zi does support multi-adapter syntax
-<slyphon> but i don't really know what that is
-<exarkun> warner: callRemote(RIFoo.getDescriptionFor("bar"), *a, **k)
-<warner> glyph: yeah, I fake it. In RemoteInterfaceClass, I remove those
-attributes, call InterfaceClass, and then put them all back in
-<glyph> warner: don't add 'em as attributes
-<glyph> warner: just fix the result of __getitem__ to add a slot actually
-refer back to the interface
-<glyph> radix: the problem is that IFoo['bar'] doesn't point back to IFoo
-<glyph> warner: even better, make them callable :-)
-<exarkun> glyph: IFoo['bar'].interface == 'IFoo'
-<glyph> RIFoo['bar']('hello')
-<warner> glyph: I was thinking of doing that in a later version of
-RemoteInterface
-<glyph> exarkun: >>> type(IFoo['bar'].interface)
-<glyph> <type 'str'>
-<exarkun> right
-<exarkun> 'IFoo'
-<exarkun> Just look through all the defined interfaces for ones with matching
-names
-<glyph> exarkun: ... e.g. *NOT* __main__.IFoo
-<glyph> exarkun: AAAA you die
-<radix> hee hee
-* warner struggles to keep up with his thoughts and those of people around him
-* glyph realizes he has been given the power to whine
-<warner> glyph: ok, so with RemoteInterface.__getitem__, you could still do
-rr.callRemote(RIFoo.bar, **kw), right?
-<warner> was your objection to the interface or to the implementation?
-<itamar> I really don't think you should add attributes to the interface
-<warner> ok
-<warner> I need to stash a table of method schemas somewhere
-<itamar> just make __getitem__ return better type of object
-<itamar> and ideally if this is generic we can get it into upstream
-<exarkun> Is there a reason Method.interface isn't a fully qualified name?
-<itamar> not necessarily
-<itamar> I have commit access to zope.interface
-<itamar> if you have any features you want added, post to
-interface-dev@zope.org mailing list
-<itamar> and if Jim Fulton is ok with them I can add them for you
-<warner> hmm
-<warner> does using RIFoo.bar to designate a remote method seem reasonable?
-<warner> I could always adapt it to something inside callRemote
-<warner> something PB-specific, that is
-<warner> but that adapter would have to be able to pull a few attributes off
-the method (name, schema, reference to the enclosing RemoteInterface)
-<warner> and we're really talking about __getattr__ here, not __getitem__,
-right?
-<exarkun> for x.y yes
-<itamar> no, I don't think that's a good idea
-<itamar> interfaces have all kinds od methods on them already, for
-introspection purposes
-<itamar> namespace clashes are the suck
-<itamar> unless RIFoo isn't really an Interface
-<itamar> hm
-<itamar> how about if it were a wrapper around a regular Interface?
-<warner> yeah, RemoteInterfaces are kind of a special case
-<itamar> RIFoo(IFoo, publishedMethods=['doThis', 'doThat'])
-<itamar> s/RIFoo/RIFoo = RemoteInterface(/
-<exarkun> I'm confused. Why should you have to specify which methods are
-published?
-<itamar> SECURITY!
-<itamar> not actually necessary though, no
-<itamar> and may be overkill
-<warner> the only reason I have it derive from Interface is so that we can do
-neat adapter tricks in the future
-<itamar> that's not contradictory
-<itamar> RIFoo(x) would still be able to do magic
-<itamar> you wouldn't be able to check if an object provides RIFoo, though
-<itamar> which kinda sucks
-<itamar> but in any case I am against RIFoo.bar
-<warner> pity, it makes the callRemote syntax very clean
-<radix> hm
-<radix> So how come it's a RemoteInterface and not an Interface, anyway?
-<radix> I mean, how come that needs to be done explicitly. Can't you just
-write a serializer for Interface itself?
-
-* warner goes to figure out where the RemoteInterface discussion went after he
- got distracted
-<warner> maybe I should make RemoteInterface a totally separate class and just
-implement a couple of Interface-like methods
-<warner> cause rr.callRemote(IFoo.bar, a=1) just feels so clean
-<Jerub> warner: why not IFoo(rr).bar(a=1) ?
-<warner> hmm, also a possibility
-<radix> well
-<radix> IFoo(rr).callRemote('bar')
-<radix> or RIFoo, or whatever
-<Jerub> hold on, what does rr inherit from?
-<warner> RemoteReference
-<radix> it's a RemoteReference
-<Jerub> then why not IFoo(rr) /
-<warner> I'm keeping a strong distinction between local interfaces and remote
-ones
-<Jerub> ah, oka.y
-<radix> warner: right, you can still do RIFoo
-<warner> ILocal(a).meth(args) is an immediate function call
-<Jerub> in that case, I prefer rr.callRemote(IFoo.bar, a=1)
-<radix> .meth( is definitely bad, we need callRemote
-<warner> rr.callRemote("meth", args) returns a deferred
-<Jerub> radix: I don't like from foo import IFoo, RIFoo
-<warner> you probably wouldn't have both an IFoo and an RIFoo
-<radix> warner: well, look at it this way: IFoo(rr).callRemote('foo') still
-makes it obvious that IFoo isn't local
-<radix> warner: you could implement RemoteReferen.__conform__ to implement it
-<warner> radix: I'm thinking of providing some kind of other class that would
-allow .meth() to work (without the callRemote), but it wouldn't be the default
-<radix> plus, IFoo(rr) is how you use interfaces normally, and callRemote is
-how you make remote calls normally, so it seems that's the best way to do
-interfaces + PB
-<warner> hmm
-<warner> in that case the object returned by IFoo(rr) is just rr with a tag
-that sets the "default interface name"
-<radix> right
-<warner> and callRemote(methname) looks in that default interface before
-looking anywhere else
-<warner> for some reason I want to get rid of the stringyness of the method
-name
-<warner> and the original syntax (callRemoteInterface('RIFoo', 'methname',
-args)) felt too verbose
-<radix> warner: well, isn't that what your optional .meth thing is for?
-<radix> yes, I don't like that either
-<warner> using callRemote(RIFoo.bar, args) means I can just switch on the
-_name= argument being either a string or a (whatever) that's contained in a
-RemoteInterface
-<warner> a lot of it comes down to how adapters would be most useful when
-dealing with remote objects
-<warner> and to what extent remote interfaces should be interchangeable with
-local ones
-<radix> good point. I have never had a use case where I wanted to adapt a
-remote object, I don't think
-<radix> however, I have had use cases to send interfaces across the wire
-<radix> e.g. having a parameterized portal.login() interface
-<warner> that'll be different, just callRemote('foo', RIFoo)
-<radix> yeah.
-<warner> the current issue is whether to pass them by reference or by value
-<radix> eugh
-<radix> Can you explain it without using those words? :)
-<warner> hmm
-<radix> Do you mean, Referenceable style vs Copyable style?
-<warner> at the moment, when you send a Referenceable across the wire, the
-id-number is accompanied with a list of strings that designate which
-RemoteInterfaces the original claims to provide
-<warner> the receiving end looks up each string in a local table, and
-populates the RemoteReference with a list of RemoteInterface classes
-<warner> the table is populated by metaclass magic that runs when a 'class
-RIFoo(RemoteInterface)' definition is complete
-<radix> ok
-<radix> so a RemoteInterface is simply serialized as its qual(), right?
-<warner> so as long as both sides include the same RIFoo definition, they'll
-wind up with compatible remote interfaces, defining the same method names,
-same method schemas, etc
-<warner> effectively
-<warner> you can't just send a RemoteInterface across the wire right now, but
-it would be easy to add
-<warner> the places where they are used (sending a Referenceable across the
-wire) all special case them
-<radix> ok, and you're considering actually writing a serializer for them that
-sends all the information to totally reconstruct it on the other side without
-having the definiton
-<warner> yes
-<warner> or having some kind of debug method which give you that
-<radix> I'd say, do it the way you're doing it now until someone comes up with
-a use case for actually sending it...
-<warner> right
-<warner> the only case I can come up with is some sort of generic object
-browser debug tool
-<warner> everything else turns into a form of version negotiation which is
-better handled elsewhere
-<warner> hmm
-<warner> so RIFoo(rr).callRemote('bar', **kw)
-<warner> I guess that's not too ugly
-<radix> That's my vote. :)
-<warner> one thing it lacks is the ability to cleanly state that if 'bar'
-doesn't exist in RIFoo then it should signal an error
-<warner> whereas callRemote(RIFoo.bar, **kw) would give you an AttributeError
-before callRemote ever got called
-<warner> i.e. "make it impossible to express the incorrect usage"
-<radix> mmmh
-<radix> warner: but you _can_ check it immediately when it's called
-<warner> in the direction I was heading, callRemote(str) would just send the
-method request and let the far end deal with it, no schema-checking involved
-<radix> warner: which, 99% of the time, is effectively the same time as
-IFoo.bar would happen
-<warner> whereas callRemote(RIFoo.bar) would indicate that you want schema
-checking
-<warner> yeah, true
-<radix> hm.
-<warner> (that last feature is what allowed callRemote and callRemoteInterface
-to be merged)
-<warner> or, I could say that the normal RemoteReference is "untyped" and does
-not do schema checking
-<warner> but adapting one to a RemoteInterface results in a
-TypedRemoteReference which does do schema checking
-<warner> and which refuses to be invoked with method names that are not in the
-schema
-<radix> warner: we-ell
-<radix> warner: doing method existence checking is cool
-<radix> warner: but I think tying any further "schema checking" to adaptation
-is a bad idea
-<warner> yeah, that's my hunch too
-<warner> which is why I'd rather not use adapters to express the scope of the
-method name (which RemoteInterface it is supposed to be a part of)
-<radix> warner: well, I don't think tying it to callRemote(RIFoo.methName)
-would be a good idea just the same
-<warner> hm
-<warner> so that leaves rr.callRemote(RIFoo['add']) and
-rr.callRemoteInterface(RIFoo, 'add')
-<radix> OTOH, I'm inclined to think schema checking should happen by default
-<radix> It's just a the matter of where it's parameterized
-<warner> yeah, it's just that the "default" case (rr.callRemote('name')) needs
-to work when there aren't any RemoteInterfaces declared
-<radix> warner: oh
-<warner> but if we want to encourage people to use the schemas, then we need
-to make that case simple and concise
-* radix goes over the issue in his head again
-<radix> Yes, I think I still have the same position.
-<warner> which one? :)
-<radix> IFoo(rr).callRemote("foo"); which would do schema checking because
-schema checking is on by default when it's possible
-<warner> using an adaptation-like construct to declare a scope of the method
-name that comes later
-<radix> well, it _is_ adaptation, I think.
-<radix> Adaptation always has plugged in behavior, we're just adding a bit
-more :)
-<warner> heh
-<warner> it is a narrowing of capability
-<radix> hmm, how do you mean?
-<warner> rr.callRemote("foo") will do the same thing
-<warner> but rr.callRemote("foo") can be used without the remote interfaces
-<radix> I think I lost you.
-<warner> if rr has any RIs defined, it will try to use them (and therefore
-complain if "foo" does not exist in any of them, or if the schema is violated)
-<radix> Oh. That's strange.
-<radix> So it's really quite different from how interfaces regularly work...
-<warner> yeah
-<warner> except that if you were feeling clever you could use them the normal
-way
-<radix> Well, my inclination is to make them work as similarly as possible.
-<warner> "I have a remote reference to something that implements RIFoo, but I
-want to use it in some other way"
-<radix> s/possible/practical/
-<warner> then IBar(rr) or RIBar(rr) would wrap rr in something that knows how
-to translate Bar methods into RIFoo remote methods
-<radix> Maybe it's not practical to make them very similar.
-<radix> I see.
-
-rr.callRemote(RIFoo.add, **kw)
-rr.callRemote(RIFoo['add'], **kw)
-RIFoo(rr).callRemote('add', **kw)
-
-I like the second one. Normal Interfaces behave like a dict, so IFoo['add']
-gets you the method-describing object (z.i.i.Method). My RemoteInterfaces
-don't do that right now (because I remove the attributes before handing the
-RI to z.i), but I could probably fix that. I could either add attributes to
-the Method or hook __getitem__ to return something other than a Method
-(maybe a RemoteMethodSchema).
-
-Those Method objects have a .getSignatureInfo() which provides almost
-everything I need to construct the RemoteMethodSchema. Perhaps I should
-post-process Methods rather than pre-process the RemoteInterface. I can't
-tell how to use the return value trick, and it looks like the function may
-be discarded entirely once the Method is created, so this approach may not
-work.
-
-On the server side (Referenceable), subclassing Interface is nice because it
-provides adapters and implements() queries.
-
-On the client side (RemoteReference), subclassing Interface is a hassle: I
-don't think adapters are as useful, but getting at a method (as an attribute
-of the RI) is important. We have to bypass most of Interface to parse the
-method definitions differently.
-
-* create UnslicerRegistry, registerUnslicer
-DONE (PROVISIONAL), flat registry (therefore problematic for len(opentype)>1)
-
-consider adopting the existing collection API (getChild, putChild) for this,
-or maybe allow registerUnslicer() to take a callable which behaves kind of
-like a twisted.web isLeaf=1 resource (stop walking the tree, give all index
-tokens to the isLeaf=1 node)
-
-also some APIs to get a list of everything in the registry
-
-* use metaclass to auto-register RemoteCopy classes
-DONE
-
-** use metaclass to auto-register Unslicer classes
-DONE
-
-** and maybe Slicer classes too
-DONE with name 'slices', perhaps change to 'slicerForClasses'?
-
- class FailureSlicer(slicer.BaseSlicer):
- classname = "twisted.python.failure.Failure"
- slicerForClasses = (failure.Failure,) # triggers auto-register
-
-** various registry approaches
-DONE
-
-There are currently three kinds of registries used in banana/newpb:
-
- RemoteInterface <-> interface name
- class/type -> Slicer (-> opentype) -> Unslicer (-> class/type)
- Copyable subclass -> copyable-opentype -> RemoteCopy subclass
-
-There are two basic approaches to representing the mappings that these
-registries implement. The first is implicit, where the local objects are
-subclassed from Sliceable or Copyable or RemoteInterface and have attributes
-to define the wire-side strings that represent them. On the receiving side,
-we make extensive use of metaclasses to perform automatic registration
-(taking names from class attributes and mapping them to the factory or
-RemoteInterface used to create the remote version).
-
-The second approach is explicit, where pb.registerRemoteInterface,
-pb.registerRemoteCopy, and pb.registerUnslicer are used to establish the
-receiving-side mapping. There isn't a clean way to do it explicitly on the
-sending side, since we already have instances whose classes can give us
-whatever information we want.
-
-The advantage of implicit is simplicity: no more questions about why my
-pb.RemoteCopy is giving "not unserializable" errors. The mere act of
-importing a module is enough to let PB create instances of its classes.
-
-The advantage of doing it explicitly is to remind the user about the
-existence of those maps, because the factory classes in the receiving map is
-precisely equal to the user's exposure (from a security point of view). See
-the E paper on secure-serialization for some useful concepts.
-
-A disadvantage of implicit is that you can't quite be sure what, exactly,
-you're exposed to: the registrations take place all over the place.
-
-To make explicit not so painful, we can use quotient's .wsv files
-(whitespace-separated values) which map from class to string and back again.
-The file could list fully-qualified classname, wire-side string, and
-receiving factory class on each line. The Broker (or rather the RootSlicer
-and RootUnslicer) would be given a set of .wsv files to define their
-mapping. It would get all the registrations at once (instead of having them
-scattered about). They could also demand-load the receive-side factory
-classes.
-
-For now, go implicit. Put off the decision until we have some more
-experience with using newpb.
-
-* move from VocabSlicer sequence to ADDVOCAB/DELVOCAB tokens
-
-Requires a .wantVocabString flag in the parser, which is kind of icky but
-fixes the annoying asymmetry between set (vocab sequence) and get (VOCAB
-token). Might want a CLEARVOCAB token too.
-
-On second thought, this won't work. There isn't room for both a vocab number
-and a variable-length string in a single token. It must be an open sequence.
-However, it could be an add/del/set-vocab sequence, allowing the vocab to be
-modified incrementally.
-
-** VOCABize interface/method names
-
-One possibility is to make a list of all strings used by all known
-RemoteInterfaces and all their methods, then send it at broker connection
-time as the initial vocab map. A better one (maybe) is to somehow track what
-we send and add a word to the vocab once we've sent it more than three
-times.
-
-Maybe vocabize the pairs, as "ri/name1","ri/name2", etc, or maybe do them
-separately. Should do some handwaving math to figure out which is better.
-
-* nail down some useful schema syntaxes
-
-This has two parts: parsing something like a __schema__ class attribute (see
-the sketches in schema.xhtml) into a tree of FooConstraint objects, and
-deciding how to retrieve schemas at runtime from things like the object being
-serialized or the object being called from afar. To be most useful, the
-syntax needs to mesh nicely (read "is identical to") things like formless and
-(maybe?) atop or whatever has replaced the high-density highly-structured
-save-to-disk scheme that twisted.world used to do.
-
-Some lingering questions in this area:
-
- When an object has a remotely-invokable method, where does the appropriate
- MethodConstraint come from? Some possibilities:
-
- an attribute of the method itself: obj.method.__schema__
-
- from inside a __schema__ attribute of the object's class
-
- from inside a __schema__ attribute of an Interface (which?) that the object
- implements
-
- Likewise, when a caller holding a RemoteReference invokes a method on it, it
- would be nice to enforce a schema on the arguments they are sending to the
- far end ("be conservative in what you send"). Where should this schema come
- from? It is likely that the sender only knows an Interface for their
- RemoteReference.
-
- When PB determines that an object wants to be copied by value instead of by
- reference (pb.Copyable subclass, Copyable(obj), schema says so), where
- should it find a schema to define what exactly gets copied over? A class
- attribute of the object's class would make sense: most objects would do
- this, some could override jellyFor to get more control, and others could
- override something else to push a new Slicer on the stack and do streaming
- serialization. Whatever the approach, it needs to be paralleled by the
- receiving side's unjellyableRegistry.
-
-* RemoteInterface instances should have an "RI-" prefix instead of "I-"
-
-DONE
-
-* merge my RemoteInterface syntax with zope.interface's
-
-I hacked up a syntax for how method definitions are parsed in
-RemoteInterface objects. That syntax isn't compatible with the one
-zope.interface uses for local methods, so I just delete them from the
-attribute dictionary to avoid causing z.i indigestion. It would be nice if
-they were compatible so I didn't have to do that. This basically translates
-into identifying the nifty extra flags (like priority classes, no-response)
-that we want on these methods and finding a z.i-compatible way to implement
-them. It also means thinking of SOAP/XML-RPC schemas and having a syntax
-that can represent everything at once.
-
-
-* use adapters to enable pass-by-reference or pass-by-value
-
-It should be possible to pass a reference with variable forms:
-
- rr.callRemote("foo", 1, Reference(obj))
- rr.callRemote("bar", 2, Copy(obj))
-
-This should probably adapt the object to IReferenceable or ICopyable, which
-are like ISliceable except they can pass the object by reference or by
-value. The slicing process should be:
-
- look up the type() in a table: this handles all basic types
- else adapt the object to ISliceable, use the result
- else raise an Unsliceable exception
- (and point the user to the docs on how to fix it)
-
-The adapter returned by IReferenceable or ICopyable should implement
-ISliceable, so no further adaptation will be done.
-
-* remove 'copy' prefix from remotecopy banana type names?
-
-<glyph> warner: did we ever finish our conversation on the usefulness of the
-(copy foo blah) namespace rather than just (foo blah)?
-<warner> glyph: no, I don't think we did
-<glyph> warner: do you still have (copy foo blah)?
-<warner> glyph: yup
-<warner> so far, it seems to make some things easier
-<warner> glyph: the sender can subclass pb.Copyable and not write any new
-code, while the receiver can write an Unslicer and do a registerRemoteCopy
-<warner> glyph: instead of the sender writing a whole slicer and the receiver
-registering at the top-level
-<glyph> warner: aah
-<warner> glyph: although the fact that it's easier that way may be an artifact
-of my sucky registration scheme
-<glyph> warner: so the advantage is in avoiding registration of each new
-unslicer token?
-<glyph> warner: yes. I'm thinking that a metaclass will handily remove the
-need for extra junk in the protocol ;)
-<warner> well, the real reason is my phobia about namespace purity, of course
-<glyph> warner: That's what the dots are for
-<warner> but ease of dispatch is also important
-<glyph> warner: I'm concerned about it because I consider my use of the same
-idiom in the first version of PB to be a serious wart
-* warner nods
-<warner> I will put together a list of my reasoning
-<glyph> warner: I think it's likely that PB implementors in other languages
-are going to want to introduce new standard "builtin" types; our "builtins"
-shouldn't be limited to python's provided data structures
-<moshez> glyph: wait
-<warner> ok
-<moshez> glyph: are you talking of banana types
-<moshez> glyph: or really PB
-<warner> in which case (copy blah blah) is a non-builtin type, while
-(type-foo) is a builtin type
-<glyph> warner: plus, our namespaces are already quite well separated, I can
-tell you I will never be declaring new types outside of quotient.* and
-twisted.* :)
-<warner> moshez: this is mostly banana (or what used to be jelly, really)
-<glyph> warner: my inclination is to standardize by convention
-<glyph> warner: *.* is a non-builtin type, [~.] is a builtin
-<moshez> glyph: ?
-<glyph> sorry [^.]*
-<glyph> my regular expressions and shell globs are totally confused but you
-know what I mean
-<glyph> moshez: yes
-<moshez> glyph: hrm
-<saph_w> glyph: you're making crazy anime faces
-<moshez> glyph: why do we need any non-Python builtin types
-<glyph> moshez: because I want to destroy SOAP, and doing that means working
-with people I don't like
-<glyph> moshez: outside of python
-<moshez> glyph: I meant, "what specific types"
-<moshez> I'd appreciate a blog on that
-
-* have Copyable/RemoteCopy default to __getstate__/__setstate__?
-
-At the moment, the default implementations of getStateToCopy() and
-setCopyableState() get and set __dict__ directly. Should the default instead
-be to call __getstate__() or __setstate__()?
-
-* make slicer/unslicers for pb.RemoteInterfaces
-
-exarkun's use case requires these Interfaces to be passable by reference
-(i.e. by name). It would also be interesting to let them be passed (and
-requested!) by value, so you can ask a remote peer exactly what their
-objects will respond to (the method names, the argument values, the return
-value). This also requires that constraints be serializable.
-
-do this, should be referenceable (round-trip should return the same object),
-should use the same registration lookup that RemoteReference(interfacelist)
-uses
-
-* investigate decref/Referenceable race
-
-Any object that includes some state when it is first sent across the wire
-needs more thought. The far end could drop the last reference (at time t=1)
-while a method is still pending that wants to send back the same object. If
-the method finishes at time t=2 but the decref isn't received until t=3, the
-object will be sent across the wire without the state, and the far end will
-receive it for the "first" time without that associated state.
-
-This kind of conserve-bandwidth optimization may be a bad idea. Or there
-might be a reasonable way to deal with it (maybe request the state if it
-wasn't sent and the recipient needs it, and delay delivery of the object
-until the state arrives).
-
-DONE, the RemoteReference is held until the decref has been acked. As long as
-the methods are executed in-order, this will prevent the race. TODO:
-third-party references (and other things that can cause out-of-order
-execution) could mess this up.
-
-* sketch out how to implement glyph's crazy non-compressed sexpr encoding
-
-* consider a smaller scope for OPEN-counter reference numbers
-
-For newpb, we moved to implicit reference numbers (counting OPEN tags
-instead of putting a number in the OPEN tag) because we didn't want to burn
-so much bandwidth: it isn't feasible to predict whether your object will
-need to be referenced in the future, so you always have to be prepared to
-reference it, so we always burn the memory to keep track of them (generally
-in a ScopedSlicer subclass). If we used explicit refids then we'd have to
-burn the bandwidth too.
-
-The sorta-problem is that these numbers will grow without bound as long as
-the connection remains open. After a few hours of sending 100-byte objects
-over a 100MB connection, you'll hit 1G-references and will have to start
-sending them as LONGINT tokens, which is annoying and slightly verbose (say
-3 or 4 bytes of number instead of 1 or 2). You never keep track of that many
-actual objects, because the references do not outlive their parent
-ScopedSlicer.
-
-The fact that the references themselves are scoped to the ScopedSlicer
-suggests that the reference numbers could be too. Each ScopedSlicer would
-track the number of OPEN tokens emitted (actually the number of
-slicerForObject calls made, except you'd want to use a different method to
-make sure that children who return a Slicer themselves don't corrupt the
-OPEN count).
-
-This requires careful synchronization between the ScopedSlicers on one end
-and the ScopedUnslicers on the other. I suspect it would be slightly
-fragile.
-
-One sorta-benefit would be that a somewhat human-readable sexpr-based
-encoding would be even more human readable if the reference numbers stayed
-small (you could visually correlate objects and references more easily). The
-ScopedSlicer's open-parenthesis could be represented with a curly brace or
-something, then the refNN number would refer to the NN'th left-paren from
-the last left-brace. It would also make it clear that the recipient will not
-care about objects outside that scope.
-
-* implement the FDSlicer
-
-Over a unix socket, you can pass fds. exarkun had a presentation at PyCon04
-describing the use of this to implement live application upgrade. I think
-that we could make a simple FDSlicer to hide the complexity of the
-out-of-band part of the communication.
-
-class Server(unix.Server):
- def sendFileDescriptors(self, fileno, data="Filler"):
- """
- @param fileno: An iterable of the file descriptors to pass.
- """
- payload = struct.pack("%di" % len(fileno), *fileno)
- r = sendmsg(self.fileno(), data, 0, (socket.SOL_SOCKET, SCM_RIGHTS, payload))
- return r
-
-class Client(unix.Client):
- def doRead(self):
- if not self.connected:
- return
- try:
- msg, flags, ancillary = recvmsg(self.fileno())
- except:
- log.msg('recvmsg():')
- log.err()
- else:
- buf = ancillary[0][2]
- fds = []
- while buf:
- fd, buf = buf[:4], buf[4:]
- fds.append(struct.unpack("i", fd)[0])
- try:
- self.protocol.fileDescriptorsReceived(fds)
- except:
- log.msg('protocol.fileDescriptorsReceived')
- log.err()
- return unix.Client.doRead(self)
-
-* implement AsyncDeferred returns
-
-dash wanted to implement a TransferrableReference object with a scheme that
-would require creating a new connection (to a third-party Broker) during
-ReferenceUnslicer.receiveClose . This would cause the object deserialization
-to be asynchronous.
-
-At the moment, Unslicers can return a Deferred from their receiveClose
-method. This is used by immutable containers (like tuples) to indicate that
-their object cannot be created yet. Other containers know to watch for these
-Deferreds and add a callback which will update their own entries
-appropriately. The implicit requirement is that all these Deferreds fire
-before the top-level parent object (usually a CallUnslicer) finishes. This
-allows for circular references involving immutable containers to be resolved
-into the final object graph before the target method is invoked.
-
-To accomodate Deferreds which will fire at arbitrary points in the future,
-it would be useful to create a marker subclass named AsyncDeferred. If an
-unslicer returns such an object, the container parent starts by treating it
-like a regular Deferred, but it also knows that its object is not
-"complete", and therefore returns an AsyncDeferred of its own. When the
-child completes, the parent can complete, etc. The difference between the
-two types: Deferred means that the object will be complete before the
-top-level parent is finished, AsyncDeferred makes claims about when the
-object will be finished.
-
-CallUnslicer would know that if any of its arguments are Deferreds or
-AsyncDeferreds then it need to hold off on the broker.doCall until all those
-Deferreds have fired. Top-level objects are not required to differentiate
-between the two types, because they do not return an object to an enclosing
-parent (the CallUnslicer is a child of the RootUnslicer, but it always
-returns None).
-
-Other issues: we'll need a schema to let you say whether you'll accept these
-late-bound objects or not (because if you do accept them, you won't be able
-to impose the same sorts of type-checks as you would on immediate objects).
-Also this will impact the in-order-invocation promises of PB method calls,
-so we may need to implement the "it is ok to run this asynchronously" flag
-first, then require that TransferrableReference objects are only passed to
-methods with the flag set.
-
-Also, it may not be necessary to have a marker subclass of Deferred: perhaps
-_any_ Deferred which arrives from a child is an indication that the object
-will not be available until an unknown time in the future, and obligates the
-parent to return another Deferred upwards (even though their object could be
-created synchronously). Or, it might be better to implement this some other
-way, perhaps separating "here is my object" from "here is a Deferred that
-will fire when my object is complete", like a call to
-parent.addDependency(self.deferred) or something.
-
-DONE, needs testing
-
-* TransferrableReference
-
-class MyThing(pb.Referenceable): pass
-r1 = MyThing()
-r2 = Facet(r1)
-g1 = Global(r1)
-class MyGlobalThing(pb.GloballyReferenceable): pass
-g2 = MyGlobalThing()
-g3 = Facet(g2)
-
-broker.setLocation("pb://hostname.com:8044")
-
-rem.callRemote("m1", r1) # limited to just this connection
-rem.callRemote("m2", Global(r1)) # can be published
-g3 = Global(r1)
-rem.callRemote("m3", g1) # can also be published..
-g1.revoke() # but since we remember it, it can be revoked too
-g1.restrict() # and, as a Facet, we can revoke some functionality but not all
-
-rem.callRemote("m1", g2) # can be published
-
-E tarball: jsrc/net/captp/tables/NearGiftTable
-
-issues:
- 1: when A sends a reference on B to C, C's messages to the object
- referenced must arrive after any messages A sent before the reference forks
-
- in particular, if A does:
- B.callRemote("1", hugestring)
- B.callRemote("2_makeYourSelfSecure", args)
- C.callRemote("3_transfer", B)
-
- and C does B.callRemote("4_breakIntoYou") as soon as it gets the reference,
- then the A->B queue looks like (1, 2), and the A->C queue looks like (3).
- The transfer message can be fast, and the resulting 4 message could be
- delivered to B before the A->B queue manages to deliver 2.
-
- 2: an object which get passed through multiple external brokers and
- eventually comes home must be recognized as a local object
-
- 3: Copyables that contain RemoteReferences must be passable between hosts
-
-E cannot do all three of these at once
-http://www.erights.org/elib/distrib/captp/WormholeOp.html
-
-I think that it's ok to tell people who want this guarantee to explicitly
-serialize it like this:
-
- B.callRemote("1", hugestring)
- d = B.callRemote("2_makeYourSelfSecure", args)
- d.addCallback(lambda res: C.callRemote("3_transfer", B))
-
-Note that E might not require that method calls even have a return value, so
-they might not have had a convenient way to express this enforced
-serialization.
-
-** more thoughts
-
-To enforce the partial-ordering, you could do the equivalent of:
- A:
- B.callRemote("1", hugestring)
- B.callRemote("2_makeYourSelfSecure", args)
- nonce = makeNonce()
- B.callRemote("makeYourSelfAvailableAs", nonce)
- C.callRemote("3_transfer", (nonce, B.name))
- C:
- B.callRemote("4_breakIntoYou")
-
-C uses the nonce when it connects to B. It knows the name of the reference,
-so it can compare it against some other reference to the same thing, but it
-can't actually use that name alone to get access.
-
-When the connection request arrives at B, it sees B.name (which is also
-unguessable), so that gives it reason to believe that it should queue C's
-request (that it isn't just a DoS attack). It queues it until it sees A's
-request to makeYourSelfAvailableAs with the matching nonce. Once that
-happens, it can provide the reference back to C.
-
-This implies that C won't be able to send *any* messages to B until that
-handshake has completed. It might be desireable to avoid the extra round-trip
-this would require.
-
-** more thoughts
-
- url = PBServerFactory.registerReference(ref, name=None)
- creates human-readable URLs or random identifiers
-
-the factory keeps a bidirectional mapping of names and Referenceables
-
-when a Referenceable gets serialized, if the factory's table doesn't have a
-name for it, the factory creates a random one. This entry in the table is
-kept alive by two things:
-
- a live reference by one of the factory's Brokers
- an entry in a Broker's "gift table"
-
-When a RemoteReference gets serialized (and it doesn't point back to the
-receiving Broker, and thus get turned into a your-reference sequence),
-
-<warner> A->C: "I'm going to send somebody a reference to you, incref your
- gift table", C->A: roger that, here's a gift nonce
-<warner> A->B: "here's Carol's reference: URL plus nonce"
-<warner> B->C: "I want a liveref to your 'Carol' object, here's my ticket
- (nonce)", C->B: "ok, ticket redeemed, here's your liveref"
-
-once more, without nonces:
- A->C: "I'm going to send somebody a reference to you, incref your
- gift table", C->A: roger that
- A->B: "here's Carol's reference: URL"
- B->C: "I want a liveref to your 'Carol' object", C->B: "ok, here's your
- liveref"
-
-really:
- on A: c.vat.callRemote("giftYourReference", c).addCallback(step2)
- c is serialized as (your-reference, clid)
- on C: vat.remote_giftYourReference(which): self.table[which] += 1; return
- on A: step2: b.introduce(c)
- c is serialized as (their-reference, url)
- on B: deserialization sees their-reference
- newvat = makeConnection(URL)
- newvat.callRemote("redeemGift", URL).addCallback(step3)
- on C: vat.remote_redeemGift(URL):
- ref = self.urls[URL]; self.table[ref] -= 1; return ref
- ref is serialized as (my-reference, clid)
- on B: step3(c): b.remote_introduce(c)
-
-problem: if alice sends a thousand copies, that means these 5 messages are
-each send a thousand times. The makeConnection is cached, but the rest are
-not. We don't rememeber that we've already made this gift before, that the
-other end probably still has it. Hm, but we also don't know that they didn't
-lose it already.
-
-** ok, a plan:
-
-concern 1: objects must be kept alive as long as there is a RemoteReference
-to them.
-
-concern 2: we should be able to tell when an object is being sent for the
-first time, to add metadata (interface list, public URL) that would be
-expensive to add to every occurrence.
-
- each (my-reference) sent over the wire increases the broker's refcount on
- both ends.
-
- the receiving Broker retains a weakref to the RemoteReference, and retains a
- copy of the metadata necessary to create it in the clid table (basically the
- entire contents of the RemoteReference). When the weakref expires, it marks
- the clid entry as "pending-free", and sends a decref(clid,N) to the other
- Broker. The decref is actually sent with broker.callRemote("decref", clid,
- N), so it can be acked.
-
- the sending broker gets the decref and reduces its count by N. If another
- reference was sent recently, this count may not drop all the way to zero,
- indicating there is a reference "in flight" and the far end should be ready
- to deal with it (by making a new RemoteReference with the same properties as
- the old one). If N!=0, it returns False to indicate that this was not the
- last decref message for the clid. If N==0, it returns True, since it is the
- last decref, and removes the entry from its table. Once remote_decref
- returns True, the clid is retired.
-
- the receiving broker receives the ack from the decref. If the ack says
- last==True, the clid table entry is freed. If it says last==False, then
- there should have been another (my-reference) received before the ack, so
- the refcount should be non-zero.
-
- message sequence:
-
- A-> : (my-reference clid metadata) [A.myrefs[clid].refcount++ = 1]
- A-> : (my-reference clid) [A.myrefs[clid].refcount++ = 2]
- ->B: receives my-ref, creates RR, B.yourrefs[clid].refcount++ = 1
- ->B: receives my-ref, B.yourrefs[clid].refcount++ = 2
- : time passes, B sees the reference go away
- <-B: d=brokerA.callRemote("decref", clid, B.yourrefs[clid].refcount)
- B.yourrefs[clid].refcount = 0; d.addCallback(B.checkref, clid)
- A-> : (my-reference clid) [A.myrefs[clid].refcount++ = 3]
- A<- : receives decref, A.myrefs[clid].refcount -= 2, now =1, returns False
- ->B: receives my-ref, re-creates RR, B.yourrefs[clid].refcount++ = 1
- ->B: receives ack(False), B.checkref asserts refcount != 0
- : time passes, B sees the reference go away again
- <-B: d=brokerA.callRemote("decref", clid, B.yourrefs[clid].refcount)
- B.yourrefs[clid].refcount = 0; d.addCallback(B.checkref, clid)
- A<- : receives decref, A.myrefs[clid].refcount -= 1, now =0, returns True
- del A.myrefs[clid]
- ->B: receives ack(True), B.checkref asserts refcount==0
- del B.yourrefs[clid]
-
-B retains the RemoteReference data until it receives confirmation from A.
-Therefore whenever A sends a reference that doesn't already exist in the clid
-table, it is sending it to a B that doesn't know about that reference, so it
-needs to send the metadata.
-
-concern 3: in the three-party exchange, Carol must be kept alive until Bob
-has established a reference to her, even if Alice drops her carol-reference
-immediately after sending the introduction to Bob.
-
-(my-reference, clid, [interfaces, public URL])
-(your-reference, clid)
-(their-reference, URL)
-
-Serializing a their-reference causes an entry to be placed in the Broker's
-.theirrefs[URL] table. Each time a their-reference is sent, the entry's
-refcount is incremented.
-
-Receiving a their-reference may initiate a PB connection to the target,
-followed by a getNamedReference request. When this completes (or if the
-reference was already available), the recipient sends a decgift message to
-the sender. This message includes a count, so multiple instances of the same
-gift can be acked as a group.
-
-The .theirrefs entry retains a reference to the sender's RemoteReference, so
-it cannot go away until the gift is acked.
-
-DONE, gifts are implemented, we punted on partial-ordering
-
-*** security, DoS
-
-Bob can force Alice to hold on to a reference to Carol, as long as both
-connections are open, by never acknowledging the gift.
-
-Alice can cause Bob to open up TCP connections to arbitrary hosts and ports,
-by sending third-party references to him, although the only protocol those
-connections will speak is PB.
-
-Using yURLs and StartTLS should be enough to secure and authenticate the
-connections.
-
-*** partial-ordering
-
-If we need it, the gift (their-reference message) can include a nonce, Alice
-sends a makeYourSelfAvailableAs message to Carol with the nonce, and Bob must
-do a new getReference with the nonce.
-
-Kragen came up with a good use-case for partial-ordering:
- A:
- B.callRemote("updateDocument", bigDocument)
- C.callRemote("pleaseReviewLatest", B)
- C:
- B.callRemote("getLatestDocument")
-
-
-* PBService / Tub
-
-Really, PB wants to be a Service, since third-party references mean it will
-need to make connections to arbitrary targets, and it may want to re-use
-those connections.
-
- s = pb.PBService()
- s.listenOn(strport) # provides URL base
- swissURL = s.registerReference(ref) # creates unguessable name
- publicURL = s.registerReference(ref, "name") # human-readable name
- s.unregister(URL) # also revokes all clids
- s.unregisterReference(ref)
- d = s.getReference(URL) # Deferred which fires with the RemoteReference
- d = s.shutdown() # close all servers and client connections
-
-DONE, this makes things quite clean
-
-* promise pipelining
-
-Even without third-party references, we can do E-style promise pipelining.
-
-<warner> hmm. subclass of Deferred that represents a Promise, can be
- serialized if it's being sent to the same broker as the RemoteReference it was
- generated for
-<dash> warner: hmmm. how's that help us?
-<dash> oh, pipelining?
-<warner> maybe a flag on the callRemote to say that "yeah, I want a
- DeferredPromise out of you, but I'm only going to include it as an argument to
- another method call I'm sending you, so don't bother sending *me* the result"
-<dash> aah
-<dash> yeah
-<dash> that sounds like a reasonable approach
-<warner> that would actually work
-<warner> dash: do you know if E makes any attempt to handle >2 vats in their
- pipelining implementation? seems to me it could turn into a large network
- optimization problem pretty quickly
-<dash> warner: Mmm
-<warner> hmm
-<dash> I do not think you have to
-<warner> so you have: t1=a.callRemote("foo",args1);
- t2=t1.callRemote("bar",args2), where callRemote returns a Promise, which is a
- special kind of Deferred that remembers the Broker its answer will eventually
- come from. If args2 consists of entirely immediate things (no Promises) or
- Promises that are coming from the same broker as t1 uses, then the "bar" call
- is eligible for pipelining and gets sent to the remote broker
-<warner> in the resulting newpb banana sequence, the clid of the target method
- is replaced by another kind of clid, which means "the answer you're going to
- send to method call #N", where N comes from t1
-<dash> mmm yep
-<warner> using that new I-can't-unserialize-this-yet hook we added, the second
- call sequence doesn't finish unserializing until the first call finishes and
- sends the answer. Sending answer #N fires the hook's deferred.
-<warner> that triggers the invocation of the second method
-<dash> yay
-<warner> hm, of course that totally blows away the idea of using a Constraint
- on the arguments to the second method
-<warner> because you don't even know what the object is until after the
- arguments have arrived
-<warner> but
-<dash> well
-<warner> the first method has a schema, which includes a return constraint
-<dash> okay you can't fail synchronously
-<warner> so you *can* assert that, whatever the object will be, it obeys that
- constraint
-<dash> but you can return a failure like everybody else
-<warner> and since the constraint specifies an Interface, then the Interface
- plus mehtod name is enough to come up with an argument constraint
-<warner> so you can still enforce one
-<warner> this is kind of cool
-<dash> the big advantage of pipelining is that you can have a lot of
- composable primitives on your remote interfaces rather than having to smush
- them together into things that are efficient to call remotely
-<warner> hm, yeah, as long as all the arguments are either immediate or
- reference something on the recipient
-<warner> as soon as a third party enters the equation, you have to decide
- whether to wait for the arguments to resolve locally or if it might be faster
- to throw them at someone else
-<warner> that's where the network-optimization thing I mentioned before comes
- into play
-<dash> mmm
-<warner> you send messages to A and to B, once you get both results you want
- to send the pair to C to do something with them
-<dash> spin me an example scenario
-<dash> Hmm
-<warner> if all three are close to each other, and you're far from all of
- them, it makes more sense to tell C about A and B
-<dash> how _does_ E handle that
-<warner> or maybe tell A and B about C, tell them "when you get done, send
- your results to C, who will be waiting for them"
-<dash> warner: yeah, i think that the right thing to do is to wait for them to
- resolve locally
-<Tv> assuming that C can talk to A and B is bad
-<dash> no it isn't
-<Tv> well, depends on whether you live in this world or not :)
-<dash> warner: if you want other behaviour then you should have to set it up
- explicitly, i think
-<warner> I'm not even sure how you would describe that sort of thing. It'd be
- like routing protocols, you assign a cost to each link and hope some magical
- omniscient entity can pick an optimal solution
-
-** revealing intentions
-
-<zooko> Now suppose I say "B.your_fired(C.revoke_his_rights())", or such.
-<warner> A->C: sell all my stock. A->B: declare bankruptcy
-
-If B has access to C, and the promises are pipelined, then B has a window
-during which they know something's about to happen, and they still have full
-access to C, so they can do evil.
-
-Zooko tried to explain the concern to MarkM years ago, but didn't have a
-clear example of the problem. The thing is, B can do evil all the time,
-you're just trying to revoke their capability *before* they get wind of your
-intentions. Keeping intentions secret is hard, much harder than limiting
-someone's capabilities. It's kind of the trailing edge of the capability, as
-opposed to the leading edge.
-
-Zooko feels the language needs clear support for expressing how the
-synchronization needs to take place, and which domain it needs to happen in.
-
-* web-calculus integration
-
-Tyler pointed out that it is vital for a node to be able to grant limited
-access to some held object. Specifically, Alice may want to give Bob a
-reference not to Carol as a whole, but to just a specific Carol.remote_foo
-method (and not to any other methods that Alice might be allowed to invoke).
-I had been thinking of using RemoteInterfaces to indicate method subsets,
-something like this:
-
- bob.callRemote("introduce", Facet(self, RIMinimal))
-
-but Tyler thinks that this is too coarse-grained and not likely to encourage
-the right kinds of security decisions. In his web-calculus, recipients can
-grant third-parties access to individual bound methods.
-
- bob.callRemote("introduce", carol.getMethod("howdy"))
-
-If I understand it correctly, his approach makes Referenceables into a
-copy-by-value object that is represented by a dictionary which maps method
-names to these RemoteMethod objects, so there is no actual
-callRemote(methname) method. Instead you do something like:
-
- rr = tub.getReference(url)
- d = rr['introduce'].call(args)
-
-These RemoteMethod objects are top-level, so unguessable URLs must be
-generated for them when they are sent, and they must be reference-counted. It
-must not be possible to get from the bound method to the (unrestricted)
-referenced object.
-
-TODO: how does the web-calculus maintain reference counts for these? It feels
-like there would be an awful lot of messages being thrown around.
-
-To implement this, we'll need:
-
- banana sequences for bound methods
- ('my-method', clid, url)
- ('your-method', clid)
- ('their-method', url, RI+methname?)
- syntax to carve a single method out of a local Referenceable
- A: self.doFoo (only if we get rid of remote_)
- B: self.remote_doFoo
- C: self.getMethod("doFoo")
- D: self.getMethod(RIFoo['doFoo'])
- leaning towards C or D
- syntax to carve a single method out of a RemoteReference
- A: rr.doFoo
- B: rr.getMethod('doFoo')
- C: rr.getMethod(RIFoo['doFoo'])
- D: rr['doFoo']
- E: rr[RIFoo['doFoo']]
- leaning towards B or C
- decide whether to do getMethod early or late
- early means ('my-reference') includes a big dict of my-method values
- and a whole bunch of DECREFs when that dict goes away
- late means there is a remote_tub.getMethod(your-ref, methname) call
- and an extra round-trip to retrieve them
- dash thinks late is better
-
-We could say that the 'my-reference' sequence for any RemoteInterface-enabled
-Referenceable will include a dictionary of bound methods. The receiving end
-will just stash the whole thing.
-
-* do implicit "doFoo" -> RIFoo["doFoo"] conversion
-
-I want rr.callRemote("doFoo", args) to take advantage of a RemoteInterface,
-if one is available. RemoteInterfaces aren't supposed to be overlapping (at
-least not among RemoteInterfaces that are shared by a single Referenceable),
-so there shouldn't be any ambiguity. If there is, we can raise an error.
-
-* accept Deferreds as arguments?
-
- bob.callRemote("introduce", target=self.tub.getReference(pburl))
- or
- bob.callRemote("introduce", carol.getMethod("doFoo"))
- instead of
- carol.getMethod("doFoo").addCallback(lambda r: bob.callRemote("introduce", r))
-
-If one of the top-level arguments to callRemote is a Deferred, don't send the
-method request until all the arguments resolve. If any of the arguments
-errback, the callRemote will fail with some new exception (that can contain a
-reference to the argument's exception).
-
-however, this would mean the method would be invoked out-of-order w.r.t. an
-immediately-following bob.callRemote
-
-put this off until we get some actual experience.
-
-* batch decrefs?
-
-If we implement the copy-by-value Referenceable idea, then a single gc may
-result in dozens of simultaneous decrefs. It would be nice to reduce the
-traffic generated by that.
-
-* promise pipelining
-
-Promise(Deferred).__getattr__
-
-DoS prevention techniques in CapIDL (MarkM)
-
-pb://key@ip,host,[ipv6],localhost,[/unix]/swissnumber
-tubs for lifetime management
-separate listener object, share tubs between listeners
- distinguish by key number
-
- actually, why bother with separate keys? Why allow the outside world to
- distinguish between these sub-Tubs? Use them purely for lifetime management,
- not security properties. That means a name->published-object table for each
- SubTub, maybe a hierarchy of them, and the parent-most Tub gets the
- Listeners. Incoming getReferenceByURL requests require a lookup in all Tubs
- that descend from the one attached to that listener.
-
-So one decision is whether to have implicitly-published objects have a name
-that lasts forever (well, until the Tub is destroyed), or if they should be
-reference-counted. If they are reference counted, then outstanding Gifts need
-to maintain a reference, and the gift must be turned into a live
-RemoteReference right away. It has bearing on how/if we implement SturdyRefs,
-so I need to read more about them in the E docs.
-
-Hrm, and creating new Tubs from within a remote_foo method.. to make that
-useful, you'd need to have a way to ask for the Tub through which you were
-being invoked. hrm.
-
-* creating new Tubs
-
-Tyler suggests using Tubs for namespace management. Tubs can share TCP
-listening ports, but MarkS recommends giving them all separate keys (which
-means separate SSL sessions, so separate TCP connections). Bill Frantz
-discourages using a hierarchy of Tubs, says it's not the sort of thing you
-want to be locked into.
-
-That means I'll need a separate Listener object, where the rule is that the
-last Tub to be stopped makes the Listener stop too.. probably abuse the
-Service interface in some wacky way to pull this off.
-
-Creating a new Tub.. how to conveniently create it with the same Listeners as
-the current one? If the method that's creating the Tub is receiving a
-reference, the Tub can be an attribute of the inbound RemoteReference. If
-not, that's trickier.. the _tub= argument may still be a useful way to go.
-Once you've got a source tub, then tub.newTub() should create a new one with
-the same Listeners as the source (but otherwise unassociated with it).
-
-Once you have the new Tub, registering an object in it should return
-something that can be directly serialized into a gift.
-
-class Target(pb.Referenceable):
- def remote_startGame(self, player_black, player_white):
- tub = player_black.tub.newTub()
- game = self.createGame()
- gameref = tub.register(game)
- game.setPlayer("black", tub.something(player_black))
- game.setPlayer("white", tub.something(player_white))
- return gameref
-
-Hmm. So, create a SturdyRef class, which remembers the tubid (key), list of
-location hints, and object name. These have a url() method that renders out a
-URL string, and a compare method which compares the tubid and object name but
-ignores the location hints. Serializing a SturdyRef creates a their-reference
-sequence. Tub.register takes an object (and maybe a name) and returns a
-SturdyRef. Tub.getReference takes either a URL or a SturdyRef.
-RemoteReferences should have a .getSturdyRef method.
-
-Actually, I think SturdyRefs should be serialized as Copyables, and create
-SturdyRefs on the other side. The new-tub sequence should be:
-
- create new tub, using the Listener from an existing tub
- register the objects in the new tub, obtaining a SturdyRef
- send/return SendLiveRef(sturdyref) to the far side
- SendLiveRef is a wrapper that causes a their-reference sequence to be sent.
- The alternative is to obtain an actual live reference (via
- player_black.tub.getReference(sturdyref) first), then send that, but it's
- kind of a waste if you don't actually want to use the liveref yourself.
-
-Note that it becomes necessary to provide for local references here: ones in
-different Tubs which happen to share a Listener. These can use real TCP
-connections (unless the Listener hint is only valid from the outside world).
-It might be possible to use some tricks cut out some of the network overhead,
-but I suspect there are reasons why you wouldn't actually want to do that.
--- /dev/null
+-*- outline -*-
+
+non-independent things left to do on newpb. These require deeper magic or
+can not otherwise be done casually. Many of these involve fundamental
+protocol issues, and therefore need to be decided sooner rather than later.
+
+* summary
+** protocol issues
+*** negotiation
+*** VOCABADD/DEL/SET sequences
+*** remove 'copy' prefix from RemoteCopy type sequences?
+*** smaller scope for OPEN-counter reference numbers?
+** implementation issues
+*** cred
+*** oldbanana compatibility
+*** Copyable/RemoteCopy default to __getstate__ or self.__dict__ ?
+*** RIFoo['bar'] vs RIFoo.bar (should RemoteInterface inherit from Interface?)
+*** constrain ReferenceUnslicer
+*** serialize target.remote_foo usefully
+
+* decide whether to accept positional args in non-constrained methods
+
+DEFERRED until after 2.0
+<glyph> warner: that would be awesome but let's do it _later_
+
+This is really a backwards-source-compatibility issue. In newpb, the
+preferred way of invoking callRemote() is with kwargs exclusively: glyph's
+felt positional arguments are more fragile. If the client has a
+RemoteInterface, then they can convert any positional arguments into keyword
+arguments before sending the request.
+
+The question is what to do when the client is not using a RemoteInterface.
+Until recently, callRemote("bar") would try to find a matching RI. I changed
+that to have callRemote("bar") never use an RI, and instead you would use
+callRemote(RIFoo['bar']) to indicate that you want argument-checking.
+
+That makes positional arguments problematic in more situations than they were
+before. The decision to be made is if the OPEN(call) sequence should provide
+a way to convey positional args to the server (probably with numeric "names"
+in the (argname, argvalue) tuples). If we do this, the server (which always
+has the RemoteInterface) can do the positional-to-keyword mapping. But
+putting this in the protocol will oblige other implementations to handle them
+too.
+
+* change the method-call syntax to include an interfacename
+DONE
+
+Scope the method name to the interface. This implies (I think) one of two
+things:
+
+ callRemote() must take a RemoteInterface argument
+
+ each RemoteReference handles just a single Interface
+
+Probably the latter, maybe have the RR keep both default RI and a list of
+all implemented ones, then adapting the RR to a new RI can be a simple copy
+(and change of the default one) if the Referenceable knows about the RI.
+Otherwise something on the local side will need to adapt one RI to another.
+Need to handle reference-counting/DECREF properly for these shared RRs.
+
+From glyph:
+
+ callRemote(methname, **args) # searches RIs
+ callRemoteInterface(remoteinterface, methname, **args) # single RI
+
+ getRemoteURL(url, *interfaces)
+
+ URL-RRefs should turn into the original Referenceable (in args/results)
+ (map through the factory's table upon receipt)
+
+ URL-RRefs will not survive round trips. leave reference exchange for later.
+ (like def remote_foo(): return GlobalReference(self) )
+
+ move method-invocation code into pb.Referenceable (or IReferenceable
+ adapter). Continue using remote_ prefix for now, but make it a property of
+ that code so it can change easily.
+ <warner> ok, for today I'm just going to stick with remote_foo() as a
+ low-budget decorator, so the current restrictions are 1: subclass
+ pb.Referenceable, 2: implements() a RemoteInterface with method named "foo",
+ 3: implement a remote_foo method
+ <warner> and #1 will probably go away within a week or two, to be replaced by
+ #1a: subclass pb.Referenceable OR #1b: register an IReferenceable adapter
+
+ try serializing with ISliceable first, then try IReferenceable. The
+ IReferenceable adapter must implements() some RemoteInterfaces and gets
+ serialized with a MyReferenceSlicer.
+
+http://svn.twistedmatrix.com/cvs/trunk/pynfo/admin.py?view=markup&rev=44&root=pynfo
+
+** use the methods of the RemoteInterface as the "method name"
+DONE (provisional), using RIFoo['add']
+
+ rr.callRemote(RIFoo.add, **args)
+
+Nice and concise. However, #twisted doesn't like it, adding/using arbitrary
+attributes of Interfaces is not clean (think about IFoo.implements colliding
+with RIFoo.something).
+
+ rr.callRemote(RIFoo['add'], **args)
+ RIFoo(rr).callRemote('add', **args)
+ adaptation, or narrowing?
+
+<warner> glyph: I'm adding callRemote(RIFoo.bar, **args) to newpb right now
+<radix> wow.
+<warner> seemed like a simpler interface than callRemoteInterface("RIFoo",
+"bar", **args)
+<radix> warner: Does this mean that IPerspective can be parameterized now?
+<glyph> warner: bad idea
+<exarkun> warner: Zope hates you!
+<glyph> warner: zope interfaces don't support that syntax
+<slyphon> zi does support multi-adapter syntax
+<slyphon> but i don't really know what that is
+<exarkun> warner: callRemote(RIFoo.getDescriptionFor("bar"), *a, **k)
+<warner> glyph: yeah, I fake it. In RemoteInterfaceClass, I remove those
+attributes, call InterfaceClass, and then put them all back in
+<glyph> warner: don't add 'em as attributes
+<glyph> warner: just fix the result of __getitem__ to add a slot actually
+refer back to the interface
+<glyph> radix: the problem is that IFoo['bar'] doesn't point back to IFoo
+<glyph> warner: even better, make them callable :-)
+<exarkun> glyph: IFoo['bar'].interface == 'IFoo'
+<glyph> RIFoo['bar']('hello')
+<warner> glyph: I was thinking of doing that in a later version of
+RemoteInterface
+<glyph> exarkun: >>> type(IFoo['bar'].interface)
+<glyph> <type 'str'>
+<exarkun> right
+<exarkun> 'IFoo'
+<exarkun> Just look through all the defined interfaces for ones with matching
+names
+<glyph> exarkun: ... e.g. *NOT* __main__.IFoo
+<glyph> exarkun: AAAA you die
+<radix> hee hee
+* warner struggles to keep up with his thoughts and those of people around him
+* glyph realizes he has been given the power to whine
+<warner> glyph: ok, so with RemoteInterface.__getitem__, you could still do
+rr.callRemote(RIFoo.bar, **kw), right?
+<warner> was your objection to the interface or to the implementation?
+<itamar> I really don't think you should add attributes to the interface
+<warner> ok
+<warner> I need to stash a table of method schemas somewhere
+<itamar> just make __getitem__ return better type of object
+<itamar> and ideally if this is generic we can get it into upstream
+<exarkun> Is there a reason Method.interface isn't a fully qualified name?
+<itamar> not necessarily
+<itamar> I have commit access to zope.interface
+<itamar> if you have any features you want added, post to
+interface-dev@zope.org mailing list
+<itamar> and if Jim Fulton is ok with them I can add them for you
+<warner> hmm
+<warner> does using RIFoo.bar to designate a remote method seem reasonable?
+<warner> I could always adapt it to something inside callRemote
+<warner> something PB-specific, that is
+<warner> but that adapter would have to be able to pull a few attributes off
+the method (name, schema, reference to the enclosing RemoteInterface)
+<warner> and we're really talking about __getattr__ here, not __getitem__,
+right?
+<exarkun> for x.y yes
+<itamar> no, I don't think that's a good idea
+<itamar> interfaces have all kinds od methods on them already, for
+introspection purposes
+<itamar> namespace clashes are the suck
+<itamar> unless RIFoo isn't really an Interface
+<itamar> hm
+<itamar> how about if it were a wrapper around a regular Interface?
+<warner> yeah, RemoteInterfaces are kind of a special case
+<itamar> RIFoo(IFoo, publishedMethods=['doThis', 'doThat'])
+<itamar> s/RIFoo/RIFoo = RemoteInterface(/
+<exarkun> I'm confused. Why should you have to specify which methods are
+published?
+<itamar> SECURITY!
+<itamar> not actually necessary though, no
+<itamar> and may be overkill
+<warner> the only reason I have it derive from Interface is so that we can do
+neat adapter tricks in the future
+<itamar> that's not contradictory
+<itamar> RIFoo(x) would still be able to do magic
+<itamar> you wouldn't be able to check if an object provides RIFoo, though
+<itamar> which kinda sucks
+<itamar> but in any case I am against RIFoo.bar
+<warner> pity, it makes the callRemote syntax very clean
+<radix> hm
+<radix> So how come it's a RemoteInterface and not an Interface, anyway?
+<radix> I mean, how come that needs to be done explicitly. Can't you just
+write a serializer for Interface itself?
+
+* warner goes to figure out where the RemoteInterface discussion went after he
+ got distracted
+<warner> maybe I should make RemoteInterface a totally separate class and just
+implement a couple of Interface-like methods
+<warner> cause rr.callRemote(IFoo.bar, a=1) just feels so clean
+<Jerub> warner: why not IFoo(rr).bar(a=1) ?
+<warner> hmm, also a possibility
+<radix> well
+<radix> IFoo(rr).callRemote('bar')
+<radix> or RIFoo, or whatever
+<Jerub> hold on, what does rr inherit from?
+<warner> RemoteReference
+<radix> it's a RemoteReference
+<Jerub> then why not IFoo(rr) /
+<warner> I'm keeping a strong distinction between local interfaces and remote
+ones
+<Jerub> ah, oka.y
+<radix> warner: right, you can still do RIFoo
+<warner> ILocal(a).meth(args) is an immediate function call
+<Jerub> in that case, I prefer rr.callRemote(IFoo.bar, a=1)
+<radix> .meth( is definitely bad, we need callRemote
+<warner> rr.callRemote("meth", args) returns a deferred
+<Jerub> radix: I don't like from foo import IFoo, RIFoo
+<warner> you probably wouldn't have both an IFoo and an RIFoo
+<radix> warner: well, look at it this way: IFoo(rr).callRemote('foo') still
+makes it obvious that IFoo isn't local
+<radix> warner: you could implement RemoteReferen.__conform__ to implement it
+<warner> radix: I'm thinking of providing some kind of other class that would
+allow .meth() to work (without the callRemote), but it wouldn't be the default
+<radix> plus, IFoo(rr) is how you use interfaces normally, and callRemote is
+how you make remote calls normally, so it seems that's the best way to do
+interfaces + PB
+<warner> hmm
+<warner> in that case the object returned by IFoo(rr) is just rr with a tag
+that sets the "default interface name"
+<radix> right
+<warner> and callRemote(methname) looks in that default interface before
+looking anywhere else
+<warner> for some reason I want to get rid of the stringyness of the method
+name
+<warner> and the original syntax (callRemoteInterface('RIFoo', 'methname',
+args)) felt too verbose
+<radix> warner: well, isn't that what your optional .meth thing is for?
+<radix> yes, I don't like that either
+<warner> using callRemote(RIFoo.bar, args) means I can just switch on the
+_name= argument being either a string or a (whatever) that's contained in a
+RemoteInterface
+<warner> a lot of it comes down to how adapters would be most useful when
+dealing with remote objects
+<warner> and to what extent remote interfaces should be interchangeable with
+local ones
+<radix> good point. I have never had a use case where I wanted to adapt a
+remote object, I don't think
+<radix> however, I have had use cases to send interfaces across the wire
+<radix> e.g. having a parameterized portal.login() interface
+<warner> that'll be different, just callRemote('foo', RIFoo)
+<radix> yeah.
+<warner> the current issue is whether to pass them by reference or by value
+<radix> eugh
+<radix> Can you explain it without using those words? :)
+<warner> hmm
+<radix> Do you mean, Referenceable style vs Copyable style?
+<warner> at the moment, when you send a Referenceable across the wire, the
+id-number is accompanied with a list of strings that designate which
+RemoteInterfaces the original claims to provide
+<warner> the receiving end looks up each string in a local table, and
+populates the RemoteReference with a list of RemoteInterface classes
+<warner> the table is populated by metaclass magic that runs when a 'class
+RIFoo(RemoteInterface)' definition is complete
+<radix> ok
+<radix> so a RemoteInterface is simply serialized as its qual(), right?
+<warner> so as long as both sides include the same RIFoo definition, they'll
+wind up with compatible remote interfaces, defining the same method names,
+same method schemas, etc
+<warner> effectively
+<warner> you can't just send a RemoteInterface across the wire right now, but
+it would be easy to add
+<warner> the places where they are used (sending a Referenceable across the
+wire) all special case them
+<radix> ok, and you're considering actually writing a serializer for them that
+sends all the information to totally reconstruct it on the other side without
+having the definiton
+<warner> yes
+<warner> or having some kind of debug method which give you that
+<radix> I'd say, do it the way you're doing it now until someone comes up with
+a use case for actually sending it...
+<warner> right
+<warner> the only case I can come up with is some sort of generic object
+browser debug tool
+<warner> everything else turns into a form of version negotiation which is
+better handled elsewhere
+<warner> hmm
+<warner> so RIFoo(rr).callRemote('bar', **kw)
+<warner> I guess that's not too ugly
+<radix> That's my vote. :)
+<warner> one thing it lacks is the ability to cleanly state that if 'bar'
+doesn't exist in RIFoo then it should signal an error
+<warner> whereas callRemote(RIFoo.bar, **kw) would give you an AttributeError
+before callRemote ever got called
+<warner> i.e. "make it impossible to express the incorrect usage"
+<radix> mmmh
+<radix> warner: but you _can_ check it immediately when it's called
+<warner> in the direction I was heading, callRemote(str) would just send the
+method request and let the far end deal with it, no schema-checking involved
+<radix> warner: which, 99% of the time, is effectively the same time as
+IFoo.bar would happen
+<warner> whereas callRemote(RIFoo.bar) would indicate that you want schema
+checking
+<warner> yeah, true
+<radix> hm.
+<warner> (that last feature is what allowed callRemote and callRemoteInterface
+to be merged)
+<warner> or, I could say that the normal RemoteReference is "untyped" and does
+not do schema checking
+<warner> but adapting one to a RemoteInterface results in a
+TypedRemoteReference which does do schema checking
+<warner> and which refuses to be invoked with method names that are not in the
+schema
+<radix> warner: we-ell
+<radix> warner: doing method existence checking is cool
+<radix> warner: but I think tying any further "schema checking" to adaptation
+is a bad idea
+<warner> yeah, that's my hunch too
+<warner> which is why I'd rather not use adapters to express the scope of the
+method name (which RemoteInterface it is supposed to be a part of)
+<radix> warner: well, I don't think tying it to callRemote(RIFoo.methName)
+would be a good idea just the same
+<warner> hm
+<warner> so that leaves rr.callRemote(RIFoo['add']) and
+rr.callRemoteInterface(RIFoo, 'add')
+<radix> OTOH, I'm inclined to think schema checking should happen by default
+<radix> It's just a the matter of where it's parameterized
+<warner> yeah, it's just that the "default" case (rr.callRemote('name')) needs
+to work when there aren't any RemoteInterfaces declared
+<radix> warner: oh
+<warner> but if we want to encourage people to use the schemas, then we need
+to make that case simple and concise
+* radix goes over the issue in his head again
+<radix> Yes, I think I still have the same position.
+<warner> which one? :)
+<radix> IFoo(rr).callRemote("foo"); which would do schema checking because
+schema checking is on by default when it's possible
+<warner> using an adaptation-like construct to declare a scope of the method
+name that comes later
+<radix> well, it _is_ adaptation, I think.
+<radix> Adaptation always has plugged in behavior, we're just adding a bit
+more :)
+<warner> heh
+<warner> it is a narrowing of capability
+<radix> hmm, how do you mean?
+<warner> rr.callRemote("foo") will do the same thing
+<warner> but rr.callRemote("foo") can be used without the remote interfaces
+<radix> I think I lost you.
+<warner> if rr has any RIs defined, it will try to use them (and therefore
+complain if "foo" does not exist in any of them, or if the schema is violated)
+<radix> Oh. That's strange.
+<radix> So it's really quite different from how interfaces regularly work...
+<warner> yeah
+<warner> except that if you were feeling clever you could use them the normal
+way
+<radix> Well, my inclination is to make them work as similarly as possible.
+<warner> "I have a remote reference to something that implements RIFoo, but I
+want to use it in some other way"
+<radix> s/possible/practical/
+<warner> then IBar(rr) or RIBar(rr) would wrap rr in something that knows how
+to translate Bar methods into RIFoo remote methods
+<radix> Maybe it's not practical to make them very similar.
+<radix> I see.
+
+rr.callRemote(RIFoo.add, **kw)
+rr.callRemote(RIFoo['add'], **kw)
+RIFoo(rr).callRemote('add', **kw)
+
+I like the second one. Normal Interfaces behave like a dict, so IFoo['add']
+gets you the method-describing object (z.i.i.Method). My RemoteInterfaces
+don't do that right now (because I remove the attributes before handing the
+RI to z.i), but I could probably fix that. I could either add attributes to
+the Method or hook __getitem__ to return something other than a Method
+(maybe a RemoteMethodSchema).
+
+Those Method objects have a .getSignatureInfo() which provides almost
+everything I need to construct the RemoteMethodSchema. Perhaps I should
+post-process Methods rather than pre-process the RemoteInterface. I can't
+tell how to use the return value trick, and it looks like the function may
+be discarded entirely once the Method is created, so this approach may not
+work.
+
+On the server side (Referenceable), subclassing Interface is nice because it
+provides adapters and implements() queries.
+
+On the client side (RemoteReference), subclassing Interface is a hassle: I
+don't think adapters are as useful, but getting at a method (as an attribute
+of the RI) is important. We have to bypass most of Interface to parse the
+method definitions differently.
+
+* create UnslicerRegistry, registerUnslicer
+DONE (PROVISIONAL), flat registry (therefore problematic for len(opentype)>1)
+
+consider adopting the existing collection API (getChild, putChild) for this,
+or maybe allow registerUnslicer() to take a callable which behaves kind of
+like a twisted.web isLeaf=1 resource (stop walking the tree, give all index
+tokens to the isLeaf=1 node)
+
+also some APIs to get a list of everything in the registry
+
+* use metaclass to auto-register RemoteCopy classes
+DONE
+
+** use metaclass to auto-register Unslicer classes
+DONE
+
+** and maybe Slicer classes too
+DONE with name 'slices', perhaps change to 'slicerForClasses'?
+
+ class FailureSlicer(slicer.BaseSlicer):
+ classname = "twisted.python.failure.Failure"
+ slicerForClasses = (failure.Failure,) # triggers auto-register
+
+** various registry approaches
+DONE
+
+There are currently three kinds of registries used in banana/newpb:
+
+ RemoteInterface <-> interface name
+ class/type -> Slicer (-> opentype) -> Unslicer (-> class/type)
+ Copyable subclass -> copyable-opentype -> RemoteCopy subclass
+
+There are two basic approaches to representing the mappings that these
+registries implement. The first is implicit, where the local objects are
+subclassed from Sliceable or Copyable or RemoteInterface and have attributes
+to define the wire-side strings that represent them. On the receiving side,
+we make extensive use of metaclasses to perform automatic registration
+(taking names from class attributes and mapping them to the factory or
+RemoteInterface used to create the remote version).
+
+The second approach is explicit, where pb.registerRemoteInterface,
+pb.registerRemoteCopy, and pb.registerUnslicer are used to establish the
+receiving-side mapping. There isn't a clean way to do it explicitly on the
+sending side, since we already have instances whose classes can give us
+whatever information we want.
+
+The advantage of implicit is simplicity: no more questions about why my
+pb.RemoteCopy is giving "not unserializable" errors. The mere act of
+importing a module is enough to let PB create instances of its classes.
+
+The advantage of doing it explicitly is to remind the user about the
+existence of those maps, because the factory classes in the receiving map is
+precisely equal to the user's exposure (from a security point of view). See
+the E paper on secure-serialization for some useful concepts.
+
+A disadvantage of implicit is that you can't quite be sure what, exactly,
+you're exposed to: the registrations take place all over the place.
+
+To make explicit not so painful, we can use quotient's .wsv files
+(whitespace-separated values) which map from class to string and back again.
+The file could list fully-qualified classname, wire-side string, and
+receiving factory class on each line. The Broker (or rather the RootSlicer
+and RootUnslicer) would be given a set of .wsv files to define their
+mapping. It would get all the registrations at once (instead of having them
+scattered about). They could also demand-load the receive-side factory
+classes.
+
+For now, go implicit. Put off the decision until we have some more
+experience with using newpb.
+
+* move from VocabSlicer sequence to ADDVOCAB/DELVOCAB tokens
+
+Requires a .wantVocabString flag in the parser, which is kind of icky but
+fixes the annoying asymmetry between set (vocab sequence) and get (VOCAB
+token). Might want a CLEARVOCAB token too.
+
+On second thought, this won't work. There isn't room for both a vocab number
+and a variable-length string in a single token. It must be an open sequence.
+However, it could be an add/del/set-vocab sequence, allowing the vocab to be
+modified incrementally.
+
+** VOCABize interface/method names
+
+One possibility is to make a list of all strings used by all known
+RemoteInterfaces and all their methods, then send it at broker connection
+time as the initial vocab map. A better one (maybe) is to somehow track what
+we send and add a word to the vocab once we've sent it more than three
+times.
+
+Maybe vocabize the pairs, as "ri/name1","ri/name2", etc, or maybe do them
+separately. Should do some handwaving math to figure out which is better.
+
+* nail down some useful schema syntaxes
+
+This has two parts: parsing something like a __schema__ class attribute (see
+the sketches in schema.xhtml) into a tree of FooConstraint objects, and
+deciding how to retrieve schemas at runtime from things like the object being
+serialized or the object being called from afar. To be most useful, the
+syntax needs to mesh nicely (read "is identical to") things like formless and
+(maybe?) atop or whatever has replaced the high-density highly-structured
+save-to-disk scheme that twisted.world used to do.
+
+Some lingering questions in this area:
+
+ When an object has a remotely-invokable method, where does the appropriate
+ MethodConstraint come from? Some possibilities:
+
+ an attribute of the method itself: obj.method.__schema__
+
+ from inside a __schema__ attribute of the object's class
+
+ from inside a __schema__ attribute of an Interface (which?) that the object
+ implements
+
+ Likewise, when a caller holding a RemoteReference invokes a method on it, it
+ would be nice to enforce a schema on the arguments they are sending to the
+ far end ("be conservative in what you send"). Where should this schema come
+ from? It is likely that the sender only knows an Interface for their
+ RemoteReference.
+
+ When PB determines that an object wants to be copied by value instead of by
+ reference (pb.Copyable subclass, Copyable(obj), schema says so), where
+ should it find a schema to define what exactly gets copied over? A class
+ attribute of the object's class would make sense: most objects would do
+ this, some could override jellyFor to get more control, and others could
+ override something else to push a new Slicer on the stack and do streaming
+ serialization. Whatever the approach, it needs to be paralleled by the
+ receiving side's unjellyableRegistry.
+
+* RemoteInterface instances should have an "RI-" prefix instead of "I-"
+
+DONE
+
+* merge my RemoteInterface syntax with zope.interface's
+
+I hacked up a syntax for how method definitions are parsed in
+RemoteInterface objects. That syntax isn't compatible with the one
+zope.interface uses for local methods, so I just delete them from the
+attribute dictionary to avoid causing z.i indigestion. It would be nice if
+they were compatible so I didn't have to do that. This basically translates
+into identifying the nifty extra flags (like priority classes, no-response)
+that we want on these methods and finding a z.i-compatible way to implement
+them. It also means thinking of SOAP/XML-RPC schemas and having a syntax
+that can represent everything at once.
+
+
+* use adapters to enable pass-by-reference or pass-by-value
+
+It should be possible to pass a reference with variable forms:
+
+ rr.callRemote("foo", 1, Reference(obj))
+ rr.callRemote("bar", 2, Copy(obj))
+
+This should probably adapt the object to IReferenceable or ICopyable, which
+are like ISliceable except they can pass the object by reference or by
+value. The slicing process should be:
+
+ look up the type() in a table: this handles all basic types
+ else adapt the object to ISliceable, use the result
+ else raise an Unsliceable exception
+ (and point the user to the docs on how to fix it)
+
+The adapter returned by IReferenceable or ICopyable should implement
+ISliceable, so no further adaptation will be done.
+
+* remove 'copy' prefix from remotecopy banana type names?
+
+<glyph> warner: did we ever finish our conversation on the usefulness of the
+(copy foo blah) namespace rather than just (foo blah)?
+<warner> glyph: no, I don't think we did
+<glyph> warner: do you still have (copy foo blah)?
+<warner> glyph: yup
+<warner> so far, it seems to make some things easier
+<warner> glyph: the sender can subclass pb.Copyable and not write any new
+code, while the receiver can write an Unslicer and do a registerRemoteCopy
+<warner> glyph: instead of the sender writing a whole slicer and the receiver
+registering at the top-level
+<glyph> warner: aah
+<warner> glyph: although the fact that it's easier that way may be an artifact
+of my sucky registration scheme
+<glyph> warner: so the advantage is in avoiding registration of each new
+unslicer token?
+<glyph> warner: yes. I'm thinking that a metaclass will handily remove the
+need for extra junk in the protocol ;)
+<warner> well, the real reason is my phobia about namespace purity, of course
+<glyph> warner: That's what the dots are for
+<warner> but ease of dispatch is also important
+<glyph> warner: I'm concerned about it because I consider my use of the same
+idiom in the first version of PB to be a serious wart
+* warner nods
+<warner> I will put together a list of my reasoning
+<glyph> warner: I think it's likely that PB implementors in other languages
+are going to want to introduce new standard "builtin" types; our "builtins"
+shouldn't be limited to python's provided data structures
+<moshez> glyph: wait
+<warner> ok
+<moshez> glyph: are you talking of banana types
+<moshez> glyph: or really PB
+<warner> in which case (copy blah blah) is a non-builtin type, while
+(type-foo) is a builtin type
+<glyph> warner: plus, our namespaces are already quite well separated, I can
+tell you I will never be declaring new types outside of quotient.* and
+twisted.* :)
+<warner> moshez: this is mostly banana (or what used to be jelly, really)
+<glyph> warner: my inclination is to standardize by convention
+<glyph> warner: *.* is a non-builtin type, [~.] is a builtin
+<moshez> glyph: ?
+<glyph> sorry [^.]*
+<glyph> my regular expressions and shell globs are totally confused but you
+know what I mean
+<glyph> moshez: yes
+<moshez> glyph: hrm
+<saph_w> glyph: you're making crazy anime faces
+<moshez> glyph: why do we need any non-Python builtin types
+<glyph> moshez: because I want to destroy SOAP, and doing that means working
+with people I don't like
+<glyph> moshez: outside of python
+<moshez> glyph: I meant, "what specific types"
+<moshez> I'd appreciate a blog on that
+
+* have Copyable/RemoteCopy default to __getstate__/__setstate__?
+
+At the moment, the default implementations of getStateToCopy() and
+setCopyableState() get and set __dict__ directly. Should the default instead
+be to call __getstate__() or __setstate__()?
+
+* make slicer/unslicers for pb.RemoteInterfaces
+
+exarkun's use case requires these Interfaces to be passable by reference
+(i.e. by name). It would also be interesting to let them be passed (and
+requested!) by value, so you can ask a remote peer exactly what their
+objects will respond to (the method names, the argument values, the return
+value). This also requires that constraints be serializable.
+
+do this, should be referenceable (round-trip should return the same object),
+should use the same registration lookup that RemoteReference(interfacelist)
+uses
+
+* investigate decref/Referenceable race
+
+Any object that includes some state when it is first sent across the wire
+needs more thought. The far end could drop the last reference (at time t=1)
+while a method is still pending that wants to send back the same object. If
+the method finishes at time t=2 but the decref isn't received until t=3, the
+object will be sent across the wire without the state, and the far end will
+receive it for the "first" time without that associated state.
+
+This kind of conserve-bandwidth optimization may be a bad idea. Or there
+might be a reasonable way to deal with it (maybe request the state if it
+wasn't sent and the recipient needs it, and delay delivery of the object
+until the state arrives).
+
+DONE, the RemoteReference is held until the decref has been acked. As long as
+the methods are executed in-order, this will prevent the race. TODO:
+third-party references (and other things that can cause out-of-order
+execution) could mess this up.
+
+* sketch out how to implement glyph's crazy non-compressed sexpr encoding
+
+* consider a smaller scope for OPEN-counter reference numbers
+
+For newpb, we moved to implicit reference numbers (counting OPEN tags
+instead of putting a number in the OPEN tag) because we didn't want to burn
+so much bandwidth: it isn't feasible to predict whether your object will
+need to be referenced in the future, so you always have to be prepared to
+reference it, so we always burn the memory to keep track of them (generally
+in a ScopedSlicer subclass). If we used explicit refids then we'd have to
+burn the bandwidth too.
+
+The sorta-problem is that these numbers will grow without bound as long as
+the connection remains open. After a few hours of sending 100-byte objects
+over a 100MB connection, you'll hit 1G-references and will have to start
+sending them as LONGINT tokens, which is annoying and slightly verbose (say
+3 or 4 bytes of number instead of 1 or 2). You never keep track of that many
+actual objects, because the references do not outlive their parent
+ScopedSlicer.
+
+The fact that the references themselves are scoped to the ScopedSlicer
+suggests that the reference numbers could be too. Each ScopedSlicer would
+track the number of OPEN tokens emitted (actually the number of
+slicerForObject calls made, except you'd want to use a different method to
+make sure that children who return a Slicer themselves don't corrupt the
+OPEN count).
+
+This requires careful synchronization between the ScopedSlicers on one end
+and the ScopedUnslicers on the other. I suspect it would be slightly
+fragile.
+
+One sorta-benefit would be that a somewhat human-readable sexpr-based
+encoding would be even more human readable if the reference numbers stayed
+small (you could visually correlate objects and references more easily). The
+ScopedSlicer's open-parenthesis could be represented with a curly brace or
+something, then the refNN number would refer to the NN'th left-paren from
+the last left-brace. It would also make it clear that the recipient will not
+care about objects outside that scope.
+
+* implement the FDSlicer
+
+Over a unix socket, you can pass fds. exarkun had a presentation at PyCon04
+describing the use of this to implement live application upgrade. I think
+that we could make a simple FDSlicer to hide the complexity of the
+out-of-band part of the communication.
+
+class Server(unix.Server):
+ def sendFileDescriptors(self, fileno, data="Filler"):
+ """
+ @param fileno: An iterable of the file descriptors to pass.
+ """
+ payload = struct.pack("%di" % len(fileno), *fileno)
+ r = sendmsg(self.fileno(), data, 0, (socket.SOL_SOCKET, SCM_RIGHTS, payload))
+ return r
+
+class Client(unix.Client):
+ def doRead(self):
+ if not self.connected:
+ return
+ try:
+ msg, flags, ancillary = recvmsg(self.fileno())
+ except:
+ log.msg('recvmsg():')
+ log.err()
+ else:
+ buf = ancillary[0][2]
+ fds = []
+ while buf:
+ fd, buf = buf[:4], buf[4:]
+ fds.append(struct.unpack("i", fd)[0])
+ try:
+ self.protocol.fileDescriptorsReceived(fds)
+ except:
+ log.msg('protocol.fileDescriptorsReceived')
+ log.err()
+ return unix.Client.doRead(self)
+
+* implement AsyncDeferred returns
+
+dash wanted to implement a TransferrableReference object with a scheme that
+would require creating a new connection (to a third-party Broker) during
+ReferenceUnslicer.receiveClose . This would cause the object deserialization
+to be asynchronous.
+
+At the moment, Unslicers can return a Deferred from their receiveClose
+method. This is used by immutable containers (like tuples) to indicate that
+their object cannot be created yet. Other containers know to watch for these
+Deferreds and add a callback which will update their own entries
+appropriately. The implicit requirement is that all these Deferreds fire
+before the top-level parent object (usually a CallUnslicer) finishes. This
+allows for circular references involving immutable containers to be resolved
+into the final object graph before the target method is invoked.
+
+To accomodate Deferreds which will fire at arbitrary points in the future,
+it would be useful to create a marker subclass named AsyncDeferred. If an
+unslicer returns such an object, the container parent starts by treating it
+like a regular Deferred, but it also knows that its object is not
+"complete", and therefore returns an AsyncDeferred of its own. When the
+child completes, the parent can complete, etc. The difference between the
+two types: Deferred means that the object will be complete before the
+top-level parent is finished, AsyncDeferred makes claims about when the
+object will be finished.
+
+CallUnslicer would know that if any of its arguments are Deferreds or
+AsyncDeferreds then it need to hold off on the broker.doCall until all those
+Deferreds have fired. Top-level objects are not required to differentiate
+between the two types, because they do not return an object to an enclosing
+parent (the CallUnslicer is a child of the RootUnslicer, but it always
+returns None).
+
+Other issues: we'll need a schema to let you say whether you'll accept these
+late-bound objects or not (because if you do accept them, you won't be able
+to impose the same sorts of type-checks as you would on immediate objects).
+Also this will impact the in-order-invocation promises of PB method calls,
+so we may need to implement the "it is ok to run this asynchronously" flag
+first, then require that TransferrableReference objects are only passed to
+methods with the flag set.
+
+Also, it may not be necessary to have a marker subclass of Deferred: perhaps
+_any_ Deferred which arrives from a child is an indication that the object
+will not be available until an unknown time in the future, and obligates the
+parent to return another Deferred upwards (even though their object could be
+created synchronously). Or, it might be better to implement this some other
+way, perhaps separating "here is my object" from "here is a Deferred that
+will fire when my object is complete", like a call to
+parent.addDependency(self.deferred) or something.
+
+DONE, needs testing
+
+* TransferrableReference
+
+class MyThing(pb.Referenceable): pass
+r1 = MyThing()
+r2 = Facet(r1)
+g1 = Global(r1)
+class MyGlobalThing(pb.GloballyReferenceable): pass
+g2 = MyGlobalThing()
+g3 = Facet(g2)
+
+broker.setLocation("pb://hostname.com:8044")
+
+rem.callRemote("m1", r1) # limited to just this connection
+rem.callRemote("m2", Global(r1)) # can be published
+g3 = Global(r1)
+rem.callRemote("m3", g1) # can also be published..
+g1.revoke() # but since we remember it, it can be revoked too
+g1.restrict() # and, as a Facet, we can revoke some functionality but not all
+
+rem.callRemote("m1", g2) # can be published
+
+E tarball: jsrc/net/captp/tables/NearGiftTable
+
+issues:
+ 1: when A sends a reference on B to C, C's messages to the object
+ referenced must arrive after any messages A sent before the reference forks
+
+ in particular, if A does:
+ B.callRemote("1", hugestring)
+ B.callRemote("2_makeYourSelfSecure", args)
+ C.callRemote("3_transfer", B)
+
+ and C does B.callRemote("4_breakIntoYou") as soon as it gets the reference,
+ then the A->B queue looks like (1, 2), and the A->C queue looks like (3).
+ The transfer message can be fast, and the resulting 4 message could be
+ delivered to B before the A->B queue manages to deliver 2.
+
+ 2: an object which get passed through multiple external brokers and
+ eventually comes home must be recognized as a local object
+
+ 3: Copyables that contain RemoteReferences must be passable between hosts
+
+E cannot do all three of these at once
+http://www.erights.org/elib/distrib/captp/WormholeOp.html
+
+I think that it's ok to tell people who want this guarantee to explicitly
+serialize it like this:
+
+ B.callRemote("1", hugestring)
+ d = B.callRemote("2_makeYourSelfSecure", args)
+ d.addCallback(lambda res: C.callRemote("3_transfer", B))
+
+Note that E might not require that method calls even have a return value, so
+they might not have had a convenient way to express this enforced
+serialization.
+
+** more thoughts
+
+To enforce the partial-ordering, you could do the equivalent of:
+ A:
+ B.callRemote("1", hugestring)
+ B.callRemote("2_makeYourSelfSecure", args)
+ nonce = makeNonce()
+ B.callRemote("makeYourSelfAvailableAs", nonce)
+ C.callRemote("3_transfer", (nonce, B.name))
+ C:
+ B.callRemote("4_breakIntoYou")
+
+C uses the nonce when it connects to B. It knows the name of the reference,
+so it can compare it against some other reference to the same thing, but it
+can't actually use that name alone to get access.
+
+When the connection request arrives at B, it sees B.name (which is also
+unguessable), so that gives it reason to believe that it should queue C's
+request (that it isn't just a DoS attack). It queues it until it sees A's
+request to makeYourSelfAvailableAs with the matching nonce. Once that
+happens, it can provide the reference back to C.
+
+This implies that C won't be able to send *any* messages to B until that
+handshake has completed. It might be desireable to avoid the extra round-trip
+this would require.
+
+** more thoughts
+
+ url = PBServerFactory.registerReference(ref, name=None)
+ creates human-readable URLs or random identifiers
+
+the factory keeps a bidirectional mapping of names and Referenceables
+
+when a Referenceable gets serialized, if the factory's table doesn't have a
+name for it, the factory creates a random one. This entry in the table is
+kept alive by two things:
+
+ a live reference by one of the factory's Brokers
+ an entry in a Broker's "gift table"
+
+When a RemoteReference gets serialized (and it doesn't point back to the
+receiving Broker, and thus get turned into a your-reference sequence),
+
+<warner> A->C: "I'm going to send somebody a reference to you, incref your
+ gift table", C->A: roger that, here's a gift nonce
+<warner> A->B: "here's Carol's reference: URL plus nonce"
+<warner> B->C: "I want a liveref to your 'Carol' object, here's my ticket
+ (nonce)", C->B: "ok, ticket redeemed, here's your liveref"
+
+once more, without nonces:
+ A->C: "I'm going to send somebody a reference to you, incref your
+ gift table", C->A: roger that
+ A->B: "here's Carol's reference: URL"
+ B->C: "I want a liveref to your 'Carol' object", C->B: "ok, here's your
+ liveref"
+
+really:
+ on A: c.vat.callRemote("giftYourReference", c).addCallback(step2)
+ c is serialized as (your-reference, clid)
+ on C: vat.remote_giftYourReference(which): self.table[which] += 1; return
+ on A: step2: b.introduce(c)
+ c is serialized as (their-reference, url)
+ on B: deserialization sees their-reference
+ newvat = makeConnection(URL)
+ newvat.callRemote("redeemGift", URL).addCallback(step3)
+ on C: vat.remote_redeemGift(URL):
+ ref = self.urls[URL]; self.table[ref] -= 1; return ref
+ ref is serialized as (my-reference, clid)
+ on B: step3(c): b.remote_introduce(c)
+
+problem: if alice sends a thousand copies, that means these 5 messages are
+each send a thousand times. The makeConnection is cached, but the rest are
+not. We don't rememeber that we've already made this gift before, that the
+other end probably still has it. Hm, but we also don't know that they didn't
+lose it already.
+
+** ok, a plan:
+
+concern 1: objects must be kept alive as long as there is a RemoteReference
+to them.
+
+concern 2: we should be able to tell when an object is being sent for the
+first time, to add metadata (interface list, public URL) that would be
+expensive to add to every occurrence.
+
+ each (my-reference) sent over the wire increases the broker's refcount on
+ both ends.
+
+ the receiving Broker retains a weakref to the RemoteReference, and retains a
+ copy of the metadata necessary to create it in the clid table (basically the
+ entire contents of the RemoteReference). When the weakref expires, it marks
+ the clid entry as "pending-free", and sends a decref(clid,N) to the other
+ Broker. The decref is actually sent with broker.callRemote("decref", clid,
+ N), so it can be acked.
+
+ the sending broker gets the decref and reduces its count by N. If another
+ reference was sent recently, this count may not drop all the way to zero,
+ indicating there is a reference "in flight" and the far end should be ready
+ to deal with it (by making a new RemoteReference with the same properties as
+ the old one). If N!=0, it returns False to indicate that this was not the
+ last decref message for the clid. If N==0, it returns True, since it is the
+ last decref, and removes the entry from its table. Once remote_decref
+ returns True, the clid is retired.
+
+ the receiving broker receives the ack from the decref. If the ack says
+ last==True, the clid table entry is freed. If it says last==False, then
+ there should have been another (my-reference) received before the ack, so
+ the refcount should be non-zero.
+
+ message sequence:
+
+ A-> : (my-reference clid metadata) [A.myrefs[clid].refcount++ = 1]
+ A-> : (my-reference clid) [A.myrefs[clid].refcount++ = 2]
+ ->B: receives my-ref, creates RR, B.yourrefs[clid].refcount++ = 1
+ ->B: receives my-ref, B.yourrefs[clid].refcount++ = 2
+ : time passes, B sees the reference go away
+ <-B: d=brokerA.callRemote("decref", clid, B.yourrefs[clid].refcount)
+ B.yourrefs[clid].refcount = 0; d.addCallback(B.checkref, clid)
+ A-> : (my-reference clid) [A.myrefs[clid].refcount++ = 3]
+ A<- : receives decref, A.myrefs[clid].refcount -= 2, now =1, returns False
+ ->B: receives my-ref, re-creates RR, B.yourrefs[clid].refcount++ = 1
+ ->B: receives ack(False), B.checkref asserts refcount != 0
+ : time passes, B sees the reference go away again
+ <-B: d=brokerA.callRemote("decref", clid, B.yourrefs[clid].refcount)
+ B.yourrefs[clid].refcount = 0; d.addCallback(B.checkref, clid)
+ A<- : receives decref, A.myrefs[clid].refcount -= 1, now =0, returns True
+ del A.myrefs[clid]
+ ->B: receives ack(True), B.checkref asserts refcount==0
+ del B.yourrefs[clid]
+
+B retains the RemoteReference data until it receives confirmation from A.
+Therefore whenever A sends a reference that doesn't already exist in the clid
+table, it is sending it to a B that doesn't know about that reference, so it
+needs to send the metadata.
+
+concern 3: in the three-party exchange, Carol must be kept alive until Bob
+has established a reference to her, even if Alice drops her carol-reference
+immediately after sending the introduction to Bob.
+
+(my-reference, clid, [interfaces, public URL])
+(your-reference, clid)
+(their-reference, URL)
+
+Serializing a their-reference causes an entry to be placed in the Broker's
+.theirrefs[URL] table. Each time a their-reference is sent, the entry's
+refcount is incremented.
+
+Receiving a their-reference may initiate a PB connection to the target,
+followed by a getNamedReference request. When this completes (or if the
+reference was already available), the recipient sends a decgift message to
+the sender. This message includes a count, so multiple instances of the same
+gift can be acked as a group.
+
+The .theirrefs entry retains a reference to the sender's RemoteReference, so
+it cannot go away until the gift is acked.
+
+DONE, gifts are implemented, we punted on partial-ordering
+
+*** security, DoS
+
+Bob can force Alice to hold on to a reference to Carol, as long as both
+connections are open, by never acknowledging the gift.
+
+Alice can cause Bob to open up TCP connections to arbitrary hosts and ports,
+by sending third-party references to him, although the only protocol those
+connections will speak is PB.
+
+Using yURLs and StartTLS should be enough to secure and authenticate the
+connections.
+
+*** partial-ordering
+
+If we need it, the gift (their-reference message) can include a nonce, Alice
+sends a makeYourSelfAvailableAs message to Carol with the nonce, and Bob must
+do a new getReference with the nonce.
+
+Kragen came up with a good use-case for partial-ordering:
+ A:
+ B.callRemote("updateDocument", bigDocument)
+ C.callRemote("pleaseReviewLatest", B)
+ C:
+ B.callRemote("getLatestDocument")
+
+
+* PBService / Tub
+
+Really, PB wants to be a Service, since third-party references mean it will
+need to make connections to arbitrary targets, and it may want to re-use
+those connections.
+
+ s = pb.PBService()
+ s.listenOn(strport) # provides URL base
+ swissURL = s.registerReference(ref) # creates unguessable name
+ publicURL = s.registerReference(ref, "name") # human-readable name
+ s.unregister(URL) # also revokes all clids
+ s.unregisterReference(ref)
+ d = s.getReference(URL) # Deferred which fires with the RemoteReference
+ d = s.shutdown() # close all servers and client connections
+
+DONE, this makes things quite clean
+
+* promise pipelining
+
+Even without third-party references, we can do E-style promise pipelining.
+
+<warner> hmm. subclass of Deferred that represents a Promise, can be
+ serialized if it's being sent to the same broker as the RemoteReference it was
+ generated for
+<dash> warner: hmmm. how's that help us?
+<dash> oh, pipelining?
+<warner> maybe a flag on the callRemote to say that "yeah, I want a
+ DeferredPromise out of you, but I'm only going to include it as an argument to
+ another method call I'm sending you, so don't bother sending *me* the result"
+<dash> aah
+<dash> yeah
+<dash> that sounds like a reasonable approach
+<warner> that would actually work
+<warner> dash: do you know if E makes any attempt to handle >2 vats in their
+ pipelining implementation? seems to me it could turn into a large network
+ optimization problem pretty quickly
+<dash> warner: Mmm
+<warner> hmm
+<dash> I do not think you have to
+<warner> so you have: t1=a.callRemote("foo",args1);
+ t2=t1.callRemote("bar",args2), where callRemote returns a Promise, which is a
+ special kind of Deferred that remembers the Broker its answer will eventually
+ come from. If args2 consists of entirely immediate things (no Promises) or
+ Promises that are coming from the same broker as t1 uses, then the "bar" call
+ is eligible for pipelining and gets sent to the remote broker
+<warner> in the resulting newpb banana sequence, the clid of the target method
+ is replaced by another kind of clid, which means "the answer you're going to
+ send to method call #N", where N comes from t1
+<dash> mmm yep
+<warner> using that new I-can't-unserialize-this-yet hook we added, the second
+ call sequence doesn't finish unserializing until the first call finishes and
+ sends the answer. Sending answer #N fires the hook's deferred.
+<warner> that triggers the invocation of the second method
+<dash> yay
+<warner> hm, of course that totally blows away the idea of using a Constraint
+ on the arguments to the second method
+<warner> because you don't even know what the object is until after the
+ arguments have arrived
+<warner> but
+<dash> well
+<warner> the first method has a schema, which includes a return constraint
+<dash> okay you can't fail synchronously
+<warner> so you *can* assert that, whatever the object will be, it obeys that
+ constraint
+<dash> but you can return a failure like everybody else
+<warner> and since the constraint specifies an Interface, then the Interface
+ plus mehtod name is enough to come up with an argument constraint
+<warner> so you can still enforce one
+<warner> this is kind of cool
+<dash> the big advantage of pipelining is that you can have a lot of
+ composable primitives on your remote interfaces rather than having to smush
+ them together into things that are efficient to call remotely
+<warner> hm, yeah, as long as all the arguments are either immediate or
+ reference something on the recipient
+<warner> as soon as a third party enters the equation, you have to decide
+ whether to wait for the arguments to resolve locally or if it might be faster
+ to throw them at someone else
+<warner> that's where the network-optimization thing I mentioned before comes
+ into play
+<dash> mmm
+<warner> you send messages to A and to B, once you get both results you want
+ to send the pair to C to do something with them
+<dash> spin me an example scenario
+<dash> Hmm
+<warner> if all three are close to each other, and you're far from all of
+ them, it makes more sense to tell C about A and B
+<dash> how _does_ E handle that
+<warner> or maybe tell A and B about C, tell them "when you get done, send
+ your results to C, who will be waiting for them"
+<dash> warner: yeah, i think that the right thing to do is to wait for them to
+ resolve locally
+<Tv> assuming that C can talk to A and B is bad
+<dash> no it isn't
+<Tv> well, depends on whether you live in this world or not :)
+<dash> warner: if you want other behaviour then you should have to set it up
+ explicitly, i think
+<warner> I'm not even sure how you would describe that sort of thing. It'd be
+ like routing protocols, you assign a cost to each link and hope some magical
+ omniscient entity can pick an optimal solution
+
+** revealing intentions
+
+<zooko> Now suppose I say "B.your_fired(C.revoke_his_rights())", or such.
+<warner> A->C: sell all my stock. A->B: declare bankruptcy
+
+If B has access to C, and the promises are pipelined, then B has a window
+during which they know something's about to happen, and they still have full
+access to C, so they can do evil.
+
+Zooko tried to explain the concern to MarkM years ago, but didn't have a
+clear example of the problem. The thing is, B can do evil all the time,
+you're just trying to revoke their capability *before* they get wind of your
+intentions. Keeping intentions secret is hard, much harder than limiting
+someone's capabilities. It's kind of the trailing edge of the capability, as
+opposed to the leading edge.
+
+Zooko feels the language needs clear support for expressing how the
+synchronization needs to take place, and which domain it needs to happen in.
+
+* web-calculus integration
+
+Tyler pointed out that it is vital for a node to be able to grant limited
+access to some held object. Specifically, Alice may want to give Bob a
+reference not to Carol as a whole, but to just a specific Carol.remote_foo
+method (and not to any other methods that Alice might be allowed to invoke).
+I had been thinking of using RemoteInterfaces to indicate method subsets,
+something like this:
+
+ bob.callRemote("introduce", Facet(self, RIMinimal))
+
+but Tyler thinks that this is too coarse-grained and not likely to encourage
+the right kinds of security decisions. In his web-calculus, recipients can
+grant third-parties access to individual bound methods.
+
+ bob.callRemote("introduce", carol.getMethod("howdy"))
+
+If I understand it correctly, his approach makes Referenceables into a
+copy-by-value object that is represented by a dictionary which maps method
+names to these RemoteMethod objects, so there is no actual
+callRemote(methname) method. Instead you do something like:
+
+ rr = tub.getReference(url)
+ d = rr['introduce'].call(args)
+
+These RemoteMethod objects are top-level, so unguessable URLs must be
+generated for them when they are sent, and they must be reference-counted. It
+must not be possible to get from the bound method to the (unrestricted)
+referenced object.
+
+TODO: how does the web-calculus maintain reference counts for these? It feels
+like there would be an awful lot of messages being thrown around.
+
+To implement this, we'll need:
+
+ banana sequences for bound methods
+ ('my-method', clid, url)
+ ('your-method', clid)
+ ('their-method', url, RI+methname?)
+ syntax to carve a single method out of a local Referenceable
+ A: self.doFoo (only if we get rid of remote_)
+ B: self.remote_doFoo
+ C: self.getMethod("doFoo")
+ D: self.getMethod(RIFoo['doFoo'])
+ leaning towards C or D
+ syntax to carve a single method out of a RemoteReference
+ A: rr.doFoo
+ B: rr.getMethod('doFoo')
+ C: rr.getMethod(RIFoo['doFoo'])
+ D: rr['doFoo']
+ E: rr[RIFoo['doFoo']]
+ leaning towards B or C
+ decide whether to do getMethod early or late
+ early means ('my-reference') includes a big dict of my-method values
+ and a whole bunch of DECREFs when that dict goes away
+ late means there is a remote_tub.getMethod(your-ref, methname) call
+ and an extra round-trip to retrieve them
+ dash thinks late is better
+
+We could say that the 'my-reference' sequence for any RemoteInterface-enabled
+Referenceable will include a dictionary of bound methods. The receiving end
+will just stash the whole thing.
+
+* do implicit "doFoo" -> RIFoo["doFoo"] conversion
+
+I want rr.callRemote("doFoo", args) to take advantage of a RemoteInterface,
+if one is available. RemoteInterfaces aren't supposed to be overlapping (at
+least not among RemoteInterfaces that are shared by a single Referenceable),
+so there shouldn't be any ambiguity. If there is, we can raise an error.
+
+* accept Deferreds as arguments?
+
+ bob.callRemote("introduce", target=self.tub.getReference(pburl))
+ or
+ bob.callRemote("introduce", carol.getMethod("doFoo"))
+ instead of
+ carol.getMethod("doFoo").addCallback(lambda r: bob.callRemote("introduce", r))
+
+If one of the top-level arguments to callRemote is a Deferred, don't send the
+method request until all the arguments resolve. If any of the arguments
+errback, the callRemote will fail with some new exception (that can contain a
+reference to the argument's exception).
+
+however, this would mean the method would be invoked out-of-order w.r.t. an
+immediately-following bob.callRemote
+
+put this off until we get some actual experience.
+
+* batch decrefs?
+
+If we implement the copy-by-value Referenceable idea, then a single gc may
+result in dozens of simultaneous decrefs. It would be nice to reduce the
+traffic generated by that.
+
+* promise pipelining
+
+Promise(Deferred).__getattr__
+
+DoS prevention techniques in CapIDL (MarkM)
+
+pb://key@ip,host,[ipv6],localhost,[/unix]/swissnumber
+tubs for lifetime management
+separate listener object, share tubs between listeners
+ distinguish by key number
+
+ actually, why bother with separate keys? Why allow the outside world to
+ distinguish between these sub-Tubs? Use them purely for lifetime management,
+ not security properties. That means a name->published-object table for each
+ SubTub, maybe a hierarchy of them, and the parent-most Tub gets the
+ Listeners. Incoming getReferenceByURL requests require a lookup in all Tubs
+ that descend from the one attached to that listener.
+
+So one decision is whether to have implicitly-published objects have a name
+that lasts forever (well, until the Tub is destroyed), or if they should be
+reference-counted. If they are reference counted, then outstanding Gifts need
+to maintain a reference, and the gift must be turned into a live
+RemoteReference right away. It has bearing on how/if we implement SturdyRefs,
+so I need to read more about them in the E docs.
+
+Hrm, and creating new Tubs from within a remote_foo method.. to make that
+useful, you'd need to have a way to ask for the Tub through which you were
+being invoked. hrm.
+
+* creating new Tubs
+
+Tyler suggests using Tubs for namespace management. Tubs can share TCP
+listening ports, but MarkS recommends giving them all separate keys (which
+means separate SSL sessions, so separate TCP connections). Bill Frantz
+discourages using a hierarchy of Tubs, says it's not the sort of thing you
+want to be locked into.
+
+That means I'll need a separate Listener object, where the rule is that the
+last Tub to be stopped makes the Listener stop too.. probably abuse the
+Service interface in some wacky way to pull this off.
+
+Creating a new Tub.. how to conveniently create it with the same Listeners as
+the current one? If the method that's creating the Tub is receiving a
+reference, the Tub can be an attribute of the inbound RemoteReference. If
+not, that's trickier.. the _tub= argument may still be a useful way to go.
+Once you've got a source tub, then tub.newTub() should create a new one with
+the same Listeners as the source (but otherwise unassociated with it).
+
+Once you have the new Tub, registering an object in it should return
+something that can be directly serialized into a gift.
+
+class Target(pb.Referenceable):
+ def remote_startGame(self, player_black, player_white):
+ tub = player_black.tub.newTub()
+ game = self.createGame()
+ gameref = tub.register(game)
+ game.setPlayer("black", tub.something(player_black))
+ game.setPlayer("white", tub.something(player_white))
+ return gameref
+
+Hmm. So, create a SturdyRef class, which remembers the tubid (key), list of
+location hints, and object name. These have a url() method that renders out a
+URL string, and a compare method which compares the tubid and object name but
+ignores the location hints. Serializing a SturdyRef creates a their-reference
+sequence. Tub.register takes an object (and maybe a name) and returns a
+SturdyRef. Tub.getReference takes either a URL or a SturdyRef.
+RemoteReferences should have a .getSturdyRef method.
+
+Actually, I think SturdyRefs should be serialized as Copyables, and create
+SturdyRefs on the other side. The new-tub sequence should be:
+
+ create new tub, using the Listener from an existing tub
+ register the objects in the new tub, obtaining a SturdyRef
+ send/return SendLiveRef(sturdyref) to the far side
+ SendLiveRef is a wrapper that causes a their-reference sequence to be sent.
+ The alternative is to obtain an actual live reference (via
+ player_black.tub.getReference(sturdyref) first), then send that, but it's
+ kind of a waste if you don't actually want to use the liveref yourself.
+
+Note that it becomes necessary to provide for local references here: ones in
+different Tubs which happen to share a Listener. These can use real TCP
+connections (unless the Listener hint is only valid from the outside world).
+It might be possible to use some tricks cut out some of the network overhead,
+but I suspect there are reasons why you wouldn't actually want to do that.
"""Foolscap"""
-__version__ = "0.1.4"
+__version__ = "0.1.5"
# here are the primary entry points
from foolscap.pb import Tub, UnauthenticatedTub, getRemoteURL_TCP
def receiveChild(self, token, ready_deferred):
if isinstance(token, call.InboundDelivery):
- assert ready_deferred is None
- self.broker.scheduleCall(token)
+ self.broker.scheduleCall(token, ready_deferred)
self.disconnectWatchers = []
# receiving side uses these
self.inboundDeliveryQueue = []
+ self._call_is_running = False
self.activeLocalCalls = {} # the other side wants an answer from us
def setTub(self, tub):
return m
return None
- def scheduleCall(self, delivery):
- self.inboundDeliveryQueue.append(delivery)
+ def scheduleCall(self, delivery, ready_deferred):
+ self.inboundDeliveryQueue.append( (delivery,ready_deferred) )
eventually(self.doNextCall)
- def doNextCall(self, ignored=None):
- if not self.inboundDeliveryQueue:
+ def doNextCall(self):
+ if self._call_is_running:
return
- nextCall = self.inboundDeliveryQueue[0]
- if nextCall.isRunnable():
- # remove it and arrange to run again soon
- self.inboundDeliveryQueue.pop(0)
- delivery = nextCall
- if self.inboundDeliveryQueue:
- eventually(self.doNextCall)
-
- # now perform the actual delivery
- d = defer.maybeDeferred(self._doCall, delivery)
- d.addCallback(self._callFinished, delivery)
- d.addErrback(self.callFailed, delivery.reqID, delivery)
+ if not self.inboundDeliveryQueue:
return
- # arrange to wake up when the next call becomes runnable
- d = nextCall.whenRunnable()
- d.addCallback(self.doNextCall)
+ delivery, ready_deferred = self.inboundDeliveryQueue.pop(0)
+ self._call_is_running = True
+ if not ready_deferred:
+ ready_deferred = defer.succeed(None)
+ d = ready_deferred
+ d.addCallback(lambda res: self._doCall(delivery))
+ d.addCallback(self._callFinished, delivery)
+ d.addErrback(self.callFailed, delivery.reqID, delivery)
+ def _done(res):
+ self._call_is_running = False
+ eventually(self.doNextCall)
+ d.addBoth(_done)
+ return None
def _doCall(self, delivery):
obj = delivery.obj
- assert delivery.allargs.isReady()
args = delivery.allargs.args
kwargs = delivery.allargs.kwargs
for i in args + kwargs.values():
from twisted.internet import defer
from foolscap import copyable, slicer, tokens
-from foolscap.eventual import eventually
from foolscap.copyable import AttributeDictConstraint
from foolscap.constraint import ByteStringConstraint
from foolscap.slicers.list import ListConstraint
from tokens import BananaError, Violation
+from foolscap.util import AsyncAND
class FailureConstraint(AttributeDictConstraint):
self.methodname = methodname
self.methodSchema = methodSchema
self.allargs = allargs
- if allargs.isReady():
- self.runnable = True
- self.runnable = False
-
- def isRunnable(self):
- if self.allargs.isReady():
- return True
- return False
-
- def whenRunnable(self):
- if self.allargs.isReady():
- return defer.succeed(self)
- d = self.allargs.whenReady()
- d.addCallback(lambda res: self)
- return d
def logFailure(self, f):
# called if tub.logLocalFailures is True
self.argname = None
self.argConstraint = None
self.num_unreferenceable_children = 0
- self.num_unready_children = 0
+ self._all_children_are_referenceable_d = None
+ self._ready_deferreds = []
self.closed = False
def checkToken(self, typebyte, size):
if self.debug:
log.msg("%s.receiveChild: %s %s %s %s %s args=%s kwargs=%s" %
(self, self.closed, self.num_unreferenceable_children,
- self.num_unready_children, token, ready_deferred,
+ len(self._ready_deferreds), token, ready_deferred,
self.args, self.kwargs))
if self.numargs is None:
# this token is the number of positional arguments
# resolved yet.
self.num_unreferenceable_children += 1
argvalue.addCallback(self.updateChild, argpos)
- argvalue.addErrback(self.explode)
if ready_deferred:
if self.debug:
log.msg("%s.receiveChild got an unready posarg" % self)
- self.num_unready_children += 1
- ready_deferred.addCallback(self.childReady)
+ self._ready_deferreds.append(ready_deferred)
if len(self.args) < self.numargs:
# more to come
ms = self.methodSchema
if self.argname is None:
# this token is the name of a keyword argument
+ assert ready_deferred is None
self.argname = token
# if the argname is invalid, this may raise Violation
ms = self.methodSchema
if isinstance(argvalue, defer.Deferred):
self.num_unreferenceable_children += 1
argvalue.addCallback(self.updateChild, self.argname)
- argvalue.addErrback(self.explode)
if ready_deferred:
if self.debug:
log.msg("%s.receiveChild got an unready kwarg" % self)
- self.num_unready_children += 1
- ready_deferred.addCallback(self.childReady)
+ self._ready_deferreds.append(ready_deferred)
self.argname = None
return
else:
self.kwargs[which] = obj
self.num_unreferenceable_children -= 1
- self.checkComplete()
+ if self.num_unreferenceable_children == 0:
+ if self._all_children_are_referenceable_d:
+ self._all_children_are_referenceable_d.callback(None)
return obj
- def childReady(self, obj):
- self.num_unready_children -= 1
- if self.debug:
- log.msg("%s.childReady, now %d left" %
- (self, self.num_unready_children))
- log.msg(" obj=%s, args=%s, kwargs=%s" %
- (obj, self.args, self.kwargs))
- self.checkComplete()
- return obj
-
- def checkComplete(self):
- # this is called each time one of our children gets updated or
- # becomes ready (like when a Gift is finally resolved)
- if self.debug:
- log.msg("%s.checkComplete: %s %s %s args=%s kwargs=%s" %
- (self, self.closed, self.num_unreferenceable_children,
- self.num_unready_children, self.args, self.kwargs))
-
- if not self.closed:
- return
- if self.num_unreferenceable_children:
- return
- if self.num_unready_children:
- return
- # yup, we're done. Notify anyone who is still waiting
- if self.debug:
- log.msg(" we are ready")
- for d in self.watchers:
- eventually(d.callback, self)
- del self.watchers
def receiveClose(self):
if self.debug:
log.msg("%s.receiveClose: %s %s %s" %
(self, self.closed, self.num_unreferenceable_children,
- self.num_unready_children))
+ len(self._ready_deferreds)))
if (self.numargs is None or
len(self.args) < self.numargs or
self.argname is not None):
raise BananaError("'arguments' sequence ended too early")
self.closed = True
- self.watchers = []
- # we don't return a ready_deferred. Instead, the InboundDelivery
- # object queries our isReady() method directly.
- return self, None
-
- def isReady(self):
- assert self.closed
+ dl = []
if self.num_unreferenceable_children:
- return False
- if self.num_unready_children:
- return False
- return True
-
- def whenReady(self):
- assert self.closed
- if self.isReady():
- return defer.succeed(self)
- d = defer.Deferred()
- self.watchers.append(d)
- return d
+ d = self._all_children_are_referenceable_d = defer.Deferred()
+ dl.append(d)
+ dl.extend(self._ready_deferreds)
+ ready_deferred = None
+ if dl:
+ ready_deferred = AsyncAND(dl)
+ return self, ready_deferred
def describe(self):
s = "<arguments"
else:
s += " arg[?]"
if self.closed:
- if self.isReady():
- # waiting to be delivered
- s += " ready"
- else:
- s += " waiting"
+ s += " closed"
+ # TODO: it would be nice to indicate if we still have unready
+ # children
s += ">"
return s
self.interface = None
self.methodname = None
self.methodSchema = None # will be a MethodArgumentsConstraint
+ self._ready_deferreds = []
def checkToken(self, typebyte, size):
# TODO: limit strings by returning a number instead of None
def receiveChild(self, token, ready_deferred=None):
assert not isinstance(token, defer.Deferred)
- assert ready_deferred is None
if self.debug:
log.msg("%s.receiveChild [s%d]: %s" %
(self, self.stage, repr(token)))
if self.stage == 0: # reqID
# we don't yet know which reqID to send any failure to
+ assert ready_deferred is None
self.reqID = token
self.stage = 1
if self.reqID != 0:
if self.stage == 1: # objID
# this might raise an exception if objID is invalid
+ assert ready_deferred is None
self.objID = token
self.obj = self.broker.getMyReferenceByCLID(token)
#iface = self.broker.getRemoteInterfaceByName(token)
# class). If this expectation were to go away, a quick
# obj.__class__ -> RemoteReferenceSchema cache could be built.
+ assert ready_deferred is None
self.stage = 3
if self.objID < 0:
# queue the message. It will not be executed until all the
# arguments are ready. The .args list and .kwargs dict may change
# before then.
+ if ready_deferred:
+ self._ready_deferreds.append(ready_deferred)
self.stage = 4
return
self.interface, self.methodname,
self.methodSchema,
self.allargs)
- return delivery, None
+ ready_deferred = None
+ if self._ready_deferreds:
+ ready_deferred = AsyncAND(self._ready_deferreds)
+ return delivery, ready_deferred
def describe(self):
s = "<methodcall"
resultConstraint = None
haveResults = False
+ def start(self, count):
+ slicer.ScopedUnslicer.start(self, count)
+ self._ready_deferreds = []
+ self._child_deferred = None
+
def checkToken(self, typebyte, size):
if self.request is None:
if typebyte != tokens.INT:
return unslicer
def receiveChild(self, token, ready_deferred=None):
- assert not isinstance(token, defer.Deferred)
- assert ready_deferred is None
if self.request == None:
+ assert not isinstance(token, defer.Deferred)
+ assert ready_deferred is None
reqID = token
# may raise Violation for bad reqIDs
self.request = self.broker.getRequest(reqID)
self.resultConstraint = self.request.constraint
else:
- self.results = token
+ if isinstance(token, defer.Deferred):
+ self._child_deferred = token
+ else:
+ self._child_deferred = defer.succeed(token)
+ if ready_deferred:
+ self._ready_deferreds.append(ready_deferred)
self.haveResults = True
def reportViolation(self, f):
return f # give up our sequence
def receiveClose(self):
- self.request.complete(self.results)
+ # three things must happen before our request is complete:
+ # receiveClose has occurred
+ # the receiveChild object deferred (if any) has fired
+ # ready_deferred has finished
+ # If ready_deferred errbacks, provide its failure object to the
+ # request. If not, provide the request with whatever receiveChild
+ # got.
+
+ if not self._child_deferred:
+ raise BananaError("Answer didn't include an answer")
+
+ if self._ready_deferreds:
+ d = AsyncAND(self._ready_deferreds)
+ else:
+ d = defer.succeed(None)
+
+ def _ready(res):
+ return self._child_deferred
+ d.addCallback(_ready)
+
+ def _done(res):
+ self.request.complete(res)
+ def _fail(f):
+ self.request.fail(f)
+ d.addCallbacks(_done, _fail)
+
return None, None
def describe(self):
self.frames = []
self.stack = []
+ # MAYBE: for native exception types, be willing to wire up a
+ # reference to the real exception class. For other exception types,
+ # our .type attribute will be a string, which (from a Failure's point
+ # of view) looks as if someone raised an old-style string exception.
+ # This is here so that trial will properly render a CopiedFailure
+ # that comes out of a test case (since it unconditionally does
+ # reflect.qual(f.type)
+
+ # ACTUALLY: replace self.type with a class that looks a lot like the
+ # original exception class (meaning that reflect.qual() will return
+ # the same string for this as for the original). If someone calls our
+ # .trap method, resulting in a new Failure with contents copied from
+ # this one, then the new Failure.printTraceback will attempt to use
+ # reflect.qual() on our self.type, so it needs to be a class instead
+ # of a string.
+
+ assert isinstance(self.type, str)
+ typepieces = self.type.split(".")
+ class ExceptionLikeString:
+ pass
+ self.type = ExceptionLikeString
+ self.type.__module__ = ".".join(typepieces[:-1])
+ self.type.__name__ = typepieces[-1]
+
def __str__(self):
return "[CopiedFailure instance: %s]" % self.getBriefTraceback()
file.write(self.traceback)
copyable.registerRemoteCopy(FailureSlicer.classname, CopiedFailure)
+
+class CopiedFailureSlicer(FailureSlicer):
+ # A calls B. B calls C. C fails and sends a Failure to B. B gets a
+ # CopiedFailure and sends it to A. A should get a CopiedFailure too. This
+ # class lives on B and slicers the CopiedFailure as it is sent to A.
+ slices = CopiedFailure
+
+ def getStateToCopy(self, obj, broker):
+ state = {}
+ for k in ('value', 'type', 'parents'):
+ state[k] = getattr(obj, k)
+ if broker.unsafeTracebacks:
+ state['traceback'] = obj.traceback
+ else:
+ state['traceback'] = "Traceback unavailable\n"
+ if not isinstance(state['type'], str):
+ state['type'] = reflect.qual(state['type']) # Exception class
+ return state
notifyOnDisconnect handlers are cancelled.
"""
+ def dontNotifyOnDisconnect(cookie):
+ """Deregister a callback that was registered with notifyOnDisconnect.
+ """
+
def callRemote(name, *args, **kwargs):
"""Invoke a method on the remote object with which I am associated.
self.nameToReference = weakref.WeakValueDictionary()
self.referenceToName = weakref.WeakKeyDictionary()
self.strongReferences = []
+ self.nameLookupHandlers = []
+
# remote stuff. Most of these use a TubRef (or NoAuthTubRef) as a
# dictionary key
self.tubConnectors = {} # maps TubRef to a TubConnector
return name
def getReferenceForName(self, name):
- return self.nameToReference[name]
+ if name in self.nameToReference:
+ return self.nameToReference[name]
+ for lookup in self.nameLookupHandlers:
+ ref = lookup(name)
+ if ref:
+ if ref not in self.referenceToName:
+ self.referenceToName[ref] = name
+ return ref
+ raise KeyError("unable to find reference for name '%s'" % (name,))
def getReferenceForURL(self, url):
# TODO: who should this be used by?
self.strongReferences.remove(ref)
self.revokeReference(ref)
+ def registerNameLookupHandler(self, lookup):
+ """Add a function to help convert names to Referenceables.
+
+ When remote systems pass a FURL to their Tub.getReference(), our Tub
+ will be asked to locate a Referenceable for the name inside that
+ furl. The normal mechanism for this is to look at the table
+ maintained by registerReference() and unregisterReference(). If the
+ name does not exist in that table, other 'lookup handler' functions
+ are given a chance. Each lookup handler is asked in turn, and the
+ first which returns a non-None value wins.
+
+ This may be useful for cases where the furl represents an object that
+ lives on disk, or is generated on demand: rather than creating all
+ possible Referenceables at startup, the lookup handler can create or
+ retrieve the objects only when someone asks for them.
+
+ Note that constructing the FURLs of these objects may be non-trivial.
+ It is safe to create an object, use tub.registerReference in one
+ invocation of a program to obtain (and publish) the furl, parse the
+ furl to extract the name, save the contents of the object on disk,
+ then in a later invocation of the program use a lookup handler to
+ retrieve the object from disk. This approach means the objects that
+ are created in a given invocation stick around (inside
+ tub.strongReferences) for the rest of that invocation. An alternatve
+ approach is to create the object but *not* use tub.registerReference,
+ but in that case you have to construct the FURL yourself, and the Tub
+ does not currently provide any support for doing this robustly.
+
+ @param lookup: a callable which accepts a name (as a string) and
+ returns either a Referenceable or None. Note that
+ these strings should not contain a slash, a question
+ mark, or an ampersand, as these are reserved in the
+ FURL for later expansion (to add parameters beyond the
+ object name)
+ """
+ self.nameLookupHandlers.append(lookup)
+
+ def unregisterNameLookupHandler(self, lookup):
+ self.nameLookupHandlers.remove(lookup)
+
def getReference(self, sturdyOrURL):
"""Acquire a RemoteReference for the given SturdyRef/URL.
from twisted.python.components import registerAdapter
Interface = interface.Interface
from twisted.internet import defer
-from twisted.python import failure
+from twisted.python import failure, log
from foolscap import ipb, slicer, tokens, call
BananaError = tokens.BananaError
getRemoteInterfaceByName, RemoteInterfaceConstraint
from foolscap.schema import constraintMap
from foolscap.copyable import Copyable, RemoteCopy
-from foolscap.eventual import eventually
+from foolscap.eventual import eventually, fireEventually
class OnlyReferenceable(object):
implements(ipb.IReferenceable)
methodSchema = None
return interfaceName, methodName, methodSchema
+class LocalReferenceable:
+ implements(ipb.IRemoteReference)
+ def __init__(self, original):
+ self.original = original
+
+ def notifyOnDisconnect(self, callback, *args, **kwargs):
+ # local objects never disconnect
+ return None
+ def dontNotifyOnDisconnect(self, marker):
+ pass
+
+ def callRemote(self, methname, *args, **kwargs):
+ def _try(ignored):
+ meth = getattr(self.original, "remote_" + methname)
+ return meth(*args, **kwargs)
+ d = fireEventually()
+ d.addCallback(_try)
+ return d
+
+ def callRemoteOnly(self, methname, *args, **kwargs):
+ d = self.callRemote(methname, *args, **kwargs)
+ d.addErrback(lambda f: None)
+ return None
+
+registerAdapter(LocalReferenceable, ipb.IReferenceable, ipb.IRemoteReference)
+
+
class YourReferenceSlicer(slicer.BaseSlicer):
"""I handle pb.RemoteReference objects (being sent back home to the
# but the message delivery must still wait for the getReference to
# complete. See to it that we fire the object deferred before we fire
# the ready_deferred.
- obj_deferred, ready_deferred = defer.Deferred(), defer.Deferred()
+
+ obj_deferred = defer.Deferred()
+ ready_deferred = defer.Deferred()
+
def _ready(rref):
obj_deferred.callback(rref)
ready_deferred.callback(rref)
- d.addCallback(_ready)
+ def _failed(f):
+ # if an error in getReference() occurs, log it locally (with
+ # priority UNUSUAL), because this end might need to diagnose some
+ # connection or networking problems.
+ log.msg("gift (%s) failed to resolve: %s" % (self.url, f))
+ # deliver a placeholder object to the container, but signal the
+ # ready_deferred that we've failed. This will bubble up to the
+ # enclosing InboundDelivery, and when it gets to the top of the
+ # queue, it will be flunked.
+ obj_deferred.callback("Place holder for a Gift which failed to "
+ "resolve: %s" % f)
+ ready_deferred.errback(f)
+ d.addCallbacks(_ready, _failed)
return obj_deferred, ready_deferred
# -*- test-case-name: foolscap.test.test_banana -*-
from twisted.python.components import registerAdapter
+from twisted.python import log
from zope.interface import implements
from twisted.internet.defer import Deferred
import tokens
return self.open(opentype)
def receiveChild(self, obj, ready_deferred=None):
+ """Unslicers for containers should accumulate their children's
+ ready_deferreds, then combine them in an AsyncAND when receiveClose()
+ happens, and return the AsyncAND as the ready_deferreds half of the
+ receiveClose() return value.
+ """
pass
def reportViolation(self, why):
return None
def explode(self, failure):
- """If something goes wrong in a Deferred callback, it may be too
- late to reject the token and to normal error handling. I haven't
- figured out how to do sensible error-handling in this situation.
- This method exists to make sure that the exception shows up
- *somewhere*. If this is called, it is also likely that a placeholder
- (probably a Deferred) will be left in the unserialized object about
- to be handed to the RootUnslicer.
+ """If something goes wrong in a Deferred callback, it may be too late
+ to reject the token and to normal error handling. I haven't figured
+ out how to do sensible error-handling in this situation. This method
+ exists to make sure that the exception shows up *somewhere*. If this
+ is called, it is also likely that a placeholder (probably a Deferred)
+ will be left in the unserialized object graph about to be handed to
+ the RootUnslicer.
"""
- print "KABOOM"
- print failure
+
+ # RootUnslicer pays attention to this .exploded attribute and refuses
+ # to deliver anything if it is set. But PBRootUnslicer ignores it.
+ # TODO: clean this up, and write some unit tests to trigger it (by
+ # violating schemas?)
+ log.msg("BaseUnslicer.explode: %s" % failure)
self.protocol.exploded = failure
class ScopedUnslicer(BaseUnslicer):
# -*- test-case-name: foolscap.test.test_banana -*-
from twisted.python import log
-from twisted.internet.defer import Deferred, DeferredList
+from twisted.internet.defer import Deferred
from foolscap.tokens import Violation, BananaError
from foolscap.slicer import BaseSlicer, BaseUnslicer
from foolscap.constraint import OpenerConstraint, Any, UnboundedSchema, IConstraint
+from foolscap.util import AsyncAND
class DictSlicer(BaseSlicer):
opentype = ('dict',)
def receiveClose(self):
ready_deferred = None
if self._ready_deferreds:
- ready_deferred = DeferredList(self._ready_deferreds)
+ ready_deferred = AsyncAND(self._ready_deferreds)
return self.d, ready_deferred
def describe(self):
# -*- test-case-name: foolscap.test.test_banana -*-
from twisted.python import log
-from twisted.internet.defer import Deferred, DeferredList
+from twisted.internet.defer import Deferred
from foolscap.tokens import Violation
from foolscap.slicer import BaseSlicer, BaseUnslicer
from foolscap.constraint import OpenerConstraint, Any, UnboundedSchema, IConstraint
+from foolscap.util import AsyncAND
class ListSlicer(BaseSlicer):
def receiveClose(self):
ready_deferred = None
if self._ready_deferreds:
- ready_deferred = DeferredList(self._ready_deferreds)
+ ready_deferred = AsyncAND(self._ready_deferreds)
return self.list, ready_deferred
def describe(self):
from foolscap.tokens import Violation
from foolscap.constraint import OpenerConstraint, UnboundedSchema, Any, \
IConstraint
+from foolscap.util import AsyncAND
class SetSlicer(ListSlicer):
opentype = ("set",)
def receiveClose(self):
ready_deferred = None
if self._ready_deferreds:
- ready_deferred = defer.DeferredList(self._ready_deferreds)
+ ready_deferred = AsyncAND(self._ready_deferreds)
return self.set, ready_deferred
class FrozenSetUnslicer(TupleUnslicer):
# -*- test-case-name: foolscap.test.test_banana -*-
-from twisted.internet.defer import Deferred, DeferredList
+from twisted.internet.defer import Deferred
from foolscap.tokens import Violation
from foolscap.slicer import BaseUnslicer
from foolscap.slicers.list import ListSlicer
from foolscap.constraint import OpenerConstraint, Any, UnboundedSchema, IConstraint
+from foolscap.util import AsyncAND
class TupleSlicer(ListSlicer):
def complete(self):
ready_deferred = None
if self._ready_deferreds:
- ready_deferred = DeferredList(self._ready_deferreds)
+ ready_deferred = AsyncAND(self._ready_deferreds)
t = tuple(self.list)
if self.debug:
print " not finished yet"
ready_deferred = None
if self._ready_deferreds:
- ready_deferred = DeferredList(self._ready_deferreds)
+ ready_deferred = AsyncAND(self._ready_deferreds)
return self.deferred, ready_deferred
# the list is already complete
return 24
def remote_fail(self):
raise ValueError("you asked me to fail")
+ def remote_fail_remotely(self, target):
+ return target.callRemote("fail")
+
def remote_failstring(self):
raise "string exceptions are annoying"
self.failUnless(f.check("string exceptions are annoying"),
"wrong exception type: %s" % f)
+ def testCopiedFailure(self):
+ # A calls B, who calls C. C fails. B gets a CopiedFailure and reports
+ # it back to A. What does a get?
+ rr, target = self.setupTarget(TargetWithoutInterfaces())
+ d = rr.callRemote("fail_remotely", target)
+ def _check(f):
+ # f should be a CopiedFailure
+ self.failUnless(isinstance(f, failure.Failure),
+ "Hey, we didn't fail: %s" % f)
+ self.failUnless(f.check(ValueError),
+ "wrong exception type: %s" % f)
+ self.failUnlessSubstring("you asked me to fail", f.value)
+ d.addBoth(_check)
+ return d
def testCall2(self):
# server end uses an interface this time, but not the client end
from twisted.trial import unittest
-from twisted.python import components, failure
+from twisted.python import components, failure, reflect
from foolscap.test.common import TargetMixin, HelperTarget
from foolscap import copyable, tokens
def _testFailure1_1(self, (f,)):
#print "CopiedFailure is:", f
#print f.__dict__
- self.failUnlessEqual(f.type, "exceptions.RuntimeError")
+ self.failUnlessEqual(reflect.qual(f.type), "exceptions.RuntimeError")
+ self.failUnless(f.check, RuntimeError)
self.failUnlessEqual(f.value, "message here")
self.failUnlessEqual(f.frames, [])
self.failUnlessEqual(f.tb, None)
self.failUnlessEqual(f.stack, [])
# there should be a traceback
- self.failUnless(f.traceback.find("raise RuntimeError") != -1)
+ self.failUnless(f.traceback.find("raise RuntimeError") != -1,
+ "no 'raise RuntimeError' in '%s'" % (f.traceback,))
def testFailure2(self):
self.callingBroker.unsafeTracebacks = False
def _testFailure2_1(self, (f,)):
#print "CopiedFailure is:", f
#print f.__dict__
- self.failUnlessEqual(f.type, "exceptions.RuntimeError")
+ self.failUnlessEqual(reflect.qual(f.type), "exceptions.RuntimeError")
+ self.failUnless(f.check, RuntimeError)
self.failUnlessEqual(f.value, "message here")
self.failUnlessEqual(f.frames, [])
self.failUnlessEqual(f.tb, None)
from zope.interface import implements
from twisted.trial import unittest
-from twisted.internet import defer
-from twisted.internet.error import ConnectionDone, ConnectionLost
+from twisted.internet import defer, protocol, reactor
+from twisted.internet.error import ConnectionDone, ConnectionLost, \
+ ConnectionRefusedError
+from twisted.python import failure
from foolscap import Tub, UnauthenticatedTub, RemoteInterface, Referenceable
-from foolscap.referenceable import RemoteReference
+from foolscap.referenceable import RemoteReference, SturdyRef
from foolscap.test.common import HelperTarget, RIHelper
from foolscap.eventual import flushEventualQueue
+from foolscap.tokens import BananaError, NegotiationError
crypto_available = False
try:
def remote_set(self, obj):
self.obj = obj
-class Gifts(unittest.TestCase):
- # Here we test the three-party introduction process as depicted in the
- # classic Granovetter diagram. Alice has a reference to Bob and another
- # one to Carol. Alice wants to give her Carol-reference to Bob, by
- # including it as the argument to a method she invokes on her
- # Bob-reference.
+class Base:
debug = False
def setUp(self):
- self.services = [GoodEnoughTub(), GoodEnoughTub(), GoodEnoughTub()]
- self.tubA, self.tubB, self.tubC = self.services
+ self.services = [GoodEnoughTub() for i in range(4)]
+ self.tubA, self.tubB, self.tubC, self.tubD = self.services
for s in self.services:
s.startService()
l = s.listenOn("tcp:0:interface=127.0.0.1")
def createCharacters(self):
self.alice = HelperTarget("alice")
self.bob = HelperTarget("bob")
- self.bob_url = self.tubB.registerReference(self.bob)
+ self.bob_url = self.tubB.registerReference(self.bob, "bob")
self.carol = HelperTarget("carol")
- self.carol_url = self.tubC.registerReference(self.carol)
+ self.carol_url = self.tubC.registerReference(self.carol, "carol")
# cindy is Carol's little sister. She doesn't have a phone, but
# Carol might talk about her anyway.
self.cindy = HelperTarget("cindy")
self.clarisse = HelperTarget("clarisse")
self.colette = HelperTarget("colette")
self.courtney = HelperTarget("courtney")
+ self.dave = HelperTarget("dave")
+ self.dave_url = self.tubD.registerReference(self.dave, "dave")
def createInitialReferences(self):
# we must start by giving Alice a reference to both Bob and Carol.
def _aliceGotCarol(acarol):
if self.debug: print "Alice got carol"
self.acarol = acarol # Alice's reference to Carol
+ d = self.tubB.getReference(self.dave_url)
+ return d
d.addCallback(_aliceGotCarol)
+ def _bobGotDave(bdave):
+ self.bdave = bdave
+ d.addCallback(_bobGotDave)
return d
def createMoreReferences(self):
# give Alice references to Carol's sisters
dl = []
- url = self.tubC.registerReference(self.charlene)
+ url = self.tubC.registerReference(self.charlene, "charlene")
d = self.tubA.getReference(url)
def _got_charlene(rref):
self.acharlene = rref
d.addCallback(_got_charlene)
dl.append(d)
- url = self.tubC.registerReference(self.christine)
+ url = self.tubC.registerReference(self.christine, "christine")
d = self.tubA.getReference(url)
def _got_christine(rref):
self.achristine = rref
d.addCallback(_got_christine)
dl.append(d)
- url = self.tubC.registerReference(self.clarisse)
+ url = self.tubC.registerReference(self.clarisse, "clarisse")
d = self.tubA.getReference(url)
def _got_clarisse(rref):
self.aclarisse = rref
d.addCallback(_got_clarisse)
dl.append(d)
- url = self.tubC.registerReference(self.colette)
+ url = self.tubC.registerReference(self.colette, "colette")
d = self.tubA.getReference(url)
def _got_colette(rref):
self.acolette = rref
d.addCallback(_got_colette)
dl.append(d)
- url = self.tubC.registerReference(self.courtney)
+ url = self.tubC.registerReference(self.courtney, "courtney")
d = self.tubA.getReference(url)
def _got_courtney(rref):
self.acourtney = rref
d.addCallback(_got_courtney)
dl.append(d)
+
return defer.DeferredList(dl)
+ def shouldFail(self, res, expected_failure, which, substring=None):
+ # attach this with:
+ # d = something()
+ # d.addBoth(self.shouldFail, IndexError, "something")
+ # the 'which' string helps to identify which call to shouldFail was
+ # triggered, since certain versions of Twisted don't display this
+ # very well.
+
+ if isinstance(res, failure.Failure):
+ res.trap(expected_failure)
+ if substring:
+ self.failUnless(substring in str(res),
+ "substring '%s' not in '%s'"
+ % (substring, str(res)))
+ else:
+ self.fail("%s was supposed to raise %s, not get '%s'" %
+ (which, expected_failure, res))
+
+class Gifts(Base, unittest.TestCase):
+ # Here we test the three-party introduction process as depicted in the
+ # classic Granovetter diagram. Alice has a reference to Bob and another
+ # one to Carol. Alice wants to give her Carol-reference to Bob, by
+ # including it as the argument to a method she invokes on her
+ # Bob-reference.
+
def testGift(self):
#defer.setDebugging(True)
self.createCharacters()
d.addCallback(_carolCalled)
return d
-
def testImplicitGift(self):
# in this test, Carol was registered in her Tub (using
# registerReference), but Cindy was not. Alice is given a reference
d.addCallback(_carolAndCindyCalled)
return d
+ # test gifts in return values too
+
+ def testReturn(self):
+ self.createCharacters()
+ d = self.createInitialReferences()
+ def _introduce(res):
+ self.bob.obj = self.bdave
+ return self.abob.callRemote("get")
+ d.addCallback(_introduce)
+ def _check(adave):
+ # this ought to be a RemoteReference to dave, usable by alice
+ self.failUnless(isinstance(adave, RemoteReference))
+ return adave.callRemote("set", 12)
+ d.addCallback(_check)
+ def _check2(res):
+ self.failUnlessEqual(self.dave.obj, 12)
+ d.addCallback(_check2)
+ return d
+
+ def testReturnInContainer(self):
+ self.createCharacters()
+ d = self.createInitialReferences()
+ def _introduce(res):
+ self.bob.obj = {"foo": [(set([self.bdave]),)]}
+ return self.abob.callRemote("get")
+ d.addCallback(_introduce)
+ def _check(obj):
+ adave = list(obj["foo"][0][0])[0]
+ # this ought to be a RemoteReference to dave, usable by alice
+ self.failUnless(isinstance(adave, RemoteReference))
+ return adave.callRemote("set", 12)
+ d.addCallback(_check)
+ def _check2(res):
+ self.failUnlessEqual(self.dave.obj, 12)
+ d.addCallback(_check2)
+ return d
def testOrdering(self):
self.createCharacters()
def create_constrained_characters(self):
self.alice = HelperTarget("alice")
self.bob = ConstrainedHelper("bob")
- self.bob_url = self.tubB.registerReference(self.bob)
+ self.bob_url = self.tubB.registerReference(self.bob, "bob")
self.carol = HelperTarget("carol")
- self.carol_url = self.tubC.registerReference(self.carol)
+ self.carol_url = self.tubC.registerReference(self.carol, "carol")
+ self.dave = HelperTarget("dave")
+ self.dave_url = self.tubD.registerReference(self.dave, "dave")
def test_constraint(self):
self.create_constrained_characters()
d.addCallback(_checkBob)
return d
+
+
# this was used to alice's reference to carol (self.acarol) appeared in
# alice's gift table at the right time, to make sure that the
# RemoteReference is kept alive while the gift is in transit. The whole
d.addCallback(lambda res: d1)
return d
+
+class Bad(Base, unittest.TestCase):
+
+ # if the recipient cannot claim their gift, the caller should see an
+ # errback.
+
+ def setUp(self):
+ if not crypto_available:
+ raise unittest.SkipTest("crypto not available")
+ Base.setUp(self)
+
+ def test_swissnum(self):
+ self.createCharacters()
+ d = self.createInitialReferences()
+ d.addCallback(lambda res: self.tubA.getReference(self.dave_url))
+ def _introduce(adave):
+ # now break the gift to insure that Bob is unable to claim it.
+ # The first way to do this is to simple mangle the swissnum,
+ # which will result in a failure in remote_getReferenceByName.
+ # NOTE: this will have to change when we modify the way gifts are
+ # referenced, since tracker.url is scheduled to go away.
+ r = SturdyRef(adave.tracker.url)
+ r.name += ".MANGLED"
+ adave.tracker.url = r.getURL()
+ return self.acarol.callRemote("set", adave)
+ d.addCallback(_introduce)
+ d.addBoth(self.shouldFail, KeyError, "Bad.test_swissnum")
+ # make sure we can still talk to Carol, though
+ d.addCallback(lambda res: self.acarol.callRemote("set", 14))
+ d.addCallback(lambda res: self.failUnlessEqual(self.carol.obj, 14))
+ return d
+ test_swissnum.timeout = 10
+
+ def test_tubid(self):
+ self.createCharacters()
+ d = self.createInitialReferences()
+ d.addCallback(lambda res: self.tubA.getReference(self.dave_url))
+ def _introduce(adave):
+ # The second way is to mangle the tubid, which will result in a
+ # failure during negotiation. NOTE: this will have to change when
+ # we modify the way gifts are referenced, since tracker.url is
+ # scheduled to go away.
+ r = SturdyRef(adave.tracker.url)
+ r.tubID += ".MANGLED"
+ adave.tracker.url = r.getURL()
+ return self.acarol.callRemote("set", adave)
+ d.addCallback(_introduce)
+ d.addBoth(self.shouldFail, BananaError, "Bad.test_tubid",
+ "unknown TubID")
+ return d
+ test_tubid.timeout = 10
+
+ def test_location(self):
+ self.createCharacters()
+ d = self.createInitialReferences()
+ d.addCallback(lambda res: self.tubA.getReference(self.dave_url))
+ def _introduce(adave):
+ # The third way is to mangle the location hints, which will
+ # result in a failure during negotiation as it attempts to
+ # establish a TCP connection.
+ r = SturdyRef(adave.tracker.url)
+ # highly unlikely that there's anything listening on this port
+ r.locationHints = ["127.0.0.47:1"]
+ adave.tracker.url = r.getURL()
+ return self.acarol.callRemote("set", adave)
+ d.addCallback(_introduce)
+ d.addBoth(self.shouldFail, ConnectionRefusedError, "Bad.test_location")
+ return d
+ test_location.timeout = 10
+
+ def test_hang(self):
+ f = protocol.Factory()
+ f.protocol = protocol.Protocol # ignores all input
+ p = reactor.listenTCP(0, f, interface="127.0.0.1")
+ self.createCharacters()
+ d = self.createInitialReferences()
+ d.addCallback(lambda res: self.tubA.getReference(self.dave_url))
+ def _introduce(adave):
+ # The next form of mangling is to connect to a port which never
+ # responds, which could happen if a firewall were silently
+ # dropping the TCP packets. We can't accurately simulate this
+ # case, but we can connect to a port which accepts the connection
+ # and then stays silent. This should trigger the overall
+ # connection timeout.
+ r = SturdyRef(adave.tracker.url)
+ r.locationHints = ["127.0.0.1:%d" % p.getHost().port]
+ adave.tracker.url = r.getURL()
+ self.tubD.options['connect_timeout'] = 2
+ return self.acarol.callRemote("set", adave)
+ d.addCallback(_introduce)
+ d.addBoth(self.shouldFail, NegotiationError, "Bad.test_hang",
+ "no connection established within client timeout")
+ def _stop_listening(res):
+ d1 = p.stopListening()
+ def _done_listening(x):
+ return res
+ d1.addCallback(_done_listening)
+ return d1
+ d.addBoth(_stop_listening)
+ return d
+ test_hang.timeout = 10
+
+
+ def testReturn_swissnum(self):
+ self.createCharacters()
+ d = self.createInitialReferences()
+ def _introduce(res):
+ # now break the gift to insure that Alice is unable to claim it.
+ # The first way to do this is to simple mangle the swissnum,
+ # which will result in a failure in remote_getReferenceByName.
+ # NOTE: this will have to change when we modify the way gifts are
+ # referenced, since tracker.url is scheduled to go away.
+ r = SturdyRef(self.bdave.tracker.url)
+ r.name += ".MANGLED"
+ self.bdave.tracker.url = r.getURL()
+ self.bob.obj = self.bdave
+ return self.abob.callRemote("get")
+ d.addCallback(_introduce)
+ d.addBoth(self.shouldFail, KeyError, "Bad.testReturn_swissnum")
+ # make sure we can still talk to Bob, though
+ d.addCallback(lambda res: self.abob.callRemote("set", 14))
+ d.addCallback(lambda res: self.failUnlessEqual(self.bob.obj, 14))
+ return d
+ testReturn_swissnum.timeout = 10
for i in range(len(s)):
line = s[i]
#print line
- if ("test/test_interfaces.py" in line
+ if ("test_interfaces.py" in line
and i+1 < len(s)
and "rr.callRemote" in s[i+1]):
return # all good
from twisted.python import log
log.startLogging(sys.stderr)
-from twisted.python import failure, log
+from twisted.python import failure, log, reflect
from twisted.internet import defer
from twisted.trial import unittest
req = TestRequest(12)
self.broker.addRequest(req)
u = self.newUnslicer()
+ u.start(0)
u.checkToken(INT, 0)
u.receiveChild(12) # causes broker.getRequest
u.checkToken(STRING, 8)
req.setConstraint(IConstraint(str))
self.broker.addRequest(req)
u = self.newUnslicer()
+ u.start(0)
u.checkToken(INT, 0)
u.receiveChild(12) # causes broker.getRequest
u.checkToken(STRING, 15)
return d
testBadMethod2.timeout = 5
def _testBadMethod2_eb(self, f):
- self.failUnlessEqual(f.type, 'exceptions.AttributeError')
+ self.failUnlessEqual(reflect.qual(f.type), 'exceptions.AttributeError')
self.failUnlessSubstring("TargetWithoutInterfaces", f.value)
self.failUnlessSubstring(" has no attribute 'remote_missing'", f.value)
--- /dev/null
+
+from zope.interface import implements
+from twisted.trial import unittest
+from twisted.python import failure
+from foolscap.ipb import IRemoteReference
+from foolscap.test.common import HelperTarget, Target
+from foolscap.eventual import flushEventualQueue
+
+class Remote:
+ implements(IRemoteReference)
+ pass
+
+
+class LocalReference(unittest.TestCase):
+ def tearDown(self):
+ return flushEventualQueue()
+
+ def ignored(self):
+ pass
+
+ def test_remoteReference(self):
+ r = Remote()
+ rref = IRemoteReference(r)
+ self.failUnlessIdentical(r, rref)
+
+ def test_callRemote(self):
+ t = HelperTarget()
+ t.obj = None
+ rref = IRemoteReference(t)
+ marker = rref.notifyOnDisconnect(self.ignored, "args", kwargs="foo")
+ rref.dontNotifyOnDisconnect(marker)
+ d = rref.callRemote("set", 12)
+ # the callRemote should be put behind an eventual-send
+ self.failUnlessEqual(t.obj, None)
+ def _check(res):
+ self.failUnlessEqual(t.obj, 12)
+ self.failUnlessEqual(res, True)
+ d.addCallback(_check)
+ return d
+
+ def test_callRemoteOnly(self):
+ t = HelperTarget()
+ t.obj = None
+ rref = IRemoteReference(t)
+ rc = rref.callRemoteOnly("set", 12)
+ self.failUnlessEqual(rc, None)
+
+ def shouldFail(self, res, expected_failure, which, substring=None):
+ # attach this with:
+ # d = something()
+ # d.addBoth(self.shouldFail, IndexError, "something")
+ # the 'which' string helps to identify which call to shouldFail was
+ # triggered, since certain versions of Twisted don't display this
+ # very well.
+
+ if isinstance(res, failure.Failure):
+ res.trap(expected_failure)
+ if substring:
+ self.failUnless(substring in str(res),
+ "substring '%s' not in '%s'"
+ % (substring, str(res)))
+ else:
+ self.fail("%s was supposed to raise %s, not get '%s'" %
+ (which, expected_failure, res))
+
+ def test_fail(self):
+ t = Target()
+ d = IRemoteReference(t).callRemote("fail")
+ d.addBoth(self.shouldFail, ValueError, "test_fail",
+ "you asked me to fail")
+ return d
except ImportError:
pass
-from foolscap import Tub, UnauthenticatedTub
+from foolscap import Tub, UnauthenticatedTub, SturdyRef, Referenceable
from foolscap.referenceable import RemoteReference
from foolscap.eventual import eventually, flushEventualQueue
from foolscap.test.common import HelperTarget, TargetMixin
eventually(t1.startService)
return d
+
+class NameLookup(TargetMixin, unittest.TestCase):
+
+ # test registerNameLookupHandler
+
+ def setUp(self):
+ TargetMixin.setUp(self)
+ self.tubA, self.tubB = [GoodEnoughTub(), GoodEnoughTub()]
+ self.services = [self.tubA, self.tubB]
+ self.tubA.startService()
+ self.tubB.startService()
+ l = self.tubB.listenOn("tcp:0:interface=127.0.0.1")
+ self.tubB.setLocation("127.0.0.1:%d" % l.getPortnum())
+ self.url_on_b = self.tubB.registerReference(Referenceable())
+ self.lookups = []
+ self.lookups2 = []
+ self.names = {}
+ self.names2 = {}
+
+ def tearDown(self):
+ d = TargetMixin.tearDown(self)
+ def _more(res):
+ return defer.DeferredList([s.stopService() for s in self.services])
+ d.addCallback(_more)
+ d.addCallback(flushEventualQueue)
+ return d
+
+ def lookup(self, name):
+ self.lookups.append(name)
+ return self.names.get(name, None)
+
+ def lookup2(self, name):
+ self.lookups2.append(name)
+ return self.names2.get(name, None)
+
+ def testNameLookup(self):
+ t1 = HelperTarget()
+ t2 = HelperTarget()
+ self.names["foo"] = t1
+ self.names2["bar"] = t2
+ self.names2["baz"] = t2
+ self.tubB.registerNameLookupHandler(self.lookup)
+ self.tubB.registerNameLookupHandler(self.lookup2)
+ # hack up a new furl pointing at the same tub but with a name that
+ # hasn't been registered.
+ s = SturdyRef(self.url_on_b)
+ s.name = "foo"
+
+ d = self.tubA.getReference(s)
+
+ def _check(res):
+ self.failUnless(isinstance(res, RemoteReference))
+ self.failUnlessEqual(self.lookups, ["foo"])
+ # the first lookup should short-circuit the process
+ self.failUnlessEqual(self.lookups2, [])
+ self.lookups = []; self.lookups2 = []
+ s.name = "bar"
+ return self.tubA.getReference(s)
+ d.addCallback(_check)
+
+ def _check2(res):
+ self.failUnless(isinstance(res, RemoteReference))
+ # if the first lookup fails, the second handler should be asked
+ self.failUnlessEqual(self.lookups, ["bar"])
+ self.failUnlessEqual(self.lookups2, ["bar"])
+ self.lookups = []; self.lookups2 = []
+ # make sure that loopbacks use this too
+ return self.tubB.getReference(s)
+ d.addCallback(_check2)
+
+ def _check3(res):
+ self.failUnless(isinstance(res, RemoteReference))
+ self.failUnlessEqual(self.lookups, ["bar"])
+ self.failUnlessEqual(self.lookups2, ["bar"])
+ self.lookups = []; self.lookups2 = []
+ # and make sure we can de-register handlers
+ self.tubB.unregisterNameLookupHandler(self.lookup)
+ s.name = "baz"
+ return self.tubA.getReference(s)
+ d.addCallback(_check3)
+
+ def _check4(res):
+ self.failUnless(isinstance(res, RemoteReference))
+ self.failUnlessEqual(self.lookups, [])
+ self.failUnlessEqual(self.lookups2, ["baz"])
+ self.lookups = []; self.lookups2 = []
+ d.addCallback(_check4)
+
+ return d
+
--- /dev/null
+
+from twisted.trial import unittest
+from twisted.internet import defer
+from twisted.python import failure
+from foolscap import util, eventual
+
+
+class AsyncAND(unittest.TestCase):
+ def setUp(self):
+ self.fired = False
+ self.failed = False
+
+ def callback(self, res):
+ self.fired = True
+ def errback(self, res):
+ self.failed = True
+
+ def attach(self, d):
+ d.addCallbacks(self.callback, self.errback)
+ return d
+
+ def shouldNotFire(self, ignored=None):
+ self.failIf(self.fired)
+ self.failIf(self.failed)
+ def shouldFire(self, ignored=None):
+ self.failUnless(self.fired)
+ self.failIf(self.failed)
+ def shouldFail(self, ignored=None):
+ self.failUnless(self.failed)
+ self.failIf(self.fired)
+
+ def tearDown(self):
+ return eventual.flushEventualQueue()
+
+ def test_empty(self):
+ self.attach(util.AsyncAND([]))
+ self.shouldFire()
+
+ def test_simple(self):
+ d1 = eventual.fireEventually(None)
+ a = util.AsyncAND([d1])
+ self.attach(a)
+ a.addBoth(self.shouldFire)
+ return a
+
+ def test_two(self):
+ d1 = defer.Deferred()
+ d2 = defer.Deferred()
+ self.attach(util.AsyncAND([d1, d2]))
+ self.shouldNotFire()
+ d1.callback(1)
+ self.shouldNotFire()
+ d2.callback(2)
+ self.shouldFire()
+
+ def test_one_failure_1(self):
+ d1 = defer.Deferred()
+ d2 = defer.Deferred()
+ self.attach(util.AsyncAND([d1, d2]))
+ self.shouldNotFire()
+ d1.callback(1)
+ self.shouldNotFire()
+ d2.errback(RuntimeError())
+ self.shouldFail()
+
+ def test_one_failure_2(self):
+ d1 = defer.Deferred()
+ d2 = defer.Deferred()
+ self.attach(util.AsyncAND([d1, d2]))
+ self.shouldNotFire()
+ d1.errback(RuntimeError())
+ self.shouldFail()
+ d2.callback(1)
+ self.shouldFail()
+
+ def test_two_failure(self):
+ d1 = defer.Deferred()
+ d2 = defer.Deferred()
+ self.attach(util.AsyncAND([d1, d2]))
+ def _should_fire(res):
+ self.failIf(isinstance(res, failure.Failure))
+ def _should_fail(f):
+ self.failUnless(isinstance(f, failure.Failure))
+ d1.addBoth(_should_fire)
+ d2.addBoth(_should_fail)
+ self.shouldNotFire()
+ d1.errback(RuntimeError())
+ self.shouldFail()
+ d2.errback(RuntimeError())
+ self.shouldFail()
+
+
--- /dev/null
+
+from twisted.internet import defer
+
+
+class AsyncAND(defer.Deferred):
+ """Like DeferredList, but results are discarded and failures handled
+ in a more convenient fashion.
+
+ Create me with a list of Deferreds. I will fire my callback (with None)
+ if and when all of my component Deferreds fire successfully. I will fire
+ my errback when and if any of my component Deferreds errbacks, in which
+ case I will absorb the failure. If a second Deferred errbacks, I will not
+ absorb that failure.
+
+ This means that you can put a bunch of Deferreds together into an
+ AsyncAND and then forget about them. If all succeed, the AsyncAND will
+ fire. If one fails, that Failure will be propagated to the AsyncAND. If
+ multiple ones fail, the first Failure will go to the AsyncAND and the
+ rest will be left unhandled (and therefore logged).
+ """
+
+ def __init__(self, deferredList):
+ defer.Deferred.__init__(self)
+
+ if not deferredList:
+ self.callback(None)
+ return
+
+ self.remaining = len(deferredList)
+ self._fired = False
+
+ for d in deferredList:
+ d.addCallbacks(self._cbDeferred, self._cbDeferred,
+ callbackArgs=(True,), errbackArgs=(False,))
+
+ def _cbDeferred(self, result, succeeded):
+ self.remaining -= 1
+ if succeeded:
+ if not self._fired and self.remaining == 0:
+ # the last input has fired. We fire.
+ self._fired = True
+ self.callback(None)
+ return
+ else:
+ if not self._fired:
+ # the first Failure is carried into our output
+ self._fired = True
+ self.errback(result)
+ return None
+ else:
+ # second and later Failures are not absorbed
+ return result
+foolscap (0.1.5) unstable; urgency=low
+
+ * new release
+
+ -- Brian Warner <warner@lothar.com> Tue, 07 Aug 2007 17:47:53 -0700
+
foolscap (0.1.4) unstable; urgency=low
* new release
+foolscap (0.1.5) unstable; urgency=low
+
+ * new release
+
+ -- Brian Warner <warner@lothar.com> Tue, 07 Aug 2007 17:47:53 -0700
+
foolscap (0.1.4) unstable; urgency=low
* new release
+foolscap (0.1.5) unstable; urgency=low
+
+ * new release
+
+ -- Brian Warner <warner@lothar.com> Tue, 07 Aug 2007 17:47:53 -0700
+
foolscap (0.1.4) unstable; urgency=low
* new release
+foolscap (0.1.5) unstable; urgency=low
+
+ * new release
+
+ -- Brian Warner <warner@lothar.com> Tue, 07 Aug 2007 17:47:53 -0700
+
foolscap (0.1.4) unstable; urgency=low
* new release
+foolscap (0.1.5) unstable; urgency=low
+
+ * new release
+
+ -- Brian Warner <warner@lothar.com> Tue, 07 Aug 2007 17:47:53 -0700
+
foolscap (0.1.4) unstable; urgency=low
* new release