Updated dataflow semantics for RTT

Hi Sylvain,

thanks for your quick reply!

On Mon, Sep 21, 2015 at 8:09 PM, Sylvain Joyeux <sylvain [dot] joyeux [..] ...>
wrote:

> Quick comment -- I don't have a lot of time on my hands right now:
>
> To me, it looks like you have decided that push is shared by default,
> in your description the concept of pull/push and private/shared. Am I
> wrong ? Is "push/shared" different than "push with single buffer by
> default" ? I would not see the rationale for that, if a use wants
> "private", give him "private", especially given that there are
> shortcomings ...
>
> Using "shared" will make the system fail if, later, it gets a
> connection request for a different policy than the original policy. To
> me, this spells that using shared MUST be a conscious choice of the
> system designer in all cases OR that there should be a way for RTT to
> degrade gracefully from shared to private (use shared as an
> optimization that is handled purely internally unless the user
> requested it explicitely).
>
> In general, the connection code will have to return an error if trying
> incompatible mixes of policies.
>
> Sylvain
>

That's exactly what he have in mind and how it is hopefully implemented.
push/pull and private/shared are orthogonal concepts. The default would be
push/private. Connections are never shared unless explicitly specified by
the user. Probably the table in section 5.1 of our design document
<https://docs.google.com/document/d/1zDnPPz4SiCVvfEFxYFUZBcVbXKj33o4KaotpCtXM4E0/pub#h.yfd89u4uuk9u>
is
the easiest mean to get a fast overview of the possible combinations.

These are the shortest possible definitions I can come up with:

push: Every input port has a single input buffer and all connected writers
write into that buffer.
pull: Every output port has one buffer per connection and readers will poll
all connected writers for new data.

private: Every connection buffer belongs to exactly one input port, so all
read operations are independent from each other.
shared: Multiple input ports can read from the same buffer, so one read
operation can influence the outcome of another.

The combination shared/pull is less intuitive, but we interpreted it as a
single local output buffer and multiple polling input ports read from the
same buffer (vs. one buffer per connection in the pull/private case).
Actually this connection type would only be useful for shared remote
connections, where multiple remote input ports are supposed to consume
samples from a single output port with a shared local buffer. For local
connections, there is no reason not to use a shared/push connection.

Clearly there are limitations for the possible combinations of different
connection types for a single port, primarily that an input port can have
either only pull or only push connections and that all push connections
have to be consistent in the buffer type, size and locking policy. For
illegal combinations the connectTo(...) request (or whatever API is used to
establish new connections) will return an error. Users have to be more
careful and aware of the implications of ConnPolicy flags, but we think the
default private/push should work for most applications and not break
existing deployments, unless they already use a mixture of connection
policies for the same port for whatever reason.

Johannes

Updated dataflow semantics for RTT

> Ahh, I just realized, what is odd about your example. As a design principle,
> a component does not know anything
> about the connections that are attached to the port. So, if you are a robot
> controller and want to always use
> the latest sample, you MUST use R.readNewest(x).
That's the bug. Nobody SHOULD use readNewest, which has been
introduced (by me ...) as a quickfix on top of issues we had.

Connection policy is a system design concern. If one wants to build a
system where the controller should process 10 samples queued in a
buffer, then so be it. Except that readNewest was really made
necessary by the bug Johannes describes.

Sylvain

Updated dataflow semantics for RTT

> If you are already changing the Connection implementation, I would
> recommend, to put a new thread in every connection in remote case. We experienced the
> issue, that bad Wifi connections, slowed our systems down, as it would hang on the write
> call.

This should IMO be done by the middleware ... in the CORBA case we
should have a way to spawn separate corba dispatchers (e.g. based on
the policie's name field) to isolate the domains. Moreover, there's
already been work done to use CORBA's "oneway" call to avoid this
problem completely (but I don't know what's the status of that)

Sylvain

Updated dataflow semantics for RTT

> On Sep 22, 2015, at 10:09, Sylvain Joyeux <sylvain [dot] joyeux [..] ...> wrote:
>
>> If you are already changing the Connection implementation, I would
>> recommend, to put a new thread in every connection in remote case. We experienced the
>> issue, that bad Wifi connections, slowed our systems down, as it would hang on the write
>> call.
>
> This should IMO be done by the middleware ... in the CORBA case we
> should have a way to spawn separate corba dispatchers (e.g. based on
> the policie's name field) to isolate the domains. Moreover, there's
> already been work done to use CORBA's "oneway" call to avoid this
> problem completely (but I don't know what's the status of that)

+1 for transport middleware.

We’ve been using the oneway implementation for some months now. It’s good.
S

Updated dataflow semantics for RTT

Am 22.09.2015 um 16:09 schrieb Sylvain Joyeux:
>> If you are already changing the Connection implementation, I would
>> recommend, to put a new thread in every connection in remote case. We experienced the
>> issue, that bad Wifi connections, slowed our systems down, as it would hang on the write
>> call.
> This should IMO be done by the middleware ... in the CORBA case we
> should have a way to spawn separate corba dispatchers (e.g. based on
> the policie's name field) to isolate the domains. Moreover, there's
> already been work done to use CORBA's "oneway" call to avoid this
> problem completely (but I don't know what's the status of that)
>
> Sylvain
This is basically a design decision, thread all remote connection, to be
sure
that the do not block you task, or rely on the connection to be properly
implemented.
Janosch

Updated dataflow semantics for RTT

Hi Sylvain,

thanks for your quick reply!

On Mon, Sep 21, 2015 at 8:09 PM, Sylvain Joyeux <sylvain [dot] joyeux [..] ...>
wrote:

> Quick comment -- I don't have a lot of time on my hands right now:
>
> To me, it looks like you have decided that push is shared by default,
> in your description the concept of pull/push and private/shared. Am I
> wrong ? Is "push/shared" different than "push with single buffer by
> default" ? I would not see the rationale for that, if a use wants
> "private", give him "private", especially given that there are
> shortcomings ...
>
> Using "shared" will make the system fail if, later, it gets a
> connection request for a different policy than the original policy. To
> me, this spells that using shared MUST be a conscious choice of the
> system designer in all cases OR that there should be a way for RTT to
> degrade gracefully from shared to private (use shared as an
> optimization that is handled purely internally unless the user
> requested it explicitely).
>
> In general, the connection code will have to return an error if trying
> incompatible mixes of policies.
>
> Sylvain
>

That's exactly what he have in mind and how it is hopefully implemented.
push/pull and private/shared are orthogonal concepts. The default would be
push/private. Connections are never shared unless explicitly specified by
the user. Probably the table in section 5.1 of our design document
<https://docs.google.com/document/d/1zDnPPz4SiCVvfEFxYFUZBcVbXKj33o4KaotpCtXM4E0/pub#h.yfd89u4uuk9u>
is
the easiest mean to get a fast overview of the possible combinations.

These are the shortest possible definitions I can come up with:

push: Every input port has a single input buffer and all connected writers
write into that buffer.
pull: Every output port has one buffer per connection and readers will poll
all connected writers for new data.

private: Every connection buffer belongs to exactly one input port, so all
read operations are independent from each other.
shared: Multiple input ports can read from the same buffer, so one read
operation can influence the outcome of another.

The combination shared/pull is less intuitive, but we interpreted it as a
single local output buffer and multiple polling input ports read from the
same buffer (vs. one buffer per connection in the pull/private case).
Actually this connection type would only be useful for shared remote
connections, where multiple remote input ports are supposed to consume
samples from a single output port with a shared local buffer. For local
connections, there is no reason not to use a shared/push connection.

Clearly there are limitations for the possible combinations of different
connection types for a single port, primarily that an input port can have
either only pull or only push connections and that all push connections
have to be consistent in the buffer type, size and locking policy. For
illegal combinations the connectTo(...) request (or whatever API is used to
establish new connections) will return an error. Users have to be more
careful and aware of the implications of ConnPolicy flags, but we think the
default private/push should work for most applications and not break
existing deployments, unless they already use a mixture of connection
policies for the same port for whatever reason.

Johannes